ML：常見判斷類模型好壞指標 - 混淆矩陣 & ROC曲線 & AUC & 其他

原創

2020-06-30 11:01

混淆矩陣

前提概念

縮寫	全拼	含義
TP	True Positive	預測對了，預測了“Positive”
FN	False Negative	預測錯了，預測了“Negetive”
FP	False Positive	預測錯了，預測了“Positive”
TN	True Negtive	預測對了，預測了“Negtive”

2.指標定義

指標	定義	備註
Accuracy
Precision		對於模型標記爲無誤的樣本中，它有多大比重是實際上也正確的
Recall / Sensitivity		對於實際上是正確的樣本，它有多大比重被模型無誤的找出來了
F1 - Score		取值範圍是從-到1的。1是最好，0是最差

ROC曲線

Receiver Operating Characteristic Curve / 感受性曲線 / 受試者工作特徵曲線

ROC曲線越向左上角凸，其效果越好；

AUC：即ROC曲線下的陰影部分的面積，故不展開；

注：以_score結尾的，值越大說明模型越好，以_error 或_loss結尾的越小越好。

Sklearn示例

by oopcode in stackoverflow.（有改動）

import numpy as np
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import roc_curve, auc, roc_auc_score

lr = LogisticRegression()
X = np.random.rand(20, 2)
y = np.random.randint(2, size=20)
lr.fit(X, y)

FP_rate, TP_rate, thresholds = roc_curve(y, lr.predict(X))
print(auc(FP_rate, TP_rate))
# 0.5
print(roc_auc_score(y, lr.predict(X)))
# 0.5

附註：sklearn的評價指標（官網鏈接）

指標	函數	備註
分類
‘accuracy’	metrics.accuracy_score
‘balanced_accuracy’	metrics.balanced_accuracy_score
‘average_precision’	metrics.average_precision_score
‘brier_score_loss’	metrics.brier_score_loss
‘f1’	metrics.f1_score	用於二分類
'f1_micro	metrics.f1_score
‘f1_macro’	metrics.f1_score
‘f1_weighted’	metrics.f1_score
‘f1_samples’	metrics.f1_score
‘precision’ etc	metrics.precision_score	和 `f1`搭配使用
‘recall’ etc	metrics.recall_score	和 `f1`搭配使用
‘jaccard’ etc	metrics.jaccard_score	和 `f1`搭配使用
‘neg_log_loss’	metrics.log_loss	需要 `predict_proba`支持
‘roc_auc’	metrics.roc_auc_score
聚類
‘adjusted_mutual_info_score’	metrics.adjusted_mutual_info_score
‘adjusted_rand_score’	metrics.adjusted_rand_score
‘completeness_score’	metrics.completeness_score
‘fowlkes_mallows_score’	metrics.fowlkes_mallows_score
‘homogeneity_score’	metrics.homogeneity_score
‘mutual_info_score’	metrics.mutual_info_score
‘normalized_mutual_info_score’	metrics.normalized_mutual_info_score
‘v_measure_score’	metrics.v_measure_score
迴歸
‘explained_variance’	metrics.explained_variance_score
‘r2’	metrics.r2_score
‘max_error’	metrics.max_error
‘neg_mean_absolute_error’	metrics.mean_absolute_error
‘neg_mean_squared_error’	metrics.mean_squared_error
‘neg_mean_squared_log_error’	metrics.mean_squared_log_error
‘neg_median_absolute_error’	metrics.median_absolute_error

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

ML：常見判斷類模型好壞指標 - 混淆矩陣 & ROC曲線 & AUC & 其他

混淆矩陣

ROC曲線

Sklearn示例

如何使用 JS 判斷用戶是否處於活躍狀態

通過HPA+CronHPA組合應對業務複雜彈性伸縮場景

python筆記：multiprocessing 函數apply和apply_async有什麼區別？

ML筆記：分類算法之SVM

ML：常見判斷類模型好壞指標 - 混淆矩陣 & ROC曲線 & AUC & 其他

python筆記：df.plot()常見的座標軸的操作，及正常顯示負號

ML：非監督學習之聚類之 1 KMeans聚類（sklearn.cluster.KMeans)

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結