ML:常见判断类模型好坏指标 - 混淆矩阵 & ROC曲线 & AUC & 其他

混淆矩阵

在这里插入图片描述

  1. 前提概念
缩写 全拼 含义
TP True Positive 预测对了,预测了“Positive”
FN False Negative 预测错了,预测了“Negetive”
FP False Positive 预测错了,预测了“Positive”
TN True Negtive 预测对了,预测了“Negtive”

2.指标定义

指标 定义 备注
Accuracy 在这里插入图片描述
Precision TP /(TP+FP) 对于模型标记为无误的样本中,它有多大比重是实际上也正确的
Recall / Sensitivity = TP /(TP+FN) 对于实际上是正确的样本,它有多大比重被模型无误的找出来了
F1 - Score 2*Precision*Recall / (Precision + Recall) 取值范围是从-到1的。1是最好,0是最差

ROC曲线

Receiver Operating Characteristic Curve / 感受性曲线 / 受试者工作特征曲线

在这里插入图片描述
ROC曲线越向左上角凸,其效果越好;

AUC:即ROC曲线下的阴影部分的面积,故不展开;

注:以_score结尾的,值越大说明模型越好,以_error_loss结尾的 越小越好。


Sklearn示例

by oopcode in stackoverflow.(有改动)

import numpy as np
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import roc_curve, auc, roc_auc_score

lr = LogisticRegression()
X = np.random.rand(20, 2)
y = np.random.randint(2, size=20)
lr.fit(X, y)

FP_rate, TP_rate, thresholds = roc_curve(y, lr.predict(X))
print(auc(FP_rate, TP_rate))
# 0.5
print(roc_auc_score(y, lr.predict(X)))
# 0.5

附注:sklearn的评价指标(官网链接

指标 函数 备注
分类
‘accuracy’ metrics.accuracy_score
‘balanced_accuracy’ metrics.balanced_accuracy_score
‘average_precision’ metrics.average_precision_score
‘brier_score_loss’ metrics.brier_score_loss
‘f1’ metrics.f1_score 用于二分类
'f1_micro metrics.f1_score
‘f1_macro’ metrics.f1_score
‘f1_weighted’ metrics.f1_score
‘f1_samples’ metrics.f1_score
‘precision’ etc metrics.precision_score f1搭配使用
‘recall’ etc metrics.recall_score f1搭配使用
‘jaccard’ etc metrics.jaccard_score f1搭配使用
‘neg_log_loss’ metrics.log_loss 需要 predict_proba支持
‘roc_auc’ metrics.roc_auc_score
聚类
‘adjusted_mutual_info_score’ metrics.adjusted_mutual_info_score
‘adjusted_rand_score’ metrics.adjusted_rand_score
‘completeness_score’ metrics.completeness_score
‘fowlkes_mallows_score’ metrics.fowlkes_mallows_score
‘homogeneity_score’ metrics.homogeneity_score
‘mutual_info_score’ metrics.mutual_info_score
‘normalized_mutual_info_score’ metrics.normalized_mutual_info_score
‘v_measure_score’ metrics.v_measure_score
回归
‘explained_variance’ metrics.explained_variance_score
‘r2’ metrics.r2_score
‘max_error’ metrics.max_error
‘neg_mean_absolute_error’ metrics.mean_absolute_error
‘neg_mean_squared_error’ metrics.mean_squared_error
‘neg_mean_squared_log_error’ metrics.mean_squared_log_error
‘neg_median_absolute_error’ metrics.median_absolute_error
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章