複習: 實現精準率和召回率

原創

2020-06-03 13:26

精準率：預測有100個人有癌症，在這些預測中，有多少是準確的。 \(precision = \frac{TP}{TP + FP}\)

需要的是精確度

召回率：實際上100人有癌症，我們的預測算法能從中正確的挑出多少。 \(recall = \frac{TP}{P} = \frac{TP}{TP + FN}\)

需要的是預測的範圍，預測的多不多

在scikit-learn中的混淆矩陣；精準率和召回率

# 混淆矩陣
from sklearn.metrics import confusion_matrix
confusion_matrix(y_test, y_log_predict)

# 精確率
from sklearn.metrics import precision_score
precision_score(y_test, y_log_predict)

from sklearn.metrics import recall_score
recall_score(y_test, y_log_predict)

自己手寫

import numpy as np
from sklearn import datasets

# 導入數據
digits = datasets.load_digits()        # 手寫數字識別
X = digits.data
y = digits.target.copy()     # 深拷貝
# print(X)
# print(y)

y[digits.target==9] = 1      # 等於9的，  爲1
y[digits.target!=9] = 0      # 不等於9的，爲0

# 切割數據
from sklearn.model_selection import train_test_split
# 切分數據集爲 訓練集 和 測試集
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=666)

# 邏輯迴歸
from sklearn.linear_model import LogisticRegression

log_reg = LogisticRegression()
log_reg.fit(X_train, y_train)
print(log_reg.score(X_test, y_test))      # 這是準確度 

# 注意，此數據爲 偏斜較大的數據，因此，需要考察其他指標
# 邏輯迴歸的預測值
y_log_predict = log_reg.predict(X_test)

# 對於混淆矩陣
# TN
def TN(y_true, y_predict):
    assert len(y_true) == len(y_predict)
    return np.sum((y_true == 0) & (y_predict == 0))    # 預測爲0，預測正確，y_true爲 0

# TN 值
print(TN(y_test, y_log_predict))

# FP
def FP(y_true, y_predict):
    assert len(y_true) == len(y_predict)
    return np.sum((y_true == 0) & (y_predict == 1))    # 預測爲9，預測錯誤，y_true爲 0

print(FP(y_test, y_log_predict))


def FN(y_true, y_predict):
    assert len(y_true) == len(y_predict)
    return np.sum((y_true == 1) & (y_predict == 0))   # 預測成0，預測錯誤，y_true爲 1

print(FN(y_test, y_log_predict))


def TP(y_true, y_predict):
    assert len(y_true) == len(y_predict)
    return np.sum((y_true == 1) & (y_predict == 1))   # 預測成 9，預測正確，y_true爲 1

print(TP(y_test, y_log_predict))


def confusion_matrix(y_true, y_predict):
    return np.array([
                     [TP(y_true, y_predict), FN(y_true, y_predict)],
                     [FP(y_true, y_predict), TN(y_true, y_predict)]
                    ])

confusion_matrix(y_test, y_log_predict)

# precision
def precision_score(y_true, y_predict):
    tp = TP(y_true, y_predict)
    fp = FP(y_true, y_predict)
    try:
        return tp / (tp + fp)
    except:
        return 0.0
    
print("精準率: ", precision_score(y_test, y_log_predict))


# recall
def recall_score(y_true, y_predict):
    tp = TP(y_true, y_predict)
    fn = FN(y_true, y_predict)
    try:
        return tp / (tp + fn)
    except:
        return 0.0
    
print("召回率:", recall_score(y_test, y_log_predict))

0.9755555555555555
403
2
9
36
[[ 36   9]
 [  2 403]]
精準率:  0.9473684210526315
召回率: 0.8

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

複習: 實現精準率和召回率

openai的api使用教程

prompt learning如何計算損失的

zeRO-Offload代碼實踐

pairwise損失_triplet損失_提升精排模型的trick

筆記_提升性能的損失_有用_pytorch

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結