【模型選擇與評估03】模型評估: 量化預測的質量

有 3 種不同的 API 用於評估模型預測的質量:

【1】Estimator score method（估計器得分的方法）: Estimators（估計器）有一個 score（得分）方法，爲其解決的問題提供了默認的 evaluation criterion （評估標準）。在這個頁面上沒有相關討論，但是在每個 estimator （估計器）的文檔中會有相關的討論。

【2】Scoring parameter（評分參數）: Model-evaluation tools （模型評估工具）使用 cross-validation (如 model_selection.cross_val_score 和 model_selection.GridSearchCV) 依靠 internal scoring strategy （內部 scoring（得分）策略）。這在 scoring 參數: 定義模型評估規則部分討論。

【3】Metric functions（指標函數）: metrics 模塊實現了針對特定目的評估預測誤差的函數。這些指標在以下部分部分詳細介紹分類指標, 多標籤排名指標, 迴歸指標和聚類指標。

1.基於scoring 參數來定義模型評估規則

Model selection （模型選擇）和 evaluation （評估）使用工具，例如 model_selection.GridSearchCV 和 model_selection.cross_val_score ，採用 scoring 參數來控制它們對 estimators evaluated （評估的估計量）應用的指標。

>>> from sklearn import svm, datasets
>>> from sklearn.model_selection import cross_val_score
>>> iris = datasets.load_iris()
>>> X, y = iris.data, iris.target
>>> clf = svm.SVC(probability=True, random_state=0)
>>> cross_val_score(clf, X, y, scoring='neg_log_loss') 
array([-0.07..., -0.16..., -0.06...])

2.根據 metric 函數定義您的評分策略

模塊 sklearn.metrics 還公開了一組 measuring a prediction error （測量預測誤差）的簡單函數，給出了基礎真實的數據和預測

>>> from sklearn.metrics import fbeta_score, make_scorer
>>> ftwo_scorer = make_scorer(fbeta_score, beta=2)
>>> from sklearn.model_selection import GridSearchCV
>>> from sklearn.svm import LinearSVC
>>> grid = GridSearchCV(LinearSVC(), param_grid={'C': [1, 10]}, scoring=ftwo_scorer)

3.使用多個指數評估

Scikit-learn 還允許在 GridSearchCV, RandomizedSearchCV 和 cross_validate 中評估 multiple metric （多個指數）。

>>> from sklearn.model_selection import cross_validate
>>> from sklearn.metrics import confusion_matrix
>>> # A sample toy binary classification dataset
>>> X, y = datasets.make_classification(n_classes=2, random_state=0)
>>> svm = LinearSVC(random_state=0)
>>> def tp(y_true, y_pred): return confusion_matrix(y_true, y_pred)[0, 0]
>>> def tn(y_true, y_pred): return confusion_matrix(y_true, y_pred)[0, 0]
>>> def fp(y_true, y_pred): return confusion_matrix(y_true, y_pred)[1, 0]
>>> def fn(y_true, y_pred): return confusion_matrix(y_true, y_pred)[0, 1]
>>> scoring = {'tp' : make_scorer(tp), 'tn' : make_scorer(tn),
...            'fp' : make_scorer(fp), 'fn' : make_scorer(fn)}
>>> cv_results = cross_validate(svm.fit(X, y), X, y, scoring=scoring)
>>> # Getting the test set true positive scores
>>> print(cv_results['test_tp'])          
[12 13 15]
>>> # Getting the test set false negative scores
>>> print(cv_results['test_fn'])          
[5 4 1]

4.分類指標、迴歸指標、聚類指標等

形式上基本上是導入後，按照：函數名（真實值，預測值）的結構來使用

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

【模型選擇與評估03】模型評估: 量化預測的質量

1.基於scoring 參數來定義模型評估規則

2.根據 metric 函數定義您的評分策略

3.使用多個指數評估

4.分類指標、迴歸指標、聚類指標等

985 碩士程序員，空窗 4 個月沒有 Offer！

一文搞懂 Spring 循環依賴

賽博鬥地主——使用大語言模型扮演Agent智能體玩牌類遊戲。

VScode右鍵打開(添加到右鍵)

記一次 .NET某工控視覺自動化系統卡死分析

WindowsServer--SQL Server搭建主從同步實現讀寫分離 - 事務性分發

java由於越界導致的報錯

PPT筆記

【西瓜書框圖筆記07】第八章——集成學習

【西瓜書框圖筆記10】第十一章——特徵選擇與稀疏學習

【西瓜書框圖筆記04】第五章——神經網絡

【西瓜書框圖筆記03】第四章——決策樹

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結