網格搜索最佳參數GridSearchCV

class sklearn.model_selection.GridSearchCV(estimator, param_grid, scoring=None, fit_params=None, n_jobs=1,iid=True, refit=True, cv=None, verbose=0, pre_dispatch='2*n_jobs', error_score='raise',return_train_score=True)

對分類器的指定參數值進行詳盡搜索

重要的成員是fit和predict

分類器的參數通過參數網格上的交叉驗證網格搜索進行優化

參數:

1.estimator : estimator object.
scikit-learn分類器接口，估計者需要提供分數函數。如estimator = GradientBoostingClassifier（參數設置）

2.param_grid : dict or list of dictionaries
具有參數名稱（字符串）作爲鍵的字典和要實數值的參數設置的列表，或者這些字典的列表，在這種情況下，會探索列表中每個字典跨越的網格。這樣可以根據任何參數設置的順序進行最優參數的搜索。

如param_test = {'n_estimators':range(20,81,10)}

param_dist = {"max_depth": [3, None],
              "max_features": [1,5,7,11],
              "min_samples_split": [1,5,7,11],
              "min_samples_leaf": [1,5,7,11],
              "bootstrap": [True, False],
              "criterion": ["gini", "entropy"]}

3.scoring : string, callable or None, default=None

字符串（見模型評估文檔）或具有簽名記分器（estimator，X，y）的可調用對象/函數。 如果沒有，則使用估計器的分數法

如scoring='roc_auc'

4.fit_params : dict, optional

要傳遞給擬合方法的參數

5.n_jobs : int, default=1

並行運行的作業數

6.pre_dispatch : int, or string, optional

控制在並行執行期間調度的作業數。 減少這個數字可以有效地避免在調度更多的作業時不超過CPU可以處理的內存消耗。 該參數可以是：
None，在這種情況下，所有的作業立即被創建和產生。 將其用於輕量級和快速運行的作業，以避免由於按需生成作業導致的延遲
int，給出所產生的總作業的確切數量
string，給出一個表達式作爲n_jobs的函數，如'2 * n_jobs'

7.iid : boolean, default=True

如果爲True，則假定數據在摺疊中相同分佈，損失最小化是每個樣本的總損失，而不是摺疊的平均損失。

8.cv : int, cross-validation generator or an iterable, optional

確定交叉驗證分裂策略。 cv的可能輸入是：
無，使用默認的3折交叉驗證，
整數，用於指定一個（分層）KFold中的摺疊數。一個用作交叉驗證生成器的對象。一個可迭代的生成列，測試分割。對於整數/無輸入，如果估計器是分類器，y是 使用二進制或多類，使用StratifiedKFold。 在所有其他情況下，使用KFold。

cv=5

9.refit : boolean, default=True

使用整個數據集重新設計最佳的估計值。 如果爲“False”，則在擬合之後不可能使用此GridSearchCV實例進行預測。

10.verbose : integer

控制冗長度：越高，消息越多

11.error_score : ‘raise’ (default) or numeric

12.return_train_score : boolean, default=True

如果“False”，cv_results_屬性將不包括訓練分數。

屬性:

cv_results_ : dict of numpy (masked) ndarrays

具有鍵作爲列標題和值作爲列的dict，可以導入到DataFrame中。注意，“params”鍵用於存儲所有參數候選項的參數設置列表。

best_estimator_ : estimator

通過搜索選擇的估計器，即在左側數據上給出最高分數（或指定的最小損失）的估計器。如果refit = False，則不可用。

best_score_ : float

best_estimator的分數

best_params_ : dict

在保存數據上給出最佳結果的參數設置

best_index_ : int

對應於最佳候選參數設置的索引（cv_results_數組）。
search.cv_results _ ['params'] [search.best_index_]中的dict給出了最佳模型的參數設置，給出了最高的平均分數（search.best_score_）。

scorer_ : function

Scorer function used on the held out data to choose the best parameters for the model.

n_splits_ : int

The number of cross-validation splits (folds/iterations).

方法

decision_function(*args, **kwargs)調用具有最佳發現參數的估計器的decision_function[source]

Parameters:

Parameters:	X : indexable, length n_samples Must fulfill the input assumptions of the underlying estimator.

X : indexable, length n_samples

Must fulfill the input assumptions of the underlying estimator.

fit(X, y=None, groups=None)運行擬合所有參數集。[source]

Parameters:

Parameters:	X : array-like, shape = [n_samples, n_features] Training vector, where n_samples is the number of samples and n_features is the number of features. y : array-like, shape = [n_samples] or [n_samples, n_output], optional Target relative to X for classification or regression; None for unsupervised learning. groups : array-like, with shape (n_samples,), optional Group labels for the samples used while splitting the dataset into train/test set.

X : array-like, shape = [n_samples, n_features]

Training vector, where n_samples is the number of samples and n_features is the number of features.

y : array-like, shape = [n_samples] or [n_samples, n_output], optional

Target relative to X for classification or regression; None for unsupervised learning.

groups : array-like, with shape (n_samples,), optional

Group labels for the samples used while splitting the dataset into train/test set.

get_params(deep=True)獲取此估計器的參數。[source]

Parameters:

Parameters:	deep : boolean, optional If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns:	params : mapping of string to any Parameter names mapped to their values.

deep : boolean, optional

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

params : mapping of string to any

Parameter names mapped to their values.

inverse_transform(*args, **kwargs)使用最好的參數調用估計器上的inverse_transform。[source]

Parameters:

Parameters:	Xt : indexable, length n_samples Must fulfill the input assumptions of the underlying estimator.

Xt : indexable, length n_samples

Must fulfill the input assumptions of the underlying estimator.

predict(*args, **kwargs)使用最好的參數調用估計器的預測。[source]

Parameters:

Parameters:	X : indexable, length n_samples Must fulfill the input assumptions of the underlying estimator.

X : indexable, length n_samples

Must fulfill the input assumptions of the underlying estimator.

predict_log_proba(*args, **kwargs)在具有最佳發現參數的估計器上調用predict_log_proba。[source]

Parameters:

Parameters:	X : indexable, length n_samples Must fulfill the input assumptions of the underlying estimator.

X : indexable, length n_samples

Must fulfill the input assumptions of the underlying estimator.

predict_proba(*args, **kwargs)[source]

Parameters:

Parameters:	X : indexable, length n_samples Must fulfill the input assumptions of the underlying estimator.

X : indexable, length n_samples

Must fulfill the input assumptions of the underlying estimator.

score(X, y=None)如果估計器已被重新設計，則返回給定數據的分數[source]

Parameters:

Parameters:	X : array-like, shape = [n_samples, n_features] Input data, where n_samples is the number of samples and n_features is the number of features. y : array-like, shape = [n_samples] or [n_samples, n_output], optional Target relative to X for classification or regression; None for unsupervised learning.
Returns:	score : float

X : array-like, shape = [n_samples, n_features]

Input data, where n_samples is the number of samples and n_features is the number of features.

y : array-like, shape = [n_samples] or [n_samples, n_output], optional

Target relative to X for classification or regression; None for unsupervised learning.

Returns:

score : float

例子：

predictors = [x for x in train.columns if x not in [target, IDcol]]
param_test1 = {'n_estimators':range(20,101,10)}#代表從20到81，間隔10(不包含81)
gsearch1 = GridSearchCV(estimator = GradientBoostingClassifier(learning_rate=0.1, min_samples_split=500,min_samples_leaf=50,max_depth=8,max_features='sqrt',subsample=0.8,random_state=10),
param_grid = param_test1, scoring='roc_auc',n_jobs=4,iid=False, cv=5)
gsearch1.fit(train[predictors],train[target])
gsearch1.grid_scores_, gsearch1.best_params_, gsearch1.best_score_

sklearn.model_selection.GridSearchCV 中文

網格搜索最佳參數GridSearchCV

參數:

屬性:

方法

微服務實踐之使用 Visual Studio 2022 調試Dapr 應用程序

wpf附加屬性理解 WPF附加屬性

Deeply-Recursive Convolutional Network for Image Super-Resolution

Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural

超分辨率重建最新算法總結

Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network

超分辨率重建鄰域嵌入部分代碼

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結