sklearn GridSearchCV

    前言:記錄常用工具,方便以後使用時可以隨時查看,也希望能夠幫到尋找這方面資料的人們。

sklearn中函數定義:

sklearn.model_selection.GridSearchCV(estimator, param_grid, scoring=None, fit_params=None, n_jobs=1, iid=True, refit=True, cv=None, verbose=0, pre_dispatch='2*n_jobs', error_score='raise', return_train_score=True)[source]

param_grid : dict or list of dictionaries

cv :int, cross-validation generator or an iterable, optional
Determines the cross-validation splitting strategy. Possible inputs for cv are:
None, to use the default 3-fold cross validation,
integer, to specify the number of folds in a (Stratified)KFold,
An object to be used as a cross-validation generator.
An iterable yielding train, test splits.

使用管道(pipeline)可以儲存需要用GridSearch搜索的學習器,使用一個字典可以存儲學習器需要調節的參數如:

parameters = {
    'vect__max_df': (0.5, 0.75),
    'vect__max_features': (None, 5000, 10000),
    'tfidf__use_idf': (True, False),
#    'tfidf__norm': ('l1', 'l2'),
    'clf__alpha': (0.00001, 0.000001),
#    'clf__penalty': ('l2', 'elasticnet'),
    'clf__n_iter': (10, 50),
}
pipeline = Pipeline([
('vect',CountVectorizer()),
('tfidf',TfidfTransformer()),
('clf',SGDClassifier()),
]);

grid_search = GridSearchCV(pipeline,parameters,n_jobs = 1,verbose=1);
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章