如何使用Python進行超參調參和調優

{"type":"doc","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"italic"},{"type":"color","attrs":{"color":"#000000","name":"user"}},{"type":"strong"}],"text":"本文最初發佈於rubikscode.com網站,經原作者授權由InfoQ中文站翻譯並分享。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"blockquote","content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","text":"圍繞模型優化這一主題發展出來的許多子分支之間的差異之大往往令人難以置信。其中的一個子分支叫做超參數優化,或超參數調優。"}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在本文中你會學到:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"numberedlist","attrs":{"start":null,"normalizeStart":1},"content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":1,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"機器學習中的超參數"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":2,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"前置條件和數據"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":3,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"網格搜索超參數調優"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":4,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"隨機搜索超參數調優"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":5,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"貝葉斯超參數優化"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":6,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"減半網格搜索和減半隨機搜索"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":7,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"替代選項"}]}]}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"機器學習中的超參數"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"超參數是所有機器學習和深度學習算法都包含的一部分。與由算法本身學習的標準機器學習參數(如線性迴歸中的w和b,或神經網絡中的連接權重)不同,"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"超參數由工程師在訓練流程之前設置"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"它們是完全由工程師定義的一項外部因素,用來控制學習算法的行爲。想看些例子?學習率是最著名的超參數之一,SVM中的C也是超參數,決策樹的最大深度同樣是一個超參數,等等。這些超參數都可以由工程師手動設置。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"但是,如果我們想運行多個測試,超參數用起來可能會很麻煩。於是我們就需要對超參數做優化了。這些技術的主要目標是找到給定機器學習算法的最佳超參數,以在驗證集上獲得最佳評估性能。在本教程中,我們探索了幾種可以爲你提供最佳超參數的技術。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"前置條件和數據"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"前置條件和庫"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"請安裝以下Python庫,爲本文接下來的內容做準備:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"NumPy——如果你需要安裝幫助,請參考這份"},{"type":"link","attrs":{"href":"https:\/\/numpy.org\/install\/","title":null,"type":null},"content":[{"type":"text","text":"指南"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"SciKit Learn——如果你需要安裝幫助,請參考這份"},{"type":"link","attrs":{"href":"https:\/\/scikit-learn.org\/stable\/install.html","title":null,"type":null},"content":[{"type":"text","text":"指南"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"SciPy——如果你需要安裝幫助,請參考這份"},{"type":"link","attrs":{"href":"https:\/\/www.scipy.org\/install.html","title":null,"type":null},"content":[{"type":"text","text":"指南"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"Sci-Kit Optimization——如果你需要安裝幫助,請參考這份"},{"type":"link","attrs":{"href":"https:\/\/scikit-optimize.github.io\/stable\/install.html","title":null,"type":null},"content":[{"type":"text","text":"指南"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"安裝完成後,請確保你已導入本教程中使用的所有必要模塊。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"import pandas as pd\nimport numpy as np\nimport matplotlib.pyplot as plt\n\nfrom sklearn.preprocessing import StandardScaler\nfrom sklearn.model_selection import train_test_split\nfrom sklearn.metrics import f1_score\n\nfrom sklearn.model_selection import GridSearchCV, RandomizedSearchCV\n\nfrom sklearn.experimental import enable_halving_search_cv\nfrom sklearn.model_selection import HalvingGridSearchCV, HalvingRandomSearchCV\n\nfrom sklearn.svm import SVC\nfrom sklearn.ensemble import RandomForestRegressor\n\nfrom scipy import stats\nfrom skopt import BayesSearchCV\nfrom skopt.space import Real, Categorical"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"除此之外,你最好起碼熟悉一下線性代數、微積分和概率論的基礎知識。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"準備數據"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"我們在本文中使用的數據來自PalmerPenguins數據集。該數據集是最近發佈的,旨在作爲著名的Iris數據集的替代品。它由Kristen Gorman博士和南極洲LTER的帕爾默科考站共同創建。你可以在"},{"type":"link","attrs":{"href":"https:\/\/github.com\/allisonhorst\/palmerpenguins","title":null,"type":null},"content":[{"type":"text","text":"此處"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"或通過Kaggle獲取此數據集。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"該數據集本質上是由兩個數據集組成的,每個數據集包含344只企鵝的數據。就像Iris一樣,這個數據集也有來自帕爾默羣島3個島嶼的3個種類的企鵝。此外,這些數據集包含每個物種的"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"culmen"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"維度。culmen是鳥喙的上脊。在簡化的企鵝數據中,culmen長度和深度被重命名爲變量culmen_length_mm和culmen_depth_mm。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/fd\/91\/fd136bce4f63dd478585782620a2d791.jpg","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"由於這個數據集已經標記過了,我們應該能驗證我們的實驗結果。但實際情況往往沒這麼簡單,聚類算法結果的驗證通常是一個艱難而複雜的過程。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"我們先來加載並準備PalmerPenguins數據集。首先,我們加載數據集,刪除本文中不會用到的特徵:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"data = pd.read_csv('.\/data\/penguins_size.csv')\n\ndata = data.dropna()\ndata = data.drop(['sex', 'island', 'flipper_length_mm', 'body_mass_g'], axis=1)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"然後我們分離輸入數據並對其進行縮放:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"X = data.drop(['species'], axis=1)\n\nss = StandardScaler()\nX = ss.fit_transform(X) \n\ny = data['species']\nspicies = {'Adelie': 0, 'Chinstrap': 1, 'Gentoo': 2}\ny = [spicies[item] for item in y]\ny = np.array(y) "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"最後,我們將數據拆分爲訓練和測試數據集:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=33)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"當我們繪製這裏的數據時,圖像是下面這個樣子:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/90\/41\/9085ab63af5f7a5a1e47d867ab4b8c41.jpg","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"網格搜索超參數調優"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"超參數調優的工作手動做起來又慢又煩人。所以我們開始探索第一個,也是最簡單的超參數優化技術——網格搜索。這種技術可以加快調優工作,是最常用的超參數優化技術之一。從本質上講,它會自動化試錯流程。對於這種技術,我們提供了一個包含所有超參數值的列表,然後該算法爲每個可能的組合構建模型,對其進行評估,並選擇提供最佳結果的值。它是一種通用技術,可以應用於任何模型。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在我們的示例中,我們使用SVM算法進行分類。我們考慮了三個超參數——C、gamma和kernel。想要更詳細地瞭解它們的話請查看這篇"},{"type":"link","attrs":{"href":"https:\/\/rubikscode.net\/2020\/08\/10\/back-to-machine-learning-basics-support-vector-machines\/","title":null,"type":null},"content":[{"type":"text","text":"文章"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"。對於C,我們要檢查以下值:0.1、1、100、1000;對於gamma,我們使用值:0.0001、0.001、0.005、0.1、1、3、5,對於kernel,我們使用值:“linear”和“rbf”。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"網格搜索實現"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"下面是代碼中的樣子:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"hyperparameters = {\n 'C': [0.1, 1, 100, 1000],\n 'gamma': [0.0001, 0.001, 0.005, 0.1, 1, 3, 5],\n 'kernel': ('linear', 'rbf')\n}"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"我們這裏利用了Sci-Kit Learn庫及其SVC類,其中包含SVM分類實現。除此之外,我們還使用了GridSearchCV類,用於網格搜索優化。結合起來是這個樣子:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"grid = GridSearchCV(\n estimator=SVC(),\n param_grid=hyperparameters,\n cv=5, \n\tscoring='f1_micro', \n\tn_jobs=-1)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"這個類通過構造器接收幾個參數:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"estimator——實例機器學習算法本身。我們在那裏傳遞SVC類的新實例。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"param_grid——包含超參數字典。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"cv——確定交叉驗證拆分策略。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"scoring——用於評估預測的驗證指標。我們使用F1分數。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"n_jobs——表示要並行運行的作業數。值-1表示正在使用所有處理器。"}]}]}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"剩下要做的就是使用fit方法運行訓練過程:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"grid.fit(X_train, y_train)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"訓練完成後,我們可以查看最佳超參數和這些參數的得分:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"print(f'Best parameters: {grid.best_params_}')\nprint(f'Best score: {grid.best_score_}')"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"Best parameters: {'C': 1000, 'gamma': 0.1, 'kernel': 'rbf'}\nBest score: 0.9626834381551361\t"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"此外,我們可以打印出所有結果:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"print(f'All results: {grid.cv_results_}')"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"Allresults: {'mean_fit_time': array([0.00780015, 0.00280147, 0.00120015, 0.00219998, 0.0240006 ,\n 0.00739942, 0.00059962, 0.00600033, 0.0009994 , 0.00279789,\n 0.00099969, 0.00340114, 0.00059986, 0.00299864, 0.000597 ,\n 0.00340023, 0.00119658, 0.00280094, 0.00060058, 0.00179944,\n 0.00099964, 0.00079966, 0.00099916, 0.00100031, 0.00079999,\n 0.002 , 0.00080023, 0.00220037, 0.00119958, 0.00160012,\n 0.02939963, 0.00099955, 0.00119963, 0.00139995, 0.00100069,\n 0.00100017, 0.00140052, 0.00119977, 0.00099974, 0.00180006,\n 0.00100312, 0.00199976, 0.00220003, 0.00320096, 0.00240035,\n 0.001999 , 0.00319982, 0.00199995, 0.00299931, 0.00199928, \n..."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"好的,現在我們構建這個模型並檢查它在測試數據集上的表現:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"model = SVC(C=500, gamma = 0.1, kernel = 'rbf')\nmodel.fit(X_train, y_train)\n\n\npreditions = model.predict(X_test)\nprint(f1_score(preditions, y_test, average='micro'))"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"0.9701492537313433"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"結果很不錯,我們的模型用建議的超參數獲得了約97%的精度。下面是繪製時模型的樣子:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/2a\/20\/2a4a17fba74d3731a28a9e2270d07420.jpg","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"隨機搜索超參數調優"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"網格搜索非常簡單,但它的計算成本也很高。特別是在深度學習領域,訓練可能需要大量時間。此外,某些超參數可能比其他超參數更重要。於是人們提出了隨機搜索的想法,本文接下來會具體介紹。事實上,這項研究表明,隨機搜索在做超參數優化時計算成本比網格搜索更有優勢。這種技術也讓我們可以更精確地發現重要超參數的理想值。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"就像網格搜索一樣,隨機搜索會創建一個超參數值網格並選擇隨機組合來訓練模型。這種方法可能會錯過最佳組合,但是與網格搜索相比,它選擇最佳結果的機率竟然是更高的,而且需要的時間只有網格搜索的一小部分。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"隨機搜索實現"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"我們看看它是怎樣寫成代碼的。我們再次使用Sci-KitLearn庫的SVC類,但這次我們使用RandomSearchCV類進行隨機搜索優化。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"hyperparameters = {\n \"C\": stats.uniform(500, 1500),\n \"gamma\": stats.uniform(0, 1),\n 'kernel': ('linear', 'rbf')\n}\nrandom = RandomizedSearchCV(\n estimator = SVC(), \n param_distributions = hyperparameters, \n n_iter = 100, \n cv = 3, \n random_state=42, \n n_jobs = -1)\nrandom.fit(X_train, y_train)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"請注意,我們對C和gamma使用了均勻分佈。同樣,我們可以打印出結果:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"print(f'Best parameters: {random.best_params_}')\nprint(f'Best score: {random.best_score_}')"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"Best parameters: {'C': 510.5994578295761, 'gamma': 0.023062425041415757, 'kernel': 'linear'}\nBest score: 0.9700374531835205"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"可以看到我們的結果接近網格搜索,但並不一樣。網格搜索的超參數C的值爲500,而隨機搜索的值爲510.59。僅從這一點你就可以看到隨機搜索的好處,因爲我們不太可能將這個值放入網格搜索列表中。類似地,對於gamma,我們的隨機搜索結果爲0.23,而網格搜索爲0.1。真正令人驚訝的是隨機搜索選擇了線性kernel而不是RBF,並且它獲得了更高的F1分數。要打印所有結果,我們使用cv_results_屬性:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"print(f'All results: {random.cv_results_}')"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"Allresults: {'mean_fit_time': array([0.00200065, 0.00233404, 0.00100454, 0.00233777, 0.00100009,\n 0.00033339, 0.00099715, 0.00132942, 0.00099921, 0.00066725,\n 0.00266568, 0.00233348, 0.00233301, 0.0006667 , 0.00233285,\n 0.00100001, 0.00099993, 0.00033331, 0.00166742, 0.00233364,\n 0.00199914, 0.00433286, 0.00399915, 0.00200049, 0.01033338,\n 0.00100342, 0.0029997 , 0.00166655, 0.00166726, 0.00133403,\n 0.00233293, 0.00133729, 0.00100009, 0.00066662, 0.00066646,\n\t \n\t ...."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"我們來重複上面網格搜索的步驟:使用建議的超參數創建模型,檢查測試數據集的分數並繪製模型。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"model = SVC(C=510.5994578295761, gamma = 0.023062425041415757, kernel = 'linear')\nmodel.fit(X_train, y_train)\npreditions = model.predict(X_test)\nprint(f1_score(preditions, y_test, average='micro'))"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"0.9701492537313433"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"哇,測試數據集上的F1分數與我們使用網格搜索時的分數完全相同。查看模型:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/58\/08\/58b9ce80e945215903c1f5e83a6fb708.jpg","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"貝葉斯超參數優化"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"前兩種算法有一點很棒,那就是使用各種超參數值的所有實驗都可以"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"並行"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"運行。這可以爲我們節省很多時間。然而這也是它們最大的缺陷所在。由於每個實驗都是孤立運行的,我們不能在當前實驗中使用來自過去實驗的信息。有一個專門用於解決序列優化問題的領域——基於模型的序列優化(SMBO)。在該領域探索的那些算法會使用先前的實驗和對損失函數的觀察結果,然後基於它們來試圖確定下一個最佳點。其中一種算法是貝葉斯優化。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"這種算法就像來自SMBO組的其他算法一樣,使用先前評估的點(在這裏指的是超參數值,但我們可以推而廣之)來計算損失函數的後驗期望。該算法使用兩個重要的數學概念——"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"高斯過程"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"和"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"採集函數"},{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"。由於高斯分佈是在隨機變量上完成的,因此高斯過程是其對函數的泛化。就像高斯分佈有均值和協方差一樣,高斯過程是用均值函數和協方差函數來描述的。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"採集函數是我們用來評估當前損失值的函數。可以把它看作是損失函數的損失函數。它是損失函數的後驗分佈函數,描述了所有超參數值的效用。最流行的採集函數是Expected Improvement(EI):"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/d6\/1d\/d6424ce7e6eaff7dfae82c785728de1d.jpg","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"其中f是損失函數,x'是當前最優的超參數集。當我們把它們放在一起時,貝葉斯優化分3個步驟完成:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"bulletedlist","content":[{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"使用先前評估的損失函數點,使用高斯過程計算後驗期望。"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"選擇最大化 EI的新點集"}]}]},{"type":"listitem","attrs":{"listStyle":null},"content":[{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"計算新選擇點的損失函數"}]}]}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"貝葉斯優化實現"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"將其帶入代碼的最簡單方法是使用Sci-Kit optimization庫,通常稱爲skopt。按照我們在前面示例中使用的過程,我們可以執行以下操作:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"hyperparameters = {\n \"C\": Real(1e-6, 1e+6, prior='log-uniform'),\n \"gamma\": Real(1e-6, 1e+1, prior='log-uniform'),\n \"kernel\": Categorical(['linear', 'rbf']),\n}\nbayesian = BayesSearchCV(\n estimator = SVC(), \n search_spaces = hyperparameters, \n n_iter = 100, \n cv = 5, \n random_state=42, \n n_jobs = -1)\nbayesian.fit(X_train, y_train)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"同樣,我們爲超參數集定義了字典。請注意,我們使用了Sci-Kit優化庫中的Real和Categorical類。然後我們用和使用GridSearchCV或RandomSearchCV類相同的方式來使用BayesSearchCV類。訓練完成後,我們可以打印出最好的結果:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"print(f'Best parameters: {bayesian.best_params_}')\nprint(f'Best score: {bayesian.best_score_}')"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"Best parameters: \nOrderedDict([('C', 3932.2516133086), ('gamma', 0.0011646737978730447), ('kernel', 'rbf')])\nBest score: 0.9625468164794008"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"很有趣,不是嗎?使用這種優化我們得到了完全不同的結果。損失比我們使用隨機搜索時要高一些。我們甚至可以打印出所有結果:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"print(f'All results: {bayesian.cv_results_}')"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"All results: defaultdict(, {'split0_test_score': [0.9629629629629629,\n 0.9444444444444444, 0.9444444444444444, 0.9444444444444444, 0.9444444444444444,\n 0.9444444444444444, 0.9444444444444444, 0.9444444444444444, 0.46296296296296297,\n 0.9444444444444444, 0.8703703703703703, 0.9444444444444444, 0.9444444444444444, \n 0.9444444444444444, 0.9444444444444444, 0.9444444444444444, 0.9444444444444444, \n ....."}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"使用這些超參數的模型在測試數據集上的表現如何?我們來了解一下:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"model = SVC(C=3932.2516133086, gamma = 0.0011646737978730447, kernel = 'rbf')\nmodel.fit(X_train, y_train)\npreditions = model.predict(X_test)\nprint(f1_score(preditions, y_test, average='micro'))"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"0.9850746268656716"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"太有意思了。儘管我們在驗證數據集上的結果要差一些,但我們在測試數據集上獲得了更好的分數。下面是模型:"}]},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/01\/d0\/01da0073yydcc8a02f0d7bb8b280a8d0.jpg","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"加點樂趣,我們可以把所有這些模型並排放置:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/b6\/f0\/b62e8590fce37eab5f71dd5f9629aff0.jpg","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"減半網格搜索和減半隨機搜索"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"幾個月前,Sci-Kit Learn庫引入了兩個新類,HalvingGridSearchCV和HalvingRandomSearchCV。他們聲稱,這兩個類“可以更快地找到一個理想參數組合”。這些類使用連續減半方法來搜索指定的參數值。該技術開始使用少量資源評估所有候選者,並使用越來越多的資源迭代地選擇最佳候選者。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"從減半網格搜索的角度來看,這意味着在第一次迭代中,所有候選者都將在少量訓練數據上進行訓練。下一次迭代將只包括在前一次迭代中表現最好的候選者,這些模型將獲得更多資源,也就是更多的訓練數據,然後再做評估。這個過程將繼續,並且減半網格搜索將只保留前一次迭代中的最佳候選者,直到只剩最後一個爲止。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"整個過程由兩個參數控制——min_samples和factor。第一個參數min_samples表示進程開始時的數據量。每次迭代時,該數據集將按factor定義的值增長。該過程類似於HalvingRandomSearchCV。"}]},{"type":"heading","attrs":{"align":null,"level":3},"content":[{"type":"text","text":"減半網格搜索和減半隨機搜索實現"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"這裏的代碼與前面的示例類似,我們只是使用了不同的類。我們先從HalvingGridSearch開始:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"hyperparameters = {\n 'C': [0.1, 1, 100, 500, 1000],\n 'gamma': [0.0001, 0.001, 0.01, 0.005, 0.1, 1, 3, 5],\n 'kernel': ('linear', 'rbf')\n}\n\n\ngrid = HalvingGridSearchCV(\n estimator=SVC(),\n param_grid=hyperparameters,\n cv=5, \n scoring='f1_micro', \n n_jobs=-1)\ngrid.fit(X_train, y_train)"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"有趣的是這段代碼只運行了0.7秒。相比之下,使用GridSearchCV類的相同代碼跑了3.6秒。前者的速度快得多,但結果有點不同:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"print(f'Best parameters: {grid.best_params_}')\nprint(f'Best score: {grid.best_score_}')"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"Best parameters: {'C': 500, 'gamma': 0.005, 'kernel': 'rbf'}\nBest score: 0.9529411764705882"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"我們得到了相似的結果,但並不相同。如果我們使用這些值創建模型將獲得以下精度和圖:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"model = SVC(C=500, gamma = 0.005, kernel = 'rbf')\nmodel.fit(X_train, y_train)\npreditions = model.predict(X_test)\nprint(f1_score(preditions, y_test, average='micro'))"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"0.9850746268656716"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/d6\/7a\/d63a4d52435405e8d476a73ba309137a.jpg","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"然後我們對減半隨機搜索重複上述過程。有趣的是,使用這種方法我們得到了最奇怪的結果。我們可以說以這種方式創建的模型很難過擬合:"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"hyperparameters = {\n \"C\": stats.uniform(500, 1500),\n \"gamma\": stats.uniform(0, 1),\n 'kernel': ('linear', 'rbf')\n}\nrandom = HalvingRandomSearchCV(\n estimator = SVC(), \n param_distributions = hyperparameters, \n cv = 3, \n random_state=42, \n n_jobs = -1)\nrandom.fit(X_train, y_train)\nprint(f'Best parameters: {random.best_params_}')\nprint(f'Best score: {random.best_score_}')"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"codeblock","attrs":{"lang":null},"content":[{"type":"text","text":"Best parameters: {'C': 530.8767414437036, 'gamma': 0.9699098521619943, 'kernel': 'rbf'}\nBest score: 0.9506172839506174"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"image","attrs":{"src":"https:\/\/static001.infoq.cn\/resource\/image\/2c\/29\/2c4a3b1f9613afb1252c5af6c1ab1729.jpg","alt":null,"title":"","style":[{"key":"width","value":"75%"},{"key":"bordertype","value":"none"}],"href":"","fromPaste":false,"pastePass":false}},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"其他替代品"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"一般來說,前面描述的這些方法是最流行和最常用的。但是,如果上面介紹的方案不適合你,你還可以考慮多種替代方案。其中之一是基於梯度的超參數值優化。該技術會計算關於超參數的梯度,然後使用梯度下降算法對其進行優化。這種方法的問題在於,要讓梯度下降過程正常運行,我們需要凸且平滑的函數,但超參數這個領域並不是總有這樣的條件。另一種方法是使用進化算法進行優化。"}]},{"type":"heading","attrs":{"align":null,"level":2},"content":[{"type":"text","text":"小結"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null}},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"在本文中,我們介紹了幾種衆所周知的超參數優化和調整算法。我們學習瞭如何使用網格搜索、隨機搜索和貝葉斯優化來獲得超參數的最佳值。我們還學到了如何利用Sci-KitLearn類和方法在代碼中做到這一點。"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":"感謝閱讀!"}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}],"text":" "}]},{"type":"paragraph","attrs":{"indent":0,"number":0,"align":null,"origin":null},"content":[{"type":"text","marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}},{"type":"strong"}],"text":"原文鏈接:"},{"type":"link","attrs":{"href":"https:\/\/rubikscode.net\/2021\/08\/17\/ml-optimization-pt-3-hyperparameter-optimization-with-python\/","title":null,"type":null},"content":[{"type":"text","text":"https:\/\/rubikscode.net\/2021\/08\/17\/ml-optimization-pt-3-hyperparameter-optimization-with-python\/"}],"marks":[{"type":"color","attrs":{"color":"#494949","name":"user"}}]}]}]}
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章