從零基礎入門Tensorflow2.0 ----5.3 實戰sklearn超參數搜索

every blog every motto: Until you make peace with who you are, you’ll never be content with what you have.

0. 前言

實戰sklearn超參數搜索。
注: 訓練時間較長。

1. 代碼部分

1. 導入模塊

import matplotlib as mpl
import matplotlib.pyplot as plt
%matplotlib inline
import numpy as np
import sklearn
import pandas as pd
import os
import sys
import time
import tensorflow as tf
from tensorflow import keras

print(tf.__version__)
print(sys.version_info)
for module in mpl,np,pd,sklearn,tf,keras:
    print(module.__name__,module.__version__)

在這裏插入圖片描述

2. 讀取數據

from sklearn.datasets import fetch_california_housing

# 房價預測
housing = fetch_california_housing()
print(housing.DESCR)
print(housing.data.shape)
print(housing.target.shape)

3. 劃分樣本

# 劃分樣本
from sklearn.model_selection import train_test_split

x_train_all,x_test,y_train_all,y_test = train_test_split(housing.data,housing.target,random_state=7)
x_train,x_valid,y_train,y_valid = train_test_split(x_train_all,y_train_all,random_state=11)

print(x_train.shape,y_train.shape)
print(x_valid.shape,y_valid.shape)
print(x_test.shape,y_test.shape)

在這裏插入圖片描述

4. 數據歸一化

# 歸一化
from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()
x_train_scaled = scaler.fit_transform(x_train)
x_valid_scaled = scaler.transform(x_valid)
x_test_scaled = scaler.transform(x_test)

5. 構建模型、訓練

RandomizedSearchCV
步驟:

  1. 轉換爲sklearn的model(上節實現)
  2. 定義參數集合(本節實現)
  3. 搜索參數(本節實現)
# RandomizedSearchCV
# 步驟
# 1. 轉換爲sklearn的model
# 2. 定義參數集合
# 3. 搜索參數

def build_model(hidden_layers=1,layer_size=30,learning_rate=3e-3):
    model = keras.models.Sequential()
    model.add(keras.layers.Dense(layer_size,activation='relu',input_shape=x_train.shape[1:]))
    
    for _ in range(hidden_layers - 1):
        model.add(keras.layers.Dense(layer_size,activation='relu'))
        
    model.add(keras.layers.Dense(1))
    optimizer = keras.optimizers.SGD(learning_rate)
    model.compile(loss="mse",optimizer=optimizer)
        
    return model

# 轉成sklearn model
sklearn_model = keras.wrappers.scikit_learn.KerasRegressor(build_model)

# 回調函數
callbacks = [keras.callbacks.EarlyStopping(patience=5,min_delta=1e-3)]
# 訓練
history = sklearn_model.fit(x_train_scaled,y_train,epochs=100,validation_data=(x_valid_scaled,y_valid),callbacks=callbacks)

6. 學習曲線

# 學習曲線
def plot_learning_curves(history):
    pd.DataFrame(history.history).plot(figsize=(8,5))
    plt.grid(True)
    plt.gca().set_ylim(0,1)
    plt.show()
plot_learning_curves(history)

在這裏插入圖片描述

7. 超參數搜索

cross_validation:訓練集分成n份,n-1訓練,最後一份驗證

from scipy.stats import reciprocal
# f(x) = 1/(x*log(b/a)) a<=x <=b
param_distribution = {
    "hidden_layers":[1,2,3,4],
    "layer_size":np.arange(1,100),
    "learning_rate":reciprocal(1e-4,1e-2),
}

from sklearn.model_selection import RandomizedSearchCV

random_search_cv = RandomizedSearchCV(sklearn_model,param_distribution,n_iter = 10,n_jobs=1)
random_search_cv.fit(x_train_scaled,y_train,epochs=100,validation_data=(x_valid_scaled,y_valid),callbacks=callbacks)
# cross_validation:訓練集分成n份,n-1訓練,最後一份驗證

8. 查詢最好參數、分值、模型

# 查詢最好參數、分值、模型
print(random_search_cv.best_params_)
print(random_search_cv.best_score_)
print(random_search_cv.best_estimator_)

在這裏插入圖片描述

9. 獲取最好模型,並在測試集上驗證

# 獲取最好模型,對測試集進行測試
model = random_search_cv.best_estimator_.model
model.evaluate(x_test_scaled,y_test)

在這裏插入圖片描述

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章