Keras中實現神經網絡的Stacking方法

Table of Contents

1. stack模型的一般集成方式

平均集成模型聯合來自不同訓練模型的預測結果。
  該方法的限制在於每個模型都輸出相同的預測結果給集成預測集,沒有考慮模型性能。該方法的一種變異,加權平均集成,它對每個ensemble成員中的權重,通過trust或expected模型在留出數據集上性能來衡量。
這使得性能好的模型權重高,性能差的模型權重低。加權平均集成的性能優於平均集成。
線性加權求和的模型是將任意子模型的結果聯合起來,加權平均集成未來將會取代它。加權平均集成也稱爲stacked generalization或者stacking.
  在stacking中,在子模型之上的模型算法通過將子模型結果作爲訓練集,原標籤爲目標變量,以求獲得更好的擬合能力。
  不妨將stacking過程理解成兩層:level0和 level1。

  • level0:level0數據集是原始數據集的訓練數據,用於訓練子模型,子模型輸出預測結果;
  • level1:level1數據集是將level0的預測結果作爲訓練數據,用於元模型訓練,輸出預測結果。

2. 分類任務的定義

數據集基本情況:

  • 樣本量:1000
  • 特徵:2
  • 目標變量:數值型,3分類

make_blobs() function :指定採樣量,輸入變量,類標籤等創建樣本集。

from sklearn.datasets import make_blobs
from keras.models import Sequential
from keras.models import load_model

from keras.layers import Dense
from keras.utils import to_categorical

from numpy import dstack
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
Using TensorFlow backend.
# generate 2d classification dataset
X, y = make_blobs(n_samples=1000, centers=3, n_features=2, cluster_std=2, random_state=2)
from sklearn.datasets.samples_generator import make_blobs
from matplotlib import pyplot
from pandas import DataFrame
# generate 2d classification dataset
X, y = make_blobs(n_samples=1000, centers=3, n_features=2, cluster_std=2, random_state=2)
# scatter plot, dots colored by class value
df = DataFrame(dict(x=X[:,0], y=X[:,1], label=y))
colors = {0:'red', 1:'blue', 2:'green'}
fig, ax = pyplot.subplots()
grouped = df.groupby('label')
for key, group in grouped:
    group.plot(ax=ax, kind='scatter', x='x', y='y', label=key, color=colors[key])
pyplot.show()

在這裏插入圖片描述

3. 神經網絡-多層感知器


# generate 2d classification dataset
X, y = make_blobs(n_samples=1100, centers=3, n_features=2, cluster_std=2, random_state=2)
# one hot encode output variable
y = to_categorical(y)
# split into train and test
n_train = 100
trainX, testX = X[:n_train, :], X[n_train:, :]
trainy, testy = y[:n_train], y[n_train:]
print(trainX.shape, testX.shape)

(100, 2) (1000, 2)

keras的softmax報錯:TypeError: softmax() got an unexpected keyword argument ‘axis’

方法:C:\ProgramData\Anaconda3\Lib\site-packages\keras\backend\tensorflow_backend.py中的3000多行的softmax函數的axis刪除,在重新啓動jupyter notebook

參考:
https://blog.csdn.net/czp_374/article/details/80647940

# define model
model = Sequential()
model.add(Dense(25, input_dim=2, activation='relu'))
model.add(Dense(3,activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
# fit model
history = model.fit(trainX, trainy, validation_data=(testX, testy), epochs=500, verbose=0)
# evaluate the model
_, train_acc = model.evaluate(trainX, trainy, verbose=0)
_, test_acc = model.evaluate(testX, testy, verbose=0)
print('Train: %.3f, Test: %.3f' % (train_acc, test_acc))
Train: 0.830, Test: 0.820
# learning curves of model accuracy
pyplot.plot(history.history['acc'], label='train')
pyplot.plot(history.history['val_acc'], label='test')
pyplot.legend()
pyplot.show()

在這裏插入圖片描述

4. 訓練並保存模型

創建MLP模型並訓練

# fit model on dataset
def fit_model(trainX, trainy):
	# define model
	model = Sequential()
	model.add(Dense(25, input_dim=2, activation='relu'))
	model.add(Dense(3, activation='softmax'))
	model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
	# fit model
	model.fit(trainX, trainy, epochs=500, verbose=0)
	return model

創建存放模型的文件夾

# create directory for models
import os
os.makedirs('tmp_models')

創建MLP子模型並保存

# fit and save models
n_members = 5
for i in range(n_members):
	# fit model
	model = fit_model(trainX, trainy)
	# save model
	filename = 'tmp_models/model_' + str(i + 1) + '.h5'
	model.save(filename)
	print('>Saved %s' % filename)
>Saved tmp_models/model_1.h5
>Saved tmp_models/model_2.h5
>Saved tmp_models/model_3.h5
>Saved tmp_models/model_4.h5
>Saved tmp_models/model_5.h5

5. 獨立Stacking Model

載入子模型(sub-model)

# load models from file
def load_all_models(n_models):
	all_models = list()
	for i in range(n_models):
		# define filename for this ensemble
		filename = 'tmp_models/model_' + str(i + 1) + '.h5'
		# load model from file
		model = load_model(filename)
		# add to list of members
		all_models.append(model)
		print('>loaded %s' % filename)
	return all_models

# load all models
n_members = 5
members = load_all_models(n_members)
print('Loaded %d models' % len(members))
>loaded tmp_models/model_1.h5
>loaded tmp_models/model_2.h5
>loaded tmp_models/model_3.h5
>loaded tmp_models/model_4.h5
>loaded tmp_models/model_5.h5
Loaded 5 models
testy_enc
array([[[ 0.,  1.],
        [ 1.,  0.],
        [ 1.,  0.]],

       [[ 1.,  0.],
        [ 1.,  0.],
        [ 0.,  1.]],

       [[ 1.,  0.],
        [ 1.,  0.],
        [ 0.,  1.]],

       ..., 
       [[ 1.,  0.],
        [ 0.,  1.],
        [ 1.,  0.]],

       [[ 0.,  1.],
        [ 1.,  0.],
        [ 1.,  0.]],

       [[ 1.,  0.],
        [ 1.,  0.],
        [ 0.,  1.]]], dtype=float32)
print(testX.shape,testy_enc.shape,trainX.shape, testy.shape, trainy.shape)
(1000, 2) (1000, 3, 2) (100, 2) (1000, 3) (100, 3)
members[0].evaluate(testX, testy)
1000/1000 [==============================] - 1s 570us/step





[0.4565311725139618, 0.81399999999999995]
# evaluate standalone models on test dataset
for model in members:
# 	testy_enc = to_categorical(testy)
	_, acc = model.evaluate(testX, testy, verbose=0)
	print('Model Accuracy: %.3f' % acc)
Model Accuracy: 0.814
Model Accuracy: 0.799
Model Accuracy: 0.810
Model Accuracy: 0.807
Model Accuracy: 0.809

訓練元模型(meta-learner)

# create stacked model input dataset as outputs from the ensemble
def stacked_dataset(members, inputX):
	stackX = None
	for model in members:
		# make prediction
		yhat = model.predict(inputX, verbose=0)
		# stack predictions into [rows, members, probabilities]
		if stackX is None:
			stackX = yhat
		else:
			stackX = dstack((stackX, yhat))
	# flatten predictions to [rows, members x probabilities]
	stackX = stackX.reshape((stackX.shape[0], stackX.shape[1]*stackX.shape[2]))
	return stackX

舉例: 元模型=llogistic爲例

第一層模型是各個神經網絡
第二層模型是logistic

def fit_stacked_model(members, inputX, inputy):
	# create dataset using ensemble
	stackedX = stacked_dataset(members, inputX)
	# fit standalone model
	model = LogisticRegression()
	model.fit(stackedX, inputy)
	return model
testy3 = np.sum(testy, axis=1)

# generate 2d classification dataset
X, y = make_blobs(n_samples=1100, centers=3, n_features=2, cluster_std=2, random_state=2)
# one hot encode output variable
# split into train and test
n_train = 100
trainX, testX = X[:n_train, :], X[n_train:, :]
trainy, testy = y[:n_train], y[n_train:]
# fit stacked model using the ensemble
model = fit_stacked_model(members, testX, testy2)
# make a prediction with the stacked model
def stacked_prediction(members, model, inputX):
	# create dataset using ensemble
	stackedX = stacked_dataset(members, inputX)
	# make a prediction
	yhat = model.predict(stackedX)
	return yhat
# evaluate model on test set
yhat = stacked_prediction(members, model, testX)
acc = accuracy_score(testy, yhat)
print('Stacked Test Accuracy: %.3f' % acc)
Stacked Test Accuracy: 0.823

6. 集成的stacking Model-神經網絡嵌入神經網絡模型的方式

第一層模型是神經網絡
第二層模型是神經網絡

# stacked generalization with neural net meta model on blobs dataset
from sklearn.datasets.samples_generator import make_blobs
from sklearn.metrics import accuracy_score
from keras.models import load_model
from keras.utils import to_categorical
from keras.utils import plot_model
from keras.models import Model
from keras.layers import Input
from keras.layers import Dense
from keras.layers.merge import concatenate
from numpy import argmax

# load models from file
def load_all_models(n_models):
	all_models = list()
	for i in range(n_models):
		# define filename for this ensemble
		filename = 'tmp_models/model_' + str(i + 1) + '.h5'
		# load model from file
		model = load_model(filename) 
		# add to list of members
		all_models.append(model)
		print('>loaded %s' % filename)
	return all_models

# define stacked model from multiple member input models
def define_stacked_model(members):
	# update all layers in all models to not be trainable
	for i in range(len(members)):
		model = members[i]
		for layer in model.layers:  # 對原已訓練好的模型model,凍結所有layer不再參加訓練,
			# make not trainable
			layer.trainable = False
			# rename to avoid 'unique layer name' issue
			layer.name = 'ensemble_' + str(i+1) + '_' + layer.name
	# define multi-headed input
	ensemble_visible = [model.input for model in members]  # 獲取n個原模型的input張量
	# concatenate merge output from each model
	ensemble_outputs = [model.output for model in members]  # 獲取n個原模型的output張量
	merge = concatenate(ensemble_outputs)  # 披了外殼的tf.concat()。參考:https://blog.csdn.net/leviopku/article/details/82380710
	hidden = Dense(10, activation='relu')(merge)
	output = Dense(3, activation='softmax')(hidden)
	model = Model(inputs=ensemble_visible, outputs=output)
	# plot graph of ensemble
	plot_model(model, show_shapes=True, to_file='model_graph.png')
	# compile
	model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
	return model

# fit a stacked model
def fit_stacked_model(model, inputX, inputy):
	# prepare input data
	X = [inputX for _ in range(len(model.input))]
	# encode output data
	inputy_enc = to_categorical(inputy)
	# fit model
	model.fit(X, inputy_enc, epochs=300, verbose=0)

# make a prediction with a stacked model
def predict_stacked_model(model, inputX):
	# prepare input data
	X = [inputX for _ in range(len(model.input))]
	# make prediction
	return model.predict(X, verbose=0)

# generate 2d classification dataset
X, y = make_blobs(n_samples=1100, centers=3, n_features=2, cluster_std=2, random_state=2)
# split into train and test
n_train = 100
trainX, testX = X[:n_train, :], X[n_train:, :]
trainy, testy = y[:n_train], y[n_train:]
print(trainX.shape, testX.shape)
# load all models
n_members = 5
members = load_all_models(n_members)
print('Loaded %d models' % len(members))
# define ensemble model
stacked_model = define_stacked_model(members)
# fit stacked model on test dataset
fit_stacked_model(stacked_model, testX, testy)
# make predictions and evaluate
yhat = predict_stacked_model(stacked_model, testX)
yhat = argmax(yhat, axis=1)
acc = accuracy_score(testy, yhat)
print('Stacked Test Accuracy: %.3f' % acc)
Using TensorFlow backend.


(100, 2) (1000, 2)
>loaded tmp_models/model_1.h5
>loaded tmp_models/model_2.h5
>loaded tmp_models/model_3.h5
>loaded tmp_models/model_4.h5
>loaded tmp_models/model_5.h5
Loaded 5 models
Stacked Test Accuracy: 0.832

原文鏈接:
How to Develop a Stacking Ensemble for Deep Learning Neural Networks in Python With Keras

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章