Encoder-Decoder LSTM Model模型對家庭用電進行多步時間序列預測

在本節中,我們可以更新普通的LSTM以使用編解碼器模型。這意味着模型不會直接輸出向量序列。相反,該模型將由兩個子模型組成,用於讀取和編碼輸入序列的編碼器,以及讀取編碼的輸入序列並對輸出序列中的每個元素進行一步預測的解碼器。這種差別很細微,因爲實際上這兩種方法都可以預測序列輸出。重要的不同之處在於,解碼器使用了LSTM模型,這使得解碼器既可以知道前一天在序列中預測了什麼,又可以在輸出序列時積累內部狀態。讓我們仔細看看這個模型是如何定義的。和前面一樣,我們定義了一個包含200個單元的LSTM隱藏層。這是解碼器模型,它將讀取輸入序列並輸出一個200個元素向量(每個單元一個輸出),該元素向量從輸入序列捕獲特性。

我們將使用14天的總功耗作爲輸入。

# define model
model = Sequential()
model.add(LSTM(200, activation='relu', input_shape=(n_timesteps, n_features)))

我們將使用一個簡單的編碼器-解碼器架構,易於在Keras中實現,這與LSTM自動編碼器的架構有很多相似之處。首先,對輸入序列的內部表示進行多次重複,對於輸出序列中的每個時間步長重複一次。這個向量序列將被呈現給LSTM解碼器。

model.add(RepeatVector(7))

然後我們將解碼器定義爲一個包含200個單元的LSTM隱藏層。重要的是,解碼器將輸出整個序列,而不僅僅是序列末尾的輸出,就像我們對編碼器所做的那樣。這意味着200個單元中的每一個單元都將爲7天中的每一天輸出一個值,表示輸出序列中每天預測的內容的基礎。

model.add(LSTM(200, activation='relu', return_sequences=True))

然後,我們將使用一個完全連接的層來解釋最終輸出層之前輸出序列中的每個時間步長。重要的是,輸出層預測輸出序列中的一個步驟,不是一次七天,這意味着我們將對輸出序列中的每個步驟使用相同的層。它的意思是相同的完全連接層和輸出層將用於處理解碼器提供的每個時間步長。爲了實現這一點,我們將解釋層和輸出層封裝在一個TimeDistributed包裝器中,該包裝器允許從解碼器每次執行步驟時都使用所封裝的層。模型。添加(TimeDistributed(密度(100年,激活= ' relu ')))model.add (TimeDistributed(密度(1))):

model.add(TimeDistributed(Dense(100, activation='relu')))
model.add(TimeDistributed(Dense(1)))

這允許LSTM解碼器計算出輸出序列中每個步驟所需的上下文,以及用於單獨解釋每個時間步驟的被包裝的密集層,同時重用相同的權重來執行解釋。另一種方法是將LSTM解碼器創建的所有結構壓平,並直接輸出矢量。您可以嘗試將其作爲一個擴展,以查看它是如何進行比較的。因此,網絡輸出與輸入結構相同的三維向量,具有維數[樣本、時間步長、特徵]。它只有一個功能,即每天消耗的總電量,而且總是有7個功能。因此,一個單一的一週預測將有大小:[1,7,1]。因此,在對模型進行訓練時,我們必須對輸出數據(y)進行重構,使其具有三維結構,而不是上一節所使用的【sample, features】的二維結構。

# reshape output into [samples, timesteps, features]
train_y = train_y.reshape((train_y.shape[0], train_y.shape[1], 1))
# train the model
def build_model(train, n_input):
	# prepare data
	train_x, train_y = to_supervised(train, n_input)
	# define parameters
	verbose, epochs, batch_size = 0, 20, 16
	n_timesteps, n_features, n_outputs = train_x.shape[1], train_x.shape[2], train_y.shape[1]
	# reshape output into [samples, timesteps, features]
	train_y = train_y.reshape((train_y.shape[0], train_y.shape[1], 1))
	# define model
	model = Sequential()
	model.add(LSTM(200, activation='relu', input_shape=(n_timesteps, n_features)))
	model.add(RepeatVector(n_outputs))
	model.add(LSTM(200, activation='relu', return_sequences=True))
	model.add(TimeDistributed(Dense(100, activation='relu')))
	model.add(TimeDistributed(Dense(1)))
	model.compile(loss='mse', optimizer='adam')
	# fit network
	model.fit(train_x, train_y, epochs=epochs, batch_size=batch_size, verbose=verbose)
	return model
# univariate multi-step encoder-decoder lstm
from math import sqrt
from numpy import split
from numpy import array
from pandas import read_csv
from sklearn.metrics import mean_squared_error
from matplotlib import pyplot
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Flatten
from keras.layers import LSTM
from keras.layers import RepeatVector
from keras.layers import TimeDistributed

# split a univariate dataset into train/test sets
def split_dataset(data):
	# split into standard weeks
	train, test = data[1:-328], data[-328:-6]
	# restructure into windows of weekly data
	train = array(split(train, len(train)/7))
	test = array(split(test, len(test)/7))
	return train, test

# evaluate one or more weekly forecasts against expected values
def evaluate_forecasts(actual, predicted):
	scores = list()
	# calculate an RMSE score for each day
	for i in range(actual.shape[1]):
		# calculate mse
		mse = mean_squared_error(actual[:, i], predicted[:, i])
		# calculate rmse
		rmse = sqrt(mse)
		# store
		scores.append(rmse)
	# calculate overall RMSE
	s = 0
	for row in range(actual.shape[0]):
		for col in range(actual.shape[1]):
			s += (actual[row, col] - predicted[row, col])**2
	score = sqrt(s / (actual.shape[0] * actual.shape[1]))
	return score, scores

# summarize scores
def summarize_scores(name, score, scores):
	s_scores = ', '.join(['%.1f' % s for s in scores])
	print('%s: [%.3f] %s' % (name, score, s_scores))

# convert history into inputs and outputs
def to_supervised(train, n_input, n_out=7):
	# flatten data
	data = train.reshape((train.shape[0]*train.shape[1], train.shape[2]))
	X, y = list(), list()
	in_start = 0
	# step over the entire history one time step at a time
	for _ in range(len(data)):
		# define the end of the input sequence
		in_end = in_start + n_input
		out_end = in_end + n_out
		# ensure we have enough data for this instance
		if out_end < len(data):
			x_input = data[in_start:in_end, 0]
			x_input = x_input.reshape((len(x_input), 1))
			X.append(x_input)
			y.append(data[in_end:out_end, 0])
		# move along one time step
		in_start += 1
	return array(X), array(y)

# train the model
def build_model(train, n_input):
	# prepare data
	train_x, train_y = to_supervised(train, n_input)
	# define parameters
	verbose, epochs, batch_size = 0, 20, 16
	n_timesteps, n_features, n_outputs = train_x.shape[1], train_x.shape[2], train_y.shape[1]
	# reshape output into [samples, timesteps, features]
	train_y = train_y.reshape((train_y.shape[0], train_y.shape[1], 1))
	# define model
	model = Sequential()
	model.add(LSTM(200, activation='relu', input_shape=(n_timesteps, n_features)))
	model.add(RepeatVector(n_outputs))
	model.add(LSTM(200, activation='relu', return_sequences=True))
	model.add(TimeDistributed(Dense(100, activation='relu')))
	model.add(TimeDistributed(Dense(1)))
	model.compile(loss='mse', optimizer='adam')
	# fit network
	model.fit(train_x, train_y, epochs=epochs, batch_size=batch_size, verbose=verbose)
	return model

# make a forecast
def forecast(model, history, n_input):
	# flatten data
	data = array(history)
	data = data.reshape((data.shape[0]*data.shape[1], data.shape[2]))
	# retrieve last observations for input data
	input_x = data[-n_input:, 0]
	# reshape into [1, n_input, 1]
	input_x = input_x.reshape((1, len(input_x), 1))
	# forecast the next week
	yhat = model.predict(input_x, verbose=0)
	# we only want the vector forecast
	yhat = yhat[0]
	return yhat

# evaluate a single model
def evaluate_model(train, test, n_input):
	# fit model
	model = build_model(train, n_input)
	# history is a list of weekly data
	history = [x for x in train]
	# walk-forward validation over each week
	predictions = list()
	for i in range(len(test)):
		# predict the week
		yhat_sequence = forecast(model, history, n_input)
		# store the predictions
		predictions.append(yhat_sequence)
		# get real observation and add to history for predicting the next week
		history.append(test[i, :])
	# evaluate predictions days for each week
	predictions = array(predictions)
	score, scores = evaluate_forecasts(test[:, :, 0], predictions)
	return score, scores

# load the new file
dataset = read_csv('household_power_consumption_days.csv', header=0, infer_datetime_format=True, parse_dates=['datetime'], index_col=['datetime'])
# split into train and test
train, test = split_dataset(dataset.values)
# evaluate model and get scores
n_input = 14
score, scores = evaluate_model(train, test, n_input)
# summarize scores
summarize_scores('lstm', score, scores)
# plot scores
days = ['sun', 'mon', 'tue', 'wed', 'thr', 'fri', 'sat']
pyplot.plot(days, scores, marker='o', label='lstm')
pyplot.show()

 

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章