接上文,本文介紹了CNN-LSTM模型實現單、多變量多時間步預測的家庭用電量預測任務。
1. CNN-LSTM
1.1 CNN 模型
卷積神經網絡(CNN)可用作編碼器-解碼器結構中的編碼器。 CNN不直接支持序列輸入;相反,一維CNN能夠讀取序列輸入並自動學習顯着特徵。然後可以由LSTM解碼器解釋這些內容。CNN和LSTM的混合模型稱爲CNN-LSTM模型,在編碼器-解碼器結構中一起使用。CNN希望輸入的數據具有與LSTM模型相同的3D結構,儘管將多個特徵作爲不同的通道讀取,但效果相同。
爲簡化示例,重點放在具有單變量輸入的CNN-LSTM上,但是可以很容易地對其進行更新以使用多變量輸入,這是一項練習。和以前一樣,使用14天的每日總功耗輸入序列。編碼器爲一個簡單有效的CNN模型,由兩個卷積層和一個最大池化層組成,然後將其結果平坦化。
第一層卷積讀取輸入序列,並將結果投影到特徵圖上。第二層卷積在第一層創建的特徵圖上執行相同的操作,嘗試放大其顯著特徵。每個卷積層使用64個特徵圖(filters=64
),並以3個時間步長的內核大小(kernel_size=3
)讀取輸入序列。最大池化層降採樣成原來特徵圖尺寸的1/4來簡化特徵圖。然後將提取的特徵圖展平爲一個長向量,將其用作解碼過程的輸入。代碼實現:
model.add(Conv1D(filters=64, kernel_size=3, activation='relu',
input_shape=(n_timesteps,n_features)))
model.add(Conv1D(filters=64, kernel_size=3, activation='relu'))
model.add(MaxPooling1D(pool_size=2))
model.add(Flatten())
1.2 完整代碼
完整代碼以單變量多步預測演示,要想修改多變量,只需要修改sliding_window()
和 forecast()
函數即可,可以參考上一篇文章:👉傳送門。
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
# 設置中文顯示
plt.rcParams['font.sans-serif'] = ['Microsoft JhengHei']
plt.rcParams['axes.unicode_minus'] = False
import math
import sklearn.metrics as skm
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense
from tensorflow.keras.layers import RepeatVector, TimeDistributed
def split_dataset(data):
'''
該函數實現以周爲單位切分訓練數據和測試數據
'''
# data爲按天的耗電量統計數據,shape爲(1442, 8)
# 測試集取最後一年的46周(322天)數據,剩下的159周(1113天)數據爲訓練集,以下的切片實現此功能。
train, test = data[1:-328], data[-328:-6]
train = np.array(np.split(train, len(train)/7)) # 將數據劃分爲按周爲單位的數據
test = np.array(np.split(test, len(test)/7))
return train, test
def evaluate_forecasts(actual, predicted):
'''
該函數實現根據預期值評估一個或多個周預測損失
思路:統計所有單日預測的 RMSE
'''
scores = list()
for i in range(actual.shape[1]):
mse = skm.mean_squared_error(actual[:, i], predicted[:, i])
rmse = math.sqrt(mse)
scores.append(rmse)
s = 0 # 計算總的 RMSE
for row in range(actual.shape[0]):
for col in range(actual.shape[1]):
s += (actual[row, col] - predicted[row, col]) ** 2
score = math.sqrt(s / (actual.shape[0] * actual.shape[1]))
print('actual.shape[0]:{}, actual.shape[1]:{}'.format(actual.shape[0], actual.shape[1]))
return score, scores
def summarize_scores(name, score, scores):
s_scores = ', '.join(['%.1f' % s for s in scores])
print('%s: [%.3f] %s\n' % (name, score, s_scores))
def sliding_window(train, sw_width=7, n_out=7, in_start=0):
'''
該函數實現窗口寬度爲7、滑動步長爲1的滑動窗口截取序列數據
'''
data = train.reshape((train.shape[0] * train.shape[1], train.shape[2])) # 將以周爲單位的樣本展平爲以天爲單位的序列
X, y = [], []
for _ in range(len(data)):
in_end = in_start + sw_width
out_end = in_end + n_out
# 保證截取樣本完整,最大元素索引不超過原序列索引,則截取數據;否則丟棄該樣本
if out_end < len(data):
# 訓練數據以滑動步長1截取
train_seq = data[in_start:in_end, 0]
train_seq = train_seq.reshape((len(train_seq), 1))
X.append(train_seq)
y.append(data[in_end:out_end, 0])
in_start += 1
return np.array(X), np.array(y)
def cnn_lstm_model(train, sw_width, in_start=0, verbose_set=0, epochs_num=20, batch_size_set=4):
'''
該函數定義 Encoder-Decoder LSTM 模型
'''
train_x, train_y = sliding_window(train, sw_width, in_start=0)
n_timesteps, n_features, n_outputs = train_x.shape[1], train_x.shape[2], train_y.shape[1]
train_y = train_y.reshape((train_y.shape[0], train_y.shape[1], 1))
model = Sequential()
model.add(Conv1D(filters=64, kernel_size=3, activation='relu',
input_shape=(n_timesteps,n_features)))
model.add(Conv1D(filters=64, kernel_size=3, activation='relu'))
model.add(MaxPooling1D(pool_size=2))
model.add(Flatten())
model.add(RepeatVector(n_outputs))
model.add(LSTM(200, activation='relu', return_sequences=True))
model.add(TimeDistributed(Dense(100, activation='relu')))
model.add(TimeDistributed(Dense(1)))
model.compile(loss='mse', optimizer='adam', metrics=['accuracy'])
print(model.summary())
model.fit(train_x, train_y,
epochs=epochs_num, batch_size=batch_size_set, verbose=verbose_set)
return model
def forecast(model, pred_seq, sw_width):
'''
該函數實現對輸入數據的預測
'''
data = np.array(pred_seq)
data = data.reshape((data.shape[0]*data.shape[1], data.shape[2]))
input_x = data[-sw_width:, 0] # 獲取輸入數據的最後一週的數據
input_x = input_x.reshape((1, len(input_x), 1)) # 重塑形狀[1, sw_width, 1]
yhat = model.predict(input_x, verbose=0) # 預測下週數據
yhat = yhat[0] # 獲取預測向量
return yhat
def evaluate_model(model, train, test, sd_width):
'''
該函數實現模型評估
'''
history_fore = [x for x in train]
predictions = list() # 用於保存每週的前向驗證結果;
for i in range(len(test)):
yhat_sequence = forecast(model, history_fore, sd_width) # 預測下週的數據
predictions.append(yhat_sequence) # 保存預測結果
history_fore.append(test[i, :]) # 得到真實的觀察結果並添加到歷史中以預測下週
predictions = np.array(predictions) # 評估一週中每天的預測結果
score, scores = evaluate_forecasts(test[:, :, 0], predictions)
return score, scores
def model_plot(score, scores, days, name):
'''
該函數實現繪製RMSE曲線圖
'''
plt.figure(figsize=(8,6), dpi=150)
plt.plot(days, scores, marker='o', label=name)
plt.grid(linestyle='--', alpha=0.5)
plt.ylabel(r'$RMSE$', size=15)
plt.title('CNN-LSTM 模型預測結果', size=18)
plt.legend()
plt.show()
def main_run(dataset, sw_width, days, name, in_start, verbose, epochs, batch_size):
'''
主函數:數據處理、模型訓練流程
'''
# 劃分訓練集和測試集
train, test = split_dataset(dataset.values)
# 訓練模型
model = cnn_lstm_model(train, sw_width, in_start, verbose_set=0, epochs_num=20, batch_size_set=4)
# 計算RMSE
score, scores = evaluate_model(model, train, test, sw_width)
# 打印分數
summarize_scores(name, score, scores)
# 繪圖
model_plot(score, scores, days, name)
print('------頭髮不夠,帽子來湊-----')
if __name__ == '__main__':
dataset = pd.read_csv('household_power_consumption_days.csv', header=0,
infer_datetime_format=True, engine='c',
parse_dates=['datetime'], index_col=['datetime'])
days = ['sun', 'mon', 'tue', 'wed', 'thr', 'fri', 'sat']
name = 'CNN-LSTM'
sliding_window_width= 14
input_sequence_start=0
epochs_num=20
batch_size_set=16
verbose_set=0
main_run(dataset, sliding_window_width, days, name, input_sequence_start,
verbose_set, epochs_num, batch_size_set)
運行示例總結測試集的性能。實驗表明,使用兩個卷積層使模型比僅使用單個層更穩定。可以看到,在這種情況下,該模型表現較好,總體RMSE得分約爲390千瓦。
Model: "sequential_9"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv1d (Conv1D) (None, 12, 64) 256
_________________________________________________________________
conv1d_1 (Conv1D) (None, 10, 64) 12352
_________________________________________________________________
max_pooling1d (MaxPooling1D) (None, 5, 64) 0
_________________________________________________________________
flatten (Flatten) (None, 320) 0
_________________________________________________________________
repeat_vector_5 (RepeatVecto (None, 7, 320) 0
_________________________________________________________________
lstm_13 (LSTM) (None, 7, 200) 416800
_________________________________________________________________
time_distributed_10 (TimeDis (None, 7, 100) 20100
_________________________________________________________________
time_distributed_11 (TimeDis (None, 7, 1) 101
=================================================================
Total params: 449,609
Trainable params: 449,609
Non-trainable params: 0
_________________________________________________________________
None
actual.shape[0]:46, actual.shape[1]:7
CNN-LSTM: [390.960] 411.0, 380.4, 339.6, 391.1, 375.2, 312.5, 499.6
RMSE曲線:
可以看到星期二和星期五是比較容易預測的日期,星期天是最不容易預測的日期。
本文以單變量爲例講解了CNN-LSTM處理時間序列預測的建模思路,關於多變量預測,只需要修改sliding_window()
和 forecast()
函數即可,可以參考上一篇文章:👉傳送門。
注意,以上的實現方法僅僅是使用for循環實現滑動步長爲1的滑動窗口來截取數據,在實際應用過程中,可以通過自定義函數,實現任意步長的滑動窗口,實現方法在以後的文章會介紹。
下一篇介紹ConvLSTM模型來實現家庭用電數據集的預測。