接上文，本文介紹了ConvLSTM模型實現用電量/發電量預測。

LSTM 處理用電量/發電量預測任務的文章：

【Part1】Encoder-Decoder LSTM 模型實現用電量/發電量預測
【Part2】CNN-LSTM 模型實現用電量/發電量預測
【Part3】本文

文章目錄

1. ConvLSTM

1. ConvLSTM

1.1 CNN 模型

CNN-LSTM方法的進一步擴展是執行CNN的卷積（例如CNN如何讀取輸入序列數據）作爲LSTM的一部分用於每個時間步。這種組合稱爲ConvLSTM，與CNN-LSTM一樣，它也用於時空數據。與直接讀取數據以計算內部狀態和狀態轉換的LSTM不同，與解釋CNN模型輸出的CNN-LSTM也不同，ConvLSTM直接使用卷積作爲讀取LSTM單元輸入的一部分。Keras庫提供了ConvLSTM2D類，該類支持二維數據的ConvLSTM模型。它可以配置爲一維多變量時間序列預測。默認情況下，ConvLSTM2D類要求輸入數據的形狀爲：[samples，timesteps，rows，cols，channels]。

其中數據的每個時間步均定義爲（行×列）數據點的圖像。我們正在處理總功耗的一維序列，如果我們假設我們使用兩週的數據作爲輸入，則行爲1，列爲14。ConvLSTM將一次讀取這些數據，即LSTM讀取一個14天的時間步長，並在這些時間步長上進行卷積。

在我們的任務中，可以將14天分成兩個子序列，每個子序列的長度爲7天。然後，ConvLSTM可以讀取兩個時間步長，並對每個時間步長中的7天數據執行CNN處理。因此，對於此問題的選定框架，ConvLSTM2D的輸入shape爲：[n，2，1，7，1]。參數說明：

樣本（samples）：n，表示訓練數據集中的樣本數。
時間步長（timesteps）：2，表示將一個窗口寬度爲14天的採樣數據分爲兩個子序列。
行（rows）：1，表示每個子序列的一維形狀，即有多少行。
列（cols）：7，表示每個子序列，有多少列。
通道（channels）：1，在圖像識別任務中的概念，通道數。在時間序列預測任務中其實就是特徵數（features），這個概念在之前的文章中反覆提及強調。因爲本例的業務需求是通過日總功耗來預測下週的日總功耗，所以通道數（特徵數）爲1，即代表日總功耗。如果要添加其他的特徵，這個尺寸要做相應改變。再看下數據集情況，就一目瞭然了。

還可以探索其他配置，例如使用前21天的總功耗作爲輸入，並將其分爲3個子序列，和/或提供所有八個功能或通道作爲輸入。ConvLSTM2D的數據輸入要求必須將訓練數據集重塑爲[樣本，時間步長，行，列，通道]（[samples, timesteps, rows, cols, channels]）的結構。對比CNN-LSTM完整代碼，需要在此基礎上做如下修改：

1. 重塑訓練樣本的shape：

train_x = train_x.reshape((train_x.shape[0], n_steps, 1, n_length, n_features))

2. 設定ConvLSTM模型的輸入尺寸參數：

model.add(ConvLSTM2D(filters=64, kernel_size=(1,3), activation='relu',input_shape=(sw_width, 1, n_length, n_features)))
model.add(Flatten())

3. 重塑測試樣本的shape：

input_x = input_x.reshape((1, sw_width, 1, n_length, 1))

1.2 完整代碼

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# 設置中文顯示
plt.rcParams['font.sans-serif'] = ['Microsoft JhengHei']
plt.rcParams['axes.unicode_minus'] = False

import math
import sklearn.metrics as skm
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense
from tensorflow.keras.layers import RepeatVector, TimeDistributed
from tensorflow.keras.layers import ConvLSTM2D


def split_dataset(data):
    '''
    該函數實現以周爲單位切分訓練數據和測試數據
    '''
    # data爲按天的耗電量統計數據，shape爲(1442, 8)
    # 測試集取最後一年的46周（322天）數據，剩下的159周（1113天）數據爲訓練集，以下的切片實現此功能。
    train, test = data[1:-328], data[-328:-6]
    train = np.array(np.split(train, len(train)/7)) # 將數據劃分爲按周爲單位的數據
    test = np.array(np.split(test, len(test)/7))
    return train, test

def evaluate_forecasts(actual, predicted):
    '''
    該函數實現根據預期值評估一個或多個周預測損失
    思路：統計所有單日預測的 RMSE
    '''
    scores = list()
    for i in range(actual.shape[1]):
        mse = skm.mean_squared_error(actual[:, i], predicted[:, i])
        rmse = math.sqrt(mse)
        scores.append(rmse)
    
    s = 0 # 計算總的 RMSE
    for row in range(actual.shape[0]):
        for col in range(actual.shape[1]):
            s += (actual[row, col] - predicted[row, col]) ** 2
    score = math.sqrt(s / (actual.shape[0] * actual.shape[1]))
    print('actual.shape[0]:{}, actual.shape[1]:{}'.format(actual.shape[0], actual.shape[1]))
    return score, scores

def summarize_scores(name, score, scores):
    s_scores = ', '.join(['%.1f' % s for s in scores])
    print('%s: [%.3f] %s\n' % (name, score, s_scores))
    
def sliding_window(train, sw_width=7, n_out=7, in_start=0):
    '''
    該函數實現窗口寬度爲7、滑動步長爲1的滑動窗口截取序列數據
    '''
    data = train.reshape((train.shape[0] * train.shape[1], train.shape[2])) # 將以周爲單位的樣本展平爲以天爲單位的序列
    X, y = [], []
    
    for _ in range(len(data)):
        in_end = in_start + sw_width
        out_end = in_end + n_out
        
        # 保證截取樣本完整，最大元素索引不超過原序列索引，則截取數據；否則丟棄該樣本
        if out_end < len(data):
            # 訓練數據以滑動步長1截取
            train_seq = data[in_start:in_end, 0]
            train_seq = train_seq.reshape((len(train_seq), 1))
            X.append(train_seq)
            y.append(data[in_end:out_end, 0])
        in_start += 1
        
    return np.array(X), np.array(y)

def conv_lstm_model(train, sw_width, n_steps, n_length, in_start=0, verbose_set=0, epochs_num=20, batch_size_set=4):
    '''
    該函數定義 Encoder-Decoder LSTM 模型
    '''
    train_x, train_y = sliding_window(train, sw_width, in_start=0)
    n_timesteps, n_features, n_outputs = train_x.shape[1], train_x.shape[2], train_y.shape[1]
    
    train_x = train_x.reshape((train_x.shape[0], n_steps, 1, n_length, n_features))
    train_y = train_y.reshape((train_y.shape[0], train_y.shape[1], 1))
    
    model = Sequential()
    model.add(ConvLSTM2D(filters=64, kernel_size=(1,3), activation='relu',
                         input_shape=(n_steps, 1, n_length, n_features)))
    model.add(Flatten())
    model.add(RepeatVector(n_outputs))
    model.add(LSTM(200, activation='relu', return_sequences=True))
    model.add(TimeDistributed(Dense(100, activation='relu')))
    model.add(TimeDistributed(Dense(1)))
    
    model.compile(loss='mse', optimizer='adam', metrics=['accuracy'])
    print(model.summary())
    
    model.fit(train_x, train_y,
              epochs=epochs_num, batch_size=batch_size_set, verbose=verbose_set)
    return model

def forecast(model, pred_seq, sw_width, n_length, n_steps):
    '''
    該函數實現對輸入數據的預測
    '''
    data = np.array(pred_seq)
    data = data.reshape((data.shape[0]*data.shape[1], data.shape[2]))
    
    input_x = data[-sw_width:, 0] # 獲取輸入數據的最後一週的數據
    input_x = input_x.reshape((1, n_steps, 1, n_length, 1))
    
    yhat = model.predict(input_x, verbose=0) # 預測下週數據
    yhat = yhat[0] # 獲取預測向量
    return yhat

def evaluate_model(model, train, test, sd_width, n_length, n_steps):
    '''
    該函數實現模型評估
    '''
    history_fore = [x for x in train]
    predictions = list() # 用於保存每週的前向驗證結果；
    for i in range(len(test)):
        yhat_sequence = forecast(model, history_fore, sd_width, n_length, n_steps) # 預測下週的數據
        predictions.append(yhat_sequence) # 保存預測結果
        history_fore.append(test[i, :]) # 得到真實的觀察結果並添加到歷史中以預測下週
    
    predictions = np.array(predictions) # 評估一週中每天的預測結果
    score, scores = evaluate_forecasts(test[:, :, 0], predictions)
    return score, scores

def model_plot(score, scores, days, name):
    '''
    該函數實現繪製RMSE曲線圖
    '''
    plt.figure(figsize=(8,6), dpi=150)
    plt.plot(days, scores, marker='o', label=name)
    plt.grid(linestyle='--', alpha=0.5)
    plt.ylabel(r'$RMSE$', size=15)
    plt.title('Conv-LSTM 模型預測結果',  size=18)
    plt.legend()
    plt.show()
    
def main_run(dataset, sw_width, days, name, in_start, verbose, epochs, batch_size, n_steps, n_length):
    '''
    主函數：數據處理、模型訓練流程
    '''
    # 劃分訓練集和測試集
    train, test = split_dataset(dataset.values)
    # 訓練模型
    model = conv_lstm_model(train, sw_width,  n_steps, n_length, in_start, verbose_set=0, epochs_num=20, batch_size_set=4)
    # 計算RMSE
    score, scores = evaluate_model(model, train, test, sw_width, n_length, n_steps)
    # 打印分數
    summarize_scores(name, score, scores)
    # 繪圖
    model_plot(score, scores, days, name)
    
    print('------頭髮不夠，帽子來湊-----')
    
    
if __name__ == '__main__':
    
    dataset = pd.read_csv('household_power_consumption_days.csv', header=0, 
                   infer_datetime_format=True, engine='c',
                   parse_dates=['datetime'], index_col=['datetime'])
    
    days = ['sun', 'mon', 'tue', 'wed', 'thr', 'fri', 'sat']
    name = 'Conv-LSTM'
    
    # 定義序列的數量和長度
    '''
    n_steps：子序列劃分的數量，本例爲2，將14天的數據劃分爲兩個7的子序列；
    n_length：子序列每行的元素數，即列數。
    '''
    n_steps, n_length = 2, 7
    
    sliding_window_width= n_length * n_steps
    input_sequence_start=0
    
    epochs_num=20
    batch_size_set=16
    verbose_set=0
    
    
    main_run(dataset, sliding_window_width, days, name, input_sequence_start,
             verbose_set, epochs_num, batch_size_set, n_steps, n_length)

輸出：

Model: "sequential_12"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv_lst_m2d_2 (ConvLSTM2D)  (None, 1, 5, 64)          50176     
_________________________________________________________________
flatten_3 (Flatten)          (None, 320)               0         
_________________________________________________________________
repeat_vector_8 (RepeatVecto (None, 7, 320)            0         
_________________________________________________________________
lstm_16 (LSTM)               (None, 7, 200)            416800    
_________________________________________________________________
time_distributed_16 (TimeDis (None, 7, 100)            20100     
_________________________________________________________________
time_distributed_17 (TimeDis (None, 7, 1)              101       
=================================================================
Total params: 487,177
Trainable params: 487,177
Non-trainable params: 0
_________________________________________________________________
None
actual.shape[0]:46, actual.shape[1]:7
Conv-LSTM: [382.156] 391.3, 386.4, 340.5, 388.9, 364.4, 309.1, 473.6

運行示例總結測試集的性能。實驗表明，使用兩個卷積層使模型比僅使用單個層更穩定。可以看到，在這種情況下，該模型表現較好，總體RMSE得分約爲382千瓦。

擴展

輸入大小：探索模型的輸入天數，例如3天，21天，30天等等。
模型調整：調整模型的結構和超參數，並進一步提升模型性能。
數據縮放：探索是否可以使用數據縮放（例如標準化和規範化）來改善LSTM模型的性能。
學習診斷：使用診斷（例如訓練的學習曲線和驗證損失以及均方誤差）來幫助調整LSTM模型的結構和超參數。

總結

三篇文章介紹瞭如何開發LSTM來進行家庭用電量的多步時間序列預測。主要有以下內容：

如何開發和評估用於多步時間序列預測的單變量和多變量Encoder-Decoder LSTM 模型。
如何開發和評估用於多步時間序列預測的CNN-LSTM Encoder-Decoder 模型。
如何開發和評估用於多步時間序列預測的ConvLSTM Encoder-Decoder 模型。

關於時間序列預測用電量預測任務先告一段落，下篇文章開始介紹時間序列分類任務，比如人類行爲識別，車輛駕駛行爲識別。

時間序列預測18：ConvLSTM 實現用電量/發電量預測

文章目錄

1. ConvLSTM

1.1 CNN 模型

1.2 完整代碼

擴展

總結

Python 潮流週刊#50：我最喜歡的 Python 3.13 新特性！

【CV12】如何在Keras使用 Mask R-CNN 進行目標檢測

【CV13】如何在Keras中使用 YOLO v3 進行目標檢測

【CV10】經典CNN模型中圖像數據增強方法簡介

【CV09】如何可視化CNN中的卷積核和特徵圖

【CV11】如何從頭開發於CIFAR-10圖像分類的CNN

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結