卷積神經網絡模型(CNN)可以應用於時間序列預測。有許多類型的CNN模型可用於每種特定類型的時間序列預測問題。在本介紹了在以TF2.1爲後端的Keras中如何開發用於時間序列預測的不同的CNN模型。這些模型是在比較小的人爲構造的時間序列問題上演示的,模型配置也是任意的,並沒有進行調參優化,這些內容會在以後的文章中介紹。
先看一下思維導圖,本文講解了開發CNN時間序列預測模型中的前兩個知識點:單變量CNN模型和多變量CNN模型(思維導圖中的1、2):
文章目錄
1. 單變量CNN模型(Univariate CNN Models)
儘管傳統上是針對二維圖像數據開發的,但CNNs可以用來對單變量時間序列預測問題進行建模。單變量時間序列是由具有時間順序的單個觀測序列組成的數據集,需要一個模型從過去的觀測序列中學習以預測序列中的下一個值。本節分爲兩部分:
- Data Preparation
- CNN Model
1.1 數據準備
假設有如下序列數據:
[10, 20, 30, 40, 50, 60, 70, 80, 90]
我們可以將序列分成多個輸入/輸出模式,稱爲樣本(samples),其中三個時間步(time steps)作爲輸入,一個時間步作爲輸出,並據此來預測一個時間步的輸出值y。
X, y
10, 20, 30, 40
20, 30, 40, 50
30, 40, 50, 60
...
我們可以通過一個 split_sequence()
函數來實現上述操作,該函數可以將給定的單變量序列拆分爲多個樣本,其中每個樣本具有指定數量的時間步,並且輸出是單個時間步。因爲這些函數在之前的文章已經介紹過,爲了增加文章的可讀性,此處就不逐一講解了,會在每章結束之後,給出完整代碼,如果有不懂的地方,可以查看之前的一篇文章。經過此函數處理之後,單變量序列變爲:
[10 20 30] 40
[20 30 40] 50
[30 40 50] 60
[40 50 60] 70
[50 60 70] 80
[60 70 80] 90
其中,每一行作爲一個樣本,其中三個時間步長值是樣本數據,每一行中的最後一個單值是輸出(y),因爲是單變量,所以特徵數(features)爲1。
1.2 CNN 模型
1D CNN是一個CNN模型,它有一個卷積隱藏層,在一維序列上工作。在某些情況下,這之後可能是第二卷積層,例如非常長的輸入序列,然後是池化層,其任務是將卷積層的輸出提取到最顯著的元素。卷積層和池化層之後是全連接層,用於解釋模型卷積部分提取的特徵。在卷積層和全連接層之間使用展平層(Flatten)將特徵映射簡化爲一個一維向量。代碼實現:
model = Sequential()
model.add(Conv1D(filters=64, kernel_size=2, activation='relu', input_shape=(n_steps,
n_features)))
model.add(MaxPooling1D(pool_size=2))
model.add(Flatten())
model.add(Dense(50, activation='relu'))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mse')
模型的關鍵是輸入的形狀 input_shape
參數;這是模型在時間步數和特徵數方面期望作爲每個樣本的輸入。我們使用的是一個單變量序列,因此特徵數爲1。時間步數是在劃分數據集時 split_sequence()
函數的參數中定義的。
每個樣本的輸入形狀在第一個隱藏層定義的輸入形狀參數中指定。因爲有多個樣本,因此,模型期望訓練數據的輸入維度或形狀爲:[樣本,時間步,特徵]([samples, timesteps, features])
。 split_sequence()
函數輸出的訓練數據 X
的形狀爲 [samples,timesteps]
,因此應該對 X
重塑形狀,增加一個特徵維度,以滿足CNN模型的輸入要求。代碼實現:
n_features = 1
X = X.reshape((X.shape[0], X.shape[1], n_features))
CNN實際上並不認爲數據具有時間步,而是將其視爲可以執行卷積讀取操作的序列,如一維圖像。上例中,我們定義了一個卷積層,它有64個filter,kernel大小爲2。接下來是一個最大池化層和一個全連接層(Dense)來解釋輸入特性。最後,輸出層預測單個數值。該模型利用有效的隨機梯度下降Adam進行擬合,利用均方誤差(mse)損失函數進行優化。處理好訓練數據和定義完模型之後,接下來開始訓練,代碼實現:
model.fit(X, y, epochs=1000, verbose=0)
在模型擬合後,可以利用它進行預測。假設輸入[70,80,90]來預測序列中的下一個值,並期望模型能預測類似於[100]的數據。該CNN模型期望輸入形狀是三維的,形狀爲 [樣本、時間步長、特徵]
,因此,在進行預測之前,必須重塑單個輸入樣本爲三維形狀。代碼實現:
x_input = array([70, 80, 90])
x_input = x_input.reshape((1, n_steps, n_features))
yhat = model.predict(x_input, verbose=0)
完整代碼:
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten, Conv1D, MaxPooling1D
# 該該函數將序列數據分割成樣本
def split_sequence(sequence, sw_width, n_features):
'''
這個簡單的示例,通過for循環實現有重疊截取數據,滑動步長爲1,滑動窗口寬度爲sw_width。
以後的文章,會介紹使用yield方法來實現特定滑動步長的滑動窗口的實例。
'''
X, y = [], []
for i in range(len(sequence)):
# 獲取單個樣本中最後一個元素的索引,因爲python切片前閉後開,索引從0開始,所以不需要-1
end_element_index = i + sw_width
# 如果樣本最後一個元素的索引超過了序列索引的最大長度,說明不滿足樣本元素個數,則這個樣本丟棄
if end_element_index > len(sequence) - 1:
break
# 通過切片實現步長爲1的滑動窗口截取數據組成樣本的效果
seq_x, seq_y = sequence[i:end_element_index], sequence[end_element_index]
X.append(seq_x)
y.append(seq_y)
process_X, process_y = np.array(X), np.array(y)
process_X = process_X.reshape((process_X.shape[0], process_X.shape[1], n_features))
print('split_sequence:\nX:\n{}\ny:\n{}\n'.format(np.array(X), np.array(y)))
print('X_shape:{},y_shape:{}\n'.format(np.array(X).shape, np.array(y).shape))
print('train_X:\n{}\ntrain_y:\n{}\n'.format(process_X, process_y))
print('train_X.shape:{},trian_y.shape:{}\n'.format(process_X.shape, process_y.shape))
return process_X, process_y
def oned_cnn_model(sw_width, n_features, X, y, test_X, epoch_num, verbose_set):
model = Sequential()
# 對於一維卷積來說,data_format='channels_last'是默認配置,該API的規則如下:
# 輸入形狀爲:(batch, steps, channels);輸出形狀爲:(batch, new_steps, filters),padding和strides的變化會導致new_steps變化
# 如果設置爲data_format = 'channels_first',則要求輸入形狀爲: (batch, channels, steps).
model.add(Conv1D(filters=64, kernel_size=2, activation='relu',
strides=1, padding='valid', data_format='channels_last',
input_shape=(sw_width, n_features)))
# 對於一維池化層來說,data_format='channels_last'是默認配置,該API的規則如下:
# 3D 張量的輸入形狀爲: (batch_size, steps, features);輸出3D張量的形狀爲:(batch_size, downsampled_steps, features)
# 如果設置爲data_format = 'channels_first',則要求輸入形狀爲:(batch_size, features, steps)
model.add(MaxPooling1D(pool_size=2, strides=None, padding='valid',
data_format='channels_last'))
# data_format參數的作用是在將模型從一種數據格式切換到另一種數據格式時保留權重順序。默認爲channels_last。
# 如果設置爲channels_last,那麼數據輸入形狀應爲:(batch,…,channels);如果設置爲channels_first,那麼數據輸入形狀應該爲(batch,channels,…)
# 輸出爲(batch, 之後參數尺寸的乘積)
model.add(Flatten())
# Dense執行以下操作:output=activation(dot(input,kernel)+bias),
# 其中,activation是激活函數,kernel是由層創建的權重矩陣,bias是由層創建的偏移向量(僅當use_bias爲True時適用)。
# 2D 輸入:(batch_size, input_dim);對應 2D 輸出:(batch_size, units)
model.add(Dense(units=50, activation='relu',
use_bias=True, kernel_initializer='glorot_uniform', bias_initializer='zeros',))
# 因爲要預測下一個時間步的值,因此units設置爲1
model.add(Dense(units=1))
# 配置模型
model.compile(optimizer='adam', loss='mse',
metrics=['accuracy'], loss_weights=None, sample_weight_mode=None, weighted_metrics=None, target_tensors=None)
print('\n',model.summary())
# X爲輸入數據,y爲數據標籤;batch_size:每次梯度更新的樣本數,默認爲32。
# verbose: 0,1,2. 0=訓練過程無輸出,1=顯示訓練過程進度條,2=每訓練一個epoch打印一次信息
history = model.fit(X, y, batch_size=32, epochs=epoch_num, verbose=verbose_set)
yhat = model.predict(test_X, verbose=0)
print('\nyhat:', yhat)
return model, history
if __name__ == '__main__':
train_seq = [10, 20, 30, 40, 50, 60, 70, 80, 90]
sw_width = 3
n_features = 1
epoch_num = 1000
verbose_set = 0
train_X, train_y = split_sequence(train_seq, sw_width, n_features)
# 預測
x_input = np.array([70, 80, 90])
x_input = x_input.reshape((1, sw_width, n_features))
model, history = oned_cnn_model(sw_width, n_features, train_X, train_y, x_input, epoch_num, verbose_set)
print('\ntrain_acc:%s'%np.mean(history.history['accuracy']), '\ntrain_loss:%s'%np.mean(history.history['loss']))
輸出:
split_sequence:
X:
[[10 20 30]
[20 30 40]
[30 40 50]
[40 50 60]
[50 60 70]
[60 70 80]]
y:
[40 50 60 70 80 90]
X_shape:(6, 3),y_shape:(6,)
train_X:
[[[10]
[20]
[30]]
[[20]
[30]
[40]]
[[30]
[40]
[50]]
[[40]
[50]
[60]]
[[50]
[60]
[70]]
[[60]
[70]
[80]]]
train_y:
[40 50 60 70 80 90]
train_X.shape:(6, 3, 1),trian_y.shape:(6,)
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv1d (Conv1D) (None, 2, 64) 192
_________________________________________________________________
max_pooling1d (MaxPooling1D) (None, 1, 64) 0
_________________________________________________________________
flatten (Flatten) (None, 64) 0
_________________________________________________________________
dense (Dense) (None, 50) 3250
_________________________________________________________________
dense_1 (Dense) (None, 1) 51
=================================================================
Total params: 3,493
Trainable params: 3,493
Non-trainable params: 0
_________________________________________________________________
None
yhat: [[101.824524]]
train_acc:0.0
train_loss:83.09048912930488
2. 多變量 CNN 模型 (Multivariate CNN Models)
多變量(多元)時間序列數據是指每一時間步有多個觀測值的數據。對於多變量時間序列數據,有兩種主要模型:
- 多輸入序列;
- 多並行序列
2.1 多輸入序列
一個問題可能有兩個或多個並行輸入時間序列和一個依賴於輸入時間序列的輸出時間序列。輸入時間序列是並行的,因爲每個序列在同一時間步上都有觀測值。我們可以通過兩個並行輸入時間序列的簡單示例來演示這一點,其中輸出序列是輸入序列的簡單相加。代碼實現:
in_seq1 = array([10, 20, 30, 40, 50, 60, 70, 80, 90])
in_seq2 = array([15, 25, 35, 45, 55, 65, 75, 85, 95])
out_seq = array([in_seq1[i]+in_seq2[i] for i in range(len(in_seq1))])
我們可以將這三個數據數組重塑爲單個數據集,其中每一行是一個時間步,每一列是一個單獨的時間序列。這是在CSV文件中存儲並行時間序列的標準方法。代碼實現:
in_seq1 = in_seq1.reshape((len(in_seq1), 1))
in_seq2 = in_seq2.reshape((len(in_seq2), 1))
out_seq = out_seq.reshape((len(out_seq), 1))
# 對於二維數組,hstack方法沿第二維堆疊,即沿着列堆疊,列數增加。
dataset = hstack((in_seq1, in_seq2, out_seq)
最後結果:
[[ 10 15 25]
[ 20 25 45]
[ 30 35 65]
[ 40 45 85]
[ 50 55 105]
[ 60 65 125]
[ 70 75 145]
[ 80 85 165]
[ 90 95 185]]
與單變量時間序列一樣,我們必須將這些數據構造成具有輸入和輸出樣本的樣本。一維CNN模型需要足夠的上下文信息來學習從輸入序列到輸出值的映射。CNNs可以支持並行輸入時間序列作爲獨立的通道,可以類比圖像的紅色、綠色和藍色分量。因此,我們需要將數據分成樣本,保持兩個輸入序列的觀測順序。如果我們選擇三個輸入時間步驟,那麼第一個示例將如下所示:
輸入:
10, 15
20, 25
30, 35
輸出:
65
這裏可能會看不明白,其實之前的文章已經介紹過了,這裏再說一遍。看上邊的9×3數組,也就是說,將每個並行序列(列)的前三個時間步(行)的值(3×2,三行兩列)作爲輸入樣本提供給模型,並且把第三個時間步(列)的值(本例中爲65),作爲樣本標籤提供給模型。還要注意,在將時間序列轉換爲輸入/輸出樣本以訓練模型時,不得不放棄輸出時間序列中的一些值(前兩行的第三列25和45沒有使用),因爲在先前的時間步,輸入時間序列中沒有值,所以沒法預測。這個操作可以通過類似上節的oned_split_sequence函數來實現,爲了增加文章的可讀性,這裏只放出結果,相關代碼在完整代碼部分中。劃分結果:
[[10 15]
[20 25]
[30 35]] 65
[[20 25]
[30 35]
[40 45]] 85
[[30 35]
[40 45]
[50 55]] 105
[[40 45]
[50 55]
[60 65]] 125
[[50 55]
[60 65]
[70 75]] 145
[[60 65]
[70 75]
[80 85]] 165
[[70 75]
[80 85]
[90 95]] 185
看到這裏,其實就很好理解了。每個樣本中,三行相當於滑動窗口的寬度爲3,上述劃分是以滑動步長爲1來取值的,每個樣本之間的值是有重疊的,兩列相當於有兩個特徵,然後用這些數據來預測下一步的輸出,這個輸出是單值的。這種方式可以用於時間序列分類任務,之後的文章中會介紹,比如人類行爲識別。
之前也在滑動窗口取值卡了一段時間,經過這段時間的學習,想明白了,畫個圖吧,方便填坑。這個圖一看就明白了,還是拿上邊的數據,來分析,如下圖:
如果要做時間序列分類任務,只需要把最後一列數據換爲標籤就行了,每個採樣點一個標籤,然後取滑動窗口寬度結束索引的行的標籤爲樣本標籤;如果想要增加特徵,增加列數就可以了。比如做降雨量預測,每一列數據就代表一個特徵的採樣數據,例如溫度、溼度、風速等等,每一行就表示在不同時間戳上這些特徵的採樣值。
2.1.1 CNN Model
完整代碼:
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten, Conv1D, MaxPooling1D
def split_sequences(first_seq, secend_seq, sw_width):
'''
該函數將序列數據分割成樣本
'''
input_seq1 = np.array(first_seq).reshape(len(first_seq), 1)
input_seq2 = np.array(secend_seq).reshape(len(secend_seq), 1)
out_seq = np.array([first_seq[i]+secend_seq[i] for i in range(len(first_seq))])
out_seq = out_seq.reshape(len(out_seq), 1)
dataset = np.hstack((input_seq1, input_seq2, out_seq))
print('dataset:\n',dataset)
X, y = [], []
for i in range(len(dataset)):
# 切片索引從0開始,區間爲前閉後開,所以不用減去1
end_element_index = i + sw_width
# 同樣的道理,這裏考慮最後一個樣本正好取到最後一行的數據,那麼索引不能減1,如果減去1的話最後一個樣本就取不到了。
if end_element_index > len(dataset):
break
# 該語句實現步長爲1的滑動窗口截取數據功能;
# 以下切片中,:-1 表示取除最後一列的其他列數據;-1表示取最後一列的數據
seq_x, seq_y = dataset[i:end_element_index, :-1], dataset[end_element_index-1, -1]
X.append(seq_x)
y.append(seq_y)
process_X, process_y = np.array(X), np.array(y)
n_features = process_X.shape[2]
print('train_X:\n{}\ntrain_y:\n{}\n'.format(process_X, process_y))
print('train_X.shape:{},trian_y.shape:{}\n'.format(process_X.shape, process_y.shape))
print('n_features:',n_features)
return process_X, process_y, n_features
def oned_cnn_model(sw_width, n_features, X, y, test_X, epoch_num, verbose_set):
model = Sequential()
model.add(Conv1D(filters=64, kernel_size=2, activation='relu',
strides=1, padding='valid', data_format='channels_last',
input_shape=(sw_width, n_features)))
model.add(MaxPooling1D(pool_size=2, strides=None, padding='valid',
data_format='channels_last'))
model.add(Flatten())
model.add(Dense(units=50, activation='relu',
use_bias=True, kernel_initializer='glorot_uniform', bias_initializer='zeros',))
model.add(Dense(units=1))
model.compile(optimizer='adam', loss='mse',
metrics=['accuracy'], loss_weights=None, sample_weight_mode=None, weighted_metrics=None, target_tensors=None)
print('\n',model.summary())
history = model.fit(X, y, batch_size=32, epochs=epoch_num, verbose=verbose_set)
yhat = model.predict(test_X, verbose=0)
print('\nyhat:', yhat)
return model, history
if __name__ == '__main__':
train_seq1 = [10, 20, 30, 40, 50, 60, 70, 80, 90]
train_seq2 = [15, 25, 35, 45, 55, 65, 75, 85, 95]
sw_width = 3
epoch_num = 1000
verbose_set = 0
train_X, train_y, n_features = split_sequences(train_seq1, train_seq2, sw_width)
# 預測
x_input = np.array([[80, 85], [90, 95], [100, 105]])
x_input = x_input.reshape((1, sw_width, n_features))
model, history = oned_cnn_model(sw_width, n_features, train_X, train_y, x_input, epoch_num, verbose_set)
print('\ntrain_acc:%s'%np.mean(history.history['accuracy']), '\ntrain_loss:%s'%np.mean(history.history['loss']))
輸出:
dataset:
[[ 10 15 25]
[ 20 25 45]
[ 30 35 65]
[ 40 45 85]
[ 50 55 105]
[ 60 65 125]
[ 70 75 145]
[ 80 85 165]
[ 90 95 185]]
train_X:
[[[10 15]
[20 25]
[30 35]]
[[20 25]
[30 35]
[40 45]]
[[30 35]
[40 45]
[50 55]]
[[40 45]
[50 55]
[60 65]]
[[50 55]
[60 65]
[70 75]]
[[60 65]
[70 75]
[80 85]]
[[70 75]
[80 85]
[90 95]]]
train_y:
[ 65 85 105 125 145 165 185]
train_X.shape:(7, 3, 2),trian_y.shape:(7,)
n_features: 2
Model: "sequential_2"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv1d_2 (Conv1D) (None, 2, 64) 320
_________________________________________________________________
max_pooling1d_2 (MaxPooling1 (None, 1, 64) 0
_________________________________________________________________
flatten_2 (Flatten) (None, 64) 0
_________________________________________________________________
dense_4 (Dense) (None, 50) 3250
_________________________________________________________________
dense_5 (Dense) (None, 1) 51
=================================================================
Total params: 3,621
Trainable params: 3,621
Non-trainable params: 0
_________________________________________________________________
None
yhat: [[205.84216]]
train_acc:0.0
train_loss:290.3450777206163
2.1.2 Multi-headed CNN Model
有另一種更精細的方法來解決多變量輸入序列問題。每個輸入序列可以由單獨的CNN處理,並且在對輸出序列進行預測之前,可以組合這些子模型中的每個的輸出。我們可以稱之爲 Multi-headed CNN模型。它可能提供更多的靈活性或更好的性能,這取決於正在建模的問題的具體情況。例如,它允許每個輸入序列配置不同的子模型,例如過濾器映射的數量和內核大小。這種類模型可以用Keras API定義。首先,可以將第一個輸入模型定義爲一個一維CNN,其輸入層要求輸入爲n個步驟和1個特徵。代碼實現:
visible1 = Input(shape=(n_steps, n_features))
cnn1 = Conv1D(filters=64, kernel_size=2, activation='relu')(visible1)
cnn1 = MaxPooling1D(pool_size=2)(cnn1)
cnn1 = Flatten()(cnn1)
visible2 = Input(shape=(n_steps, n_features))
cnn2 = Conv1D(filters=64, kernel_size=2, activation='relu')(visible2)
cnn2 = MaxPooling1D(pool_size=2)(cnn2)
cnn2 = Flatten()(cnn2)
定義好兩個輸入子模型後,可以將每個模型的輸出合併爲一個長向量,在對輸出序列進行預測之前可以對其進行解釋。最後,將輸入和輸出綁定在一起。代碼實現:
merge = concatenate([cnn1, cnn2])
dense = Dense(50, activation='relu')(merge)
output = Dense(1)(dense)
model = Model(inputs=[visible1, visible2], outputs=output)
此模型要求將輸入作爲兩個元素的列表提供,其中列表中的每個元素都包含其中一個子模型的數據。爲了實現這一點,我們可以將3D輸入數據分割成兩個獨立的輸入數據數組,即從一個形狀爲 [7,3,2]
的數組分割成兩個形狀爲 [7,3,1]
的3D數組。代碼實現:
n_features = 1
# separate input data
X1 = X[:, :, 0].reshape(X.shape[0], X.shape[1], n_features)
X2 = X[:, :, 1].reshape(X.shape[0], X.shape[1], n_features)
開始訓練:
model.fit([X1, X2], y, epochs=1000, verbose=0)
完整代碼:
import numpy as np
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.layers import Dense, Flatten, Conv1D, MaxPooling1D, Input, concatenate
from tensorflow.keras.utils import plot_model
def split_sequences2(first_seq, secend_seq, sw_width, n_features):
'''
該函數將序列數據分割成樣本
'''
input_seq1 = np.array(first_seq).reshape(len(first_seq), 1)
input_seq2 = np.array(secend_seq).reshape(len(secend_seq), 1)
out_seq = np.array([first_seq[i]+secend_seq[i] for i in range(len(first_seq))])
out_seq = out_seq.reshape(len(out_seq), 1)
dataset = np.hstack((input_seq1, input_seq2, out_seq))
print('dataset:\n',dataset)
X, y = [], []
for i in range(len(dataset)):
# 切片索引從0開始,區間爲前閉後開,所以不用減去1
end_element_index = i + sw_width
# 同樣的道理,這裏考慮最後一個樣本正好取到最後一行的數據,那麼索引不能減1,如果減去1的話最後一個樣本就取不到了。
if end_element_index > len(dataset):
break
# 該語句實現步長爲1的滑動窗口截取數據功能;
# 以下切片中,:-1 表示取除最後一列的其他列數據;-1表示取最後一列的數據
seq_x, seq_y = dataset[i:end_element_index, :-1], dataset[end_element_index-1, -1]
X.append(seq_x)
y.append(seq_y)
process_X, process_y = np.array(X), np.array(y)
# [:,:,0]表示三維數組前兩個維度的數據全取,第三個維度取第一個數據,可以想象成一摞餅乾,取了一塊。
# 本例中 process_X的shape爲(7,3,2),所以下式就很好理解了,
X1 = process_X[:,:,0].reshape(process_X.shape[0], process_X.shape[1], n_features)
X2 = process_X[:,:,1].reshape(process_X.shape[0], process_X.shape[1], n_features)
print('train_X:\n{}\ntrain_y:\n{}\n'.format(process_X, process_y))
print('train_X.shape:{},trian_y.shape:{}\n'.format(process_X.shape, process_y.shape))
print('X1.shape:{},X2.shape:{}\n'.format(X1.shape, X2.shape))
return X1, X2, process_y
def oned_cnn_model(n_steps, n_features, X_1, X_2, y, x1, x2, epoch_num, verbose_set):
visible1 = Input(shape=(n_steps, n_features))
cnn1 = Conv1D(filters=64, kernel_size=2, activation='relu')(visible1)
cnn1 = MaxPooling1D(pool_size=2)(cnn1)
cnn1 = Flatten()(cnn1)
visible2 = Input(shape=(n_steps, n_features))
cnn2 = Conv1D(filters=64, kernel_size=2, activation='relu')(visible2)
cnn2 = MaxPooling1D(pool_size=2)(cnn2)
cnn2 = Flatten()(cnn2)
merge = concatenate([cnn1, cnn2])
dense = Dense(50, activation='relu')(merge)
output = Dense(1)(dense)
model = Model(inputs=[visible1, visible2], outputs=output)
model.compile(optimizer='adam', loss='mse',
metrics=['accuracy'], loss_weights=None, sample_weight_mode=None, weighted_metrics=None, target_tensors=None)
print('\n',model.summary())
plot_model(model, to_file='multi_head_cnn_model.png', show_shapes=True, show_layer_names=True, rankdir='TB', dpi=200)
history = model.fit([X_1, X_2], y, batch_size=32, epochs=epoch_num, verbose=verbose_set)
yhat = model.predict([x1,x2], verbose=0)
print('\nyhat:', yhat)
return model, history
if __name__ == '__main__':
train_seq1 = [10, 20, 30, 40, 50, 60, 70, 80, 90]
train_seq2 = [15, 25, 35, 45, 55, 65, 75, 85, 95]
sw_width = 3
n_features = 1
epoch_num = 1000
verbose_set = 0
train_X1, train_X2, train_y = split_sequences2(train_seq1, train_seq2, sw_width, n_features)
# 預測
x_input = np.array([[80, 85], [90, 95], [100, 105]])
x_1 = x_input[:, 0].reshape((1, sw_width, n_features))
x_2 = x_input[:, 1].reshape((1, sw_width, n_features))
model, history = oned_cnn_model(sw_width, n_features, train_X1, train_X2, train_y, x_1, x_2, epoch_num, verbose_set)
print('\ntrain_acc:%s'%np.mean(history.history['accuracy']), '\ntrain_loss:%s'%np.mean(history.history['loss']))
輸出:
dataset:
[[ 10 15 25]
[ 20 25 45]
[ 30 35 65]
[ 40 45 85]
[ 50 55 105]
[ 60 65 125]
[ 70 75 145]
[ 80 85 165]
[ 90 95 185]]
train_X:
[[[10 15]
[20 25]
[30 35]]
[[20 25]
[30 35]
[40 45]]
[[30 35]
[40 45]
[50 55]]
[[40 45]
[50 55]
[60 65]]
[[50 55]
[60 65]
[70 75]]
[[60 65]
[70 75]
[80 85]]
[[70 75]
[80 85]
[90 95]]]
train_y:
[ 65 85 105 125 145 165 185]
train_X.shape:(7, 3, 2),trian_y.shape:(7,)
X1.shape:(7, 3, 1),X2.shape:(7, 3, 1)
Model: "model_8"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_17 (InputLayer) [(None, 3, 1)] 0
__________________________________________________________________________________________________
input_18 (InputLayer) [(None, 3, 1)] 0
__________________________________________________________________________________________________
conv1d_16 (Conv1D) (None, 2, 64) 192 input_17[0][0]
__________________________________________________________________________________________________
conv1d_17 (Conv1D) (None, 2, 64) 192 input_18[0][0]
__________________________________________________________________________________________________
max_pooling1d_16 (MaxPooling1D) (None, 1, 64) 0 conv1d_16[0][0]
__________________________________________________________________________________________________
max_pooling1d_17 (MaxPooling1D) (None, 1, 64) 0 conv1d_17[0][0]
__________________________________________________________________________________________________
flatten_16 (Flatten) (None, 64) 0 max_pooling1d_16[0][0]
__________________________________________________________________________________________________
flatten_17 (Flatten) (None, 64) 0 max_pooling1d_17[0][0]
__________________________________________________________________________________________________
concatenate_8 (Concatenate) (None, 128) 0 flatten_16[0][0]
flatten_17[0][0]
__________________________________________________________________________________________________
dense_16 (Dense) (None, 50) 6450 concatenate_8[0][0]
__________________________________________________________________________________________________
dense_17 (Dense) (None, 1) 51 dense_16[0][0]
==================================================================================================
Total params: 6,885
Trainable params: 6,885
Non-trainable params: 0
__________________________________________________________________________________________________
None
yhat: [[205.66142]]
train_acc:0.0
train_loss:201.12432714579907
保存的網絡結構圖:
2.2 多並行序列
假設有如下序列:
[[ 10 15 25]
[ 20 25 45]
[ 30 35 65]
[ 40 45 85]
[ 50 55 105]
[ 60 65 125]
[ 70 75 145]
[ 80 85 165]
[ 90 95 185]]
輸入:
10, 15, 25
20, 25, 45
30, 35, 65
輸出:
40, 45, 85
對於整個數據集:
[[10 15 25]
[20 25 45]
[30 35 65]] [40 45 85]
[[20 25 45]
[30 35 65]
[40 45 85]] [ 50 55 105]
[[ 30 35 65]
[ 40 45 85]
[ 50 55 105]] [ 60 65 125]
[[ 40 45 85]
[ 50 55 105]
[ 60 65 125]] [ 70 75 145]
[[ 50 55 105]
[ 60 65 125]
[ 70 75 145]] [ 80 85 165]
[[ 60 65 125]
[ 70 75 145]
[ 80 85 165]] [ 90 95 185]
2.2.1 Vector-Output CNN Model
要在該數據集上建立一個一維CNN模型。在該模型中,通過input_shape參數爲輸入層指定時間步數和並行序列(特徵)。代碼實現:
model = Sequential()
model.add(Conv1D(filters=64, kernel_size=2, activation='relu', input_shape=(n_steps,
n_features)))
model.add(MaxPooling1D(pool_size=2))
model.add(Flatten())
model.add(Dense(50, activation='relu'))
model.add(Dense(n_features))
model.compile(optimizer='adam', loss='mse')
完整代碼:
import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten, Conv1D, MaxPooling1D
def split_sequences(first_seq, secend_seq, sw_width):
'''
該函數將序列數據分割成樣本
'''
input_seq1 = np.array(first_seq).reshape(len(first_seq), 1)
input_seq2 = np.array(secend_seq).reshape(len(secend_seq), 1)
out_seq = np.array([first_seq[i]+secend_seq[i] for i in range(len(first_seq))])
out_seq = out_seq.reshape(len(out_seq), 1)
dataset = np.hstack((input_seq1, input_seq2, out_seq))
print('dataset:\n',dataset)
X, y = [], []
for i in range(len(dataset)):
end_element_index = i + sw_width
if end_element_index > len(dataset) - 1:
break
# 該語句實現步長爲1的滑動窗口截取數據功能;
seq_x, seq_y = dataset[i:end_element_index, :], dataset[end_element_index, :]
X.append(seq_x)
y.append(seq_y)
process_X, process_y = np.array(X), np.array(y)
n_features = process_X.shape[2]
print('train_X:\n{}\ntrain_y:\n{}\n'.format(process_X, process_y))
print('train_X.shape:{},trian_y.shape:{}\n'.format(process_X.shape, process_y.shape))
print('n_features:',n_features)
return process_X, process_y, n_features
def oned_cnn_model(sw_width, n_features, X, y, test_X, epoch_num, verbose_set):
model = Sequential()
model.add(Conv1D(filters=64, kernel_size=2, activation='relu',
strides=1, padding='valid', data_format='channels_last',
input_shape=(sw_width, n_features)))
model.add(MaxPooling1D(pool_size=2, strides=None, padding='valid',
data_format='channels_last'))
model.add(Flatten())
model.add(Dense(units=50, activation='relu',
use_bias=True, kernel_initializer='glorot_uniform', bias_initializer='zeros',))
model.add(Dense(units=n_features))
model.compile(optimizer='adam', loss='mse',
metrics=['accuracy'], loss_weights=None, sample_weight_mode=None, weighted_metrics=None, target_tensors=None)
print('\n',model.summary())
history = model.fit(X, y, batch_size=32, epochs=epoch_num, verbose=verbose_set)
yhat = model.predict(test_X, verbose=0)
print('\nyhat:', yhat)
return model, history
if __name__ == '__main__':
train_seq1 = [10, 20, 30, 40, 50, 60, 70, 80, 90]
train_seq2 = [15, 25, 35, 45, 55, 65, 75, 85, 95]
sw_width = 3
epoch_num = 3000
verbose_set = 0
train_X, train_y, n_features = split_sequences(train_seq1, train_seq2, sw_width)
# 預測
x_input = np.array([[70,75,145], [80,85,165], [90,95,185]])
x_input = x_input.reshape((1, sw_width, n_features))
model, history = oned_cnn_model(sw_width, n_features, train_X, train_y, x_input, epoch_num, verbose_set)
print('\ntrain_acc:%s'%np.mean(history.history['accuracy']), '\ntrain_loss:%s'%np.mean(history.history['loss']))
輸出:
dataset:
[[ 10 15 25]
[ 20 25 45]
[ 30 35 65]
[ 40 45 85]
[ 50 55 105]
[ 60 65 125]
[ 70 75 145]
[ 80 85 165]
[ 90 95 185]]
train_X:
[[[ 10 15 25]
[ 20 25 45]
[ 30 35 65]]
[[ 20 25 45]
[ 30 35 65]
[ 40 45 85]]
[[ 30 35 65]
[ 40 45 85]
[ 50 55 105]]
[[ 40 45 85]
[ 50 55 105]
[ 60 65 125]]
[[ 50 55 105]
[ 60 65 125]
[ 70 75 145]]
[[ 60 65 125]
[ 70 75 145]
[ 80 85 165]]]
train_y:
[[ 40 45 85]
[ 50 55 105]
[ 60 65 125]
[ 70 75 145]
[ 80 85 165]
[ 90 95 185]]
train_X.shape:(6, 3, 3),trian_y.shape:(6, 3)
n_features: 3
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv1d (Conv1D) (None, 2, 64) 448
_________________________________________________________________
max_pooling1d (MaxPooling1D) (None, 1, 64) 0
_________________________________________________________________
flatten (Flatten) (None, 64) 0
_________________________________________________________________
dense (Dense) (None, 50) 3250
_________________________________________________________________
dense_1 (Dense) (None, 3) 153
=================================================================
Total params: 3,851
Trainable params: 3,851
Non-trainable params: 0
_________________________________________________________________
None
yhat: [[100.58862 106.20969 207.11055]]
train_acc:0.9979444
train_loss:36.841995810692595
2.2.2 Multi-output CNN Model
與多個輸入序列一樣,還有另一種更精細的方法來建模問題。每個輸出序列可以由單獨的輸出CNN模型處理。我們可以稱之爲多輸出CNN模型。它可能提供更多的靈活性或更好的性能,這取決於正在建模的問題的具體情況。代碼實現:
visible = Input(shape=(n_steps, n_features))
cnn = Conv1D(filters=64, kernel_size=2, activation='relu')(visible)
cnn = MaxPooling1D(pool_size=2)(cnn)
cnn = Flatten()(cnn)
cnn = Dense(50, activation='relu')(cnn)
然後,我們可以爲希望預測的三個序列中的每一個定義一個輸出層,其中每個輸出子模型將預測一個時間步。
output1 = Dense(1)(cnn)
output2 = Dense(1)(cnn)
output3 = Dense(1)(cnn)
綁定模型,編譯模型:
model = Model(inputs=visible, outputs=[output1, output2, output3])
model.compile(optimizer='adam', loss='mse')
在訓練模型時,每個樣本需要三個獨立的輸出數組。可以通過將具有形狀[7,3]的輸出訓練數據轉換爲具有形狀[7,1]的三個數組來實現這一點。
y1 = y[:, 0].reshape((y.shape[0], 1))
y2 = y[:, 1].reshape((y.shape[0], 1))
y3 = y[:, 2].reshape((y.shape[0], 1))
開始訓練:
model.fit(X, [y1,y2,y3], epochs=2000, verbose=0)
完整代碼:
import numpy as np
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.layers import Dense, Flatten, Conv1D, MaxPooling1D, Input, concatenate
from tensorflow.keras.utils import plot_model
def split_sequences(first_seq, secend_seq, sw_width):
'''
該函數將序列數據分割成樣本
'''
input_seq1 = np.array(first_seq).reshape(len(first_seq), 1)
input_seq2 = np.array(secend_seq).reshape(len(secend_seq), 1)
out_seq = np.array([first_seq[i]+secend_seq[i] for i in range(len(first_seq))])
out_seq = out_seq.reshape(len(out_seq), 1)
dataset = np.hstack((input_seq1, input_seq2, out_seq))
print('dataset:\n',dataset)
X, y = [], []
for i in range(len(dataset)):
end_element_index = i + sw_width
if end_element_index > len(dataset) - 1:
break
# 該語句實現步長爲1的滑動窗口截取數據功能;
seq_x, seq_y = dataset[i:end_element_index, :], dataset[end_element_index, :]
X.append(seq_x)
y.append(seq_y)
process_X, process_y = np.array(X), np.array(y)
n_features = process_X.shape[2]
y1 = process_y[:, 0].reshape((process_y.shape[0], 1))
y2 = process_y[:, 1].reshape((process_y.shape[0], 1))
y3 = process_y[:, 2].reshape((process_y.shape[0], 1))
print('train_X:\n{}\ntrain_y:\n{}\n'.format(process_X, process_y))
print('train_X.shape:{},trian_y.shape:{}\n'.format(process_X.shape, process_y.shape))
print('n_features:',n_features)
return process_X, process_y, n_features, y1, y2, y3
def oned_cnn_model(n_steps, n_features, X, y, test_X, epoch_num, verbose_set):
visible = Input(shape=(n_steps, n_features))
cnn = Conv1D(filters=64, kernel_size=2, activation='relu')(visible)
cnn = MaxPooling1D(pool_size=2)(cnn)
cnn = Flatten()(cnn)
cnn = Dense(50, activation='relu')(cnn)
output1 = Dense(1)(cnn)
output2 = Dense(1)(cnn)
output3 = Dense(1)(cnn)
model = Model(inputs=visible, outputs=[output1, output2, output3])
model.compile(optimizer='adam', loss='mse',
metrics=['accuracy'], loss_weights=None, sample_weight_mode=None, weighted_metrics=None, target_tensors=None)
print('\n',model.summary())
plot_model(model, to_file='vector_output_cnn_model.png', show_shapes=True, show_layer_names=True, rankdir='TB', dpi=200)
model.fit(X, y, batch_size=32, epochs=epoch_num, verbose=verbose_set)
yhat = model.predict(test_X, verbose=0)
print('\nyhat:', yhat)
return model
if __name__ == '__main__':
train_seq1 = [10, 20, 30, 40, 50, 60, 70, 80, 90]
train_seq2 = [15, 25, 35, 45, 55, 65, 75, 85, 95]
sw_width = 3
epoch_num = 2000
verbose_set = 0
train_X, train_y, n_features, y1, y2, y3 = split_sequences(train_seq1, train_seq2, sw_width)
# 預測
x_input = np.array([[70,75,145], [80,85,165], [90,95,185]])
x_input = x_input.reshape((1, sw_width, n_features))
model = oned_cnn_model(sw_width, n_features, train_X, [y1, y2, y3], x_input, epoch_num, verbose_set)
輸出:
dataset:
[[ 10 15 25]
[ 20 25 45]
[ 30 35 65]
[ 40 45 85]
[ 50 55 105]
[ 60 65 125]
[ 70 75 145]
[ 80 85 165]
[ 90 95 185]]
train_X:
[[[ 10 15 25]
[ 20 25 45]
[ 30 35 65]]
[[ 20 25 45]
[ 30 35 65]
[ 40 45 85]]
[[ 30 35 65]
[ 40 45 85]
[ 50 55 105]]
[[ 40 45 85]
[ 50 55 105]
[ 60 65 125]]
[[ 50 55 105]
[ 60 65 125]
[ 70 75 145]]
[[ 60 65 125]
[ 70 75 145]
[ 80 85 165]]]
train_y:
[[ 40 45 85]
[ 50 55 105]
[ 60 65 125]
[ 70 75 145]
[ 80 85 165]
[ 90 95 185]]
train_X.shape:(6, 3, 3),trian_y.shape:(6, 3)
n_features: 3
Model: "model_2"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_4 (InputLayer) [(None, 3, 3)] 0
__________________________________________________________________________________________________
conv1d_4 (Conv1D) (None, 2, 64) 448 input_4[0][0]
__________________________________________________________________________________________________
max_pooling1d_4 (MaxPooling1D) (None, 1, 64) 0 conv1d_4[0][0]
__________________________________________________________________________________________________
flatten_4 (Flatten) (None, 64) 0 max_pooling1d_4[0][0]
__________________________________________________________________________________________________
dense_14 (Dense) (None, 50) 3250 flatten_4[0][0]
__________________________________________________________________________________________________
dense_15 (Dense) (None, 1) 51 dense_14[0][0]
__________________________________________________________________________________________________
dense_16 (Dense) (None, 1) 51 dense_14[0][0]
__________________________________________________________________________________________________
dense_17 (Dense) (None, 1) 51 dense_14[0][0]
==================================================================================================
Total params: 3,851
Trainable params: 3,851
Non-trainable params: 0
__________________________________________________________________________________________________
None
yhat: [array([[101.1264]], dtype=float32), array([[106.274635]], dtype=float32), array([[207.64928]], dtype=float32)]
網絡結構圖:
由於字數過多(已經28000多字)頁面變卡,下篇繼續將剩下的兩類模型。