【Pytorch】時間序列中LSTM的輸入輸出理解

Pytorch中的nn.LSTM

Pytorch中LSTM總共有7個參數，前面3個是必須輸入的

input_size – The number of expected features in the input x
hidden_size – The number of features in the hidden state h
num_layers – Number of recurrent layers. E.g., setting num_layers=2 would mean stacking two LSTMs together to form a stacked LSTM, with the second LSTM taking in outputs of the first LSTM and computing the final results. Default: 1
bias – If False, then the layer does not use bias weights b_ih and b_hh. Default: True
batch_first – If True, then the input and output tensors are provided as (batch, seq, feature). Default: False
dropout – If non-zero, introduces a Dropout layer on the outputs of each LSTM layer except the last layer, with dropout probability equal to dropout. Default: 0
bidirectional – If True, becomes a bidirectional LSTM. Default: False

1：input_size: 輸入特徵維數，即每一行輸入元素的個數。輸入是一維向量。如：[1,2,3,4,5,6,7,8,9]，input_size 就是9

2：hidden_size: 隱藏層狀態的維數，即隱藏層節點的個數，這個和單層感知器的結構是類似的。這個維數值是自定義的，和輸入的維度沒有關係，如下圖：

input_size：就是輸入層，左邊藍色方格 [i0,i1,i2,i3,i4]，hidden_size：就是隱藏層，中間黃色圓圈 [h0,h1,h2,h3,h4]。最右邊藍色圓圈 [o0,o1,o2] 的是輸出層，節點個數也是按具體業務需求決定的。

3：num_layers: LSTM 堆疊的層數，默認值是1層，如果設置爲2，第二個LSTM接收第一個LSTM的計算結果。也就是第一層輸入 [ X0 X1 X2 ... Xt]，計算出 [ h0 h1 h2 ... ht ]，第二層將 [ h0 h1 h2 ... ht ] 作爲 [ X0 X1 X2 ... Xt] 輸入再次計算，輸出最後的 [ h0 h1 h2 ... ht ]。

4：bias: 隱層狀態是否帶bias，默認爲true。bias是偏置值，或者偏移值。沒有偏置值就是以0爲中軸，或以0爲起點。偏置值的作用請參考單層感知器相關結構。

5：batch_first: 輸入輸出的第一維是否爲 batch_size，默認值 False。因爲 Torch 中，人們習慣使用Torch中帶有的dataset，dataloader向神經網絡模型連續輸入數據，這裏面就有一個 batch_size 的參數，表示一次輸入多少個數據。在 LSTM 模型中，輸入數據必須是一批數據，爲了區分LSTM中的批量數據和dataloader中的批量數據是否相同意義，LSTM 模型就通過這個參數的設定來區分。如果是相同意義的，就設置爲True，如果不同意義的，設置爲False。 torch.LSTM 中 batch_size 維度默認是放在第二維度，故此參數設置可以將 batch_size 放在第一維度。如：input 默認是(4,1,5)，中間的 1 是 batch_size，指定batch_first=True後就是(1,4,5)。所以，如果你的輸入數據是二維數據的話，就應該將 batch_first 設置爲True;

6：dropout: 默認值0。是否在除最後一個 RNN 層外的其他 RNN 層後面加 dropout 層。輸入值是 0-1 之間的小數，表示概率。

7：bidirectional: 是否是雙向 RNN，默認爲：false，若爲 true，則：num_directions=2，否則爲1。

Pytorch中nn.LSTM的輸入輸出格式

輸入數據格式：
input(seq_len, batch, input_size)
h0(num_layers * num_directions, batch, hidden_size)
c0(num_layers * num_directions, batch, hidden_size)

輸出數據格式：
output(seq_len, batch, hidden_size * num_directions)
hn(num_layers * num_directions, batch, hidden_size)
cn(num_layers * num_directions, batch, hidden_size)

舉個栗子

1、在nlp中，假設有3個句子，每個句子5個單詞，每個單詞用10維表示，那麼對應的維度應該是(5,3,10)

2、在時序預測中，假設有128個時間數據（行），每個時間點有10個特徵，那麼此時可以理解爲seq_len爲1，對應的維度就是

(1,128,10)

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

【Pytorch】時間序列中LSTM的輸入輸出理解

Pytorch中的nn.LSTM

Pytorch中nn.LSTM的輸入輸出格式

我真的從測試轉成了開發......

零基礎寫框架(2)：故障排查和日誌基礎

芯片產業管理和營銷指北（1）—— 產品線經理主要職能

記一次疑似JVM內存泄漏的排查過程

C#序列化對象轉爲爲XML格式字符串

Django博客重構教程（一）-models模型設計

ASP.Net引入Select2選擇框以及傳值

Django個人博客搭建教程---restful-api動態序列化

【Java】兩整數之和

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結