pytorch的LSTM及RNN的dropout不會對每個time step進行dropout,只對一層的輸出設置了dropout。
在新版本的pytorch中,對於1層的lstm,dropout參數無效了,就說明對每個時間步是不dropout的。
源碼中,也是最後一層的輸出時才加上的dropout.
for i in range(num_layers):
all_output = []
for j, inner in enumerate(inners):
l = i * num_directions + j
hy, output = inner(input, hidden[l], weight[l], batch_sizes)
next_hidden.append(hy)
all_output.append(output)
input = torch.cat(all_output, input.dim() - 1)
if dropout != 0 and i < num_layers - 1:
input = F.dropout(input, p=dropout, training=train, inplace=False)