RNN變體之dropout

問題

RNN在迭代運用狀態轉換操作“輸入到隱狀態”實現任意長序列的定長表示時，會遭遇到“對隱狀態擾動過於敏感”的困境。

dropout

dropout的數學形式化:

y=f(W⋅d(x)) , 其中d(x)={mask∗x, if train phaseing(1−p)x,otherwise
p 爲dropout率，mask爲以1-p爲概率的貝努力分佈生成的二值向量

rnn dropout

改變傳統做法“在每個時間步採用不同的mask對隱節點進行屏蔽”，提出新的策略(如下圖所示),其特點是：1）generates the dropout mask only at the beginning of each training sequence and fixes it through the sequence；2）dropping both the non-recurrent and recurrent connections。

Moon T, Choi H, Lee H, et al. RNNDROP: A novel dropout for RNNS in ASR[C]// Automatic Speech Recognition and Understanding. IEEE, 2016:65-70.

recurrent dropout

思想：通過dropout LSTM/GRU中的input或update門以prevents the loss of long-term memories built up in the states/cells 。

簡單RNN及其dropout:
RNN: ht=f(Wh⊙[xt,ht−1]+bh) ;
dropout: ht=f(Wh⊙[xt,d(ht−1)]+bh) , d(⋅) 爲dropout函數
LSTM： ct=ft⊙ct−1+it⊙d(gt)
GRU: ht=(1−zt)⊙ct−1+zt⊙d(gt)
從理論上講， masks can be applied to any subset of the gates, cells, and states.
文獻：Semeniuta S, Severyn A, Barth E. Recurrent Dropout without Memory Loss[J]. 2016.

垂直連接的dropout

針對多層LSTM網絡，對其垂直連接進行隨機dropout, 也即是否允許L 層某個LSTM單元的隱狀態信息流入L+1 層對應單元

圖中虛線是進行隨機dropout的操作對象。

dropout操作後的信息流
文獻: Zaremba W, Sutskever I, Vinyals O. Recurrent Neural Network Regularization[C]. ICLR 2015.
源碼：https://github.com/wojzaremba/lstm .

基於變分推理的dropout

圖中虛線代表不進行dropout,而不同顏色的實線表示不同的mask。
傳統dropout rnn: use different masks at different time steps
基於變分推理的dropout: uses the same dropout mask at each time step, including the recurrent layers
基於變分推理的dropout的具體實現（上圖（b）的實線顏色可知）：爲每個連接矩陣一次性生成貝努力隨機變量的mask，然後在後續的每個時間點上都採用相同的mask.
文獻：Gal Y. A Theoretically Grounded Application of Dropout in Recurrent Neural Networks[J]. Statistics, 2015:285-290.
源碼: http://yarin.co/BRNN

Zoneout

ct=dct⊙ct−1+(1−dct)⊙ft⊙ct−1+it⊙gt
ht=dht⊙ht−1+(1−dht)⊙ot⊙tanh(ct−1⊙ft+it⊙gt)
其中dht 爲0與1的二值隨機向量。

文獻：Krueger D, Maharaj T, Kramár J, et al. Zoneout: Regularizing RNNs by Randomly Preserving Hidden Activations[C]. ICLR 2017
源碼：http://github.com/teganmaharaj/zoneout

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

RNN變體之dropout

問題

dropout

rnn dropout

recurrent dropout

垂直連接的dropout

基於變分推理的dropout

Zoneout

答案選取(基於嵌入表示的相似度量)

面對稀疏噪聲的有標籤數據時如何改造通用詞嵌入表示

基於語義知識不等式的詞嵌入

面向語義對比分析的詞嵌入

繼續dropout

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結