繼續dropout

dropout VS. L2 VS ensemble learning

  • Ensemble learning using a different set of hidden units in every iteration (this is the dropout) performs better than when using the same set of hidden units throughout the learning.
    Note that even with dropout learning using more hidden units than ensemble learning, overfitting did not occur
  • L2與dropout的正則化效果相當,在SGD+L2的配置中需要不斷嘗試學習速率α,而dropout沒有對應微調參數。

Selective Dropout

文獻:Barrow E, Eastwood M, Jayne C. Selective Dropout for Deep Neural Networks[M]// Neural Information Processing. Springer International Publishing, 2016.
方法:根據dropout率來決定每層需要dropout的單元數,分別以下面三個值來產生三個神經單元選擇概率,值越大者越

  • 權重變化度:avgk=1nj=1n(|W(i)jkW(i1)jk|) ,變化越大則說明該單元還處於積極學習中,則dropout的概率要越低。

  • 權重平均值:avgk=1nj=1n(W(i)jk) ,該值越大意味着對應神經元基本學會,則其dropout的概率要越大。

  • 輸出方差:N_Variancek=variance(X(i1)k) ,該值越大意味着該單元基本穩定,則其dropout的概率要越大。

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章