Dropout network, DropConnect network

原創

大眼呆萌君

2020-07-04 18:42

Notations

input $v$
output $r$
weight parameter $W \in \mathbb{R}^{d \times m}$
activation function $a$
mask $m$ for vector and $M$ for matrix

Dropout

Randomly set activations of each layer to zero with probability $1-p$ .
$r = m \circ a(Wv),$
$m_j \sim \text{\small Bernoulli}(p)$ .
As many activation functions have the property that $a(0)=0)$ , we have
$r = a(m \circ Wv).$

DropConnect

Randomly set the weight of each layer to zero with probability $1-p$ .
$r = a(M \circ Wv),$
$M_{ij} \sim \text{\small Bernoulli}(p)$ .
Each $M_{ij}$ is drawn independently for each example during training.
The memory requirement for $M$ 's grows with the size of each mini-batch, and therefore, the implementation needs to be carefully designed.
overall model $f(x;\theta,M)$ , where $\theta = \{W_g,W,W_s\}$
$\begin{aligned} o=\mathbb{E}_M[f(x;\theta,M)]&=\sum_M p(M) f(x;\theta,M)\\ &=\frac{1}{|M|}\sum_M s(a(M \circ W) v); W_s) \quad \text{if } p = 0.5 \end{aligned}$

inference (test stage)
$\begin{aligned} r&=\frac{1}{|M|} \sum_M a((M \circ W)v))\\ r&\approx \frac{1}{Z} \sum_{z=1}^Z r_z \\ &\approx \frac{1}{Z} \sum_{z=1}^Z a(u_z), \end{aligned}$
where $u_z \sim \mathcal{N}(pWv,p(1-p)(W \circ W)(v \circ v)$ ; $Z$ denotes the number of randoml samples drawn from the Gaussian distribution.
Idea: approximate a sum of weighted Bernoulli random variables by a Gaussian random variable. Partially supported by the central limit theorem.

$\textcolor{red}{\text{\small 侷限性}}$ :
Both techniques are suitable for fully connected layers only.

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

相關文章

論文學習「翻譯」：《Understanding LSTM Netword》，附原文

遞歸神經網絡人類並不是每一秒都從頭開始思考。當你閱讀這篇文章時，你會根據對前幾個詞的理解來理解每個詞。你不會扔掉一切，然後再從頭開始思考。你的思想是有持久性的。傳統的神經網絡並不能做到這一點，這似乎是一個主要的缺點。例如，想象一下你要

2020-07-08 02:31:08

論文學習：BP神經網絡

最近在學習 Long Short-Term Memery (LSTM)[1]，文獻中指出：通過遞歸反向傳播算法學習在長時間間隔內存儲信息需要花費很長的時間，這主要是由於不足、衰減的誤差反向流造成的（Learning to store i

2020-07-08 02:31:08

論文學習「MDP」：馬爾可夫決策過程原理與代碼實現

最近在學習 RL ，不得不先接觸一下“ 馬爾可夫決策過程 ”，這裏找到了 David Silver 的課程: UCL Course on RL （http://www0.cs.ucl.ac.uk/staff/d.silver/web/Te

2020-07-08 02:30:58

論文學習：基於Hierarchical Softmax的CBOW模型

看到一篇寫的非常好的關於 word2vec 文章《word2vec中的數學原理詳解》[1]，本來想着看完之後按自己對知識的整理做一下記錄，但是覺得原文作者（peghoty）寫的太好了，這裏按作者寫作邏輯記錄一下個人的學習筆記吧。目錄一

2020-07-08 02:30:58

論文學習「Python」：Huffman編碼詳述及代碼實現

在學習 word2vec 時，首先接觸到的就是 Huffman 編碼，這裏簡單記錄一下學習內容。目錄一、簡介二、Huffman樹（一）基礎術語（二）構建三、Huffman編碼四、代碼（一）python （二）結果一、

2020-07-08 02:30:58

論文學習「Python」：神經網絡中多種激活函數論述與代碼實現

在神經網絡中，隱藏層和輸出層節點總是需要一個可微的激活函數，那麼激活函數有什麼作用呢？又有哪些常見的激活函數呢？目錄一、激活函數的作用[1] （一）二分類問題（二）激活函數二、激活函數（一）sigmoid函數（二）tanh函數

2020-07-08 02:30:58

2.Deep Learning綜述中英翻譯

文章目錄1. Introduction2. Supervised learning3. Backpropagation to train multilayer architectures4. Convolutional neura

2020-07-07 19:19:48

1.Deep Learning

1 論文與作者簡介 2.論文結構 3.引言 4.監督學習與BP算法 5.基於CNN的圖像理解

2020-07-07 19:19:48

2.（AlexNet）ImageNet Classification with Deep Convolutional Neural Network

神經網絡處理分類問題流程網絡結構及部分參數計算網絡超參數及其訓練網絡特點訓練模型測試模型

2020-07-07 19:19:48

5.(GoogleNet)Going deeper with convolutions

1.經典網絡 2.論文核心思想

2020-07-07 19:19:48

【論文閱讀】A Self-supervised Approach for Adversarial Robustness#CVPR2020

combine the benefits of Adversarial training and input processing and propose a self-supervised adversarial training m

2020-07-04 19:49:24

【論文閱讀】Adversarial Vertex Mixup: Toward Better Adversarially Robust Generalization#CVPR2020

論文地址：http://xxx.itp.ac.cn/pdf/2003.02484.pdf 解決問題：雖然對抗性訓練是對抗性訓練中最有效的防禦形式之一，但不幸的是，對抗性訓練中存在着測試準確性和訓練準確性之間的矛盾。總結：在本文中

2020-07-04 19:49:24

Curriculum adversarial training

Weakness of adversarial training: overfit to the attack in use and hence does not generalize to test data Curriculu

大眼呆萌君

2020-07-04 18:42:44

複雜網絡社區發現方法總結（一）

複雜網絡社區發現方法一、KL算法 1.經典論文： http://wenku.baidu.com/link?url=jLIGECP1kkikDbTJOUh3ArHFULWQLX0cTsHNBagMFNL-4NEKpb2myet2PKf

2020-07-04 07:50:31

Detr & End-to-end object detection with Transformers (1)

title: Detr author: yangsenius original link: https://senyang-ml.github.io/2020/06/04/detr/ date: 2020-06-04 18:1

2020-07-03 12:00:08

24小時熱門文章

最新文章

最新評論文章