【NAS工具箱】Drop Path介紹 + Dropout回顧

原創

2021-05-27 12:54

【前言】Drop Path是NAS中常用到的一種正則化方法，由於網絡訓練的過程中常常是動態的，Drop Path就成了一個不錯的正則化工具，在FractalNet、NASNet等都有廣泛使用。

Dropout

Dropout是最早的用於解決過擬合的方法，是所有drop類方法的大前輩。Dropout在12年被Hinton提出，並且在ImageNet Classification with Deep Convolutional Neural Network工作AlexNet中使用到了Dropout。

原理：在前向傳播的時候，讓某個神經元激活以概率1-keep_prob（0<p<1）停止工作。

功能：這樣可以讓模型泛化能力更強，因爲其不會過於以來某些局部的節點。訓練階段以keep_prob的概率保留，以1-keep_prob的概率關閉；測試階段所有的神經元都不關閉，但是對訓練階段應用了dropout的神經元，輸出值需要乘以keep_prob。

具體是這樣的：

假設一個神經元的輸出激活值爲a，在不使用dropout的情況下，其輸出期望值爲a，如果使用了dropout，神經元就可能有保留和關閉兩種狀態，把它看作一個離散型隨機變量，它就符合概率論中的0-1分佈，其輸出激活值的期望變爲 p*a+(1-p)*0=pa，此時若要保持期望和不使用dropout時一致，就要除以 p。
作者：種子_fe
鏈接：https://www.imooc.com/article/30129

實現： pytorch中的實現如下。

class _DropoutNd(Module):
    __constants__ = ['p', 'inplace']
    p: float
    inplace: bool

    def __init__(self, p: float = 0.5, inplace: bool = False) -> None:
        super(_DropoutNd, self).__init__()
        if p < 0 or p > 1:
            raise ValueError("dropout probability has to be between 0 and 1, "
                             "but got {}".format(p))
        self.p = p
        self.inplace = inplace

    def extra_repr(self) -> str:
        return 'p={}, inplace={}'.format(self.p, self.inplace)
    
class Dropout(_DropoutNd):
    def forward(self, input: Tensor) -> Tensor:
        return F.dropout(input, self.p, self.training, self.inplace)

funtional.py中的dropout實現：

def dropout(input: Tensor, p: float = 0.5, training: bool = True, inplace: bool = False) -> Tensor:
    r"""
    During training, randomly zeroes some of the elements of the input
    tensor with probability :attr:`p` using samples from a Bernoulli
    distribution.
    See :class:`~torch.nn.Dropout` for details.
    Args:
        p: probability of an element to be zeroed. Default: 0.5
        training: apply dropout if is ``True``. Default: ``True``
        inplace: If set to ``True``, will do this operation in-place. Default: ``False``
    """
    if has_torch_function_unary(input):
        return handle_torch_function(dropout, (input,), input, p=p, training=training, inplace=inplace)
    if p < 0.0 or p > 1.0:
        raise ValueError("dropout probability has to be between 0 and 1, " "but got {}".format(p))
    return _VF.dropout_(input, p, training) if inplace else _VF.dropout(input, p, training)

最終在Dropout.cpp中找到具體實現：

template<bool feature_dropout, bool alpha_dropout, bool inplace, typename T>
Ctype<inplace> _dropout_impl(T& input, double p, bool train) {
  TORCH_CHECK(p >= 0 && p <= 1, "dropout probability has to be between 0 and 1, but got ", p);
  if (p == 0 || !train || input.numel() == 0) {
    return input;
  }

  if (p == 1) {
    return multiply<inplace>(input, at::zeros({}, input.options()));
  }

  at::Tensor b; // used for alpha_dropout only
  auto noise = feature_dropout ? make_feature_noise(input) : at::empty_like(input, LEGACY_CONTIGUOUS_MEMORY_FORMAT);
  noise.bernoulli_(1 - p);
  if (alpha_dropout) {
    constexpr double alpha = 1.7580993408473766;
    double a = 1. / std::sqrt((alpha * alpha * p + 1) * (1 - p));
    b = noise.add(-1).mul_(alpha * a).add_(alpha * a * p);
    noise.mul_(a);
  } else {
    noise.div_(1 - p);
  }  

  if (!alpha_dropout) {
    return multiply<inplace>(input, noise);
  } else {
    return multiply<inplace>(input, noise).add_(b);
  }
}

流程：

判斷p的範圍以及訓練狀態
使用1-p的概率得到伯努利分佈（0-1分佈）
(input / 1-p) * 伯努利分佈

Drop Path

原理：字如其名，Drop Path就是隨機將深度學習網絡中的多分支結構隨機刪除。

功能：一般可以作爲正則化手段加入網絡，但是會增加網絡訓練的難度。尤其是在NAS問題中，如果設置的drop prob過高，模型甚至有可能不收斂。

實現：

def drop_path(x, drop_prob: float = 0., training: bool = False):
    if drop_prob == 0. or not training:
        return x
    keep_prob = 1 - drop_prob
    shape = (x.shape[0],) + (1,) * (x.ndim - 1)  # work with diff dim tensors, not just 2D ConvNets
    random_tensor = keep_prob + torch.rand(shape, dtype=x.dtype, device=x.device)
    random_tensor.floor_()  # binarize
    output = x.div(keep_prob) * random_tensor
    return output


class DropPath(nn.Module):
    """Drop paths (Stochastic Depth) per sample  (when applied in main path of residual blocks).
    """
    def __init__(self, drop_prob=None):
        super(DropPath, self).__init__()
        self.drop_prob = drop_prob

    def forward(self, x):
        return drop_path(x, self.drop_prob, self.training)

有了Dropout的理論鋪墊，這裏的實現就比較明瞭了，具體使用的時候一般是這樣的：

x = x + self.drop_path(self.conv(x))

Drop Path不能直接這樣使用：

x = self.drop_path(x)

Reference

https://www.cnblogs.com/dan-baishucaizi/p/14703263.html

https://www.imooc.com/article/30129

https://www.github.com/pytorch/pytorch

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

【NAS工具箱】Drop Path介紹 + Dropout回顧

Dropout

Drop Path

Reference

985 碩士程序員，空窗 4 個月沒有 Offer！

一文搞懂 Spring 循環依賴

賽博鬥地主——使用大語言模型扮演Agent智能體玩牌類遊戲。

VScode右鍵打開(添加到右鍵)

記一次 .NET某工控視覺自動化系統卡死分析

WindowsServer--SQL Server搭建主從同步實現讀寫分離 - 事務性分發

java由於越界導致的報錯

Spack：軟件包管理的終極解決方案以 unzip 無sudo權限安裝爲例

2021 BDCI 華爲零售商品識別競賽一等獎方案分享

當可變形注意力機制引入Vision Transformer

CeiT：訓練更快的多層特徵抽取ViT

BoTNet:Bottleneck Transformers for Visual Recognition

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結