ACNet論文閱讀筆記

原創

2020-05-26 17:19

論文地址：ACNet
論文代碼：github

1 研究思路

嘗試提出一個和架構無關的新型CNN結構來提升CNN的性能。穩重提取了非對稱卷積塊(ACB)，使用三個並行的 $d\times d,1\times d, d\times 1$ 代替原始的 $d\times d$ 卷積核進行特徵提取。

2 結構

2.1 非對稱卷積

非對稱卷積基本上是相對對稱卷積來說的，對稱卷積一般爲 $3\times 3$ ，非對稱卷積一般爲 $1\times 3$ 或者 $3\times 1$ 兩類。在inceptionv3中基本上證明了垂直和水平兩個方向的非對稱卷積並行鏈接某種程度上等效於單個對稱卷積且更少的參數，並且vgg證明了多個小卷積核串聯等價於單個大卷積核且更少的參數。同理多個非對稱卷積的串聯可以擁有更廣的感受野。

2.2 公式推導

對於一般的卷積，有：
$O_{:,:,j}=\sum_{k=1}^{C}M_{:,:,k}\ast F^{(j)}_{:,:,k}$

$M\in R^{U\times V\times C}$ 表示輸入；
$F\in R^{H\times W\times C}$ 表示卷積核；
$O\in R^{R\times T\times D}$ 表示輸出特徵圖；
$\ast$ 表示卷積操作。

對以上輸出經過bn的結果爲：
$O_{:,:,j}=(\sum_{k=1}^{C}M_{:,:,k}\ast F^{(j)}_{:,:,k} - \mu_j)\frac{\gamma_j}{\sigma_j}+\beta_j$

$\mu_j$ 表示均值；
$\sigma_j$ 表示標準差；
$\gamma_j$ 表示縮放係數；
$\beta_j$ 表示偏移量。

卷積可加性:
$I\ast K^{(1)}+I\ast K^{(2)}=I\ast(K^{(1)}\oplus K^{(2)})$

$K^{(1)},K^{(2)}$ 爲兩個兼容尺寸的2d核；
$I$ 爲輸入矩陣；
$\oplus$ 爲按位置求和。

2.3 code

class CropLayer(nn.Module):

    #   E.g., (-1, 0) means this layer should crop the first and last rows of the feature map. And (0, -1) crops the first and last columns
    def __init__(self, crop_set):
        super(CropLayer, self).__init__()
        self.rows_to_crop = - crop_set[0]
        self.cols_to_crop = - crop_set[1]
        assert self.rows_to_crop >= 0
        assert self.cols_to_crop >= 0

    def forward(self, input):
        return input[:, :, self.rows_to_crop:-self.rows_to_crop, self.cols_to_crop:-self.cols_to_crop]


class ACBlock(nn.Module):

    def __init__(self, in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, padding_mode='zeros', deploy=False):
        super(ACBlock, self).__init__()
        self.deploy = deploy
        if deploy:
            self.fused_conv = nn.Conv2d(in_channels=in_channels, out_channels=out_channels, kernel_size=(kernel_size,kernel_size), stride=stride,
                                      padding=padding, dilation=dilation, groups=groups, bias=True, padding_mode=padding_mode)
        else:
            self.square_conv = nn.Conv2d(in_channels=in_channels, out_channels=out_channels,
                                         kernel_size=(kernel_size, kernel_size), stride=stride,
                                         padding=padding, dilation=dilation, groups=groups, bias=False,
                                         padding_mode=padding_mode)
            self.square_bn = nn.BatchNorm2d(num_features=out_channels)

            center_offset_from_origin_border = padding - kernel_size // 2
            ver_pad_or_crop = (center_offset_from_origin_border + 1, center_offset_from_origin_border)
            hor_pad_or_crop = (center_offset_from_origin_border, center_offset_from_origin_border + 1)
            if center_offset_from_origin_border >= 0:
                self.ver_conv_crop_layer = nn.Identity()
                ver_conv_padding = ver_pad_or_crop
                self.hor_conv_crop_layer = nn.Identity()
                hor_conv_padding = hor_pad_or_crop
            else:
                self.ver_conv_crop_layer = CropLayer(crop_set=ver_pad_or_crop)
                ver_conv_padding = (0, 0)
                self.hor_conv_crop_layer = CropLayer(crop_set=hor_pad_or_crop)
                hor_conv_padding = (0, 0)
            self.ver_conv = nn.Conv2d(in_channels=in_channels, out_channels=out_channels, kernel_size=(3, 1),
                                      stride=stride,
                                      padding=ver_conv_padding, dilation=dilation, groups=groups, bias=False,
                                      padding_mode=padding_mode)

            self.hor_conv = nn.Conv2d(in_channels=in_channels, out_channels=out_channels, kernel_size=(1, 3),
                                      stride=stride,
                                      padding=hor_conv_padding, dilation=dilation, groups=groups, bias=False,
                                      padding_mode=padding_mode)
            self.ver_bn = nn.BatchNorm2d(num_features=out_channels)
            self.hor_bn = nn.BatchNorm2d(num_features=out_channels)



    def forward(self, input):
        if self.deploy:
            return self.fused_conv(input)
        else:
            square_outputs = self.square_conv(input)
            square_outputs = self.square_bn(square_outputs)
            # print(square_outputs.size())
            # return square_outputs
            vertical_outputs = self.ver_conv_crop_layer(input)
            vertical_outputs = self.ver_conv(vertical_outputs)
            vertical_outputs = self.ver_bn(vertical_outputs)
            # print(vertical_outputs.size())
            horizontal_outputs = self.hor_conv_crop_layer(input)
            horizontal_outputs = self.hor_conv(horizontal_outputs)
            horizontal_outputs = self.hor_bn(horizontal_outputs)
            # print(horizontal_outputs.size())
            return square_outputs + vertical_outputs + horizontal_outputs

3 結果

消融實驗：

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

ACNet論文閱讀筆記

1 研究思路

2 結構

2.1 非對稱卷積

2.2 公式推導

2.3 code

3 結果

2019超分辨綜述

backbone之MobileNet-v1論文閱讀筆記

語義分割之《PIXEL DECONVOLUTIONAL NETWORKS》論文閱讀筆記

《Bag of Tricks for Image Classification with Convolutional Neural Networks》論文閱讀筆記

感知器及雙月實驗

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結