1. 分組卷積(Group Convolution)
分組卷積最早出現在AlexNet中,如下圖所示。在CNN發展初期,GPU資源不足以滿足訓練任務的要求,因此,Hinton採用了多GPU訓練的策略,每個GPU完成一部分卷積,最後把多個GPU的卷積結果進行融合。
這裏提出一個小小的問題給大家思考:如上圖所示,input Features 是12,將其分爲3個組,每組4個Features map,那麼output Feature maps 的數量可以是任意的嗎,可以是1嗎?
2. 深度可分離卷積(Depthwise Separable Convolution)
3. PyTorch實現
Pytorch是2017年推出的深度學習框架,不同於Tensorflow基於靜態圖的模型搭建方式,PyTorch是完全動態的框架,推出以來很快成爲AI研究人員的熱門選擇並受到推崇。(介紹到此結束)
在PyTorch中,實現二維卷積是通過nn.Conv2d實現的,這個函數是非常強大的,其功能不僅僅是實現常規卷積,通過合理的參數選擇就可以實現分組卷積、空洞卷積。API的官方介紹如下:
CLASS torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True)
- stride: controls the stride for the cross-correlation, a single number or a tuple.
- padding: controls the amount of implicit zero-paddings on both sides for padding number of points for each dimension.
- dilation: controls the spacing between the kernel points; also known as the à trous algorithm. It is harder to describe, but this link has a nice visualization of what dilation does.
- groups: controls the connections between inputs and outputs. in_channels and out_channels must both be divisible by groups. For example,
At groups=1, all inputs are convolved to all outputs.
At groups=2, the operation becomes equivalent to having two conv layers side by side, each seeing half the input channels, and producing half the output channels, and both subsequently concatenated.
At groups= in_channels, each input channel is convolved with its own set of filters.
3.1 分組卷積
分組卷積只需要對nn.Conv2d
中的groups
參數進行設置即可,表示需要分的組數,groups
的默認值爲1,即進行常規卷積。以下是實現分組卷積的代碼:
class CSDN_Tem(nn.Module):
def __init__(self, in_ch, out_ch, groups):
super(CSDN_Tem, self).__init__()
self.conv = nn.Conv2d(
in_channels=in_ch,
out_channels=out_ch,
kernel_size=3,
stride=1,
padding=1,
groups=groups
)
def forward(self, input):
out = self.conv(input)
return out
通過以下代碼對該模型進行測試,設定輸入特徵圖通道數爲16,輸出特徵圖通道數爲64,分組數目爲4:
conv = CSDN_Tem(16, 64, 4)
print(summary(conv, (16, 128, 128), batch_size=1))
控制檯輸出爲:
----------------------------------------------------------------
Layer (type) Output Shape Param #
================================================================
Conv2d-1 [1, 64, 128, 128] 2,368
================================================================
Total params: 2,368
Trainable params: 2,368
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 1.00
Forward/backward pass size (MB): 8.00
Params size (MB): 0.01
Estimated Total Size (MB): 9.01
----------------------------------------------------------------
這一分組卷積過程所需參數爲2368個,其中包含了偏置(Bias).
3.2 深度可分離卷積
深度可分離卷積的PyTorch代碼如下:
class CSDN_Tem(nn.Module):
def __init__(self, in_ch, out_ch):
super(CSDN_Tem, self).__init__()
self.depth_conv = nn.Conv2d(
in_channels=in_ch,
out_channels=in_ch,
kernel_size=3,
stride=1,
padding=1,
groups=in_ch
)
self.point_conv = nn.Conv2d(
in_channels=in_ch,
out_channels=out_ch,
kernel_size=1,
stride=1,
padding=0,
groups=1
)
def forward(self, input):
out = self.depth_conv(input)
out = self.point_conv(out)
return out
採用和分組卷積相同的輸入和輸出通道數,測試代碼如下:
conv = depth_conv(16,64)
print(summary(conv,(16,128,128),batch_size=1))
控制檯輸出結果爲:
----------------------------------------------------------------
Layer (type) Output Shape Param #
================================================================
Conv2d-1 [1, 16, 128, 128] 160
Conv2d-2 [1, 64, 128, 128] 1,088
================================================================
Total params: 1,248
Trainable params: 1,248
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 1.00
Forward/backward pass size (MB): 10.00
Params size (MB): 0.00
Estimated Total Size (MB): 11.00
深度可分離卷積實現相同的操作僅需704個參數。
如有疑問,歡迎留言!