1. VGG模型
2. 用2次3*3卷積代替5*5卷積,3次3*3卷積代替7*7卷積。
(1)卷積結果對應原圖的感受野計算:
(F(i)表示第i層的感受野,如果是最上層的就是1,如果多次卷積就多次迭代計算。)
例如:三次3*3卷積代替7*7卷積
F=1, F3=(1-1)*1+3=3, F2=(3-1)*1+3=5, F1=(5-1)*1+3=7
(2) 參數量對比
① 7*7: 7*7*channels*channels=49*C*C
② 3個3*3 3*3*channels*channels*3=27*C*C
3. 模型代碼
import torch.nn as nn
import torch
class VGG(nn.Module):
def __init__(self, features, num_classes=1000, init_weights=False): # features: 由make_features生成的提取特徵網絡結構
super(VGG, self).__init__()
self.features = features
self.classifier = nn.Sequential( # 最後的三層全連接層 (分類網絡結構)
nn.Dropout(p=0.5), # 與全連接層連接之前,先展平爲1維,爲了減少過擬合進行dropout再與全連接層進行連接(以0.5的比例隨機失活神經元)
nn.Linear(512*7*7, 2048), # 原論文中的節點個數是4096,這裏簡化爲2048
nn.ReLU(True),
nn.Dropout(p=0.5),
nn.Linear(2048, 2048),
nn.ReLU(True),
nn.Linear(2048, num_classes)
)
if init_weights:
self._initialize_weights()
def forward(self, x):
# N x 3 x 224 x 224
x = self.features(x) # 進入卷積層提取特徵
# N x 512 x 7 x 7
x = torch.flatten(x, start_dim=1) # 展平(第0個維度是batch,所以從第一個維度展平)
# N x 512*7*7
x = self.classifier(x) # 全連接層進行分類
return x
def _initialize_weights(self): # 初始化權重
for m in self.modules(): # 遍歷網絡的每一層
if isinstance(m, nn.Conv2d): # 如果當前層是卷積層
# nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
nn.init.xavier_uniform_(m.weight) # 初始化卷積核的權重
if m.bias is not None: # 如果採用了bias,則將bias初始化爲0
nn.init.constant_(m.bias, 0)
elif isinstance(m, nn.Linear): # 當前層是全連接層
nn.init.xavier_uniform_(m.weight) # 初始化全連接層的權重
# nn.init.normal_(m.weight, 0, 0.01)
nn.init.constant_(m.bias, 0)
# 生成提取特徵網絡結構
def make_features(cfg: list): # 傳入含有網絡信息的列表
layers = []
in_channels = 3 # R G B
for v in cfg:
if v == "M":
layers += [nn.MaxPool2d(kernel_size=2, stride=2)]
else:
conv2d = nn.Conv2d(in_channels, v, kernel_size=3, padding=1)
layers += [conv2d, nn.ReLU(True)]
in_channels = v
return nn.Sequential(*layers) # 將列表通過非關鍵字參數的形式傳入
cfgs = {
# 卷積核大小3*3
# 數字表示卷積核個數,‘M’表示maxpooling
'vgg11': [64, 'M', 128, 'M', 256, 256, 'M', 512, 512, 'M', 512, 512, 'M'],
'vgg13': [64, 64, 'M', 128, 128, 'M', 256, 256, 'M', 512, 512, 'M', 512, 512, 'M'],
'vgg16': [64, 64, 'M', 128, 128, 'M', 256, 256, 256, 'M', 512, 512, 512, 'M', 512, 512, 512, 'M'],
'vgg19': [64, 64, 'M', 128, 128, 'M', 256, 256, 256, 256, 'M', 512, 512, 512, 512, 'M', 512, 512, 512, 512, 'M'],
}
# 實例化VGG網絡
def vgg(model_name="vgg16", **kwargs):
try:
cfg = cfgs[model_name]
except:
print("Warning: model number {} not in cfgs dict!".format(model_name))
exit(-1)
model = VGG(make_features(cfg), **kwargs)
return model
4. VGG實驗源碼
注意:使用VGG時,如果使用遷移學習的方法對VGG進行預訓練時需要在RGB三個通道減去[123.68,116.78,103.94],如果從頭訓練則可以忽略。