faster rcnn代碼解讀(一)特徵提取

faster rcnn代碼解讀參考

https://github.com/adityaarun1/pytorch_fast-er_rcnn

    https://github.com/jwyang/faster-rcnn.pytorch

實際上是一遍整理一遍修改吧。

這裏借用的是vgg16的遷移學習(transfer learning)進行的或者說微調(fine-tuning)。

一、關於vgg16網絡參數載入及凍結

我直接把vgg16打印出來

VGG16(
  (vgg): VGG(
    (features): Sequential(
      (0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (1): ReLU(inplace=True)
      (2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (3): ReLU(inplace=True)
      (4): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
      (5): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (6): ReLU(inplace=True)
      (7): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (8): ReLU(inplace=True)
      (9): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
      (10): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (11): ReLU(inplace=True)
      (12): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (13): ReLU(inplace=True)
      (14): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (15): ReLU(inplace=True)
      (16): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
      (17): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (18): ReLU(inplace=True)
      (19): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (20): ReLU(inplace=True)
      (21): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (22): ReLU(inplace=True)
      (23): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
      (24): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (25): ReLU(inplace=True)
      (26): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (27): ReLU(inplace=True)
      (28): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (29): ReLU(inplace=True)
    )
    (avgpool): AdaptiveAvgPool2d(output_size=(7, 7))
    (classifier): Sequential(
      (0): Linear(in_features=25088, out_features=4096, bias=True)
      (1): ReLU(inplace=True)
      (2): Dropout(p=0.5, inplace=False)
      (3): Linear(in_features=4096, out_features=4096, bias=True)
      (4): ReLU(inplace=True)
      (5): Dropout(p=0.5, inplace=False)
    )
  )
)

可以看到vgg16包含三個部分:feature特徵提取、avgpool爲全連接降維度、classifier分類。

class VGG16(nn.Module):
    def __init__(self,model_path ):
        super(VGG16, self).__init__()
        self.vgg = models.vgg16().to(cfg['device'])        
        if cfg['net_mode']=='train':
            print("Loading pretrained weights from %s" % (model_path))
            state_dict = torch.load(model_path)
            self.vgg.load_state_dict({k: v for k, v in state_dict.items() if k in self.vgg.state_dict()})
        self.vgg.classifier = nn.Sequential(*list(self.vgg.classifier._modules.values())[:-1])
        self.vgg.features = nn.Sequential(*list(self.vgg.features._modules.values())[:-1])
        for layer in range(10):
            for p in self.vgg.features[layer].parameters():
                p.requires_grad = False

    def forward(self, x):
        out = self.vgg.features(x)
        return out,self.vgg.classifier

常用的套路,features都特徵提取直接拿過來用,載入參數後直接凍結。而classifier層這裏只是因爲後面做rcnn分類的時候少些幾行代碼,這個可以忽略,自己手寫都沒毛病。

二、我後面把vgg16又包了一層:

class FeatureNet(nn.Module):
    def __init__(self):
        super(FeatureNet, self).__init__()
        model_path = cfg['pretrained_model']
        if cfg['feature_net'] =='vgg16':
            self.feature_net =VGG16(model_path)
    def forward(self, inputs):
        features,classifier = self.feature_net(inputs)
        return features,classifier

沒有什麼其他目的,就是爲了讓faster rcnn看起來層次更好一點,因爲features提取既可以用vgg16,也可以用點別的網絡

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章