神经网络模型中-model/model.name_module()的区别

标题 model / model.name_module()的区别:

对于已经训练好的网络模型,为了方便后续使用和提取某一层的参数,在储存模型时,我们会对网络每一层进行命名+储存参数数据。所以我们可以对每一层进行操作,例如剪枝操作。
在判断当前层时,使用:

for name, m0 in model.name_module():
	if isinstance(m0, Conv):
		....(argument you need)
		

那么到底print(model)和print(model.name_module())到底有什么区别呢?
当我们使用 print(model)时,输出model中全部结构和层,例如:
[下列打印的输出是博主在进行shuffleNetV2剪枝的时候输出的模型结构]

ShuffleNetV2_2(
  (conv1): Conv2d(3, 24, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
  (stage2): Sequential(
    (ShuffleUnit_Stage2_0): ShuffleUnit(
      (g_conv_1x1_compress): Sequential(
        (conv1x1): Conv2d(24, 50, kernel_size=(1, 1), stride=(1, 1))
        (batch_norm): BatchNorm2d(50, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU()
      )
      (depthwise_conv3x3): Conv2d(50, 50, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=50)
      (bn_after_depthwise): BatchNorm2d(50, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (g_conv_1x1_expand): Sequential(
        (conv1x1): Conv2d(50, 176, kernel_size=(1, 1), stride=(1, 1), groups=2)
        (batch_norm): BatchNorm2d(176, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (ShuffleUnit_Stage2_1): ShuffleUnit(
      (g_conv_1x1_compress): Sequential(
        (conv1x1): Conv2d(200, 50, kernel_size=(1, 1), stride=(1, 1), groups=2)
        (batch_norm): BatchNorm2d(50, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU()
      )
      (depthwise_conv3x3): Conv2d(50, 50, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=50)
      (bn_after_depthwise): BatchNorm2d(50, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (g_conv_1x1_expand): Sequential(
        (conv1x1): Conv2d(50, 200, kernel_size=(1, 1), stride=(1, 1), groups=2)
        (batch_norm): BatchNorm2d(200, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (ShuffleUnit_Stage2_2): ShuffleUnit(
      (g_conv_1x1_compress): Sequential(
        (conv1x1): Conv2d(200, 50, kernel_size=(1, 1), stride=(1, 1), groups=2)
        (batch_norm): BatchNorm2d(50, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU()
      )
      (depthwise_conv3x3): Conv2d(50, 50, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=50)
      (bn_after_depthwise): BatchNorm2d(50, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (g_conv_1x1_expand): Sequential(
        (conv1x1): Conv2d(50, 200, kernel_size=(1, 1), stride=(1, 1), groups=2)
        (batch_norm): BatchNorm2d(200, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (ShuffleUnit_Stage2_3): ShuffleUnit(
      (g_conv_1x1_compress): Sequential(
        (conv1x1): Conv2d(200, 50, kernel_size=(1, 1), stride=(1, 1), groups=2)
        (batch_norm): BatchNorm2d(50, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU()
      )
      (depthwise_conv3x3): Conv2d(50, 50, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=50)
      (bn_after_depthwise): BatchNorm2d(50, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (g_conv_1x1_expand): Sequential(
        (conv1x1): Conv2d(50, 200, kernel_size=(1, 1), stride=(1, 1), groups=2)
        (batch_norm): BatchNorm2d(200, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
  )
 ])

但我们打印print(model.name_module()):

为了方便各位观看,这里只截取一小部分,进行讲解展示

  1. model.name_module()先将model全部内容打印一边,如同前面print(model)结果一样
  2. 打印一边全部内容后,进行迭代,将结构中的全部block中的每一层打印一边:
    1)先打印当前block,2)后打印当前block中的每一层
    3)以此类推,把全部结构打印出来
    举个本文代码的例子,结构共16个block,但最后打印出来166个结果,具体打印出来多少,根据每一个block中的层数而定!
0  ShuffleNetV2_2(
  (conv1): Conv2d(3, 24, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
  (stage2): Sequential(
    (ShuffleUnit_Stage2_0): ShuffleUnit(
      (g_conv_1x1_compress): Sequential(
        (conv1x1): Conv2d(24, 50, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (batch_norm): BatchNorm2d(50, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU()
      )
      (depthwise_conv3x3): Conv2d(50, 50, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=50, bias=False)
      (bn_after_depthwise): BatchNorm2d(50, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (g_conv_1x1_expand): Sequential(
        (conv1x1): Conv2d(50, 176, kernel_size=(1, 1), stride=(1, 1), groups=2, bias=False)
        (batch_norm): BatchNorm2d(176, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (ShuffleUnit_Stage2_1): ShuffleUnit(
      (g_conv_1x1_compress): Sequential(
        (conv1x1): Conv2d(200, 50, kernel_size=(1, 1), stride=(1, 1), groups=2, bias=False)
        (batch_norm): BatchNorm2d(50, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU()
      )
      (depthwise_conv3x3): Conv2d(50, 50, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=50, bias=False)
      (bn_after_depthwise): BatchNorm2d(50, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (g_conv_1x1_expand): Sequential(
        (conv1x1): Conv2d(50, 200, kernel_size=(1, 1), stride=(1, 1), groups=2, bias=False)
        (batch_norm): BatchNorm2d(200, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (ShuffleUnit_Stage2_2): ShuffleUnit(
      (g_conv_1x1_compress): Sequential(
        (conv1x1): Conv2d(200, 50, kernel_size=(1, 1), stride=(1, 1), groups=2, bias=False)
        (batch_norm): BatchNorm2d(50, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU()
      )
      (depthwise_conv3x3): Conv2d(50, 50, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=50, bias=False)
      (bn_after_depthwise): BatchNorm2d(50, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (g_conv_1x1_expand): Sequential(
        (conv1x1): Conv2d(50, 200, kernel_size=(1, 1), stride=(1, 1), groups=2, bias=False)
        (batch_norm): BatchNorm2d(200, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (ShuffleUnit_Stage2_3): ShuffleUnit(
      (g_conv_1x1_compress): Sequential(
        (conv1x1): Conv2d(200, 50, kernel_size=(1, 1), stride=(1, 1), groups=2, bias=False)
        (batch_norm): BatchNorm2d(50, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU()
      )
      (depthwise_conv3x3): Conv2d(50, 50, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=50, bias=False)
      (bn_after_depthwise): BatchNorm2d(50, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (g_conv_1x1_expand): Sequential(
        (conv1x1): Conv2d(50, 200, kernel_size=(1, 1), stride=(1, 1), groups=2, bias=False)
        (batch_norm): BatchNorm2d(200, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
  )

下面的输出结果解释什么叫迭代输出:

4 是当前block,内含conv,bn,relu三层
5 是迭代4中的block内容,依次输出 5 conv,6 bn,7 relu

4 stage2.ShuffleUnit_Stage2_0.g_conv_1x1_compress Sequential(
  (conv1x1): Conv2d(24, 50, kernel_size=(1, 1), stride=(1, 1), bias=False)
  (batch_norm): BatchNorm2d(50, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (relu): ReLU()
)
5 stage2.ShuffleUnit_Stage2_0.g_conv_1x1_compress.conv1x1 Conv2d(24, 50, kernel_size=(1, 1), stride=(1, 1), bias=False)
6 stage2.ShuffleUnit_Stage2_0.g_conv_1x1_compress.batch_norm BatchNorm2d(50, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
7 stage2.ShuffleUnit_Stage2_0.g_conv_1x1_compress.relu ReLU()

所以在使用model中block内容的时候,一定弄清两者区别
尤其要注意后者的输出内容和输出顺序!!

祝好

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章