上篇文章简单介绍了卷积神经网络的经典模型，本节将介绍如何使用TensorFlow.Keras实现各个模型。

文章目录

1. VGG Blocks

VGG卷积神经网络架构以牛津大学Visual Geometry Group命名，是将深度学习方法用于计算机视觉的重要里程碑。该模型的关键创新是重复堆叠VGG Block，使用较小的filter（如3×3）进行卷积，之后是步幅为2尺寸为2×2的最大池化层。

开发新模型时，具有VGG块的卷积神经网络是较好的起点，因为它易于实现，并且可以非常有效地从图像中提取特征。

下例将VGG块进行多层堆叠，它们具有相同数量的filter，size为3×3，stride为1×1，使用padding以使得输出特征图的大小与输入特征图的大小相同，并使用relu激活函数，之后使用最大池化层，其尺寸和步幅都为2×2。定义输入尺寸为256×256×3，绘制网络结构图。完整示例如下：（防抄袭水印：CSDN：datamonday。本文只发布在CSDN，其余平台皆为抄袭。）

from keras.models import Model
from keras.layers import Input, Conv2D, MaxPooling2D
from keras.utils import plot_model

# VGG block
def vgg_block(layer_in, n_filters, n_conv):
    
    # add convolutional layers
    for _ in range(n_conv):
        layer_in = Conv2D(n_filters, (3,3), padding='same', activation='relu', name=f'conv_{_}')(layer_in)
    
    # add max pooling layer
    layer_in = MaxPooling2D((2,2), strides=(2,2), name='maxpool')(layer_in)
    
    return layer_in

# define model input
visible = Input(shape=(256, 256, 3), name='input')

# add vgg module
layer = vgg_block(visible, 64, 2)

# create model
model = Model(inputs=visible, outputs=layer, name='VGG Block')
model.summary()

# plot model architecture
plot_model(model, show_shapes=True, show_layer_names=False, to_file='vgg_block.png', dpi=200)

输出：

Model: "VGG Block"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input (InputLayer)           (None, 256, 256, 3)       0         
_________________________________________________________________
conv_0 (Conv2D)              (None, 256, 256, 64)      1792      
_________________________________________________________________
conv_1 (Conv2D)              (None, 256, 256, 64)      36928     
_________________________________________________________________
maxpool (MaxPooling2D)       (None, 128, 128, 64)      0         
=================================================================
Total params: 38,720
Trainable params: 38,720
Non-trainable params: 0
_________________________________________________________________

【参数量计算】：

$conv\_0$ ： $(3×3×3+1) × 64 = 1792$
$conv\_1$ ： $(3×3×64+1) × 64 = 36928$

更进一步，可以扩展为一个具有3个VGG块的模型，前两个块分别具有两个带64个和128个filter的卷积层，第三个块具有4个具有256个filter的卷积层，这是VGG块的常见用法，其中filter的数量随模型的深度而增加。

from keras.models import Model
from keras.layers import Input, Conv2D, MaxPooling2D, Dense
from keras.utils import plot_model
import numpy as np 

# VGG block
def vgg_block(layer_in, n_filters, n_conv_start, n_conv_end, pool_index):
    '''
    layer_in：输入模型
    n_filters：过滤器数量
    n_conv_start：卷积层起始索引
    n_conv_end：卷积层结束索引+1，前闭后开。
    pool_index：池化层索引
    
    '''
    # add convolutional layers
    for _ in np.arange(n_conv_start, n_conv_end, 1):
        layer_in = Conv2D(n_filters, (3,3), padding='same', activation='relu', name=f'conv_{_}')(layer_in)
    
    # add max pooling layer
    layer_in = MaxPooling2D((2,2), strides=(2,2), name=f'maxpool_{pool_index}')(layer_in)
    
    return layer_in

# define model input
visible = Input(shape=(256, 256, 3), name='input')

# add vgg module
layer = vgg_block(visible, 64, 1, 4, 1)
layer = vgg_block(layer, 128, 4, 6, 2)
layer = vgg_block(layer, 256, 6, 8, 3)


# add dense(fc) layer
dense = Dense(1024, name='dense_1')(layer)
dense = Dense(1024, name='dense_2')(dense)
dense = Dense(512, name='dense_3')(dense)

# create model
model = Model(inputs=visible, outputs=dense, name='VGG Navie Model')
model.summary()

# plot model architecture
plot_model(model, show_shapes=True, show_layer_names=True, to_file='vgg_block.png', dpi=200)

输出：

Model: "VGG Navie Model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input (InputLayer)           (None, 256, 256, 3)       0         
_________________________________________________________________
conv_1 (Conv2D)              (None, 256, 256, 64)      1792      
_________________________________________________________________
conv_2 (Conv2D)              (None, 256, 256, 64)      36928     
_________________________________________________________________
conv_3 (Conv2D)              (None, 256, 256, 64)      36928     
_________________________________________________________________
maxpool_1 (MaxPooling2D)     (None, 128, 128, 64)      0         
_________________________________________________________________
conv_4 (Conv2D)              (None, 128, 128, 128)     73856     
_________________________________________________________________
conv_5 (Conv2D)              (None, 128, 128, 128)     147584    
_________________________________________________________________
maxpool_2 (MaxPooling2D)     (None, 64, 64, 128)       0         
_________________________________________________________________
conv_6 (Conv2D)              (None, 64, 64, 256)       295168    
_________________________________________________________________
conv_7 (Conv2D)              (None, 64, 64, 256)       590080    
_________________________________________________________________
maxpool_3 (MaxPooling2D)     (None, 32, 32, 256)       0         
_________________________________________________________________
dense_1 (Dense)              (None, 32, 32, 1024)      263168    
_________________________________________________________________
dense_2 (Dense)              (None, 32, 32, 1024)      1049600   
_________________________________________________________________
dense_3 (Dense)              (None, 32, 32, 512)       524800    
=================================================================
Total params: 3,019,904
Trainable params: 3,019,904
Non-trainable params: 0
_________________________________________________________________

【参数量计算】：

$dense\_1$ ： $(256+1) \times 1024 = 263168$
$dense\_2$ ： $(1024+1) \times 1024 = 1049600$
$dense\_3$ ： $(1024+1) \times 512 = 524800$

2. Inception Module

该模型的关键创新为Inception模块，其具有不同大小的过滤器（例如1× 1、3×3、5×5）的卷积层以及并行的3×3的池化层，然后将其结果合并在一起。

这是一个非常简单而强大的结构单元，它使模型不仅可以学习相同大小的并行过滤器，而且可以学习不同大小的并行过滤器，从而可以进行多个尺度的学习。

# Inception module
from keras.models import Model
from keras.layers import Input, Conv2D, MaxPooling2D, concatenate
from keras.utils import plot_model

def inception_module(layer_in, f1, f2_in, f2_out, f3_in, f3_out, f4_out):
    '''
    layer_in：模型输入
    f1：1×1卷积核数量
    f2_in：3×3卷积核数量
    f2_out：3×3卷积核数量
    f3_in：5×5卷积核数量
    f3_ou：5×5卷积核数量
    f4_out：最大池化层的filter数量
    '''
    # 1x1 conv
    conv1 = Conv2D(f1, (1,1), padding='same', activation='relu')(layer_in)
    
    # 3x3 conv
    conv3 = Conv2D(f2_in, (1,1), padding='same', activation='relu')(layer_in)
    conv3 = Conv2D(f2_out, (3,3), padding='same', activation='relu')(conv3)
    
    # 5x5 conv
    conv5 = Conv2D(f3_in, (1,1), padding='same', activation='relu')(layer_in)
    conv5 = Conv2D(f3_out, (5,5), padding='same', activation='relu')(conv5)
    
    # 3x3 max pooling
    pool = MaxPooling2D((3,3), strides=(1,1), padding='same')(layer_in)
    pool = Conv2D(f4_out, (1,1), padding='same', activation='relu')(pool)
    
    # concatenate filters, assumes filters/channels last
    layer_out = concatenate([conv1, conv3, conv5, pool], axis=-1)
    
    return layer_out

# define model input
visible = Input(shape=(256, 256, 3), name='input')

# add inception block 1
layer = inception_module(visible, 64, 96, 128, 16, 32, 32)
layer = inception_module(layer, 128, 128, 192, 32, 96, 64)

# create model
model = Model(inputs=visible, outputs=layer, name='Inception Model')
model.summary()

# plot model architecture
plot_model(model, show_shapes=True, show_layer_names=False, to_file='inception_module.png', dpi=200)

输出：

Model: "Inception Model"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input (InputLayer)              (None, 256, 256, 3)  0                                            
__________________________________________________________________________________________________
conv2d_20 (Conv2D)              (None, 256, 256, 96) 384         input[0][0]                      
__________________________________________________________________________________________________
conv2d_22 (Conv2D)              (None, 256, 256, 16) 64          input[0][0]                      
__________________________________________________________________________________________________
max_pooling2d_6 (MaxPooling2D)  (None, 256, 256, 3)  0           input[0][0]                      
__________________________________________________________________________________________________
conv2d_19 (Conv2D)              (None, 256, 256, 64) 256         input[0][0]                      
__________________________________________________________________________________________________
conv2d_21 (Conv2D)              (None, 256, 256, 128 110720      conv2d_20[0][0]                  
__________________________________________________________________________________________________
conv2d_23 (Conv2D)              (None, 256, 256, 32) 12832       conv2d_22[0][0]                  
__________________________________________________________________________________________________
conv2d_24 (Conv2D)              (None, 256, 256, 32) 128         max_pooling2d_6[0][0]            
__________________________________________________________________________________________________
concatenate_3 (Concatenate)     (None, 256, 256, 256 0           conv2d_19[0][0]                  
                                                                 conv2d_21[0][0]                  
                                                                 conv2d_23[0][0]                  
                                                                 conv2d_24[0][0]                  
__________________________________________________________________________________________________
conv2d_26 (Conv2D)              (None, 256, 256, 128 32896       concatenate_3[0][0]              
__________________________________________________________________________________________________
conv2d_28 (Conv2D)              (None, 256, 256, 32) 8224        concatenate_3[0][0]              
__________________________________________________________________________________________________
max_pooling2d_7 (MaxPooling2D)  (None, 256, 256, 256 0           concatenate_3[0][0]              
__________________________________________________________________________________________________
conv2d_25 (Conv2D)              (None, 256, 256, 128 32896       concatenate_3[0][0]              
__________________________________________________________________________________________________
conv2d_27 (Conv2D)              (None, 256, 256, 192 221376      conv2d_26[0][0]                  
__________________________________________________________________________________________________
conv2d_29 (Conv2D)              (None, 256, 256, 96) 76896       conv2d_28[0][0]                  
__________________________________________________________________________________________________
conv2d_30 (Conv2D)              (None, 256, 256, 64) 16448       max_pooling2d_7[0][0]            
__________________________________________________________________________________________________
concatenate_4 (Concatenate)     (None, 256, 256, 480 0           conv2d_25[0][0]                  
                                                                 conv2d_27[0][0]                  
                                                                 conv2d_29[0][0]                  
                                                                 conv2d_30[0][0]                  
==================================================================================================
Total params: 513,120
Trainable params: 513,120
Non-trainable params: 0
__________________________________________________________________________________________________

3. Residual Module

ResNet中的一个关键创新是残差模块（Residual Module）。残差模块，特别是单位残差模型（identity residual model），是两个卷积层的块，它们具有相同数量的过滤器和较小的过滤器大小，其中第二层的输出与第一卷积层的输入相加。以图形形式绘制，模块的输入被添加到模块的输出中，称为快捷连接（shortcut connection）。

如果输入层中的过滤器数量与模块最后一个卷积层中的过滤器数量不匹配，则会报错。一种解决方案是使用1×1卷积（通常称为投影层（projection layer））来增加输入层的滤波器数量或减少模块中最后一个卷积层的滤波器数量。前一种解决方案更有意义，论文中提出的方法，称为投影快捷方式（projection shortcut）。

# identity or projection residual module
from keras.models import Model
from keras.layers import Input, Activation, Conv2D, MaxPooling2D
from keras.layers import add
from keras.utils import plot_model

def residual_module(layer_in, n_filters):
    merge_input = layer_in
    
    # check if the number of filters needs to be increase(assumes channels last format)
    if layer_in.shape[-1] != n_filters:
        merge_input = Conv2D(n_filters, (1,1), padding='same', activation='relu', kernel_initializer='he_normal')(layer_in)
    
    # conv layer
    conv1 = Conv2D(n_filters, (3,3), padding='same', activation='relu', kernel_initializer='he_normal')(layer_in)
    conv2 = Conv2D(n_filters, (3,3), padding='same', activation='linear', kernel_initializer='he_normal')(conv1)
    
    # add filters, assumes filters(channels last)
    layer_out = add([conv2, merge_input])
    
    # activation function
    layer_out = Activation('relu')(layer_out)
    
    return layer_out

# define model input
visible = Input(shape=(256, 256, 3), name='input')

# add residual module
layer = residual_module(visible, 64)

# create model
model = Model(inputs=visible, outputs=layer, name='Residual Model')
model.summary()

# plot model architecture
plot_model(model, show_shapes=True, show_layer_names=False, to_file='residual_module.png', dpi=200)

输出：

Model: "Residual Model"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input (InputLayer)              (None, 256, 256, 3)  0                                            
__________________________________________________________________________________________________
conv2d_41 (Conv2D)              (None, 256, 256, 64) 1792        input[0][0]                      
__________________________________________________________________________________________________
conv2d_42 (Conv2D)              (None, 256, 256, 64) 36928       conv2d_41[0][0]                  
__________________________________________________________________________________________________
conv2d_40 (Conv2D)              (None, 256, 256, 64) 256         input[0][0]                      
__________________________________________________________________________________________________
add_4 (Add)                     (None, 256, 256, 64) 0           conv2d_42[0][0]                  
                                                                 conv2d_40[0][0]                  
__________________________________________________________________________________________________
activation_4 (Activation)       (None, 256, 256, 64) 0           add_4[0][0]                      
==================================================================================================
Total params: 38,976
Trainable params: 38,976
Non-trainable params: 0
__________________________________________________________________________________________________

参考：
https://machinelearningmastery.com/how-to-implement-major-architecture-innovations-for-convolutional-neural-networks/
https://blog.csdn.net/hzhj2007/article/details/80164909

【CV07】如何使用 Keras 开发 VGG，Inception、ResNet 模块

文章目录

1. VGG Blocks

2. Inception Module

3. Residual Module

[软件工具百科] 互联网资源历史快照归档站点与数字图书馆

网易面试：SpringBoot如何开启虚拟线程？

杭州的 IT 崩盘了么？

程序员常见的文本查看工具

VS2022 解决方案打不开 .NET Framework 4.0 、 4.5 等老项目

Vue3 运行可以，build 打包发布报错，app.config.globalProperties 用法坑

既然测试也要求写代码，那干脆让开发兼任测试不就好了吗？

ITSM落地经验之建设蓝图规划

PDF 补丁丁 1.0.2 版更新

奇怪！应用的日志呢？？

【CV12】如何在Keras使用 Mask R-CNN 進行目標檢測

【CV13】如何在Keras中使用 YOLO v3 進行目標檢測

【CV10】經典CNN模型中圖像數據增強方法簡介

【CV09】如何可視化CNN中的卷積核和特徵圖

【CV11】如何從頭開發於CIFAR-10圖像分類的CNN

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結