易 AI - 使用 TensorFlow 2 Keras 實現 ResNet 網絡

前言

上一篇筆者使用如何閱讀深度學習論文的方法閱讀了 ResNet。爲了加深理解，本文帶大家使用 TensorFlow 2 Keras 實現 ResNet 網絡。

網絡結構

從論文了解到，可以根據網絡的深度開發不同類型的 ResNet，如：ResNet-34、ResNet-50、ResNet-152，甚至可以自定義網絡的深度，來滿足需求。因此，ResNet 更像是一種範式，可能叫 ResNets 會更合適些。

ResNet 可以看做是 VGG 的升級版，區別在於 ResNet 中使用的快捷連接(shortcuts)。在下圖中，我們可以看到 VGG 的架構以及 34 層 ResNet。

圖 3. ImageNet 的網絡架構例子。左：作爲參考的 VGG-19 模型[41]。中：具有 34 個參數層的簡單網絡（36 億 FLOPs）。右：具有 34 個參數層的殘差網絡（36 億 FLOPs）。帶點的快捷連接增加了維度。表 1 顯示了更多細節和其它變種。

而對於不同的網絡類型，其中的構建塊（building block）也不一樣，如下圖所示：

圖 5. ImageNet 的深度殘差函數 F。左：ResNet-34 的構建塊（在 56×56 的特徵圖上），如圖 3。右：ResNet-50/101/152 的 “bottleneck”構建塊。

論文也提供了不同深度的 ResNet 的架構圖，如下所示：

表 1. ImageNet 架構。構建塊顯示在括號中（也可看圖 5），以及構建塊的堆疊數量。下采樣通過步長爲 2 的 conv3_1, conv4_1 和 conv5_1 執行。

實現

下面以 ResNet-50 爲例，其他的類型都是類似的。

注：源碼已經上傳 https://github.com/CatchZeng/YiAI-examples/blob/master/papers/ResNet/ResNet.py，需要的同學可以參考。

首先，先實現下堆疊的殘差結構（上圖紅色框部分）。

def ResNet50(include_top=True,
             input_shape=None,
             pooling=None,
             classes=1000):
    """Instantiates the ResNet50 architecture."""

    # 堆疊的殘差結構
    def stack_fn(x):
        x = stack(x, 64, 3, stride1=1, name='conv2')
        x = stack(x, 128, 4, name='conv3')
        x = stack(x, 256, 6, name='conv4')
        return stack(x, 512, 3, name='conv5')

    return ResNet(stack_fn, 'resnet50', include_top, input_shape, pooling, classes)

def stack(x, filters, blocks, stride1=2, name=None):
    """A set of stacked residual blocks.

    Args:
      x: input tensor.
      filters: integer, filters of the bottleneck layer in a block.
      blocks: integer, blocks in the stacked blocks.
      stride1: default 2, stride of the first layer in the first block.
      name: string, stack label.

    Returns:
      Output tensor for the stacked blocks.
    """
    x = block(x, filters, stride=stride1, name=name + '_block1')
    # 沒有增加維度的時候可以做恆等快捷連接，不需要 conv_shortcut，可以參考圖 3。
    for i in range(2, blocks + 1):
        x = block(x, filters, conv_shortcut=False,
                  name=name + '_block' + str(i))
    return x

def block(x, filters, kernel_size=3, stride=1, conv_shortcut=True, name=None):
    """A residual block.

    Args:
      x: input tensor.
      filters: integer, filters of the bottleneck layer.
      kernel_size: default 3, kernel size of the bottleneck layer.
      stride: default 1, stride of the first layer.
      conv_shortcut: default True, use convolution shortcut if True,
          otherwise identity shortcut.
      name: string, block label.

    Returns:
      Output tensor for the residual block.
    """
    bn_axis = 3 if backend.image_data_format() == 'channels_last' else 1

    # 當維度增加（圖 3 中的虛線快捷連接）時，我們考慮兩個選項：（A）快捷連接仍然執行恆等映射，額外填充零輸入以增加維度。此選項不會引入額外的參數；（B）方程（2）中的投影快捷連接用於匹配維度（由 1×1 卷積完成）。對於這兩個選項，當快捷連接跨越兩種尺寸的特徵圖時，它們執行時步長爲 2。
    if conv_shortcut:
        shortcut = layers.Conv2D(
            4 * filters, 1, strides=stride, name=name + '_0_conv')(x)
        shortcut = layers.BatchNormalization(
            axis=bn_axis, epsilon=1.001e-5, name=name + '_0_bn')(shortcut)
    # 當輸入和輸出具有相同的維度時（圖 3 中的實線快捷連接）時，可以直接使用恆等快捷連接
    else:
        shortcut = x

    # 1x1xfilters
    x = layers.Conv2D(filters, 1, strides=stride, name=name + '_1_conv')(x)
    x = layers.BatchNormalization(
        axis=bn_axis, epsilon=1.001e-5, name=name + '_1_bn')(x)
    x = layers.Activation('relu', name=name + '_1_relu')(x)

    # 3x3xfilters
    x = layers.Conv2D(
        filters, kernel_size, padding='SAME', name=name + '_2_conv')(x)
    x = layers.BatchNormalization(
        axis=bn_axis, epsilon=1.001e-5, name=name + '_2_bn')(x)
    x = layers.Activation('relu', name=name + '_2_relu')(x)

    # 1x1x(4倍filters)
    x = layers.Conv2D(4 * filters, 1, name=name + '_3_conv')(x)
    x = layers.BatchNormalization(
        axis=bn_axis, epsilon=1.001e-5, name=name + '_3_bn')(x)

    x = layers.Add(name=name + '_add')([shortcut, x])
    x = layers.Activation('relu', name=name + '_out')(x)
    return x

接着再實現輸入和輸出部分（上圖綠色和藍色框部分），代碼如下：

def ResNet(stack_fn,
           model_name='resnet',
           include_top=True,
           input_shape=None,
           pooling=None,
           classes=1000,
           classifier_activation='softmax'):
    """Instantiates the ResNet, ResNetV2, and ResNeXt architecture.

    Args:
      stack_fn: a function that returns output tensor for the
        stacked residual blocks.
      model_name: string, model name.
      include_top: whether to include the fully-connected
        layer at the top of the network.
      input_shape: optional shape tuple, `(224, 224, 3)` (with `channels_last` data format)
        or `(3, 224, 224)` (with `channels_first` data format).
        It should have exactly 3 inputs channels.
      pooling: optional pooling mode for feature extraction
        when `include_top` is `False`.
        - `None` means that the output of the model will be
            the 4D tensor output of the
            last convolutional layer.
        - `avg` means that global average pooling
            will be applied to the output of the
            last convolutional layer, and thus
            the output of the model will be a 2D tensor.
        - `max` means that global max pooling will
            be applied.
      classes: optional number of classes to classify images
        into, only to be specified if `include_top` is True, and
        if no `weights` argument is specified.
      classifier_activation: A `str` or callable. The activation function to use
        on the "top" layer. Ignored unless `include_top=True`. Set
        `classifier_activation=None` to return the logits of the "top" layer.
        When loading pretrained weights, `classifier_activation` can only
        be `None` or `"softmax"`.

    Returns:
      A `keras.Model` instance.
    """

    img_input = layers.Input(shape=input_shape)

    bn_axis = 3 if backend.image_data_format() == 'channels_last' else 1

    x = layers.ZeroPadding2D(
        padding=((3, 3), (3, 3)), name='conv1_pad')(img_input)
    # conv 1 7x7, 64, stride 2
    x = layers.Conv2D(64, 7, strides=2, use_bias=True,
                      name='conv1_conv')(x)

    x = layers.BatchNormalization(
        axis=bn_axis, epsilon=1.001e-5, name='conv1_bn')(x)
    x = layers.Activation('relu', name='conv1_relu')(x)

    x = layers.ZeroPadding2D(padding=((1, 1), (1, 1)), name='pool1_pad')(x)
    # 3x3 max pool, stride 2
    x = layers.MaxPooling2D(3, strides=2, name='pool1_pool')(x)

    x = stack_fn(x)

    if include_top:
        # average pool, 1000-d fc, softmax
        x = layers.GlobalAveragePooling2D(name='avg_pool')(x)
        x = layers.Dense(classes, activation=classifier_activation,
                         name='predictions')(x)
    else:
        if pooling == 'avg':
            x = layers.GlobalAveragePooling2D(name='avg_pool')(x)
        elif pooling == 'max':
            x = layers.GlobalMaxPooling2D(name='max_pool')(x)

    inputs = img_input

    model = Model(inputs, x, name=model_name)

    return model

測試

通過與 tensorflow 官方的代碼對比，可以驗證模型編寫的準確性。

if __name__ == '__main__':
    model = ResNet50(include_top=True, input_shape=(224, 224, 3), classes=10)
    model.summary()

    print("----------------------------------------")

    from tensorflow.keras.applications import resnet
    model2 = resnet.ResNet50(
        include_top=True, weights=None, input_shape=(224, 224, 3), classes=10)
    model2.summary()

小結

實踐出真知，從閱讀到實踐，是一個提升的過程。在實踐中，不但可以瞭解到實現的細節，而且還能熟悉 TensorFlow 的生態。強烈推薦大家，多看論文，並實踐。

參考

https://www.analyticsvidhya.com/blog/2021/08/how-to-code-your-resnet-from-scratch-in-tensorflow/

易 AI - 使用 TensorFlow 2 Keras 實現 ResNet 網絡

前言

網絡結構

實現

測試

小結

延伸閱讀

參考

使用c#強大的表達式樹實現對象的深克隆之解決循環引用的問題

GPT-4o 引領人機交互新風向，向量數據庫賽道沸騰了

free AI online tools All In One

痞子衡嵌入式：恩智浦i.MX RT1xxx系列MCU啓動那些事（12.A）- uSDHC eMMC啓動時間(RT1170)

基於Ubuntu-22.04安裝K8s-v1.28.2實驗（二）使用kube-vip實現集羣VIP訪問

企業大模型如何成爲自己數據的“百科全書”？

本地SSL證書過期輸入命令在IIS自動生成

.NET週刊【5月第2期 2024-05-12】

基於Ubuntu-22.04安裝K8s-v1.28.2實驗（一）部署K8s

基於Ubuntu-22.04安裝K8s-v1.28.2實驗（三）數據卷掛載NFS（網絡文件系統）

Istio 1.16 Beta Alpha Experimental 中間版本

Istio 引入 Ambient Mesh（無 sidecar 數據平面模式)，讓服務網格真正成爲通信基礎設施

VSCode - Volar 插件保存文件時不斷閃爍，導致無法保存

FAQ - Mac 設置無線 Wi-Fi 和有線網絡同時訪問內外網

golang 1.19 工具、運行時、庫、性能，改良版

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結