YOLOv3源碼閱讀之四:layer_utils.py

一、YOLO簡介

  YOLO(You Only Look Once)是一個高效的目標檢測算法,屬於One-Stage大家族,針對於Two-Stage目標檢測算法普遍存在的運算速度慢的缺點,YOLO創造性的提出了One-Stage。也就是將物體分類和物體定位在一個步驟中完成。YOLO直接在輸出層迴歸bounding box的位置和bounding box所屬類別,從而實現one-stage。

  經過兩次迭代,YOLO目前的最新版本爲YOLOv3,在前兩版的基礎上,YOLOv3進行了一些比較細節的改動,效果有所提升。

  本文正是希望可以將源碼加以註釋,方便自己學習,同時也願意分享出來和大家一起學習。由於本人還是一學生,如果有錯還請大家不吝指出。

  本文參考的源碼地址爲:https://github.com/wizyoung/YOLOv3_TensorFlow

二、代碼和註釋

  文件目錄:YOUR_PATH\YOLOv3_TensorFlow-master\utils\layer_utils.py

  這裏函數的主要作用是對卷積等操作做出一定的個性化封裝,方便代碼的編寫。主要包括:

  • 卷積的封裝
  • darknet網絡結構的定義
  • resize的定義,默認是最近鄰方法
  • 在主體網絡的基礎上做的YOLO的附加的卷積操作,爲後面的特徵融合做準備
# coding: utf-8

from __future__ import division, print_function

import numpy as np
import tensorflow as tf

slim = tf.contrib.slim


def conv2d(inputs, filters, kernel_size, strides=1):
    # 對conv2d做一定的個性化封裝,方便代碼的編寫和閱讀
    def _fixed_padding(inputs, kernel_size):
        pad_total = kernel_size - 1
        pad_beg = pad_total // 2
        pad_end = pad_total - pad_beg

        padded_inputs = tf.pad(inputs, [[0, 0], [pad_beg, pad_end],
                                        [pad_beg, pad_end], [0, 0]], mode='CONSTANT')
        return padded_inputs

    if strides > 1:
        inputs = _fixed_padding(inputs, kernel_size)
    inputs = slim.conv2d(inputs, filters, kernel_size, stride=strides,
                         padding=('SAME' if strides == 1 else 'VALID'))
    return inputs


def darknet53_body(inputs):
    """
    darknet的主體網絡框架
    :param inputs: 
    :return: 三張不同尺度的特徵圖
    """
    def res_block(inputs, filters):
        shortcut = inputs
        net = conv2d(inputs, filters * 1, 1)
        net = conv2d(net, filters * 2, 3)

        net = net + shortcut

        return net

    # first two conv2d layers
    net = conv2d(inputs, 32, 3, strides=1)
    net = conv2d(net, 64, 3, strides=2)

    # res_block * 1
    net = res_block(net, 32)

    net = conv2d(net, 128, 3, strides=2)

    # res_block * 2
    for i in range(2):
        net = res_block(net, 64)

    net = conv2d(net, 256, 3, strides=2)

    # res_block * 8
    for i in range(8):
        net = res_block(net, 128)

    route_1 = net
    net = conv2d(net, 512, 3, strides=2)

    # res_block * 8
    for i in range(8):
        net = res_block(net, 256)

    route_2 = net
    net = conv2d(net, 1024, 3, strides=2)

    # res_block * 4
    for i in range(4):
        net = res_block(net, 512)
    route_3 = net

    return route_1, route_2, route_3


def yolo_block(inputs, filters):
    """
    在darknet主體網絡提取特徵的基礎上增加的若干卷積層,爲了後面的特徵融合做準備
    :param inputs: 
    :param filters: 
    :return: 
    """
    net = conv2d(inputs, filters * 1, 1)
    net = conv2d(net, filters * 2, 3)
    net = conv2d(net, filters * 1, 1)
    net = conv2d(net, filters * 2, 3)
    net = conv2d(net, filters * 1, 1)
    route = net
    net = conv2d(net, filters * 2, 3)
    return route, net


def upsample_layer(inputs, out_shape):
    """
    這一部分主要是對特徵圖進行resize,默認使用最近鄰方法
    :param inputs: 
    :param out_shape: 
    :return: 
    """
    new_height, new_width = out_shape[1], out_shape[2]
    # NOTE: here height is the first
    # TODO: Do we need to set `align_corners` as True?
    inputs = tf.image.resize_nearest_neighbor(inputs, (new_height, new_width), name='upsampled')
    return inputs

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章