YOLOv3源碼閱讀之四：layer_utils.py

原創

2019-09-04 17:47

一、YOLO簡介

YOLO（You Only Look Once）是一個高效的目標檢測算法，屬於One-Stage大家族，針對於Two-Stage目標檢測算法普遍存在的運算速度慢的缺點，YOLO創造性的提出了One-Stage。也就是將物體分類和物體定位在一個步驟中完成。YOLO直接在輸出層迴歸bounding box的位置和bounding box所屬類別，從而實現one-stage。

經過兩次迭代，YOLO目前的最新版本爲YOLOv3，在前兩版的基礎上，YOLOv3進行了一些比較細節的改動，效果有所提升。

本文正是希望可以將源碼加以註釋，方便自己學習，同時也願意分享出來和大家一起學習。由於本人還是一學生，如果有錯還請大家不吝指出。

本文參考的源碼地址爲：https://github.com/wizyoung/YOLOv3_TensorFlow

二、代碼和註釋

文件目錄：YOUR_PATH\YOLOv3_TensorFlow-master\utils\layer_utils.py

這裏函數的主要作用是對卷積等操作做出一定的個性化封裝，方便代碼的編寫。主要包括：

卷積的封裝
darknet網絡結構的定義
resize的定義，默認是最近鄰方法
在主體網絡的基礎上做的YOLO的附加的卷積操作，爲後面的特徵融合做準備

# coding: utf-8

from __future__ import division, print_function

import numpy as np
import tensorflow as tf

slim = tf.contrib.slim


def conv2d(inputs, filters, kernel_size, strides=1):
    # 對conv2d做一定的個性化封裝，方便代碼的編寫和閱讀
    def _fixed_padding(inputs, kernel_size):
        pad_total = kernel_size - 1
        pad_beg = pad_total // 2
        pad_end = pad_total - pad_beg

        padded_inputs = tf.pad(inputs, [[0, 0], [pad_beg, pad_end],
                                        [pad_beg, pad_end], [0, 0]], mode='CONSTANT')
        return padded_inputs

    if strides > 1:
        inputs = _fixed_padding(inputs, kernel_size)
    inputs = slim.conv2d(inputs, filters, kernel_size, stride=strides,
                         padding=('SAME' if strides == 1 else 'VALID'))
    return inputs


def darknet53_body(inputs):
    """
    darknet的主體網絡框架
    :param inputs: 
    :return: 三張不同尺度的特徵圖
    """
    def res_block(inputs, filters):
        shortcut = inputs
        net = conv2d(inputs, filters * 1, 1)
        net = conv2d(net, filters * 2, 3)

        net = net + shortcut

        return net

    # first two conv2d layers
    net = conv2d(inputs, 32, 3, strides=1)
    net = conv2d(net, 64, 3, strides=2)

    # res_block * 1
    net = res_block(net, 32)

    net = conv2d(net, 128, 3, strides=2)

    # res_block * 2
    for i in range(2):
        net = res_block(net, 64)

    net = conv2d(net, 256, 3, strides=2)

    # res_block * 8
    for i in range(8):
        net = res_block(net, 128)

    route_1 = net
    net = conv2d(net, 512, 3, strides=2)

    # res_block * 8
    for i in range(8):
        net = res_block(net, 256)

    route_2 = net
    net = conv2d(net, 1024, 3, strides=2)

    # res_block * 4
    for i in range(4):
        net = res_block(net, 512)
    route_3 = net

    return route_1, route_2, route_3


def yolo_block(inputs, filters):
    """
    在darknet主體網絡提取特徵的基礎上增加的若干卷積層，爲了後面的特徵融合做準備
    :param inputs: 
    :param filters: 
    :return: 
    """
    net = conv2d(inputs, filters * 1, 1)
    net = conv2d(net, filters * 2, 3)
    net = conv2d(net, filters * 1, 1)
    net = conv2d(net, filters * 2, 3)
    net = conv2d(net, filters * 1, 1)
    route = net
    net = conv2d(net, filters * 2, 3)
    return route, net


def upsample_layer(inputs, out_shape):
    """
    這一部分主要是對特徵圖進行resize，默認使用最近鄰方法
    :param inputs: 
    :param out_shape: 
    :return: 
    """
    new_height, new_width = out_shape[1], out_shape[2]
    # NOTE: here height is the first
    # TODO: Do we need to set `align_corners` as True?
    inputs = tf.image.resize_nearest_neighbor(inputs, (new_height, new_width), name='upsampled')
    return inputs

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

YOLOv3源碼閱讀之四：layer_utils.py

一、YOLO簡介

二、代碼和註釋

機器學習複習：Adaboost算法

LeetCode解題分享：82. Remove Duplicates from Sorted List II

深度學習之卷積：如果卷積核被初始化爲0

卷積可視化：特徵圖的可視化

機器學習複習：A大boost算法

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結