一、YOLO簡介
YOLO(You Only Look Once)是一個高效的目標檢測算法,屬於One-Stage大家族,針對於Two-Stage目標檢測算法普遍存在的運算速度慢的缺點,YOLO創造性的提出了One-Stage。也就是將物體分類和物體定位在一個步驟中完成。YOLO直接在輸出層迴歸bounding box的位置和bounding box所屬類別,從而實現one-stage。
經過兩次迭代,YOLO目前的最新版本爲YOLOv3,在前兩版的基礎上,YOLOv3進行了一些比較細節的改動,效果有所提升。
本文正是希望可以將源碼加以註釋,方便自己學習,同時也願意分享出來和大家一起學習。由於本人還是一學生,如果有錯還請大家不吝指出。
本文參考的源碼地址爲:https://github.com/wizyoung/YOLOv3_TensorFlow
二、代碼和註釋
文件目錄:YOUR_PATH\YOLOv3_TensorFlow-master\utils\layer_utils.py
這裏函數的主要作用是對卷積等操作做出一定的個性化封裝,方便代碼的編寫。主要包括:
- 卷積的封裝
- darknet網絡結構的定義
- resize的定義,默認是最近鄰方法
- 在主體網絡的基礎上做的YOLO的附加的卷積操作,爲後面的特徵融合做準備
# coding: utf-8
from __future__ import division, print_function
import numpy as np
import tensorflow as tf
slim = tf.contrib.slim
def conv2d(inputs, filters, kernel_size, strides=1):
# 對conv2d做一定的個性化封裝,方便代碼的編寫和閱讀
def _fixed_padding(inputs, kernel_size):
pad_total = kernel_size - 1
pad_beg = pad_total // 2
pad_end = pad_total - pad_beg
padded_inputs = tf.pad(inputs, [[0, 0], [pad_beg, pad_end],
[pad_beg, pad_end], [0, 0]], mode='CONSTANT')
return padded_inputs
if strides > 1:
inputs = _fixed_padding(inputs, kernel_size)
inputs = slim.conv2d(inputs, filters, kernel_size, stride=strides,
padding=('SAME' if strides == 1 else 'VALID'))
return inputs
def darknet53_body(inputs):
"""
darknet的主體網絡框架
:param inputs:
:return: 三張不同尺度的特徵圖
"""
def res_block(inputs, filters):
shortcut = inputs
net = conv2d(inputs, filters * 1, 1)
net = conv2d(net, filters * 2, 3)
net = net + shortcut
return net
# first two conv2d layers
net = conv2d(inputs, 32, 3, strides=1)
net = conv2d(net, 64, 3, strides=2)
# res_block * 1
net = res_block(net, 32)
net = conv2d(net, 128, 3, strides=2)
# res_block * 2
for i in range(2):
net = res_block(net, 64)
net = conv2d(net, 256, 3, strides=2)
# res_block * 8
for i in range(8):
net = res_block(net, 128)
route_1 = net
net = conv2d(net, 512, 3, strides=2)
# res_block * 8
for i in range(8):
net = res_block(net, 256)
route_2 = net
net = conv2d(net, 1024, 3, strides=2)
# res_block * 4
for i in range(4):
net = res_block(net, 512)
route_3 = net
return route_1, route_2, route_3
def yolo_block(inputs, filters):
"""
在darknet主體網絡提取特徵的基礎上增加的若干卷積層,爲了後面的特徵融合做準備
:param inputs:
:param filters:
:return:
"""
net = conv2d(inputs, filters * 1, 1)
net = conv2d(net, filters * 2, 3)
net = conv2d(net, filters * 1, 1)
net = conv2d(net, filters * 2, 3)
net = conv2d(net, filters * 1, 1)
route = net
net = conv2d(net, filters * 2, 3)
return route, net
def upsample_layer(inputs, out_shape):
"""
這一部分主要是對特徵圖進行resize,默認使用最近鄰方法
:param inputs:
:param out_shape:
:return:
"""
new_height, new_width = out_shape[1], out_shape[2]
# NOTE: here height is the first
# TODO: Do we need to set `align_corners` as True?
inputs = tf.image.resize_nearest_neighbor(inputs, (new_height, new_width), name='upsampled')
return inputs