目錄
相關資料
官方鏈接
yoloV4論文:《YOLOv4: Optimal Speed and Accuracy of Object Detection》
yoloV4 Github代碼:https://github.com/AlexeyAB/darknet
yoloV1~V3 Github代碼:https://github.com/pjreddie/darknet
darknet官網:https://pjreddie.com/darknet/
權重文件
可以從指定鏈接或百度網盤下載:
(1)yoloV4.weights(245MB):
https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v3_optimal/yolov4.weights
https://pan.baidu.com/s/1KBCYYZdJ1XshFUqSWy8EcA
(提取碼:htlf)
(2)yoloV4預訓練權重:yolov4.conv.137
https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v3_optimal/yolov4.conv.137
https://pan.baidu.com/s/1pEmlzaevA6SHBjQImqRUnw
(提取碼:ujy5)
(3)yolov4-tiny.weights(23.1MB):
https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v4_pre/yolov4-tiny.weights
https://pan.baidu.com/s/1Ea9w4LoOXuwgIqFGhfG_dQ
(提取碼:da8k)
(4)tiny yoloV4預訓練權重:yolov4-tiny.conv.29:
https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v4_pre/yolov4-tiny.conv.29
https://pan.baidu.com/s/1xkSUkvTwLe4PmH_v3vM3aQ
(提取碼:7m7o)
訓練步驟
(1)編譯
下載darknet:
git clone https://github.com/AlexeyAB/darknet.git
編譯有兩種方式Makefile和Cmake(推薦make),Makefile配置項及解釋如下:
其中,GPU和CUDNN是GPU加速,CUDNN_HALF是特定硬件加速,OPENCV是否使用OpenCV,AVX和OPENMP是CPU加速
cd darknet
make 或者 make -j8(加速編譯)
(2)數據準備
數據按照VOC或者COCO數據集的格式準備,以VOC格式示例(也可以自定義格式,直接生成label,後續加):
----VOCdevkit\
|----VOC2020\ # 目錄
| |----Annotations\
| | |----00000001.xml # 圖片標註信息
| |----ImageSets\
| | |----Main\ # 訓練:驗證:測試=1:1:2
| | | |----test.txt # 測試集
| | | |----train.txt # 訓練集
| | | |----val.txt # 驗證集
| |----JPEGImages\
| | |----00000001.jpg # 對應圖片
複製/darknet/scripts/VOC_label.py與VOCdevkit文件夾並列,修改VOC_label.py並運行
import xml.etree.ElementTree as ET
import pickle
import os
from os import listdir, getcwd
from os.path import join
sets=[('2020', 'train'), ('2020', 'val'), ('2020', 'test')] # **對應VOC2020和Main下面的數據集**
classes = ["car", "bus", "motor"] # **訓練類別信息**
def convert(size, box):
dw = 1./(size[0])
dh = 1./(size[1])
x = (box[0] + box[1])/2.0 - 1
y = (box[2] + box[3])/2.0 - 1
w = box[1] - box[0]
h = box[3] - box[2]
x = x*dw
w = w*dw
y = y*dh
h = h*dh
return (x,y,w,h)
def convert_annotation(year, image_id):
in_file = open('VOCdevkit/VOC%s/Annotations/%s.xml'%(year, image_id))
out_file = open('VOCdevkit/VOC%s/labels/%s.txt'%(year, image_id), 'w')
tree=ET.parse(in_file)
root = tree.getroot()
size = root.find('size')
w = int(size.find('width').text)
h = int(size.find('height').text)
for obj in root.iter('object'):
difficult = obj.find('difficult').text
cls = obj.find('name').text
if cls not in classes or int(difficult)==1:
continue
cls_id = classes.index(cls)
xmlbox = obj.find('bndbox')
b = (float(xmlbox.find('xmin').text), float(xmlbox.find('xmax').text), float(xmlbox.find('ymin').text), float(xmlbox.find('ymax').text))
bb = convert((w,h), b)
out_file.write(str(cls_id) + " " + " ".join([str(a) for a in bb]) + '\n')
wd = getcwd()
for year, image_set in sets:
if not os.path.exists('VOCdevkit/VOC%s/labels/'%(year)):
os.makedirs('VOCdevkit/VOC%s/labels/'%(year))
image_ids = open('VOCdevkit/VOC%s/ImageSets/Main/%s.txt'%(year, image_set)).read().strip().split()
list_file = open('%s_%s.txt'%(year, image_set), 'w')
for image_id in image_ids:
list_file.write('%s/VOCdevkit/VOC%s/JPEGImages/%s.jpg\n'%(wd, year, image_id))
convert_annotation(year, image_id)
list_file.close()
os.system("cat 2020_train.txt 2020_val.txt > train.txt")
生成yolo訓練數據格式如下:
----train_data\
|----2020_test.txt
|----2020_train.txt
|----2020_val.txt
|----train.txt
|----VOCdevkit\
| |----VOC2020\
| | |----Annotations\
| | | |----00000001.xml
| | |----ImageSets\
| | | |----Main\
| | | | |----test.txt
| | | | |----train.txt
| | | | |----val.txt
| | |----JPEGImages\
| | | |----00000001.jpg
| | |----labels\
| | | |----00000001.txt
|----voc_label.py
(3)修改配置
(a) cfg/yoloV4-custom(tiny).cfg
[net]層要改batch,[yolo]層要改classes,[convolutional]要改filters
(t配置文件搜yolo,tiny改兩處,V4改三處)
[net]
# Testing #測試模式,測試時開啓
#batch=1 #
#subdivisions=1 #
# Training #訓練模式,訓練時開啓,測試時註釋
**batch=256** # 每批數量,根據配置設置,如果內存小,改小batch和subdivisions, batch和subdivisions越大,效果越好
**subdivisions=16** #
width=416
height=416
channels=3 # 輸入圖像width height channels 長寬設置爲32的倍數,因爲下采樣參數是32,最小320*320 最大608*608
momentum=0.9 # 動量參數,影響梯度下降速度
decay=0.0005 # 權重衰減正則項,防止過擬合
angle=0 # 旋轉
saturation = 1.5 # 飽和度擴增
exposure = 1.5 # 曝光度
hue=.1 # 色調
learning_rate=0.00261 # 學習率,權值更新速度
burn_in=1000 # 迭代次數小於burn_in,學習率更新;大於burn_in,採用policy更新
max_batches = 500200 # 訓練達到max_batches停止
policy=steps # 學習率調整策略policy:constant, steps, exp, poly, step, sig, RANDOM
steps=400000,450000 # 步長
scales=.1,.1 # 學習率變化比例
[convolutional]
batch_normalize=1
filters=32
size=3
stride=2
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=64
size=3
stride=2
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=leaky
[route]
layers=-1
groups=2
group_id=1
[convolutional]
batch_normalize=1
filters=32
size=3
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=32
size=3
stride=1
pad=1
activation=leaky
[route]
layers = -1,-2
[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=leaky
[route]
layers = -6,-1
[maxpool]
size=2
stride=2
[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky
[route]
layers=-1
groups=2
group_id=1
[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=leaky
[route]
layers = -1,-2
[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky
[route]
layers = -6,-1
[maxpool]
size=2
stride=2
[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky
[route]
layers=-1
groups=2
group_id=1
[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky
[route]
layers = -1,-2
[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky
[route]
layers = -6,-1
[maxpool]
size=2
stride=2
[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky
##################################
[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky
[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky
[convolutional]
size=1
stride=1
pad=1
**filters=24** # 3*(classes+5)
activation=linear
[yolo]
mask = 3,4,5
anchors = 10,14, 23,27, 37,58, 81,82, 135,169, 344,319
**classes=3** # 類別數,修改爲實際需要數量
num=6
jitter=.3
scale_x_y = 1.05
cls_normalizer=1.0
iou_normalizer=0.07
iou_loss=ciou
ignore_thresh = .7
truth_thresh = 1
random=0
resize=1.5
nms_kind=greedynms
beta_nms=0.6
[route]
layers = -4
[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky
[upsample]
stride=2
[route]
layers = -1, 23
[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky
[convolutional]
size=1
stride=1
pad=1
**filters=24** # 3*(5+classes)
activation=linear
[yolo]
mask = 1,2,3
anchors = 10,14, 23,27, 37,58, 81,82, 135,169, 344,319 #預測框的初始寬高,第一個是w,第二個是h,總數量是num*2
**classes=3** # 類別數,修改爲實際需要數量
num=6 # 每個grid預測的BoundingBox個數
jitter=.3 # # 利用數據抖動產生更多數據抑制過擬合.YOLOv2中使用的是crop,filp,以及net層的angle,flip是隨機的,crop就是jitter的參數,tiny-yolo-voc.cfg中jitter=.2,就是在0~0.2中進行crop
scale_x_y = 1.05
cls_normalizer=1.0
iou_normalizer=0.07
iou_loss=ciou
ignore_thresh = .7 # # 決定是否需要計算IOU誤差的參數,大於thresh,IOU誤差不會夾在cost function中
truth_thresh = 1
random=0 # 如果爲1每次迭代圖片大小隨機從320到608,步長爲32,如果爲0,每次訓練大小與輸入大小一致
resize=1.5
nms_kind=greedynms
beta_nms=0.6
(b) data/voc.names
car
bus
motor
© cfg/voc.data
classes= 3 # 類別數
train = /home/keygo/darknet-master/traindata/train.txt # 訓練集路徑
valid = /home/keygo/darknet-master/traindata/2020_test.txt # 測試集路徑
names = data/voc.names # 類別詳細
backup = /home/keygo/darknet-master/backup # 模型生成路徑,沒有需新建
(4)訓練
可以根據需求設置不同的flag
(a) 多GPU訓練:
(0,1, 指GPU索引,可以nvidia-smi查看)
./darknet detector train data\voc.data cfg\yolov4.cfg yolov4.conv.137 -gpus 0,1
(b) 指定GPU訓練:
(GPU 0)
./darknet detector train data\voc.data cfg\yolov4.cfg yolov4.conv.137 -gpus 0
./darknet detector train data\voc.data cfg\yolov4.cfg yolov4.conv.137 -gpus 0
© 重定向生成log,方便分析(V4已有可視化):
./darknet detector train cfg/voc.data cfg/yolov3-tiny.cfg yolov3-tiny.conv.15 -gpus 0,1 2>&1 > tiny.log
(d) 停止後繼續訓練:
./darknet detector train cfg/voc.data cfg/yolov4-tiny.cfg backup/yolov4-tiny_99000.weights -gpus 0,1
(e) mAP可視化
類別平均精度,每4個epoch會算一次(Epoch = images_in_train_txt / batch)
./darknet detector train cfg/voc.data cfg/yolov4-tiny.cfg backup/yolov4-tiny_99000.weights -gpus 0,1 -map
停止訓練
一般最好訓練(類別數2000)迭代次數,停止標誌:
(1)當看到平均loss0.XXXXXXX avg**不下降時,平均loss小於 0.05
(小模型簡單數據集) to 3.0
(大模型複雜數據集).
(2)如果訓練時加上 -map
控制檯上mAp提示比loss下降更好,當精度不再上升時停止
選取最優權重
訓練結束後,/darknet/backup文件夾下有很多權重文件,如何知道哪個模型是最優模型
./data/VOC.data有驗證集路徑,分別測試不同權重在驗證集上測試效果
./darknet detector map data/VOC.data yolo-obj.cfg backup\yolo-obj_6000.weights
比較最後一行,選mAP (mean average precision) 最大的 ,或者IoU(intersect over union)最好的
測試
./darknet detect cfg/yolov3.cfg yolov3.weights data/dog.jpg
darknet文件夾也有darknet.py和darknet_video.py示例
不過都是輸入路徑,如果想讀取圖片輸入接口,後續加。