放在前面:
爲了嘗試各種算法,又來跑yolo了,這次是基於darknet-efficientB0.cfg轉caffe的記錄
需要用到的一些項目地址:
訓練工具--darknet:https://github.com/AlexeyAB/darknet
轉caffe工具--darknet to caffe:https://github.com/marvis/pytorch-caffe-darknet-convert
使用caffe-yolo的c++工程--caffe-yolov3:https://github.com/ChenYingpeng/caffe-yolov3
如果想看之前修改的ghost-yolo文件:https://blog.csdn.net/weixin_38715903/article/details/105550619
目錄
這次是在上次ghost-yolo轉caffe的基礎上,繼續修改darknet2caffe.py文件,然後這次有點不同,上次使用caffemodel的時候不需要修改caffe-yolov3文件,這次需要修改
這次修改的文件存放在:https://github.com/hualuluu/efficientNetB0-yolo
1.修改轉caffe工具
確認efficientDet-B0的結構有哪些需要添加的
- swish激活函數
- 不一樣的shortcut操作
- dropout層
A.關於swish激活函數
我用的caffe是ssd的版本,所以沒有swish激活函數,以此爲前提。
需要做兩個操作:爲caffe添加swish_layer以及在轉caffe工具darknet2caffe.py中添加swish layer的相關操作
elif block['type'] == 'convolutional':
conv_layer = OrderedDict()
conv_layer['bottom'] = bottom
if block.has_key('name'):
conv_layer['top'] = block['name']
conv_layer['name'] = block['name']
else:
conv_layer['top'] = 'layer%d-conv' % layer_id
conv_layer['name'] = 'layer%d-conv' % layer_id
conv_layer['type'] = 'Convolution'
convolution_param = OrderedDict()
convolution_param['num_output'] = block['filters']
prev_filters = block['filters']
convolution_param['kernel_size'] = block['size']
#print(block)
if 'groups' in block:
convolution_param['group']=block['groups']
if 'pad' in block:
if block['pad'] == '1':
convolution_param['pad'] = str(int(convolution_param['kernel_size'])/2)
else:
convolution_param['pad']=str(int(convolution_param['kernel_size'])/2)
convolution_param['stride'] = block['stride']
if block['batch_normalize'] == '1':
convolution_param['bias_term'] = 'false'
else:
convolution_param['bias_term'] = 'true'
conv_layer['convolution_param'] = convolution_param
layers.append(conv_layer)
bottom = conv_layer['top']
if block['batch_normalize'] == '1':
bn_layer = OrderedDict()
bn_layer['bottom'] = bottom
bn_layer['top'] = bottom
if block.has_key('name'):
bn_layer['name'] = '%s-bn' % block['name']
else:
bn_layer['name'] = 'layer%d-bn' % layer_id
bn_layer['type'] = 'BatchNorm'
batch_norm_param = OrderedDict()
batch_norm_param['use_global_stats'] = 'true'
bn_layer['batch_norm_param'] = batch_norm_param
layers.append(bn_layer)
scale_layer = OrderedDict()
scale_layer['bottom'] = bottom
scale_layer['top'] = bottom
if block.has_key('name'):
scale_layer['name'] = '%s-scale' % block['name']
else:
scale_layer['name'] = 'layer%d-scale' % layer_id
scale_layer['type'] = 'Scale'
scale_param = OrderedDict()
scale_param['bias_term'] = 'true'
scale_layer['scale_param'] = scale_param
layers.append(scale_layer)
"""這裏添加Sigmoid層的操作"""
if block['activation'] == 'logistic':
sigmoid_layer = OrderedDict()
sigmoid_layer['bottom'] = bottom
sigmoid_layer['top'] = bottom
if block.has_key('name'):
sigmoid_layer['name'] = '%s-act' % block['name']
else:
sigmoid_layer['name'] = 'layer%d-act' % layer_id
sigmoid_layer['type'] = 'Sigmoid'
layers.append(sigmoid_layer)
"""這裏添加swish層的操作"""
elif block['activation'] == 'swish':
swish_layer = OrderedDict()
swish_layer['bottom'] = bottom
swish_layer['top'] = bottom
if block.has_key('name'):
swish_layer['name'] = '%s-swish' % block['name']
else:
swish_layer['name'] = 'layer%d-swish' % layer_id
swish_layer['type'] = 'Swish'
layers.append(swish_layer)
B.關於shortcut操作
之前有說過shortcut就是相當於eltwise操作,但這裏有一個問題是,efficientDet-B0的結構中存在維度不相同的特徵相加的情況
layer filters size/strd(dil) input output
90 conv 576 1 x 1/ 1 60 x 34 x 112 -> 60 x 34 x 576 0.263 BF
....
....
139 upsample 2x 30 x 17 x 128 -> 60 x 34 x 128
140 Shortcut Layer: 90, wt = 0, wn = 0, outputs: 60 x 34 x 128 0.000 BF
比如上面顯示的140層是將90層和139層的特徵相加,但是我們可以看到兩個特徵的維度分別爲:60*34*576和60*34*128,而且最後的輸出爲60*34*128。我去看了darknet的源碼,是根據輸出的維度,來判斷需要捨棄的特徵參數,這一層中就是將90層的60*34*128特徵與139層的60*34*128特徵進行eltwise操作。
在caffe中沒有這樣的層,我又不想自己寫,於是,我們可以利用split layer的操作將90層的輸出特徵,分爲60*34*0到60*34*128和60*34*129到60*34*576兩個部分。【這裏要注意,因爲我們只用到了split操作後的第一部分特徵60*34*0到60*34*128,另一部分就捨棄了,會影響最後的caffe使用,所以到時要修改caffe-yolov3的detecnet.cpp文件】
最終修改darknet2caffe.py:
elif block['type'] == 'shortcut':
if(int(block['from'])>0):
"""
還是講一下原理吧:
darknet中的shuortcut有一個參數from,表示除了前一層外,還接受哪一層的特徵;
比如剛剛講的140層的from值就爲90,【也可以用-50代替90】
那麼這裏爲什麼from參數>0就split呢?
因爲比較巧合,effi_B0.cfg中,進行shortcut操作的層,
只要兩個輸入層特徵的維度相同,他們的from值就爲負數,
兩個輸入層特徵維度不同,from都是用正數表示的
所以剛好利用這個規律,區分shortcut的操作,
當然如果cfg結構參數有變化就要視情況而定了
"""
#添加split層
prev_layer_id1=int(block['from'])+1
slice_layer = OrderedDict()
slice_layer['bottom']=topnames[prev_layer_id1]
slice_layer['name'] = 'layer%d-slice' % layer_id
top1=slice_layer['name']+'_1'
top2=slice_layer['name']+'_2'
slice_layer['top']=[top1,top2]
#slice_layer['top']=top1
slice_layer['type']='Slice'
slice_param=OrderedDict()
slice_param['axis']='1'
slice_param['slice_point']='128'
slice_layer['slice_param']=slice_param
layers.append(slice_layer)
bottom1 = top1
else:
prev_layer_id1 = layer_id + int(block['from'])
bottom1 = topnames[prev_layer_id1]
#後面是一樣的eltwise層基本操作,不做修改
prev_layer_id2 = layer_id - 1
#print('^^^^^^^^^^^^^^^^^^^^^^^^^^^')
#print('topnames:',topnames)
#print(layer_id,prev_layer_id1,prev_layer_id2)
bottom2= topnames[prev_layer_id2]
shortcut_layer = OrderedDict()
shortcut_layer['bottom'] = [bottom1, bottom2]
if block.has_key('name'):
shortcut_layer['top'] = block['name']
shortcut_layer['name'] = block['name']
else:
shortcut_layer['top'] = 'layer%d-shortcut' % layer_id
shortcut_layer['name'] = 'layer%d-shortcut' % layer_id
shortcut_layer['type'] = 'Eltwise'
eltwise_param = OrderedDict()
eltwise_param['operation'] = 'SUM'
shortcut_layer['eltwise_param'] = eltwise_param
layers.append(shortcut_layer)
bottom = shortcut_layer['top']
if block['activation'] != 'linear':
relu_layer = OrderedDict()
relu_layer['bottom'] = bottom
relu_layer['top'] = bottom
if block.has_key('name'):
relu_layer['name'] = '%s-act' % block['name']
else:
relu_layer['name'] = 'layer%d-act' % layer_id
relu_layer['type'] = 'ReLU'
if block['activation'] == 'leaky':
relu_param = OrderedDict()
relu_param['negative_slope'] = '0.1'
relu_layer['relu_param'] = relu_param
layers.append(relu_layer)
topnames[layer_id] = bottom
layer_id = layer_id + 1
C.添加dropout層
這個沒什麼說的,就是添加一個dropout layer
elif block['type'] == 'dropout':
dropout_layer = OrderedDict()
dropout_layer['bottom'] = bottom
if block.has_key('name'):
dropout_layer['top'] = block['name']
dropout_layer['name'] = block['name']
else:
dropout_layer['top'] = 'layer%d-dropout' % layer_id
dropout_layer['name'] = 'layer%d-dropout' % layer_id
dropout_layer['type'] = 'Dropout'
dropout_param = OrderedDict()
dropout_param['dropout_ratio'] = block['probability']
dropout_layer['dropout_param']=dropout_param
layers.append(dropout_layer)
bottom = dropout_layer['top']
topnames[layer_id] = bottom
layer_id = layer_id+1
2.修改caffe-yolo中的文件
首先要明確,經過prototxt之後的結構的網絡輸出層是什麼,只有需連接yolo layer的兩個卷積層嗎,不是的。
除了兩個卷積層之外,還有的是剛剛我們爲了shortcut操作split之後的部分特徵,split之後使用了前部分的特徵,後部分的特徵就沒有用了,也會作爲輸出層輸出。
在代碼98-101行的部分有如下:
net->Forward();
for(int i =0;i<net->num_outputs();++i){
blobs.push_back(net->output_blobs()[i]);
//這裏把所有的輸出層傳遞給blobs,然後送到後面的get_detections函數得到box之類的結果
//剛剛分析過最後輸出總共4個特徵:兩個與yolo相連的conv和split之後的不需要的部分
//【effi_B0結構中有兩個這樣的split輸出】,所以總輸出是4個,net->num_outputs()=4
//get_detections函數需要的是與yolo相連接的layer輸出的特徵,不要多餘的
//所以這裏,應該找到需要的layer,放入blobs中
}
dets = get_detections(blobs,im.w,im.h,
net_input_data_blobs->width(),net_input_data_blobs->height(),&nboxes);
//修改如下,可以根據情況自己改,肯定有更好的方法
for(int i=0;i<net->num_outputs();++i){
if (i==0 || i==3)//取net->output_blobs()[0],net->output_blobs()[3]放入blobs
{ blobs.push_back(net->output_blobs()[i]);
LOG(INFO) << net->blob_names()[net->output_blob_indices()[i]];
//這一行輸出存放到blobs中的layer名,確認一下自己有沒有弄錯,可以註釋掉
}
}
dets = get_detections(blobs,im.w,im.h,
net_input_data_blobs->width(),net_input_data_blobs->height(),&nboxes);
2.具體操作流程
emmm,訓練步驟還是一樣pass默認都會,使用的就是官方的effi_B0.cfg文件。【不會的看這裏-https://blog.csdn.net/weixin_38715903/article/details/103695844】,下載cfg文件,按照darknet的步驟訓練就行了;
-
A.darknet to caffe
git clone https://github.com/marvis/pytorch-caffe-darknet-convert
然後下載darknet2caffe.py,替換原有的darknet2caffe.py
【https://github.com/hualuluu/efficientNetB0-yolo】
注意!!下載的darknet2caffe.py中有一些路徑需要改成自己的,比如說你的caffe路徑【需要一些caffe頭文件】
然後:
再運行:
sudo python2.7 darknet2caffe.py cfg/ghostnet-yolo.cfg ghostnet-yolo.weights ghostnet-yolo.prototxt ghostnet-yolo.caffemodel
這一步會得到prototxt和caffemodal
-
B.caffe-yolo
git clone https://github.com/ChenYingpeng/caffe-yolov3
cd caffe-yolov3
修改detectnet.cpp文件:
在代碼98-101行的部分有如下:
net->Forward();
for(int i =0;i<net->num_outputs();++i){
blobs.push_back(net->output_blobs()[i]);
}
dets = get_detections(blobs,im.w,im.h,
net_input_data_blobs->width(),net_input_data_blobs->height(),&nboxes);
//修改如下,可以根據情況自己改,肯定有更好的方法
for(int i=0;i<net->num_outputs();++i){
if (i==0 || i==3)//取net->output_blobs()[0],net->output_blobs()[3]放入blobs
{ blobs.push_back(net->output_blobs()[i]);
LOG(INFO) << net->blob_names()[net->output_blob_indices()[i]];
}
}
dets = get_detections(blobs,im.w,im.h,
net_input_data_blobs->width(),net_input_data_blobs->height(),&nboxes);
將生成的caffemodel和prototxt放在./caffemodel和./prototxt文件下【沒有就建一個】
修改cmakelist.txt
"""全部都要改成自己的caffe路徑"""
# build C/C++ interface
include_directories(${PROJECT_INCLUDE_DIR} ${GIE_PATH}/include)
include_directories(${PROJECT_INCLUDE_DIR}
/home/ubuntu247/liliang/caffe-ssd/include
/home/ubuntu247/liliang/caffe-ssd/build/include
)
file(GLOB inferenceSources *.cpp *.cu )
file(GLOB inferenceIncludes *.h )
cuda_add_library(yolov3-plugin SHARED ${inferenceSources})
target_link_libraries(yolov3-plugin
/home/ubuntu247/liliang/caffe-ssd/build/lib/libcaffe.so
/usr/lib/x86_64-linux-gnu/libglog.so
/usr/lib/x86_64-linux-gnu/libgflags.so.2
/usr/lib/x86_64-linux-gnu/libboost_system.so
/usr/lib/x86_64-linux-gnu/libGLEW.so.1.13
)
如果你在訓練中使用的是自己的anchors值,要修改anchors的值(yolo.cpp中),再進行編譯;還有yolo.h中的classes數
/*
* Company: Synthesis
* Author: Chen
* Date: 2018/06/04
*/
#include "yolo_layer.h"
#include "blas.h"
#include "cuda.h"
#include "activations.h"
#include "box.h"
#include <stdio.h>
#include <math.h>
//yolov3
//float biases[18] = {10,13,16,30,33,23,30,61,62,45,59,119,116,90,156,198,373,326};
float biases[18] = {7, 15, 16, 18, 22, 32, 9, 40, 20, 71, 37, 39, 52, 65, 70, 110, 105, 208};
/*
* Company: Synthesis
* Author: Chen
* Date: 2018/06/04
*/
#ifndef __YOLO_LAYER_H_
#define __YOLO_LAYER_H_
#include <caffe/caffe.hpp>
#include <string>
#include <vector>
using namespace caffe;
const int classes = 3;
const float thresh = 0.5;
const float hier_thresh = 0.5;
const float nms_thresh = 0.5;
const int num_bboxes = 3;
const int relative = 1;
編譯
mkdir build
cd build
cmake ..
make -j12
運行
./x86_64/bin/detectnet ../prototxt/effi-yolo.prototxt ../caffemodel/effi-yolo.caffemodel ../images/bicycle.jpg
應該沒有遺漏的地方吧,想到再說,就醬,撒花~