配置及運行MobileNetSSD
github資源:
Caffe for SSD:https://github.com/weiliu89/caffe/tree/ssd
MobileNet-SSD:https://github.com/chuanqi305/MobileNet-SSD
配置及運行MobileNetSSD
如果你需要用MobileNetSSD進行訓練自己的數據集,你可能額外需要閱讀一下其他參考網址:http://www.cnblogs.com/EstherLjy/p/6863890.html,已經有的步驟就不需要做了
通過分析MobileNet和MobileNet-SSD的模型結構, 可以看出,conv13是骨幹網絡的最後一層,作者仿照VGG-SSD的結構,在Mobilenet的conv13後面添加了8個卷積層,然後總共抽取6層用作檢測,貌似沒有使用分辨率爲38*38的層,可能是位置太靠前了吧。
MobileNet-SSD文件夾,其中重要文件簡介如下:
- template 存放4個網絡定義的公用模板,可以由gen.py腳本修改並生成
- MobileNetSSD_deploy.prototxt 運行網絡定義文件
- solver_train.prototxt 網絡訓練超參數定義文件
- solver_test.prototxt 網絡測試超參數定義文件
- train.sh 網絡訓練腳本
- test.sh 網絡測試腳本
- gen_model.sh 生成自定義網絡腳本(調用template文件夾內容)
- gen.py 生成公用模板腳本(暫不用)
- demo.py 實際檢測腳本(圖片存於images文件夾)
- merge_bn.py 合併bn層腳本,用於生成最終的caffemodel
步驟:
1 Convert your own dataset to lmdb database生成自己的數據集
2 Create the labelmap.prototxt file and put it into current directory.這一步也就是配置SSD的時候生成的,我這裏的名字叫做labelmap_voc_my_test.prototxt,在/home/caffe/data/my_test/目錄下
3 Use gen_model.sh to generate your own training prototxt.
暫時不用修改
4 Download the training weights from the link above, and run train.sh, after about 30000 iterations, the loss should be 1.5 - 2.5.
5 Run test.sh to evaluate the result.
6 Run merge_bn.py to generate your own deploy caffemodel.
7 測試訓練的模型
修改./demo.py,檢測usb攝像頭實時輸出檢測物體
import numpy as np
import sys,os
import cv2
import time
caffe_root = '/home/l/caffe-ssd/caffe/'
sys.path.insert(0, caffe_root + 'python')
import caffe
net_file= 'MobileNetSSD_deploy.prototxt'
caffe_model='MobileNetSSD_deploy.caffemodel'
test_dir = "images"
if not os.path.exists(caffe_model):
print("MobileNetSSD_deploy.caffemodel does not exist,")
print("use merge_bn.py to generate it.")
exit()
caffe.set_mode_gpu()
caffe.set_device(0)
net = caffe.Net(net_file,caffe_model,caffe.TEST)
CLASSES = ('background',
'aeroplane', 'bicycle', 'bird', 'boat',
'bottle', 'bus', 'car', 'cat', 'chair',
'cow', 'diningtable', 'dog', 'horse',
'motorbike', 'person', 'pottedplant',
'sheep', 'sofa', 'train', 'tvmonitor')
def preprocess(src):
img = cv2.resize(src, (300,300))
img = img - 127.5
img = img * 0.007843
return img
def postprocess(img, out):
h = img.shape[0]
w = img.shape[1]
box = out['detection_out'][0,0,:,3:7] * np.array([w, h, w, h])
cls = out['detection_out'][0,0,:,1]
conf = out['detection_out'][0,0,:,2]
return (box.astype(np.int32), conf, cls)
def detect(frame):
## origimg = cv2.imread(imgfile)
img = preprocess(frame)
img = img.astype(np.float32)
img = img.transpose((2, 0, 1))
##統計時間
net.blobs['data'].data[...] = img
start=time.time()
out = net.forward()
use_time=time.time()-start
print ("time="+str(use_time)+"s")
fps=1/use_time
print ("FPS="+str(fps))
box, conf, cls = postprocess(frame, out)
##調攝像頭
## k = cv2.waitKey(30) & 0xff
## cap.release()
##cv2.destroyAllWindows()
#Exit if ESC pressed
## if k == 27 : return False
return True
cap = cv2.VideoCapture(0)
while(1):
ret, frame = cap.read()
##detect(frame)
img = preprocess(frame)
img = img.astype(np.float32)
img = img.transpose((2, 0, 1))
net.blobs['data'].data[...] = img
start=time.time()
out = net.forward()
use_time=time.time()-start
print ("time="+str(use_time)+"s")
fps=1/use_time
print ("FPS="+str(fps))
box, conf, cls = postprocess(frame, out)
for i in range(len(box)):
p1 = (box[i][0], box[i][1])
p2 = (box[i][2], box[i][3])
cv2.rectangle(frame, p1, p2, (0,255,0))
p3 = (max(p1[0], 15), max(p1[1], 15))
title = "%s:%.2f" % (CLASSES[int(cls[i])], conf[i])
cv2.putText(frame, title, p3, cv2.FONT_ITALIC, 0.6, (0, 255, 0), 1)
cv2.imshow("SSD", frame)
#cv2.imshow("capture", frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()