Task2 ：Caffe-ssd Face Detection

原創

2019-07-05 17:43

0 緒論

使用 caffe 的步驟：
(1) convert data（run a script)
(2) Define net(edit prototxt) 網絡結構
(3) Define solver(edit prototxt) 超參數
(4) Train(witjh pretrained weights)(run a script)

1. 數據打包：VOC 格式，lmdb 封裝

1.1構造VOC 格式數據集：圖片，xml標註

        VOC 含3個文件夾，Annotations、ImageSets、JPEGImages。其中Annotations存放xml格式的標註信息。ImageSets/Main目錄存放文件名列表，train.txt、val.txt、test.txt。JPEGImages存放圖片。
       VOC 中的圖片僅拷貝即可。而標註數據，將真值文件解析成xml格式。人臉業務調用：writexml(filename,saveimg,bboxes,xmlpath)。
        xml文件採取節點-子節點的數據結構。

1.1.1 write_xml

	from xml.dom.minidom import Document 
	
	(1)doc = Document()  # 創建文件對象

	(2)doc.createElement()  # 用文件對象創建元素

	(3)doc.createTextNode()  # 用文件對象創建文本

	(4)node.appendChild()  # 在node之後追加對象
	
	(5)node.setAttribute('A','B') # 節點內 A = "B" 

	(6)f.write(doc.toprettyxml())  # 寫入xml文件

凡是寫節點或者寫文本，都要遵循先創建元素，再添加的步驟。

詳情參考：https://www.cnblogs.com/wcwnina/p/7222180.html

1.1.2 解析原標註數據，生成VOC格式

從數據集提供的標註文件中，解析出 write_xml 函數所需的入參，按照 VOC 格式進行封裝。

1.2修改數據打包腳本的相關路徑並運行，獲得lmdb

在路徑 caffe-ssd/data/widerface 下：

（1）修改labelmap，VOC原本含21個類別，人臉業務含2個類別，人臉和背景
（2）修改文件 create_list.sh 並執行 :讓 dataset_file 指向 Imageset/Main 下 trainval.txt 和 test.txt。對這兩個列表文件進行隨即排序、複製，在 widerface 這個目錄下生成 trian.txt、test.txt 和 test_name_size.txt。
（3）修改文件 create_data.sh 並執行：讓 mapfile 指向 caffe-ssd/data/widerface/labelmap_voc.prototxt。這個shell腳本執行了 caffe-ssd/scripts/create_annoset.py ，在存儲原VOC格式數據的widerface文件夾下生成一個lmdb文件夾存放數據。

2. caffe 源碼解讀及優化

2.1 主幹網絡 : python/caffe/model_libs.py

這個文件中提供了多個定義主幹網絡結構的函數。

2.2 模型訓練配置: examples/ssd/ssd_pascal.py

在 "# Create train/test net"之後，通過調用 model_libs.py 中的不同函數，選擇主幹網絡結構。

2.3 模型優化：提速

（1）減小模型：修改 model_libs.py 定義網絡結構中每一層的輸出大小。
（2）去掉一些卷積層

3. caffe-ssd 測試

(1) 測試網絡模型需要使用：deploy.prototxt 、 face.caffemodel

(2) 實例化 caffe 模型：

net = caffe.Net(model_def,model_weight,caffe.TEST)

(3) 處理輸入數據層：

image_data = caffe.io.load_image(img_path)

tranformer = caffe.io.Transformer({'data':net.blobs['data'].data.shape})

tranformer.set_transpose('data',(2,0,1))

tranformer.set_channel_swap('data',(2,1,0))

tranformer.set_mean('data',np.array([128,128,128]))

tranformer.set_raw_scale('data',255)

tranformer_image = tranformer.preprocess('data',image_data)

net.blobs['data'].reshape(1,3,300,300)
net.blobs['data'].data[...] = tranformer_image

(4) 解析並顯示結果：

detect_out = net.forward()['detection_out']

# print(detect_out)

det_label = detect_out[0,0,:,1]
det_conf = detect_out[0,0,:,2]

det_xmin = detect_out[0,0,:,3]
det_ymin = detect_out[0,0,:,4]
det_xmax = detect_out[0,0,:,5]
det_ymax = detect_out[0,0,:,6]

top_indices = [i for i,conf in enumerate(det_conf) if conf >= 0]

top_conf = det_conf[top_indices]

top_xmin = det_conf[top_indices]
top_ymin = det_conf[top_indices]
top_xmax = det_conf[top_indices]
top_ymax = det_conf[top_indices]

[height,width,_] = image_data.shape

for i in range(min(5,top_conf.shape[0])):
    xmin = int(top_xmin[i] * width)
    ymin = int(top_ymin[i] * height)
    xmax = int(top_xmax[i] * width)
    ymax = int(top_ymax[i] * height)

    cv2.rectangle(image_data,(xmin,ymin),(xmax,ymax),(255,0,0),2)

cv2.imshow("face",image_data)

cv2.waitKey(0)

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

Task2 ：Caffe-ssd Face Detection

[軟件工具百科] 互聯網資源歷史快照歸檔站點與數字圖書館

網易面試：SpringBoot如何開啓虛擬線程？

杭州的 IT 崩盤了麼？

程序員常見的文本查看工具

VS2022 解決方案打不開 .NET Framework 4.0 、 4.5 等老項目

Vue3 運行可以，build 打包發佈報錯，app.config.globalProperties 用法坑

既然測試也要求寫代碼，那乾脆讓開發兼任測試不就好了嗎？

ITSM落地經驗之建設藍圖規劃

PDF 補丁丁 1.0.2 版更新

奇怪！應用的日誌呢？？

記錄貼：win10下dlib庫的安裝,更新ubantu

圖像處理：fourier變換，walsh變換，dct 變換

linux 掛載移動硬盤掃盲

.7z.001 這種讓人頭疼的分卷格式

學習筆記：tf.data對tfrecord數據進行解析，獲取data_batch

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結