MTCNN配置及訓練詳細步驟

配置環境爲win7 64位,主要完成的任務是用MTCNN完成人臉檢測,即使用目標檢測框將圖像中的人臉框出來,配置過程如下:

1、環境配置

安裝anaconda

進入官網:
https://www.anaconda.com/download/
根據python版本下載安裝相應的anaconda即可

安裝Microsoft Visual Studio 2013

注意此處一定要安裝2013版方便後面caffe的編譯,下載地址爲:
https://msdn.itellyou.cn/

在編譯好的VS環境下配置opencv和openblas

配置opencv參考:
https://blog.csdn.net/SherryD/article/details/51734334
配置openblas參考:
https://blog.csdn.net/yangyangyang20092010/article/details/45156881

在VS環境下編譯caffe

下載caffe的windows官方編譯版本:
https://github.com/happynear/caffe-windows
然後按照:
https://blog.csdn.net/xierhacker/article/details/51834563
安裝即可

安裝pycharm,並在anaconda的python環境中配置opencv

1、安裝pycharm參考:
https://www.jianshu.com/p/042324342bf4

2、在anaconda的python環境中配置opencv
首先在官網:https://opencv.org/releases.html
下載opencv的win pack包,然後直接點exe運行即可,安裝完成後,將opencv的安裝路徑:
E:\OpenCV2\opencv\build\python\2.7\x86下的cv2.pyd移動到anaconda的安裝路徑:D:\Anaconda\anaconda2\Lib\site-packages下,然後可以在cmd命令進行測試

3、以上配置好後,在pycharm中一個常見的問題就是:

 ImportError: No module named google.protobuf.internal

這裏需要首先到:https://github.com/google/protobuf將protobuf-maste拷貝下來,然後到:https://github.com/google/protobuf/releases中下載protoc-3.5.1-win32.zip
將protoc-3.5.1-win32\bin下的protoc.exe複製到protobuf-master\src文件夾下,按照:
http://sharley.iteye.com/blog/2375044
中的方式進行安裝

2、MTCNN配置

github上MTCNN有很多版本,我以從數據集準備到最終的測試的順序來介紹
訓練主要參考:https://github.com/dlunion/mtcnn
測試主要參考:https://github.com/CongWeilin/mtcnn-caffe

數據集的準備

1、將採集好數據集放到一個文件夾中,命名爲samples(也可以寫成別的名字,但是注意與後面的步驟中需要該文件夾數據的路徑要一致)
2、對數據集進行標註,網上有很多的標註工具可以使用:
https://blog.csdn.net/chaipp0607/article/details/79036312
可以使用上面的標註工具進行標註,標註完成後會生成一個txt文檔或者是xml文檔之類的文檔,裏面包含了圖像檢測框的左上角點的座標和右下角點的座標信息。
3、根據文檔中提供的信息,我們需要將檢測框的左上角點的座標和右下角點的座標提取出來,整理成以下形式:
samples/filename.jpg xmin ymin xmax ymax
(即:數據集文件夾/圖片名 檢測框左上角點的x座標 檢測框左上角點的y座標 檢測框右下角點的x座標 檢測框右下角點的y座標)
我使用的數據標註工具生成的文檔如下所示:

<?xml version='1.0' encoding='GB2312'?>
<info>
    <src width="480" height="640" depth="3">00ff0abc4818a309b51180264b830211.jpg</src>
    <object id="E68519DF-E8E1-4C55-9231-CB381DE1CC5A">
        <rect lefttopx="168" lefttopy="168" rightbottomx="313" rightbottomy="340"></rect>
        <type>21</type>
        <descriinfo></descriinfo>
        <modifydate>2018-05-08 17:04:07</modifydate>
    </object>
</info>

所以我需要將這個文檔中的檢測框座標點提取出來,並整理成如上所述的標準形式,形成一個 label.txt 文檔
根據以上xml的形式,轉換的腳本如下:

# -*- coding:utf-8 -*-

import os
from lxml import etree

##################### 以下部分用於讀取xml文件,返回檢測框左上角和右下角的座標 ###################
def read_xml(in_path):
    tree = etree.parse(in_path)
    return tree

def find_nodes(tree, path):
    return tree.findall(path)

def get_obj(xml_path):
    tree = read_xml(xml_path)
    nodes = find_nodes(tree, "src")
    objects = []

    for node in nodes:
        pic_struct = {}
        pic_struct['width'] = str(node.get('width'))
        pic_struct['height'] = str(node.get('height'))
        pic_struct['depth'] = str(node.get('depth'))
        # objects.append(pic_struct)
    nodes = find_nodes(tree, "object")

    for i in range(len(nodes)):
        # obj_struct = {}
        # obj_struct['name'] = str(find_nodes(nodes[i] , 'type')[0].text)
        cl_box = find_nodes(nodes[i], 'rect')
        for rec in cl_box:
            objects = [int(rec.get('lefttopx')), int(rec.get('lefttopy')),
                       int(rec.get('rightbottomx')), int(rec.get('rightbottomy'))]
    return objects

################# 將xml的信息統一成標準形式 ################
def listFile(data_dir, suffix):
    fs = os.listdir(data_dir)
    for i in range(len(fs)-1, -1, -1):
        # 如果後綴不是.jpg就將該文件刪除掉
        if not fs[i].endswith(suffix):
            del fs[i]
    return fs

def write_label(data_dir, xml_dir):
    images = listFile(data_dir, ".jpg")
    with open("label.txt", "w") as label:
        for i in range(len(images)):
            image_path = data_dir + "/" + images[i]
            xml_path = xml_dir + "/" + images[i][:-4] + ".txt"
            objects = get_obj(xml_path)
            line = image_path + " " + str(objects[0]) + " " + str(objects[1]) \
                   + " " + str(objects[2]) + " " + str(objects[3]) + "\n"
            label.write(line)

################ 主函數 ###################
if __name__ == '__main__':
    data_dir = "E:/MTCNN/Train/samples"
    xml_dir = "E:/MTCNN/Train/samples/annotation"
    write_label(data_dir, xml_dir)

整理好的 label.txt 形式爲:

E:/MTCNN/Train/samples/0019c3f356ada6bcda0b695020e295e6.jpg 102 87 311 417
E:/MTCNN/Train/samples/0043e38f303b247e50b9a07cb5887b39.jpg 156 75 335 295
E:/MTCNN/Train/samples/004e26290d2290ca87e02b737a740aee.jpg 105 122 291 381
E:/MTCNN/Train/samples/00ff0abc4818a309b51180264b830211.jpg 168 168 313 340
E:/MTCNN/Train/samples/015a7137173f29e2cd4663c7cbcad1cb.jpg 127 60 332 398
E:/MTCNN/Train/samples/0166ceba53a4bfc4360e1d12b33ecb61.jpg 149 82 353 378
E:/MTCNN/Train/samples/01e6deccb55b377985d2c4d72006ee34.jpg 185 100 289 249
E:/MTCNN/Train/samples/021e34448c0ed051db501156cf2b6552.jpg 204 91 359 289
......

3、MTCNN訓練數據生成及訓練

(1) P_Net 的訓練

按照MTCNN論文中的說法:
這裏寫圖片描述
需要將原始數據集的數據分成Negative,Positive,Part faces,Landmark faces四個部分,由於本次主要是進行人臉檢測的任務,所以只需要分成Negative,Positive,Part faces三個部分即可,代碼如下所示:

# -*- coding:utf-8 -*-
import sys
import numpy as np
import cv2
import os
import numpy.random as npr

stdsize = 12
anno_file = "label.txt"
im_dir = "samples"
pos_save_dir = str(stdsize) + "/positive"
part_save_dir = str(stdsize) + "/part"
neg_save_dir = str(stdsize) + '/negative'
save_dir = "./" + str(stdsize)

def IoU(box, boxes):
    """Compute IoU between detect box and gt boxes

    Parameters:
    ----------
    box: numpy array , shape (5, ): x1, y1, x2, y2, score
        input box
    boxes: numpy array, shape (n, 4): x1, y1, x2, y2
        input ground truth boxes

    Returns:
    -------
    ovr: numpy.array, shape (n, )
        IoU
    """
    box_area = (box[2] - box[0] + 1) * (box[3] - box[1] + 1)
    area = (boxes[:, 2] - boxes[:, 0] + 1) * (boxes[:, 3] - boxes[:, 1] + 1)
    # boxes[:, 0]代表取boxes這個nx4矩陣所有行的第一個數據
    xx1 = np.maximum(box[0], boxes[:, 0])
    yy1 = np.maximum(box[1], boxes[:, 1])
    xx2 = np.minimum(box[2], boxes[:, 2])
    yy2 = np.minimum(box[3], boxes[:, 3])

    # compute the width and height of the bounding box
    w = np.maximum(0, xx2 - xx1 + 1)
    h = np.maximum(0, yy2 - yy1 + 1)

    inter = w * h
    ovr = inter / (box_area + area - inter)
    return ovr

# 生成一系列文件夾用於存儲三類樣本
def mkr(dr):
    if not os.path.exists(dr):
        os.mkdir(dr)

mkr(save_dir)
mkr(pos_save_dir)
mkr(part_save_dir)
mkr(neg_save_dir)

# 生成一系列txt文檔用於存儲Positive,Negative,Part三類數據的信息
f1 = open(os.path.join(save_dir, 'pos_' + str(stdsize) + '.txt'), 'w')
f2 = open(os.path.join(save_dir, 'neg_' + str(stdsize) + '.txt'), 'w')
f3 = open(os.path.join(save_dir, 'part_' + str(stdsize) + '.txt'), 'w')

# 讀取label.txt
with open(anno_file, 'r') as f:
    annotations = f.readlines()
num = len(annotations)
print "%d pics in total" % num
p_idx = 0 # positive
n_idx = 0 # negative
d_idx = 0 # dont care
idx = 0
box_idx = 0


for annotation in annotations:
    annotation = annotation.strip().split(' ')
    im_path = annotation[0]
    bbox = map(float, annotation[1:])
    boxes = np.array(bbox, dtype=np.float32).reshape(-1, 4)
    print im_path
    img = cv2.imread(im_path)
    idx += 1
    if idx % 100 == 0:
        print idx, "images done"

    height, width, channel = img.shape

    neg_num = 0
    while neg_num < 50:
        # 生成隨機數,對每張數據集中的圖像進行切割,生成一系列小的圖像
        size = npr.randint(40, min(width, height) / 2)
        nx = npr.randint(0, width - size)
        ny = npr.randint(0, height - size)
        crop_box = np.array([nx, ny, nx + size, ny + size])
        # 計算小的圖像與標註產生的檢測框之間的IoU
        Iou = IoU(crop_box, boxes)

        cropped_im = img[ny : ny + size, nx : nx + size, :]
        resized_im = cv2.resize(cropped_im, (stdsize, stdsize), interpolation=cv2.INTER_LINEAR)

        if np.max(Iou) < 0.3:
            # Iou with all gts must below 0.3
            save_file = os.path.join(neg_save_dir, "%s.jpg"%n_idx)
            f2.write(str(stdsize)+"/negative/%s"%n_idx + ' 0\n')
            cv2.imwrite(save_file, resized_im)
            n_idx += 1
            neg_num += 1


    for box in boxes:
        # box (x_left, y_top, x_right, y_bottom)
        x1, y1, x2, y2 = box
        w = x2 - x1 + 1
        h = y2 - y1 + 1

        # max(w, h) < 40:參數40表示忽略的最小的臉的大小
        # in case the ground truth boxes of small faces are not accurate
        if max(w, h) < 40 or x1 < 0 or y1 < 0:
            continue

        # generate positive examples and part faces
        for i in range(20):
            size = npr.randint(int(min(w, h) * 0.8), np.ceil(1.25 * max(w, h)))

            # delta here is the offset of box center
            delta_x = npr.randint(-w * 0.2, w * 0.2)
            delta_y = npr.randint(-h * 0.2, h * 0.2)

            nx1 = max(x1 + w / 2 + delta_x - size / 2, 0)
            ny1 = max(y1 + h / 2 + delta_y - size / 2, 0)
            nx2 = nx1 + size
            ny2 = ny1 + size

            if nx2 > width or ny2 > height:
                continue
            crop_box = np.array([nx1, ny1, nx2, ny2])

            offset_x1 = (x1 - nx1) / float(size)
            offset_y1 = (y1 - ny1) / float(size)
            offset_x2 = (x2 - nx2) / float(size)
            offset_y2 = (y2 - ny2) / float(size)

            cropped_im = img[int(ny1) : int(ny2), int(nx1) : int(nx2), :]
            resized_im = cv2.resize(cropped_im, (stdsize, stdsize), interpolation=cv2.INTER_LINEAR)

            box_ = box.reshape(1, -1)
            if IoU(crop_box, box_) >= 0.65:
                save_file = os.path.join(pos_save_dir, "%s.jpg"%p_idx)
                f1.write(str(stdsize)+"/positive/%s"%p_idx + ' 1 %.2f %.2f %.2f %.2f\n'%(offset_x1, offset_y1, offset_x2, offset_y2))
                cv2.imwrite(save_file, resized_im)
                p_idx += 1
            elif IoU(crop_box, box_) >= 0.4:
                save_file = os.path.join(part_save_dir, "%s.jpg"%d_idx)
                f3.write(str(stdsize)+"/part/%s"%d_idx + ' -1 %.2f %.2f %.2f %.2f\n'%(offset_x1, offset_y1, offset_x2, offset_y2))
                cv2.imwrite(save_file, resized_im)
                d_idx += 1
        box_idx += 1
        print "%s images done, pos: %s part: %s neg: %s"%(idx, p_idx, d_idx, n_idx)

f1.close()
f2.close()
f3.close()

這裏是產生第一個P-Net的訓練樣本,產生後續R-Net和O-Net的訓練樣本只需要將上面的 stdsize = 12 參數改成24和48即可,裏面有些參數也可以根據自己的需要進行修改。

上面獲得了隨機切分原圖後得到的Negative,Positive,Part faces三類樣本的圖片路徑和樣本中的每一張圖片裏檢測框的座標,我們要進行訓練,還是需要將這些信息保存爲第三步中label.txt的形式:

import sys
import os

save_dir = "./12"
if not os.path.exists(save_dir):
    os.mkdir(save_dir)
f1 = open(os.path.join(save_dir, 'pos_12.txt'), 'r')
f2 = open(os.path.join(save_dir, 'neg_12.txt'), 'r')
f3 = open(os.path.join(save_dir, 'part_12.txt'), 'r')

pos = f1.readlines()
neg = f2.readlines()
part = f3.readlines()
f = open(os.path.join(save_dir, 'label-train.txt'), 'w')

for i in range(int(len(pos))):
    p = pos[i].find(" ") + 1
    pos[i] = pos[i][:p-1] + ".jpg " + pos[i][p:-1] + "\n"
    f.write(pos[i])

for i in range(int(len(neg))):
    p = neg[i].find(" ") + 1
    neg[i] = neg[i][:p-1] + ".jpg " + neg[i][p:-1] + " -1 -1 -1 -1\n"
    f.write(neg[i])

for i in range(int(len(part))):
    p = part[i].find(" ") + 1
    part[i] = part[i][:p-1] + ".jpg " + part[i][p:-1] + "\n"
    f.write(part[i])

f1.close()
f2.close()
f3.close()

接下來要將其轉換成caffe用的lmdb形式,這裏我們利用caffe自帶的工具,轉換代碼如下:

"caffe/convert_imageset.exe" "" 12/label.txt train_lmdb12 --backend=mtcnn --shuffle=true

由於將原始圖片切分成Negative,Positive,Part faces三個部分後數據量很大,所以可能轉換的時間會很長。
至此,P_Net的訓練數據就準備好了,接下來就可以進行訓練了。

訓練我們需要配置到caffe的相關prototxt:
上面訓練參考鏈接中的:det1-train.prototxt,solver-12.prototxt,注意調整這兩個文件中的路徑,然後在根目錄下新建models-12文件夾用於存儲snapshot,最後使用命令:

"caffe/caffe.exe" train --solver=solver-12.prototxt --weights=det1.caffemodel

進行訓練即可。

(2) R_Net 的訓練

進行完上面P_Net的訓練後,繼續參考上面的產生數據的代碼產生R_Net所需的訓練數據,同時因爲論文中強調了產生hard_sample會提高模型的預測精度:
這裏寫圖片描述

所以我們使用下面的代碼來產生hard_sample:

import tools
import caffe
import cv2
import numpy as np
import os
from utils import *
deploy = 'det1.prototxt'
caffemodel = 'det1.caffemodel'
net_12 = caffe.Net(deploy,caffemodel,caffe.TEST)
def view_bar(num, total):
    rate = float(num) / total
    rate_num = int(rate * 100)
    r = '\r[%s%s]%d%%  (%d/%d)' % ("#"*rate_num, " "*(100-rate_num), rate_num, num, total)
    sys.stdout.write(r)
    sys.stdout.flush()
def detectFace(img_path,threshold):
    img = cv2.imread(img_path)
    caffe_img = img.copy()-128
    origin_h,origin_w,ch = caffe_img.shape
    scales = tools.calculateScales(img)
    out = []
    for scale in scales:
        hs = int(origin_h*scale)
        ws = int(origin_w*scale)
        scale_img = cv2.resize(caffe_img,(ws,hs))
        scale_img = np.swapaxes(scale_img, 0, 2)
        net_12.blobs['data'].reshape(1,3,ws,hs)
        net_12.blobs['data'].data[...]=scale_img
    caffe.set_device(0)
    caffe.set_mode_gpu()
    out_ = net_12.forward()
        out.append(out_)
    image_num = len(scales)
    rectangles = []
    for i in range(image_num):    
        cls_prob = out[i]['cls_score'][0][1]
        roi      = out[i]['conv4-2'][0]
        out_h,out_w = cls_prob.shape
        out_side = max(out_h,out_w)
        rectangle = tools.detect_face_12net(cls_prob,roi,out_side,1/scales[i],origin_w,origin_h,threshold[0])
        rectangles.extend(rectangle)
    return rectangles
anno_file = 'wider_face_train.txt'
im_dir = "WIDER_train/images/"
neg_save_dir  = "24/negative"
pos_save_dir  = "24/positive"
part_save_dir = "24/part"
image_size = 24
f1 = open('24/pos_24.txt', 'w')
f2 = open('24/neg_24.txt', 'w')
f3 = open('24/part_24.txt', 'w')
threshold = [0.6,0.6,0.7]
with open(anno_file, 'r') as f:
    annotations = f.readlines()
num = len(annotations)
print "%d pics in total" % num

p_idx = 0 # positive
n_idx = 0 # negative
d_idx = 0 # dont care
image_idx = 0

for annotation in annotations:
    annotation = annotation.strip().split(' ')
    bbox = map(float, annotation[1:])
    gts = np.array(bbox, dtype=np.float32).reshape(-1, 4)
    img_path = im_dir + annotation[0] + '.jpg'
    rectangles = detectFace(img_path,threshold)
    img = cv2.imread(img_path)
    image_idx += 1
    view_bar(image_idx,num)
    for box in rectangles:
        x_left, y_top, x_right, y_bottom, _ = box
        crop_w = x_right - x_left + 1
        crop_h = y_bottom - y_top + 1
        # ignore box that is too small or beyond image border
        if crop_w < image_size or crop_h < image_size :
            continue

        # compute intersection over union(IoU) between current box and all gt boxes
        Iou = IoU(box, gts)
        cropped_im = img[y_top:y_bottom + 1, x_left:x_right + 1]
        resized_im = cv2.resize(cropped_im, (image_size, image_size), interpolation=cv2.INTER_LINEAR)

        # save negative images and write label
        if np.max(Iou) < 0.3:
            # Iou with all gts must below 0.3
            save_file = os.path.join(neg_save_dir, "%s.jpg"%n_idx)
            f2.write("%s/negative/%s"%(image_size, n_idx) + ' 0\n')
            cv2.imwrite(save_file, resized_im)
            n_idx += 1
        else:
            # find gt_box with the highest iou
            idx = np.argmax(Iou)
            assigned_gt = gts[idx]
            x1, y1, x2, y2 = assigned_gt

            # compute bbox reg label
            offset_x1 = (x1 - x_left)   / float(crop_w)
            offset_y1 = (y1 - y_top)    / float(crop_h)
            offset_x2 = (x2 - x_right)  / float(crop_w)
            offset_y2 = (y2 - y_bottom )/ float(crop_h)

            # save positive and part-face images and write labels
            if np.max(Iou) >= 0.65:
                save_file = os.path.join(pos_save_dir, "%s.jpg"%p_idx)
                f1.write("%s/positive/%s"%(image_size, p_idx) + ' 1 %.2f %.2f %.2f %.2f\n'%(offset_x1, offset_y1, offset_x2, offset_y2))
                cv2.imwrite(save_file, resized_im)
                p_idx += 1

            elif np.max(Iou) >= 0.4:
                save_file = os.path.join(part_save_dir, "%s.jpg"%d_idx)
                f3.write("%s/part/%s"%(image_size, d_idx)     + ' -1 %.2f %.2f %.2f %.2f\n'%(offset_x1, offset_y1, offset_x2, offset_y2))
                cv2.imwrite(save_file, resized_im)
                d_idx += 1

f1.close()
f2.close()
f3.close()

注意修改上面代碼的路徑,用上面P_Net同樣的處理方式將以上數據處理成lmdb的形式並進行訓練。O_Net同理。上面的訓練完成後就可以進行測試了。

4、MTCNN的測試

經過以上的步驟,在models-12、models-24和models-48會有三個網絡對應的caffemodel,再加上det1.prototxt、det2.prototxt和det3.prototxt就可以利用下面的代碼進行測試了(主要參考https://github.com/CongWeilin/mtcnn-caffe/tree/master/demo中的代碼):

import tools_matrix as tools
import caffe
import cv2
import numpy as np
deploy = 'det1.prototxt'
caffemodel = 'det1.caffemodel'
net_12 = caffe.Net(deploy,caffemodel,caffe.TEST)

deploy = 'det2.prototxt'
caffemodel = 'det2.caffemodel'
net_24 = caffe.Net(deploy,caffemodel,caffe.TEST)

deploy = 'det3.prototxt'
caffemodel = 'det3.caffemodel'
net_48 = caffe.Net(deploy,caffemodel,caffe.TEST)


def detectFace(img_path,threshold):
    img = cv2.imread(img_path)
    caffe_img = (img.copy()-127.5)/128
    origin_h,origin_w,ch = caffe_img.shape
    scales = tools.calculateScales(img)
    out = []
    for scale in scales:
        hs = int(origin_h*scale)
        ws = int(origin_w*scale)
        scale_img = cv2.resize(caffe_img,(ws,hs))
        scale_img = np.swapaxes(scale_img, 0, 2)
        net_12.blobs['data'].reshape(1,3,ws,hs)
        net_12.blobs['data'].data[...]=scale_img
    caffe.set_device(0)
    caffe.set_mode_gpu()
    out_ = net_12.forward()
        out.append(out_)
    image_num = len(scales)
    rectangles = []
    for i in range(image_num):    
        cls_prob = out[i]['prob1'][0][1]
        roi      = out[i]['conv4-2'][0]
        out_h,out_w = cls_prob.shape
        out_side = max(out_h,out_w)
        rectangle = tools.detect_face_12net(cls_prob,roi,out_side,1/scales[i],origin_w,origin_h,threshold[0])
        rectangles.extend(rectangle)
    rectangles = tools.NMS(rectangles,0.7,'iou')

    if len(rectangles)==0:
        return rectangles
    net_24.blobs['data'].reshape(len(rectangles),3,24,24)
    crop_number = 0
    for rectangle in rectangles:
        crop_img = caffe_img[int(rectangle[1]):int(rectangle[3]), int(rectangle[0]):int(rectangle[2])]
        scale_img = cv2.resize(crop_img,(24,24))
        scale_img = np.swapaxes(scale_img, 0, 2)
        net_24.blobs['data'].data[crop_number] =scale_img 
        crop_number += 1
    out = net_24.forward()
    cls_prob = out['prob1']
    roi_prob = out['conv5-2']
    rectangles = tools.filter_face_24net(cls_prob,roi_prob,rectangles,origin_w,origin_h,threshold[1])

    if len(rectangles)==0:
        return rectangles
    net_48.blobs['data'].reshape(len(rectangles),3,48,48)
    crop_number = 0
    for rectangle in rectangles:
        crop_img = caffe_img[int(rectangle[1]):int(rectangle[3]), int(rectangle[0]):int(rectangle[2])]
        scale_img = cv2.resize(crop_img,(48,48))
        scale_img = np.swapaxes(scale_img, 0, 2)
        net_48.blobs['data'].data[crop_number] =scale_img 
        crop_number += 1
    out = net_48.forward()
    cls_prob = out['prob1']
    roi_prob = out['conv6-2']
    pts_prob = out['conv6-3']
    rectangles = tools.filter_face_48net(cls_prob,roi_prob,pts_prob,rectangles,origin_w,origin_h,threshold[2])

    return rectangles

threshold = [0.6,0.6,0.7]
imgpath = ""
rectangles = detectFace(imgpath,threshold)
img = cv2.imread(imgpath)
draw = img.copy()
for rectangle in rectangles:
    cv2.putText(draw,str(rectangle[4]),(int(rectangle[0]),int(rectangle[1])),cv2.FONT_HERSHEY_SIMPLEX,1,(0,255,0))
    cv2.rectangle(draw,(int(rectangle[0]),int(rectangle[1])),(int(rectangle[2]),int(rectangle[3])),(255,0,0),1)
    for i in range(5,15,2):
        cv2.circle(draw,(int(rectangle[i+0]),int(rectangle[i+1])),2,(0,255,0))
cv2.imshow("test",draw)
cv2.waitKey()
cv2.imwrite('test.jpg',draw)

上面只是一個簡單的實現過程的介紹,但是需要實現論文裏面的效果,還需要很複雜的處理數據和調參過程。

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章