文章目錄
一 dlib本地安裝與編譯
1.1 dlib源碼下載
下載地址:https://github.com/davisking/dlib (當前最新版dlib 19.15)
爲了區分版本,將下載目錄命名爲dlib-19-15,如上圖所示。
1.2 dlib C++編譯示例程序
1.2.1 dlib庫編譯
編譯需要安裝VS,在此安裝的是最新版Visual Studio 15 2017版本。
進入dlib-master/dlib-19-15目錄,運行:
# mkdir build
# cd build
# cmake ..
# cmake --build .
指定運行環境及模式:
# cmake .. -G "Visual Studio 15 2017 Win64" -T host=x64
在上圖的目錄下能看到生成的.lib依賴項,則代表dllib庫成功編譯。
1.2.2 C++示例程序配置、運行
以examples/train_shape_predictor_ex.cpp爲例,其他示例代碼操作相同。
1、創建ConsoleApplication1.cpp和source.cpp來源
首先,打開VS新建一個C++控制檯工程,將train_shape_predictor_ex.cpp的代碼複製到ConsoleApplication1.cpp,以添加現有項的方式加入source.cpp文件,source.cpp文件在dlib-master/dlib-19-15/dlib/all目錄下。
2、修改stadfx屬性
進入項目-屬性進行以下修改,避免預編譯頭帶來的error。
3、 加入目錄
4、 加入生成的依賴項.lib的路徑
5、圖形處理類配置
加入DLIB_JPG_SUPPORT、DLIB_JPEG_ SUPPORT、DLIB_JPEG_STATIC
項目配置完成後,點擊生成-生成解決方案,工程目錄下將會生成ConsoleApplication1.exe文件。以命令行的方式運行ConsoleApplication1.exe文件,或者在VS上點擊調試-開始執行即可。有參數輸入的需要輸入命令行參數。
1.3 dlib python API編譯
方法一:
進入目錄,運行:
# python setup.py install
之後進入python_examples便可運行python示例程序。
方法二:
# pip3 install dlib
這種方法目前本地dlib19.15版本不能成功安裝,只能安裝低版本的dlib,這樣python示例中的某些函數調用可能不能正常運行。
二 dlib庫的主要功能及準確率評估
dlib庫中的主要功能包括人臉檢測、人臉關鍵點檢測、人臉識別三部分。此處研究python_examples示例代碼部分,C++程序示例類似。這裏的評估實現主要是參考2.1節中示例代碼的二次開發代碼。
2.1 代碼功能簡介
主要代碼在dlib庫的python_examples目錄下,其中需要用到的模型文件下載地址爲http://dlib.net/files:
- face_detector.py
人臉正面檢測器,主要使用dlib.get_frontal_face_detector()。 - cnn_face_detector.py
人臉檢測器,主要使用dlib.cnn_face_detection_model_v1 (‘mmod_human_face_detector.dat’),官方指出比dlib.get_frontal_face_detector()準確率高。 - face_landmark.py
人臉關鍵點檢測,主要使用dlib.get_frontal_face_detector()和dlib.shape_predictor(‘shape_predictor_68_face_landmarks.dat’)。 - face_recognition.py
人臉識別,主要使用dlib.get_frontal_face_detector()和dlib.shape_predictor(‘shape_predictor_5_face_landmarks.dat’)和dlib.face_recognition_model_v1(‘dlib_face_recognition_resnet_model_v1.dat’)。 - opencv_webcam_face_detection.py
人臉檢測的視頻使用,主要使用dlib.get_frontal_face_detector()和cv2.VideoCapture()。 - train_object_detector.py
人臉正面檢測器的訓練部分,訓練生成detector.svm文件。 - train_shape_predictor.py
人臉關鍵點檢測器的訓練部分,訓練生成predictor.dat文件。
2.2 人臉檢測和人臉關鍵點
2.2.1 數據集、代碼準備
使用參考代碼:examples/face_landmark_detection.py,爲了進行人臉準確率統計,將其改寫並命名爲face_landmark.py,目前只能統計圖片中含單個人臉的準確率(每張圖片含多個人臉難以統計總的準確率)。
需要的模型文件:shape_predictor_68_face_landmarks.dat是訓練好的人臉關鍵點檢測器。
待測圖像數據集:LFW數據集。
face_landmark.py代碼如下:
import os
import dlib
from skimage import io
# 待測人臉數據集
faces_folder_path = "lfwdata"
# 第一步,人臉檢測器和人臉關鍵點檢測器加載
# 人臉檢測器
detector = dlib.get_frontal_face_detector()
# 人臉關鍵點檢測器
predictor = dlib.shape_predictor("../shape_predictor_68_face_landmarks.dat")
# 第二步,遍歷圖片,使用人臉檢測器和人臉關鍵點檢測器,並顯示
# 窗口
win = dlib.image_window()
# 統計檢測正確數
tol = ans = 0
# 遍歷文件夾中的jpg圖片
for (path, dirnames, filenames) in os.walk(faces_folder_path):
for filename in filenames:
if filename.endswith('.jpg') or filename.endswith('.png'):
tol += 1
img_path = path + '/' + filename
print("Processing file: {}".format(img_path))
# 讀取圖片
img = io.imread(img_path)
win.clear_overlay()
win.set_image(img)
# 人臉檢測器的使用
dets = detector(img, 1)
# 統計每張圖片人臉個數>0判斷是否檢測成功
face_num = len(dets)
print("Number of faces detected: {}".format(len(dets)))
if face_num > 0:
ans += 1
else:
print("fail")
for k, d in enumerate(dets):
print("Detection {}: Left: {} Top: {} Right: {} Bottom: {}".format(
k, d.left(), d.top(), d.right(), d.bottom()))
# 人臉關鍵點檢測器的使用
shape = predictor(img, d)
print("Part 0: {}, Part 1: {} ...".format(shape.part(0), shape.part(1)))
win.add_overlay(shape)
win.add_overlay(dets)
# 鼠標控制下一張
# dlib.hit_enter_to_continue()
# 第三步,計算準確率
# 打印準確率
print("correct:{},total{}".format(ans, tol))
print("correct:{}".format(ans/tol))
2.2.2 測試效果圖
2.2.3 準確率
檢測總共13234張圖片,檢測到有人臉的有13172張照片,準確率爲:99.53%。
測試失敗的圖像中,人像多爲半臉、側臉、曝光或有遮擋。這與代碼中使用的是正臉檢測器dlib.get_frontal_face_detector()有很大關係,檢測失敗的部分圖片如下:
2.3 人臉識別
2.3.1 數據集、代碼準備
使用參考代碼:face_recognition.py,爲了進行準確率統計,將其改寫並命名爲face_recog.py。
需要的模型文件:shape_predictor_68_face_landmarks.dat是訓練好的人臉關鍵點檢測器。dlib_face_recognition_resnet_model_v1.dat是訓練好的ResNet人臉識別模型。
數據集:lfw數據集挑選候選人臉398張正臉(每人一張圖片),待測人臉525張正臉(每個人可能含有多張圖片)。
face_recog.py代碼如下:
import os
import dlib
import glob
import numpy
from skimage import io
# 訓練人臉文件夾
faces_folder_path = "recog_train"
# 待測人臉文件夾
img_folder_path = "recog_test"
# 第二步,生成訓練人臉標籤和描述子,供人臉識別使用
# 對文件夾下的每一個人臉進行:
# 1.人臉檢測
# 2.關鍵點檢測
# 3.描述子提取
# 訓練人臉標籤和描述子list
def train(faces_folder_path):
trainlabel = []
train_descriptors = []
for file in glob.glob(os.path.join(faces_folder_path, "*.jpg")):
labelName = file.split('_0')[0].split('\\')[1]
trainlabel.append(labelName)
# print("Processing file: {}".format(labelName))
face = io.imread(file)
# 1.人臉檢測
dets = detector(face, 1)
# print("Number of faces detected: {}".format(len(dets)))
for k, d in enumerate(dets):
# 2.關鍵點檢測
shape = predictor(face, d)
# 3.描述子提取,128D向量
face_descriptor = facerec.compute_face_descriptor(face, shape)
# 轉換爲numpy array
face_vector = numpy.array(face_descriptor)
train_descriptors.append(face_vector)
return trainlabel, train_descriptors
# 第三步,識別待測人臉是哪個人
def recognition(trainlabel, train_descriptors):
ans_right = 0
ans_wrong = 0
# 對需識別人臉進行同樣處理
for file in glob.glob(os.path.join(img_folder_path, "*.jpg")):
img = io.imread(file)
# 人臉檢測
dets = detector(img, 1)
# 待測人臉與所有訓練人臉的距離
dists = []
for k, d in enumerate(dets):
# 關鍵點檢測
shape = predictor(img, d)
# 提取描述子
test_descriptor = facerec.compute_face_descriptor(img, shape)
d_test = numpy.array(test_descriptor)
# 計算歐式距離
for d_train in train_descriptors:
dist = numpy.linalg.norm(d_train-d_test)
dists.append(dist)
# 待測人臉和所有訓練人臉的標籤、距離組成一個dict
c_d = dict(zip(trainlabel, dists))
cd_sorted = sorted(c_d.items(), key=lambda d:d[1])
nametest = file.split('_0')[0].split('\\')[1]
print(cd_sorted[0][1])
# 設置閾值判斷是哪個人
if cd_sorted[0][1] < 0.6:
namepredict = cd_sorted[0][0]
else:
namepredict = "Unknown"
print(nametest, namepredict)
# 判斷識別是否正確識別
if(namepredict == nametest) or (namepredict == "Unknown" and nametest not in trainlabel):
print("right")
ans_right += 1
else:
print("wrong")
ans_wrong += 1
# dlib.hit_enter_to_continue()
print("total:", ans_right + ans_wrong, "\nright:", ans_right, "\nwrong:", ans_wrong)
if __name__ == '__main__':
# 第一步,三種檢測器的加載
# 1.加載正臉檢測器
detector = dlib.get_frontal_face_detector()
# 2.加載人臉關鍵點檢測器
predictor = dlib.shape_predictor("../shape_predictor_68_face_landmarks.dat")
# 3. 加載人臉識別模型
facerec = dlib.face_recognition_model_v1("../dlib_face_recognition_resnet_model_v1.dat")
# 第二步,生成訓練人臉標籤和描述子,供人臉識別使用
trainlabel, train_descriptors = train(faces_folder_path)
# 第三步,識別待測人臉是哪個人並統計正確率
recognition(trainlabel, train_descriptors)
2.3.2 人臉識別步驟
首先,先將候選人臉文件夾中的人臉進行:
1.人臉檢測
2.關鍵點檢測,畫出人臉區域和和關鍵點
3.描述子提取,128D向量,轉換爲numpy array
4.將候選人圖像的文件名提取出來,作爲候選人名單
然後,對待測人臉進行同樣的處理:
1.人臉檢測,關鍵點檢測,描述子提取
2.計算待測人臉描述子和候選人臉描述子之間的歐氏距離
3.將所有候選人與待測人臉描述子的距離組成一個dict
4.排序
5.距離最小者且閾值小於0.6,判定爲同一個人
2.3.3 準確率
檢測的525張圖片中,有503張檢測成功,準確率爲:503/525=95.81%。
2.4 視頻中的人臉檢測、人臉識別
2.4.1 攝像頭讀入檢測時間測試
代碼命名爲face_detector_video.py。代碼如下:
import cv2
import dlib
import time
# 初始化dlib人臉檢測器
detector = dlib.get_frontal_face_detector()
# 初始化顯示窗口
win = dlib.image_window()
# opencv加載視頻文件
# cap = cv2.VideoCapture(r'../test.mp4')
cap = cv2.VideoCapture(0) #加載攝像頭
while True:
start = time.time()
ret, cv_img = cap.read()
if cv_img is None:
break
# 縮小圖像至1/4
cv_img = cv2.resize(cv_img, (0, 0), fx=0.25, fy=0.25)
# OpenCV默認是讀取爲RGB圖像,而dlib需要的是BGR圖像,因此這一步轉換不能少
img = cv2.cvtColor(cv_img, cv2.COLOR_RGB2BGR)
# 檢測人臉
dets = detector(img, 1)
print("Number of faces detected: {}".format(len(dets)))
for i, d in enumerate(dets):
print("Detection {}: Left: {} Top: {} Right: {} Bottom: {}".format(
i, d.left(), d.top(), d.right(), d.bottom()))
print(time.time() - start)
win.clear_overlay()
win.set_image(img)
win.add_overlay(dets)
cap.release()
dlib的人臉檢測精度比OpenCV自帶的高很多,因此本文采用dlib的人臉檢測器。從攝像頭讀入數據,結合OpenCV將視頻流截成圖像幀,使用正臉檢測器dlib.get_frontal_face_detector()進行檢測。
測試效果圖:
測試時的輸出:
測試速度:
0.09s~0.11s/幀。
2.4.2 mp4文件讀入檢測時間測試
將2.4.1節的代碼中加載攝像頭語句更改爲加載mp4文件。然後同樣將視頻截成圖像,使用正臉檢測器dlib.get_frontal_face_detector()進行檢測。
測試效果圖:
測試時的輸出:
測試速度:
0.09s~0.11s/幀。
注意:視頻文件中的人臉檢測的速度跟文件的大小(幀高、幀寬)有很大關係。
2.4.3 視頻中的人臉識別
分別使用dlib中的人臉識別功能,代碼命名爲face_recogn_video.py;和dlib二次開發包face_recognition中的人臉識別功能,代碼命名爲face_recognition_video.py。
face_recogn_video.py代碼如下:
import dlib
import numpy as np
import cv2
import json
import os
import glob
# 候選人數據集
faces_folder_path = r'../train_person'
video_path = r'../test.mp4'
# 獲取訓練集標籤和人臉識別描述子
def train(faces_folder_path):
trainlabel = []
train_descriptors = []
for file in glob.glob(os.path.join(faces_folder_path, "*.jpg")):
labelName = file.split('.jpg')[0].split('\\')[1]
trainlabel.append(labelName)
print("Processing file: {}".format(labelName))
face = cv2.imread(file)
# 1.人臉檢測
dets = detector(face, 1)
# print("Number of faces detected: {}".format(len(dets)))
for k, d in enumerate(dets):
# 2.關鍵點檢測
shape = predictor(face, d)
# 3.描述子提取,128D向量
face_descriptor = facerec.compute_face_descriptor(face, shape)
# 轉換爲numpy array
face_vector = np.array(face_descriptor)
train_descriptors.append(face_vector)
return trainlabel, train_descriptors
# 識別確定哪個人
def findNearestClassForImage(face_descriptor, trainlabel, train_descriptors):
train_descriptors = np.array(train_descriptors)
dist = np.linalg.norm(face_descriptor - train_descriptors, axis=1, keepdims=True)
min_distance = dist.min()
print('distance: ', min_distance)
if min_distance > threshold:
return 'Unknown'
index = np.argmin(dist)
return trainlabel[index]
# 人臉識別
def recognition(img, trainlabel, train_descriptors):
# 人臉檢測
dets = detector(img, 1)
for k, d in enumerate(dets):
print("Detection {}: Left: {} Top: {} Right: {} Bottom: {}".format(
k, d.left(), d.top(), d.right(), d.bottom()))
# 人臉關鍵點檢測器
shape = predictor(img, d)
# 人臉識別描述子
face_descriptor = facerec.compute_face_descriptor(img, shape)
# 識別確定哪個人
class_pre = findNearestClassForImage(face_descriptor, trainlabel, train_descriptors)
print(class_pre)
cv2.rectangle(img, (d.left(), d.top() + 10), (d.right(), d.bottom()), (0, 255, 0), 2)
cv2.putText(img, class_pre, (d.left(), d.top()), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 255, 0), 2, cv2.LINE_AA)
cv2.imshow('image', img)
if __name__ == '__main__':
# 加載網絡模型
detector = dlib.get_frontal_face_detector()
predictor = dlib.shape_predictor('../shape_predictor_68_face_landmarks.dat')
facerec = dlib.face_recognition_model_v1('../dlib_face_recognition_resnet_model_v1.dat')
# 設置識別閾值
threshold = 0.6
# 訓練標籤及人臉識別描述子
trainlabel, train_descriptors = train(faces_folder_path)
# cap = cv2.VideoCapture(0)
cap = cv2.VideoCapture(video_path)
# 保存視頻
# fps = 10
# size = (640, 480)
# fourcc = cv2.VideoWriter_fourcc(*'XVID')
# videoWriter = cv2.VideoWriter('video.MP4', fourcc, fps, size)
while (1):
ret, frame = cap.read()
# 縮小圖像至1/4
frame = cv2.resize(frame, (0,0), fx=0.25, fy=0.25)
# 人臉識別
recognition(frame, trainlabel, train_descriptors)
# videoWriter.write(frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
videoWriter.release()
cv2.destroyAllWindows()
face_recognition_video.py代碼如下:
import face_recognition
import cv2
import os
import glob
# 視頻路徑和已知人臉文件夾
video_path = r'../test.mp4'
faces_folder_path = '../train_person'
# 讀取訓練集人臉姓名和人臉識別編碼
def train(faces_folder_path):
known_face_names = []
known_face_encodings = []
for file in glob.glob(os.path.join(faces_folder_path, "*.jpg")):
labelName = file.split('.jpg')[0].split('\\')[1]
known_face_names.append(labelName)
image = face_recognition.load_image_file(file)
face_encoding = face_recognition.face_encodings(image)[0]
known_face_encodings.append(face_encoding)
return known_face_names, known_face_encodings
def recognition(rgb_small_frame, known_face_names, known_face_encodings):
# 根據encoding來判斷是不是同一個人,是就輸出true,不是爲flase
face_locations = face_recognition.face_locations(rgb_small_frame)
face_encodings = face_recognition.face_encodings(rgb_small_frame, face_locations)
face_names = []
for face_encoding in face_encodings:
# 默認爲unknown
matches = face_recognition.compare_faces(known_face_encodings, face_encoding)
name = "Unknown"
if True in matches:
first_match_index = matches.index(True)
name = known_face_names[first_match_index]
face_names.append(name)
return face_locations, face_names
def main():
face_locations = []
face_names = []
# 設置顯示窗口
wnd = 'OpenCV Video'
cv2.namedWindow(wnd, flags=0)
cv2.resizeWindow(wnd, 1920, 1080)
known_face_names, known_face_encodings = train(faces_folder_path)
# 讀取視頻
# video_capture = cv2.VideoCapture(0)
video_capture = cv2.VideoCapture(video_path)
# 隔幾幀顯示
process_this_frame = 0
while True:
# 讀取攝像頭畫面
ret, frame = video_capture.read()
# 改變攝像頭圖像的大小,圖像小,所做的計算就少
small_frame = cv2.resize(frame, (0, 0), fx=0.25, fy=0.25)
# opencv的圖像是BGR格式的,而我們需要是的RGB格式的,因此需要進行一個轉換。
rgb_small_frame = small_frame[:, :, ::-1]
process_this_frame += 1
if process_this_frame % 5 == 0:
# 位置,姓名
face_locations, face_names = recognition(rgb_small_frame, known_face_names, known_face_encodings)
# 將捕捉到的人臉顯示出來
for (top, right, bottom, left), name in zip(face_locations, face_names):
# 放大至真實值
top *= 4
right *= 4
bottom *= 4
left *= 4
# 矩形框
cv2.rectangle(frame, (left, top), (right, bottom), (0, 0, 255), 2)
#加上標籤
cv2.rectangle(frame, (left, bottom - 35), (right, bottom), (0, 0, 255), cv2.FILLED)
font = cv2.FONT_HERSHEY_DUPLEX
cv2.putText(frame, name, (left + 6, bottom - 6), font, 1.0, (255, 255, 255), 1)
# 顯示
cv2.imshow(wnd, frame)
# 按Q退出
if cv2.waitKey(1) & 0xFF == ord('q'):
break
video_capture.release()
cv2.destroyAllWindows()
if __name__ == '__main__':
main()
測試效果圖:
測試結果:
face_recognition中的人臉識別功能比dlib中的人臉識別功能識別速度較快。
2.4.4 優化部分
在face_recognition中的人臉識別功能代碼中,加入了兩點優化:
1、識別時縮小圖像至1/4,顯示時擴大至圖像原大小;
2、每5幀進行一次人臉識別。
最後達到的人臉識別速度接近實時識別(即接近達到正常播放視頻的速度)。
三 python訓練自己的模型
用python_examples的示例代碼訓練自己的模型較簡單,爲了加快訓練,將在linux服務器上運行python_examples的示例代碼。關於linux服務器dlib庫的安裝參考1.3節。此處採用的方法是pip3 install dib。
3.1 數據集標註
3.1.1 imglab簡介
imglab是dlib提供用來製作數據集的工具,通過給圖片打標籤,最後會生成一個xml文件。
3.1.2 imglab使用方法
在dlib官方源碼中提供了這個工具,文件路徑爲:tools/imglab。
使用前要先安裝好cmake。
- 使用步驟:
- 打開cmd
- 進入tools/imglab目錄
- 新建一個build文件夾,進入build
- 輸入:cmake …
- 輸入:cmake --build . --config Release
- 進入Release
- 新建一個image文件夾,將訓練集所有圖片複製進去
- 在Release目錄下,輸入:imglab -c mydataset.xml image,將會創建一個mydataset.xml文件
- 輸入:imglab mydataset.xml
出現imglab標註軟件了,可以自己進行標註了。
- 標註方法如下:
- 按Shift+左鍵進行畫框。先鬆開左鍵,框就畫上去了;先鬆開Shift鍵,則取消畫人臉框。
- 對框雙擊左鍵,按delete鍵可刪除。
- 對框雙擊左鍵,按i鍵可將物體標註爲ignore,即是不明物體,進行忽略。
- 按e鍵,會曝光圖片,效果如下。
- 按Ctrl鍵加滾輪,可以縮放圖片加標籤。
- 雙擊選中框後,按shift+左鍵可畫關鍵點。
- 畫完人臉框和關鍵點之後,點filesave保存,然後exit退出,就可以在mydataset.xml文件中看到人臉檢測的數據集了。
3.2 訓練自己的人臉關鍵點檢測器
3.2.1 數據集
使用imglab工具,給訓練的圖片和測試的圖片標註人臉框和關鍵點(5個關鍵點:眼睛、鼻子、嘴巴),訓練圖片7張,測試圖片5張。生成標註文件train_landmarks.xml和test_landmarks.xml。目錄如下,train、test文件夾中存放訓練、測試圖片。
3.2.2 訓練部分
訓練代碼參考python_examples/train_shape_predictor.py,如下:
import os
import sys
import glob
import dlib
options = dlib.shape_predictor_training_options()
# Now make the object responsible for training the model.
# This algorithm has a bunch of parameters you can mess with. The
# documentation for the shape_predictor_trainer explains all of them.
# You should also read Kazemi's paper which explains all the parameters
# in great detail. However, here I'm just setting three of them
# differently than their default values. I'm doing this because we
# have a very small dataset. In particular, setting the oversampling
# to a high amount (300) effectively boosts the training set size, so
# that helps this example.
options.oversampling_amount = 300
# I'm also reducing the capacity of the model by explicitly increasing
# the regularization (making nu smaller) and by using trees with
# smaller depths.
options.nu = 0.05
options.tree_depth = 2
options.be_verbose = True
# dlib.train_shape_predictor() does the actual training. It will save the
# final predictor to predictor.dat. The input is an XML file that lists the
# images in the training dataset and also contains the positions of the face
# parts.
training_xml_path = ' /home/users/chenzhuo/program/dlib-19-15/python_test/mytest/train_landmarks.xml '
dlib.train_shape_predictor(training_xml_path, "predictor.dat", options)
# Now that we have a model we can test it. dlib.test_shape_predictor()
# measures the average distance between a face landmark output by the
# shape_predictor and where it should be according to the truth data.
print("\nTraining accuracy: {}".format(
dlib.test_shape_predictor(training_xml_path, "predictor.dat")))
# The real test is to see how well it does on data it wasn't trained on. We
# trained it on a very small dataset so the accuracy is not extremely high, but
# it's still doing quite good. Moreover, if you train it on one of the large
# face landmarking datasets you will obtain state-of-the-art results, as shown
# in the Kazemi paper.
testing_xml_path = ‘/home/users/chenzhuo/program/dlib-19-15/python_test/mytest/test_landmarks.xml’
print("Testing accuracy: {}".format(
dlib.test_shape_predictor(testing_xml_path, "predictor.dat")))
將上述代碼命名爲shape_predictor_train.py,將代碼中training_xml_path改爲自己的數據集xml文件路徑,進入.py文件所在目錄,執行
# python3 shape_predictor_train.py
3.2.3 測試部分
測試代碼訓練代碼參考python_examples/shape_predictor_test.py,如下:
import os
import sys
import glob
import cv2
import dlib
if len(sys.argv) != 2:
print(
"Give the path to the examples/faces directory as the argument to this "
"program. For example, if you are in the python_examples folder then "
"execute this program by running:\n"
" ./train_shape_predictor.py ../examples/faces")
exit()
faces_folder = sys.argv[1]
# Now let's use it as you would in a normal application. First we will load it
# from disk. We also need to load a face detector to provide the initial
# estimate of the facial location.
predictor = dlib.shape_predictor("predictor.dat")
detector = dlib.get_frontal_face_detector()
# Now let's run the detector and shape_predictor over the images in the faces
# folder and display the results.
print("Showing detections and predictions on the images in the faces folder...")
win = dlib.image_window()
for f in glob.glob(os.path.join(faces_folder, "*.jpg")):
print("Processing file: {}".format(f))
# img = dlib.load_rgb_image(f)
img = cv2.imread(f)
win.clear_overlay()
win.set_image(img)
# Ask the detector to find the bounding boxes of each face. The 1 in the
# second argument indicates that we should upsample the image 1 time. This
# will make everything bigger and allow us to detect more faces.
dets = detector(img, 1)
print("Number of faces detected: {}".format(len(dets)))
for k, d in enumerate(dets):
print("Detection {}: Left: {} Top: {} Right: {} Bottom: {}".format(
k, d.left(), d.top(), d.right(), d.bottom()))
# Get the landmarks/parts for the face in box d.
shape = predictor(img, d)
print("Part 0: {}, Part 1: {} ...".format(shape.part(0),
shape.part(1)))
# Draw the face landmarks on the screen.
win.add_overlay(shape)
win.add_overlay(dets)
dlib.hit_enter_to_continue()
將上述代碼命名爲shape_predictor_test.py,打開VNC客戶端,進入.py文件所在目錄,執行
# python3 shape_predictor_test.py /home/users/chenzhuo/program/dlib-19-15/examples/faces
3.2.4 優化部分
訓練時可以用多姿態的訓練數據,比如正臉、左側臉、右側臉的標註數據集進行訓練。
3.3 訓練自己的人臉檢測器
3.3.1 數據集
# wget http://dlib.net/files/data/dlib_face_detector_training_data.tar.gz
這是dlib訓練使用的數據集,裏面有數千張人臉的標註數據集,此處僅使用frontal_faces.xml,如下圖。
3.3.2 訓練部分
訓練代碼參考python_examples/train_object_detection.py,如下:
import os
import sys
import glob
import dlib
# Now let's do the training. The train_simple_object_detector() function has a
# bunch of options, all of which come with reasonable default values. The next
# few lines goes over some of these options.
# 超參數
options = dlib.simple_object_detector_training_options()
# Since faces are left/right symmetric we can tell the trainer to train a
# symmetric detector. This helps it get the most value out of the training
# data.
# 對稱檢測器
options.add_left_right_image_flips = True
# The trainer is a kind of support vector machine and therefore has the usual
# SVM C parameter. In general, a bigger C encourages it to fit the training
# data better but might lead to overfitting. You must find the best C value
# empirically by checking how well the trained detector works on a test set of
# images you haven't trained on. Don't just leave the value set at 5. Try a
# few different C values and see what works best for your data.
options.C = 5
# Tell the code how many CPU cores your computer has for the fastest training.
options.num_threads = 4
options.be_verbose = True
training_xml_path = '/home/users/chenzhuo/program/dlib-19-15/python_test/dlib_face_detector_training_data/frontal_faces.xml'
# testing_xml_path = '/home/users/chenzhuo/program/dlib-19-15/python_test/cats/cats_test/cat_test.xml'
# This function does the actual training. It will save the final detector to
# detector.svm. The input is an XML file that lists the images in the training
# dataset and also contains the positions of the face boxes. To create your
# own XML files you can use the imglab tool which can be found in the
# tools/imglab folder. It is a simple graphical tool for labeling objects in
# images with boxes. To see how to use it read the tools/imglab/README.txt
# file. But for this example, we just use the training.xml file included with
# dlib.
dlib.train_simple_object_detector(training_xml_path, "detector.svm", options)
# Now that we have a face detector we can test it. The first statement tests
# it on the training data. It will print(the precision, recall, and then)
# average precision.
print("") # Print blank line to create gap from previous output
print("Training accuracy: {}".format(
dlib.test_simple_object_detector(training_xml_path, "detector.svm")))
# However, to get an idea if it really worked without overfitting we need to
# run it on images it wasn't trained on. The next line does this. Happily, we
# see that the object detector works perfectly on the testing images.
# print("Testing accuracy: {}".format(
# dlib.test_simple_object_detector(testing_xml_path, "detector.svm")))
將上述代碼命名爲object_detection_train.py,將代碼中training_xml_path改爲自己的數據集xml文件路徑,進入.py文件所在目錄,執行
# python3 object_detection_train.py
3.3.3 測試部分
測試代碼訓練代碼參考python_examples/train_object_detection.py,如下:
import os
import sys
import glob
import dlib
import cv2
if len(sys.argv) != 2:
print(
"Give the path to the examples/faces directory as the argument to this "
"program. For example, if you are in the python_examples folder then "
"execute this program by running:\n"
" ./train_object_detector.py ../examples/faces")
exit()
faces_folder = sys.argv[1]
# Now let's use the detector as you would in a normal application. First we
# will load it from disk.
detector = dlib.simple_object_detector("detector.svm")
# We can look at the HOG filter we learned. It should look like a face. Neat!
win_det = dlib.image_window()
win_det.set_image(detector)
# Now let's run the detector over the images in the faces folder and display the
# results.
print("Showing detections on the images in the faces folder...")
win = dlib.image_window()
for f in glob.glob(os.path.join(faces_folder, "*.jpg")):
print("Processing file: {}".format(f))
# img = dlib.load_rgb_image(f)
img = cv2.imread(f)
dets = detector(img)
print("Number of faces detected: {}".format(len(dets)))
for k, d in enumerate(dets):
print("Detection {}: Left: {} Top: {} Right: {} Bottom: {}".format(
k, d.left(), d.top(), d.right(), d.bottom()))
win.clear_overlay()
win.set_image(img)
win.add_overlay(dets)
dlib.hit_enter_to_continue()
將上述代碼命名爲object_detection_test.py,打開VNC客戶端,進入.py文件所在目錄,執行
# python3 object_detection_test.py /home/users/chenzhuo/program/dlib-19-15/examples/faces
3.3.4 優化部分
目前訓練好的人臉檢測器爲正臉檢測器,對側臉的檢測效果較差。
爲了提高人臉檢測的準確性,可以訓練多個人臉檢測器進行人臉預測,比如訓練正臉檢測器、左側臉檢測器、右側臉檢測器等多個檢測器進行組合,使用關鍵操作如下:
image = dlib.load_rgb_image(faces_folder + '/2008_002506.jpg')
detector1 = dlib.fhog_object_detector("detector.svm")
detector2 = dlib.fhog_object_detector("detector.svm")
detectors = [detector1, detector2]
[boxes, confidences, detector_idxs] = dlib.fhog_object_detector.run_multiple (detectors, image, upsample_num_times=1, adjust_threshold=0.0)
for i in range(len(boxes)):
print("detector {} found box {} with confidence {}.".format(detector_idxs[i], boxes[i], confidences[i]))
3.4 總結
從上面的訓練操作流程看,dlib庫不僅可以做人臉檢測、識別,還可以做其他物體的檢測、識別等功能。
四 C++訓練自己的模型
4.1 訓練自己的人臉關鍵點檢測器
每一個代碼的程序配置參見1.2.2節。選擇在Release模式下進行項目配置並運行,加快運行速度。
4.1.1 數據集
使用imglab工具,給訓練的圖片和測試的圖片標註人臉框和關鍵點(5個關鍵點:眼睛、鼻子、嘴巴),訓練圖片7張,測試圖片5張。生成標註文件train_landmarks.xml和test_landmarks.xml。目錄如下,train、test文件夾中存放訓練、測試圖片。
4.1.2 訓練部分
使用examples/train_shape_predictor_ex.cpp代碼進行項目配置後,命令參數中輸入標註xml文件所在的目錄,點擊調試-開始執行,進行模型的訓練,生成模型文件sp.dat。
load_image_dataset(images_train, face_boxes_train,faces_directory+"\\***.xml");
load_image_dataset(images_test,face_boxes_test, faces_directory+"\\***.xml");
測試誤差:
4.1.3 測試
使用examples/face_landmark_detection_ex.cpp代碼進行項目配置後,在命令參數中輸入生成的模型文件sp.dat的路徑和待檢測的圖片路徑,點擊調試-開始執行,測試結果如下:
4.1.4 優化部分
訓練時可以用多姿態的訓練數據,比如正臉、左側臉、右側臉的標註數據集進行訓練。
4.2 訓練自己的人臉檢測器
4.2.1 數據集
使用imglab工具,給訓練的圖片和測試的圖片標註人臉框,訓練圖片7張,測試圖片5張。生成標註文件train.xml和test.xml。
4.2.2 訓練部分
使用examples/ fhog_object_detector_ex.cpp代碼進行項目配置後,在代碼裏修改以下語句,將自己標註的xml文件名寫入代碼相應位置中。
load_image_dataset(images_train, face_boxes_train,faces_directory+"\\***.xml");
load_image_dataset(images_test,face_boxes_test, faces_directory+"\\***.xml");
點擊調試-開始執行,訓練效果圖如下,結果會生成face_predictor.svm模型文件:
4.2.3 測試
示例中沒提供測試代碼,該部分爲自寫代碼,命名爲face_object_detection:
/*
人臉檢測器測試
*/
#include <dlib/svm_threaded.h>
#include <dlib/gui_widgets.h>
#include <dlib/image_processing.h>
#include <dlib/data_io.h>
#include <iostream>
#include <fstream>
using namespace std;
using namespace dlib;
// ----------------------------------------------------------------------------------------
int main(int argc, char** argv)
{
try
{
// In this example we are going to train a face detector based on the
// small faces dataset in the examples/faces directory. So the first
// thing we do is load that dataset. This means you need to supply the
// path to this faces folder as a command line argument so we will know
// where it is.
if (argc == 1)
{
cout << "Call this program like this:" << endl;
cout << "./face_detector.svm faces/*.jpg" << endl;
return 0;
}
//定義scanner類型,用於掃描圖片並提取特徵(HOG)
typedef scan_fhog_pyramid<pyramid_down<6> > image_scanner_type;
// 加載模型
object_detector<image_scanner_type> detector;
deserialize(argv[1]) >> detector;
//顯示hog
image_window hogwin(draw_fhog(detector), "Learned fHOG detector");
// 顯示測試集的人臉檢測結果
image_window win;
// Loop over all the images provided on the command line.
for (int i = 2; i < argc; ++i)
{
cout << "processing image " << argv[i] << endl;
array2d<rgb_pixel> img;
// 讀取圖片數據
load_image(img, argv[i]);
// Make the image larger so we can detect small faces.
pyramid_up(img);
// Now tell the face detector to give us a list of bounding boxes
// around all the faces in the image.
// 人臉預測
std::vector<rectangle> dets = detector(img);
cout << "Number of faces detected: " << dets.size() << endl;
win.clear_overlay();
win.set_image(img);
win.add_overlay(dets, rgb_pixel(255, 0, 0));
cout << "Hit enter to process the next image..." << endl;
cin.get();
}
}
catch (exception& e)
{
cout << "\nexception thrown!" << endl;
cout << e.what() << endl;
}
system("pause");
}
在命令參數中輸入生成的模型文件face_predictor.svm的路徑和待檢測的圖片路徑,點擊調試-開始執行,測試結果如下:
4.2.4 優化部分
目前訓練好的人臉檢測器爲正臉檢測器,對側臉的檢測效果較差:
frontal_face_detector detector = get_frontal_face_detector();
訓練多個人臉檢測器進行人臉預測,比如訓練正臉檢測器、左側臉檢測器、右側臉檢測器等多個檢測器進行組合,使用關鍵操作如下:
std::vector<object_detector<image_scanner_type> > my_detectors;
my_detectors.push_back(detector);
std::vector<rectangle> dets = evaluate_detectors(my_detectors, image);