前言
基於深度學習的人臉識別系統,一共用到了5個開源庫:OpenCV(計算機視覺庫)、Caffe(深度學習庫)、Dlib(機器學習庫)、libfacedetection(人臉檢測庫)、cudnn(gpu加速庫)。
用到了一個開源的深度學習模型:VGG model。
最終的效果是很讚的,識別一張人臉的速度是0.039秒,而且最重要的是:精度高啊!!!
CPU:intel i5-4590
GPU:GTX 980
系統:Win 10
OpenCV版本:3.1(這個無所謂)
Caffe版本:Microsoft caffe (微軟編譯的Caffe,安裝方便,在這裏安利一波)
Dlib版本:19.0(也無所謂
CUDA版本:7.5
cudnn版本:4
libfacedetection:6月份之後的(這個有所謂,6月後出了64位版本的)
這個系列純C++構成,有問題的各位朋同學可以直接在博客下留言,我們互相交流學習。
====================================================================
本篇是該系列的第三篇博客,介紹如何使用VGG網絡模型與Caffe的 MemoryData層去提取一個OpenCV矩陣類型Mat的特徵。
思路
VGG網絡模型是牛津大學視覺幾何組提出的一種深度模型,在LFW數據庫上取得了97%的準確率。VGG網絡由5個卷積層,兩層fc圖像特徵,一層fc分類特徵組成,具體我們可以去讀它的prototxt文件。這裏是模型與配置文件的下載地址。
http://www.robots.ox.ac.uk/~vgg/software/vgg_face/
話題回到Caffe。在Caffe中提取圖片的特徵是很容易的,其提供了extract_feature.exe讓我們來實現,提取格式爲lmdb與leveldb。關於這個的做法,可以看我的這篇博客:
http://blog.csdn.net/mr_curry/article/details/52097529
顯然,我們在程序中肯定是希望能夠靈活利用的,使用這種方法不太可行。Caffe的Data層提供了type:MemoryData,我們可以使用它來進行Mat類型特徵的提取。
注:你需要先按照本系列第一篇博客的方法去配置好Caffe的屬性表。
http://blog.csdn.net/mr_curry/article/details/52443126
實現
首先我們打開VGG_FACE_deploy.prototxt,觀察VGG的網絡結構。
有意思的是,MemoryData層需要圖像均值,但是官方網站上並沒有給出mean文件。我們可以通過這種方式進行輸入:
mean_value:129.1863
mean_value:104.7624
mean_value:93.5940
我們還需要修改它的data層:(你可以用下面這部分的代碼去替換下載下來的prototxt文件的data層)
layer {
name: "data"
type: "MemoryData"
top: "data"
top: "label"
transform_param {
mirror: false
crop_size: 224
mean_value:129.1863
mean_value:104.7624
mean_value:93.5940
}
memory_data_param {
batch_size: 1
channels:3
height:224
width:224
}
}
爲了不破壞原來的文件,把它另存爲vgg_extract_feature_memorydata.prototxt。
好的,然後我們開始編寫。添加好這個屬性表:
然後,新建caffe_net_memorylayer.h、ExtractFeature_.h、ExtractFeature_.cpp開始編寫。
caffe_net_memorylayer.h:
#include "caffe/layers/input_layer.hpp"
#include "caffe/layers/inner_product_layer.hpp"
#include "caffe/layers/dropout_layer.hpp"
#include "caffe/layers/conv_layer.hpp"
#include "caffe/layers/relu_layer.hpp"
#include <iostream>
#include "caffe/caffe.hpp"
#include <opencv.hpp>
#include <caffe/layers/memory_data_layer.hpp>
#include "caffe/layers/pooling_layer.hpp"
#include "caffe/layers/lrn_layer.hpp"
#include "caffe/layers/softmax_layer.hpp"
// must predefined
caffe::MemoryDataLayer<float> *memory_layer;
caffe::Net<float>* net;
ExtractFeature_.h
#include <opencv.hpp>
using namespace cv;
using namespace std;
std::vector<float> ExtractFeature(Mat FaceROI);//給一個圖片 返回一個vector<float>容器
void Caffe_Predefine();
ExtractFeature_.cpp:
#include <ExtractFeature_.h>
#include <caffe_net_memorylayer.h>
namespace caffe
{
extern INSTANTIATE_CLASS(InputLayer);
extern INSTANTIATE_CLASS(InnerProductLayer);
extern INSTANTIATE_CLASS(DropoutLayer);
extern INSTANTIATE_CLASS(ConvolutionLayer);
REGISTER_LAYER_CLASS(Convolution);
extern INSTANTIATE_CLASS(ReLULayer);
REGISTER_LAYER_CLASS(ReLU);
extern INSTANTIATE_CLASS(PoolingLayer);
REGISTER_LAYER_CLASS(Pooling);
extern INSTANTIATE_CLASS(LRNLayer);
REGISTER_LAYER_CLASS(LRN);
extern INSTANTIATE_CLASS(SoftmaxLayer);
REGISTER_LAYER_CLASS(Softmax);
extern INSTANTIATE_CLASS(MemoryDataLayer);
}
template <typename Dtype>
caffe::Net<Dtype>* Net_Init_Load(std::string param_file, std::string pretrained_param_file, caffe::Phase phase)
{
caffe::Net<Dtype>* net(new caffe::Net<Dtype>("vgg_extract_feature_memorydata.prototxt", caffe::TEST));
net->CopyTrainedLayersFrom("VGG_FACE.caffemodel");
return net;
}
void Caffe_Predefine()//when our code begining run must add it
{
caffe::Caffe::set_mode(caffe::Caffe::GPU);
net = Net_Init_Load<float>("vgg_extract_feature_memorydata.prototxt", "VGG_FACE.caffemodel", caffe::TEST);
memory_layer = (caffe::MemoryDataLayer<float> *)net->layers()[0].get();
}
std::vector<float> ExtractFeature(Mat FaceROI)
{
caffe::Caffe::set_mode(caffe::Caffe::GPU);
std::vector<Mat> test;
std::vector<int> testLabel;
std::vector<float> test_vector;
test.push_back(FaceROI);
testLabel.push_back(0);
memory_layer->AddMatVector(test, testLabel);// memory_layer and net , must be define be a global variable.
test.clear(); testLabel.clear();
std::vector<caffe::Blob<float>*> input_vec;
net->Forward(input_vec);
boost::shared_ptr<caffe::Blob<float>> fc8 = net->blob_by_name("fc8");
int test_num = 0;
while (test_num < 2622)
{
test_vector.push_back(fc8->data_at(0, test_num++, 1, 1));
}
return test_vector;
}
=============注意上面這個地方可以這麼改:==============
(直接可以知道這個向量的首地址、尾地址,我們直接用其來定義vector)
float* begin = nullptr;
float* end = nullptr;
begin = fc8->mutable_cpu_data();
end = begin + fc8->channels();
CHECK(begin != nullptr);
CHECK(end != nullptr);
std::vector<float> FaceVector{ begin,end };
return std::move(FaceVector);
請特別注意這個地方:
namespace caffe
{
extern INSTANTIATE_CLASS(InputLayer);
extern INSTANTIATE_CLASS(InnerProductLayer);
extern INSTANTIATE_CLASS(DropoutLayer);
extern INSTANTIATE_CLASS(ConvolutionLayer);
REGISTER_LAYER_CLASS(Convolution);
extern INSTANTIATE_CLASS(ReLULayer);
REGISTER_LAYER_CLASS(ReLU);
extern INSTANTIATE_CLASS(PoolingLayer);
REGISTER_LAYER_CLASS(Pooling);
extern INSTANTIATE_CLASS(LRNLayer);
REGISTER_LAYER_CLASS(LRN);
extern INSTANTIATE_CLASS(SoftmaxLayer);
REGISTER_LAYER_CLASS(Softmax);
extern INSTANTIATE_CLASS(MemoryDataLayer);
}
爲什麼要加這些?因爲在提取過程中發現,如果不加,會導致有一些層沒有註冊的情況。我在Github的Microsoft/Caffe上幫一外國哥們解決了這個問題。我把問題展現一下:
如果我們加了上述代碼,就相當於註冊了這些層,自然就不會有這樣的問題。
在提取過程中,我提取的是fc8層的特徵,2622維。當然,最後一層都已經是分類特徵了,最好還是提取fc7層的4096維特徵。
在這個地方:
void Caffe_Predefine()//when our code begining run must add it
{
caffe::Caffe::set_mode(caffe::Caffe::GPU);
net = Net_Init_Load<float>("vgg_extract_feature_memorydata.prototxt", "VGG_FACE.caffemodel", caffe::TEST);
memory_layer = (caffe::MemoryDataLayer<float> *)net->layers()[0].get();
}
是一個初始化的函數,用於將VGG網絡模型與提取特徵的配置文件進行傳入,所以很明顯地,在提取特徵之前,需要先:
Caffe_Predefine();
進行了這個之後,這些全局量我們就能一直用了。
我們可以試試提取特徵的這個接口。新建一個main.cpp,調用之:
#include <ExtractFeature_.h>
int main()
{
Caffe_Predefine();
Mat lena = imread("lena.jpg");
if (!lena.empty())
{
ExtractFeature(lena);
}
}
因爲我們得到的是一個vector< float>類型,所以我們可以把它逐一輸出出來看看。當然,在ExtractFeature()的函數中你就可以這麼做了。我們還是在main()函數裏這麼做。
來看看:
#include <ExtractFeature_.h>
int main()
{
Caffe_Predefine();
Mat lena = imread("lena.jpg");
if (!lena.empty())
{
int i = 0;
vector<float> print=ExtractFeature(lena);
while (i<print.size())
{
cout << print[i++] << endl;
}
}
imshow("Extract feature",lena);
waitKey(0);
}
那麼對於這張圖片,提取出的特徵,就是很多的這些數字:
提取一張224*224圖片特徵的時間爲:0.019s。我們可以看到,GPU加速的效果是非常明顯的。而且我這塊顯卡也就是GTX980。不知道泰坦X的提取速度如何(淚)。
附:net結構 (prototxt),注意layer和layers的區別:
name: "VGG_FACE_16_layer"
layer {
name: "data"
type: "MemoryData"
top: "data"
top: "label"
transform_param {
mirror: false
crop_size: 224
mean_value:129.1863
mean_value:104.7624
mean_value:93.5940
}
memory_data_param {
batch_size: 1
channels:3
height:224
width:224
}
}
layer {
bottom: "data"
top: "conv1_1"
name: "conv1_1"
type: "Convolution"
convolution_param {
num_output: 64
pad: 1
kernel_size: 3
}
}
layer {
bottom: "conv1_1"
top: "conv1_1"
name: "relu1_1"
type: "ReLU"
}
layer {
bottom: "conv1_1"
top: "conv1_2"
name: "conv1_2"
type: "Convolution"
convolution_param {
num_output: 64
pad: 1
kernel_size: 3
}
}
layer {
bottom: "conv1_2"
top: "conv1_2"
name: "relu1_2"
type: "ReLU"
}
layer {
bottom: "conv1_2"
top: "pool1"
name: "pool1"
type: "Pooling"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
bottom: "pool1"
top: "conv2_1"
name: "conv2_1"
type: "Convolution"
convolution_param {
num_output: 128
pad: 1
kernel_size: 3
}
}
layer {
bottom: "conv2_1"
top: "conv2_1"
name: "relu2_1"
type: "ReLU"
}
layer {
bottom: "conv2_1"
top: "conv2_2"
name: "conv2_2"
type: "Convolution"
convolution_param {
num_output: 128
pad: 1
kernel_size: 3
}
}
layer {
bottom: "conv2_2"
top: "conv2_2"
name: "relu2_2"
type: "ReLU"
}
layer {
bottom: "conv2_2"
top: "pool2"
name: "pool2"
type: "Pooling"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
bottom: "pool2"
top: "conv3_1"
name: "conv3_1"
type: "Convolution"
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
}
}
layer {
bottom: "conv3_1"
top: "conv3_1"
name: "relu3_1"
type: "ReLU"
}
layer {
bottom: "conv3_1"
top: "conv3_2"
name: "conv3_2"
type: "Convolution"
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
}
}
layer {
bottom: "conv3_2"
top: "conv3_2"
name: "relu3_2"
type: "ReLU"
}
layer {
bottom: "conv3_2"
top: "conv3_3"
name: "conv3_3"
type: "Convolution"
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
}
}
layer {
bottom: "conv3_3"
top: "conv3_3"
name: "relu3_3"
type: "ReLU"
}
layer {
bottom: "conv3_3"
top: "pool3"
name: "pool3"
type: "Pooling"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
bottom: "pool3"
top: "conv4_1"
name: "conv4_1"
type: "Convolution"
convolution_param {
num_output: 512
pad: 1
kernel_size: 3
}
}
layer {
bottom: "conv4_1"
top: "conv4_1"
name: "relu4_1"
type: "ReLU"
}
layer {
bottom: "conv4_1"
top: "conv4_2"
name: "conv4_2"
type: "Convolution"
convolution_param {
num_output: 512
pad: 1
kernel_size: 3
}
}
layer {
bottom: "conv4_2"
top: "conv4_2"
name: "relu4_2"
type: "ReLU"
}
layer {
bottom: "conv4_2"
top: "conv4_3"
name: "conv4_3"
type: "Convolution"
convolution_param {
num_output: 512
pad: 1
kernel_size: 3
}
}
layer {
bottom: "conv4_3"
top: "conv4_3"
name: "relu4_3"
type: "ReLU"
}
layer {
bottom: "conv4_3"
top: "pool4"
name: "pool4"
type: "Pooling"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
bottom: "pool4"
top: "conv5_1"
name: "conv5_1"
type: "Convolution"
convolution_param {
num_output: 512
pad: 1
kernel_size: 3
}
}
layer {
bottom: "conv5_1"
top: "conv5_1"
name: "relu5_1"
type: "ReLU"
}
layer {
bottom: "conv5_1"
top: "conv5_2"
name: "conv5_2"
type: "Convolution"
convolution_param {
num_output: 512
pad: 1
kernel_size: 3
}
}
layer {
bottom: "conv5_2"
top: "conv5_2"
name: "relu5_2"
type: "ReLU"
}
layer {
bottom: "conv5_2"
top: "conv5_3"
name: "conv5_3"
type: "Convolution"
convolution_param {
num_output: 512
pad: 1
kernel_size: 3
}
}
layer {
bottom: "conv5_3"
top: "conv5_3"
name: "relu5_3"
type: "ReLU"
}
layer {
bottom: "conv5_3"
top: "pool5"
name: "pool5"
type: "Pooling"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
bottom: "pool5"
top: "fc6"
name: "fc6"
type: "InnerProduct"
inner_product_param {
num_output: 4096
}
}
layer {
bottom: "fc6"
top: "fc6"
name: "relu6"
type: "ReLU"
}
layer {
bottom: "fc6"
top: "fc6"
name: "drop6"
type: "Dropout"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
bottom: "fc6"
top: "fc7"
name: "fc7"
type: "InnerProduct"
inner_product_param {
num_output: 4096
}
}
layer {
bottom: "fc7"
top: "fc7"
name: "relu7"
type: "ReLU"
}
layer {
bottom: "fc7"
top: "fc7"
name: "drop7"
type: "Dropout"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
bottom: "fc7"
top: "fc8"
name: "fc8"
type: "InnerProduct"
inner_product_param {
num_output: 2622
}
}
layer {
bottom: "fc8"
top: "prob"
name: "prob"
type: "Softmax"
}
=================================================================