轉載:https://www.jianshu.com/p/a76c18a3c6d5
先貼個官網的安裝方法,http://caffe.berkeleyvision.org/installation.html
2.安裝好之後,仔細閱讀並照着流程跑一下官網給的例子,鏈接如下:
1).http://caffe.berkeleyvision.org/gathered/examples/mnist.html
2).http://caffe.berkeleyvision.org/gathered/examples/cifar10.html
……
3.看完之後,可以仔細研究以下通過python來使用caffe的例子,瞭解使用caffe的方法。
1). http://nbviewer.jupyter.org/github/BVLC/caffe/blob/master/examples/01-learning-lenet.ipynb
2). http://nbviewer.jupyter.org/github/BVLC/caffe/blob/master/examples/00-classification.ipynb
3).http://www.cnblogs.com/empty16/p/4878164.html
……
4.以下以人臉識別問題使用以下庫使用caffe進行訓練和測試:
http://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html
裏面包括了40個人,每人10張人臉照片。如下圖:
由於官網上給出了Model_Zoo的鏈接,通過查詢得知,已經有訓練好的人臉識別模型,可以直接拿來使用,即:
下載地址:
http://www.robots.ox.ac.uk/%7Evgg/software/vgg_face/src/vgg_face_caffe.tar.gz
在網站VGG Face Descriptor中提供了模型和源碼,具體使用參考相關說明即可,基本的流程應該比較簡單:
- 在腳本源碼中指定Caffe庫的路徑,指定.caffemodel模型,指定輸入數據,通過函數調用網絡的測試功能,獲取網絡輸出結果。
- 執行腳本源碼。
如果源碼的使用說明不能夠充分理解,可以參考Jupyter Notebook Viewer的示例。基本流程與ImageNet的分類任務應該是相同的。另外,模型的數據集在VGG Face Descriptor相關論文的第三章有說明。pdf
其次因爲人臉圖片是灰度圖,需要首先用OpenCV將其轉化成RGB的圖片才能使用VGG。 python代碼如下:
import os
import cv2
import sysdef
convert_gray_img_to_rgb(base_dir,dir_pre_str,dir_range_list,dir_post_str,file_format,partion_list):
for i in dir_range_list:
for index,partion_list_part in enumerate(partion_list):
for k in partion_list_part:
if base_dir=="":
base_dir_str=""
else:
base_dir_str=base_dir+os.sep
type=""
if index==0:
type="train"
elif index==1:
type="tst"
file_input_path=base_dir_str+type+os.sep+dir_pre_str+str(i)+\
dir_post_str+os.sep+str(k)+file_format
img = cv2.imread( file_input_path,0 )
img = cv2.cvtColor( img, cv2.COLOR_GRAY2RGB )
out_file= base_dir_str+type+os.sep+dir_pre_str+\
str(i)+dir_post_str+os.sep+str(k)+".jpg"
cv2.imwrite(out_file, img)
if __name__=='__main__':
source_dir="/Users/Ren/Downloads/att_faces_back"
dir_pre_str="s"
dir_range_list=range(1,41)
test_partion_list=[7,8,9,10]
train_partion_list=[1,2,3,4,5,6]
dir_post_str=""
file_format=".pgm"
convert_gray_img_to_rgb(source_dir,dir_pre_str,dir_range_list\
,dir_post_str,file_format,[train_partion_list,test_partion_list])
對於此數據庫,首先需要將人臉的數據進行劃分:訓練和測試集,並轉換成lmdb模型。過程請參考:http://www.cnblogs.com/dupuleng/articles/4370236.html。我的代碼如下,將其保存到了example/att_faces/create_att_faces.sh
#!/usr/bin/env sh
# Create the imagenet lmdb inputs
# N.B. set the path to the imagenet train + val data dirs
EXAMPLE=examples/att_faces
DATA=data/att_faces
TOOLS=build/tools
DBTYPE=lmdb
TRAIN_DATA_ROOT=$DATA/train/
TEST_DATA_ROOT=$DATA/tst/
ROOT=./
# Set RESIZE=true to resize the images to 256x256. Leave as false if images have
# already been resized using another tool.
RESIZE=true
if $RESIZE; then
RESIZE_HEIGHT=224
RESIZE_WIDTH=224
else
RESIZE_HEIGHT=0
RESIZE_WIDTH=0
fi
if [ ! -d "$TRAIN_DATA_ROOT" ]; then
echo "Error: TRAIN_DATA_ROOT is not a path to a directory: $TRAIN_DATA_ROOT"
echo "Set the TRAIN_DATA_ROOT variable in create_att_faces.sh to the path" \
"where the ImageNet training data is stored."
exit 1
fi
if [ ! -d "$TEST_DATA_ROOT" ]; then
echo "Error: TEST_DATA_ROOT is not a path to a directory: $TEST_DATA_ROOT"
echo "Set the TEST_DATA_ROOT variable in create_att_faces.sh to the path" \
"where the ImageNet test data is stored."
exit 1
fi
echo "Creating train lmdb..."
rm -rf $EXAMPLE/att_faces_train_$DBTYPE $EXAMPLE/att_faces_tst_$DBTYPE
GLOG_logtostderr=1 $TOOLS/convert_imageset \
--resize_height=$RESIZE_HEIGHT \
--resize_width=$RESIZE_WIDTH \
--shuffle \
$ROOT \
$DATA/train.txt \
$EXAMPLE/att_faces_train_$DBTYPE
echo "Creating tst lmdb..."
rm -f $EXAMPLE/mean.binaryproto
GLOG_logtostderr=1 $TOOLS/convert_imageset \
--resize_height=$RESIZE_HEIGHT \
--resize_width=$RESIZE_WIDTH \
--shuffle \
$ROOT \
$DATA/tst.txt \
$EXAMPLE/att_faces_tst_$DBTYPE
echo "Computing image mean..."
./build/tools/compute_image_mean -backend=$DBTYPE \
$EXAMPLE/att_faces_train_$DBTYPE $EXAMPLE/mean.binaryproto
echo "Done."
之後可以使用該數據通過以models/finetune_flickr_style/train_val.prototxt 爲模板,以vgg_face_caffe/VGG_FACE_deploy.prototxt 爲內容將網絡結構進行填充。即加入數據輸入層與改變最後一層的全連接層輸出數量,修正掉舊caffe的語法。修正後的內容如下:
name: "VGG_FACE_16_Net"
layer {
name: "data"
type: "ImageData"
top: "data"
top: "label"
include {
phase: TRAIN
}
transform_param {
mirror: true
crop_size: 224
mean_file: "examples/att_faces/mean.binaryproto"
}
image_data_param {
source: "data/att_faces/train.txt"
batch_size: 1
new_height: 224
new_width: 224
}
}
layer {
name: "data"
type: "ImageData"
top: "data"
top: "label"
include {
phase: TEST
}
transform_param {
mirror: false
crop_size: 224
mean_file: "examples/att_faces/mean.binaryproto"
}
image_data_param {
source: "data/att_faces/tst.txt"
batch_size: 1
new_height: 224
new_width: 224
}
}
layer {
name: "conv1_1"
type: "Convolution"
bottom: "data"
top: "conv1_1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 64
kernel_size: 3
pad: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1_1"
type: "ReLU"
bottom: "conv1_1"
top: "conv1_1"
}
layer {
name: "conv1_2"
type: "Convolution"
bottom: "conv1_1"
top: "conv1_2"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 64
kernel_size: 3
pad: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1_2"
type: "ReLU"
bottom: "conv1_2"
top: "conv1_2"
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1_2"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "conv2_1"
type: "Convolution"
bottom: "pool1"
top: "conv2_1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 128
kernel_size: 3
pad: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu2_1"
type: "ReLU"
bottom: "conv2_1"
top: "conv2_1"
}
layer {
name: "conv2_2"
type: "Convolution"
bottom: "conv2_1"
top: "conv2_2"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 128
kernel_size: 3
pad: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu2_2"
type: "ReLU"
bottom: "conv2_2"
top: "conv2_2"
}
layer {
name: "pool2"
type: "Pooling"
bottom: "conv2_2"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "conv3_1"
type: "Convolution"
bottom: "pool2"
top: "conv3_1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 256
kernel_size: 3
pad: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3_1"
type: "ReLU"
bottom: "conv3_1"
top: "conv3_1"
}
layer {
name: "conv3_2"
type: "Convolution"
bottom: "conv3_1"
top: "conv3_2"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 256
kernel_size: 3
pad: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3_2"
type: "ReLU"
bottom: "conv3_2"
top: "conv3_2"
}
layer {
name: "conv3_3"
type: "Convolution"
bottom: "conv3_2"
top: "conv3_3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 256
kernel_size: 3
pad: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3_3"
type: "ReLU"
bottom: "conv3_3"
top: "conv3_3"
}
layer {
name: "pool3"
type: "Pooling"
bottom: "conv3_3"
top: "pool3"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "conv4_1"
type: "Convolution"
bottom: "pool3"
top: "conv4_1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 512
kernel_size: 3
pad: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu4_1"
type: "ReLU"
bottom: "conv4_1"
top: "conv4_1"
}
layer {
name: "conv4_2"
type: "Convolution"
bottom: "conv4_1"
top: "conv4_2"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 512
kernel_size: 3
pad: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu4_2"
type: "ReLU"
bottom: "conv4_2"
top: "conv4_2"
}
layer {
name: "conv4_3"
type: "Convolution"
bottom: "conv4_2"
top: "conv4_3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 512
kernel_size: 3
pad: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu4_3"
type: "ReLU"
bottom: "conv4_3"
top: "conv4_3"
}
layer {
name: "pool4"
type: "Pooling"
bottom: "conv4_3"
top: "pool4"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "conv5_1"
type: "Convolution"
bottom: "pool4"
top: "conv5_1"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 512
kernel_size: 3
pad: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu5_1"
type: "ReLU"
bottom: "conv5_1"
top: "conv5_1"
}
layer {
name: "conv5_2"
type: "Convolution"
bottom: "conv5_1"
top: "conv5_2"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 512
kernel_size: 3
pad: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu5_2"
type: "ReLU"
bottom: "conv5_2"
top: "conv5_2"
}
layer {
name: "conv5_3"
type: "Convolution"
bottom: "conv5_2"
top: "conv5_3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
convolution_param {
num_output: 512
kernel_size: 3
pad: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu5_3"
type: "ReLU"
bottom: "conv5_3"
top: "conv5_3"
}
layer {
name: "pool5"
type: "Pooling"
bottom: "conv5_3"
top: "pool5"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
}
}
layer {
name: "fc6"
type: "InnerProduct"
bottom: "pool5"
top: "fc6"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 4096
weight_filler {
type: "gaussian"
std: 0.005
}
bias_filler {
type: "constant"
value: 1
}
}
}
layer {
name: "relu6"
type: "ReLU"
bottom: "fc6"
top: "fc6"
}
layer {
name: "drop6"
type: "Dropout"
bottom: "fc6"
top: "fc6"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc7"
type: "InnerProduct"
bottom: "fc6"
top: "fc7"
# Note that lr_mult can be set to 0 to disable any fine-tuning of this, and any other, layer
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 2
decay_mult: 0
}
inner_product_param {
num_output: 4096
weight_filler {
type: "gaussian"
std: 0.005
}
bias_filler {
type: "constant"
value: 1
}
}
}
layer {
name: "relu7"
type: "ReLU"
bottom: "fc7"
top: "fc7"
}
layer {
name: "drop7"
type: "Dropout"
bottom: "fc7"
top: "fc7"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc8_flickr"
type: "InnerProduct"
bottom: "fc7"
top: "fc8_flickr"
# lr_mult is set to higher than for other layers, because this layer is starting from random while the others are already trained
propagate_down: false
inner_product_param {
num_output: 40
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "accuracy"
type: "Accuracy"
bottom: "fc8_flickr"
bottom: "label"
top: "accuracy"
include {
phase: TEST
}
}
layer {
name: "loss"
type: "SoftmaxWithLoss"
bottom: "fc8_flickr"
bottom: "label"
top: "loss"
}
拷貝models/finetune_flickr_style/solver.prototxt,並將新的針對現問題進行修改,主要修改
net: "models/finetune/train_val.prototxt"
test_iter: 100
test_interval: 100
# lr for fine-tuning should be lower than when starting from scratch
base_lr: 0.001
lr_policy: "step"
gamma: 0.1
# stepsize should also be lower, as we're closer to being done
stepsize: 2000
display: 20
max_iter: 10000
momentum: 0.9
weight_decay: 0.0005
snapshot: 1000
snapshot_prefix: "models/finetune/finetune"
# uncomment the following to default to CPU mode solving
#solver_mode: CPU
最後使用自己的數據對模型進行fine-tuning。代碼如下:
./build/tools/caffe train -solver models/finetune/solver.prototxt -weights models/vgg_face_caffe/VGG_FACE.caffemodel -gpu 0