caffe新手教程

轉載:https://www.jianshu.com/p/a76c18a3c6d5

先貼個官網的安裝方法,http://caffe.berkeleyvision.org/installation.html

2.安裝好之後,仔細閱讀並照着流程跑一下官網給的例子,鏈接如下:
1).http://caffe.berkeleyvision.org/gathered/examples/mnist.html
2).http://caffe.berkeleyvision.org/gathered/examples/cifar10.html
……

3.看完之後,可以仔細研究以下通過python來使用caffe的例子,瞭解使用caffe的方法。
1). http://nbviewer.jupyter.org/github/BVLC/caffe/blob/master/examples/01-learning-lenet.ipynb
2). http://nbviewer.jupyter.org/github/BVLC/caffe/blob/master/examples/00-classification.ipynb
3).http://www.cnblogs.com/empty16/p/4878164.html
……

4.以下以人臉識別問題使用以下庫使用caffe進行訓練和測試:
http://www.cl.cam.ac.uk/research/dtg/attarchive/facedatabase.html

裏面包括了40個人,每人10張人臉照片。如下圖:

att_face數據庫

由於官網上給出了Model_Zoo的鏈接,通過查詢得知,已經有訓練好的人臉識別模型,可以直接拿來使用,即:
下載地址:
http://www.robots.ox.ac.uk/%7Evgg/software/vgg_face/src/vgg_face_caffe.tar.gz

在網站VGG Face Descriptor中提供了模型和源碼,具體使用參考相關說明即可,基本的流程應該比較簡單:

  • 在腳本源碼中指定Caffe庫的路徑,指定.caffemodel模型,指定輸入數據,通過函數調用網絡的測試功能,獲取網絡輸出結果。
  • 執行腳本源碼。

如果源碼的使用說明不能夠充分理解,可以參考Jupyter Notebook Viewer的示例。基本流程與ImageNet的分類任務應該是相同的。另外,模型的數據集在VGG Face Descriptor相關論文的第三章有說明。pdf

其次因爲人臉圖片是灰度圖,需要首先用OpenCV將其轉化成RGB的圖片才能使用VGG。 python代碼如下:

import os

import cv2

import sysdef 

convert_gray_img_to_rgb(base_dir,dir_pre_str,dir_range_list,dir_post_str,file_format,partion_list):
    for i in dir_range_list:
        for index,partion_list_part in enumerate(partion_list): 
              for k in partion_list_part:
                  if base_dir=="":
                          base_dir_str=""
                  else:
                          base_dir_str=base_dir+os.sep 
                  type=""
                  if index==0:
                        type="train"
                  elif index==1:
                        type="tst"                                               
                  file_input_path=base_dir_str+type+os.sep+dir_pre_str+str(i)+\

                                            dir_post_str+os.sep+str(k)+file_format

                  img = cv2.imread( file_input_path,0 )
                  img = cv2.cvtColor( img, cv2.COLOR_GRAY2RGB )
                  out_file= base_dir_str+type+os.sep+dir_pre_str+\

                                  str(i)+dir_post_str+os.sep+str(k)+".jpg"
                  cv2.imwrite(out_file, img)
if __name__=='__main__':

source_dir="/Users/Ren/Downloads/att_faces_back"

dir_pre_str="s"

dir_range_list=range(1,41)

test_partion_list=[7,8,9,10]

train_partion_list=[1,2,3,4,5,6]

dir_post_str=""

file_format=".pgm"
   
convert_gray_img_to_rgb(source_dir,dir_pre_str,dir_range_list\

,dir_post_str,file_format,[train_partion_list,test_partion_list])

對於此數據庫,首先需要將人臉的數據進行劃分:訓練和測試集,並轉換成lmdb模型。過程請參考:http://www.cnblogs.com/dupuleng/articles/4370236.html。我的代碼如下,將其保存到了example/att_faces/create_att_faces.sh

#!/usr/bin/env sh
# Create the imagenet lmdb inputs
# N.B. set the path to the imagenet train + val data dirs
EXAMPLE=examples/att_faces
DATA=data/att_faces
TOOLS=build/tools
DBTYPE=lmdb
TRAIN_DATA_ROOT=$DATA/train/
TEST_DATA_ROOT=$DATA/tst/
ROOT=./
# Set RESIZE=true to resize the images to 256x256. Leave as false if images have
# already been resized using another tool.
RESIZE=true
if $RESIZE; then
  RESIZE_HEIGHT=224
  RESIZE_WIDTH=224
else
  RESIZE_HEIGHT=0
  RESIZE_WIDTH=0
fi

if [ ! -d "$TRAIN_DATA_ROOT" ]; then
  echo "Error: TRAIN_DATA_ROOT is not a path to a directory: $TRAIN_DATA_ROOT"
  echo "Set the TRAIN_DATA_ROOT variable in create_att_faces.sh to the path" \
       "where the ImageNet training data is stored."
  exit 1
fi

if [ ! -d "$TEST_DATA_ROOT" ]; then
  echo "Error: TEST_DATA_ROOT is not a path to a directory: $TEST_DATA_ROOT"
  echo "Set the TEST_DATA_ROOT variable in create_att_faces.sh to the path" \
       "where the ImageNet test data is stored."
  exit 1
fi

echo "Creating train lmdb..."
rm -rf $EXAMPLE/att_faces_train_$DBTYPE $EXAMPLE/att_faces_tst_$DBTYPE

GLOG_logtostderr=1 $TOOLS/convert_imageset \
    --resize_height=$RESIZE_HEIGHT \
    --resize_width=$RESIZE_WIDTH \
    --shuffle \
    $ROOT \
    $DATA/train.txt \
    $EXAMPLE/att_faces_train_$DBTYPE

echo "Creating tst lmdb..."
rm -f $EXAMPLE/mean.binaryproto
GLOG_logtostderr=1 $TOOLS/convert_imageset \
    --resize_height=$RESIZE_HEIGHT \
    --resize_width=$RESIZE_WIDTH \
    --shuffle \
    $ROOT \
    $DATA/tst.txt \
    $EXAMPLE/att_faces_tst_$DBTYPE
echo "Computing image mean..."
./build/tools/compute_image_mean -backend=$DBTYPE \
  $EXAMPLE/att_faces_train_$DBTYPE $EXAMPLE/mean.binaryproto
echo "Done."

之後可以使用該數據通過以models/finetune_flickr_style/train_val.prototxt 爲模板,以vgg_face_caffe/VGG_FACE_deploy.prototxt 爲內容將網絡結構進行填充。即加入數據輸入層與改變最後一層的全連接層輸出數量,修正掉舊caffe的語法。修正後的內容如下:

name: "VGG_FACE_16_Net"
layer {
  name: "data"
  type: "ImageData"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }
  transform_param {
    mirror: true
    crop_size: 224
    mean_file: "examples/att_faces/mean.binaryproto"
  }
  image_data_param {
    source: "data/att_faces/train.txt"
    batch_size: 1
    new_height: 224
    new_width: 224
  }
}
layer {
  name: "data"
  type: "ImageData"
  top: "data"
  top: "label"
  include {
    phase: TEST
  }
  transform_param {
    mirror: false
    crop_size: 224
    mean_file: "examples/att_faces/mean.binaryproto"
  }
  image_data_param {
    source: "data/att_faces/tst.txt"
    batch_size: 1
    new_height: 224
    new_width: 224
  }
}
layer {
  name: "conv1_1"
  type: "Convolution"
  bottom: "data"
  top: "conv1_1"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 64
    kernel_size: 3
    pad: 1
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "relu1_1"
  type: "ReLU"
  bottom: "conv1_1"
  top: "conv1_1"
}
layer {
  name: "conv1_2"
  type: "Convolution"
  bottom: "conv1_1"
  top: "conv1_2"
  param {
    lr_mult: 1 
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 64
    kernel_size: 3
    pad: 1
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  } 
}
layer {
  name: "relu1_2"
  type: "ReLU"
  bottom: "conv1_2"
  top: "conv1_2"
}
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1_2"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "conv2_1"
  type: "Convolution"
  bottom: "pool1"
  top: "conv2_1"
  param {
    lr_mult: 1
    decay_mult: 1
  } 
  param {
    lr_mult: 2
    decay_mult: 0
  } 
  convolution_param {
    num_output: 128
    kernel_size: 3
    pad: 1
    weight_filler {
      type: "gaussian"
      std: 0.01
    } 
    bias_filler {
      type: "constant"
      value: 0
    } 
  } 
}
layer {
  name: "relu2_1"
  type: "ReLU"
  bottom: "conv2_1"
  top: "conv2_1"
}
layer { 
  name: "conv2_2"
  type: "Convolution"
  bottom: "conv2_1"
  top: "conv2_2"
  param {
    lr_mult: 1
    decay_mult: 1
  } 
  param {
    lr_mult: 2
    decay_mult: 0
  } 
  convolution_param {
    num_output: 128
    kernel_size: 3
    pad: 1
    weight_filler {
      type: "gaussian"
      std: 0.01 
    } 
    bias_filler {
      type: "constant"
      value: 0
    }
  } 
}
layer {
  name: "relu2_2"
  type: "ReLU"
  bottom: "conv2_2"
  top: "conv2_2"
}
layer {
  name: "pool2"
  type: "Pooling"
  bottom: "conv2_2"
  top: "pool2"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "conv3_1"
  type: "Convolution"
  bottom: "pool2"
  top: "conv3_1"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 256
    kernel_size: 3
    pad: 1
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "relu3_1"
  type: "ReLU"
  bottom: "conv3_1"
  top: "conv3_1"
}
layer {
  name: "conv3_2"
  type: "Convolution"
  bottom: "conv3_1"
  top: "conv3_2"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 256
    kernel_size: 3
    pad: 1
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "relu3_2"
  type: "ReLU"
  bottom: "conv3_2"
  top: "conv3_2"
}
layer {
  name: "conv3_3"
  type: "Convolution"
  bottom: "conv3_2"
  top: "conv3_3"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 256
    kernel_size: 3
    pad: 1
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "relu3_3"
  type: "ReLU"
  bottom: "conv3_3"
  top: "conv3_3"
}
layer {
  name: "pool3"
  type: "Pooling"
  bottom: "conv3_3"
  top: "pool3"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "conv4_1"
  type: "Convolution"
  bottom: "pool3"
  top: "conv4_1"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 512
    kernel_size: 3
    pad: 1
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "relu4_1"
  type: "ReLU"
  bottom: "conv4_1"
  top: "conv4_1"
}
layer {
  name: "conv4_2"
  type: "Convolution"
  bottom: "conv4_1"
  top: "conv4_2"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 512
    kernel_size: 3
    pad: 1
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "relu4_2"
  type: "ReLU"
  bottom: "conv4_2"
  top: "conv4_2"
}
layer {
  name: "conv4_3"
  type: "Convolution"
  bottom: "conv4_2"
  top: "conv4_3"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 512
    kernel_size: 3
    pad: 1
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "relu4_3"
  type: "ReLU"
  bottom: "conv4_3"
  top: "conv4_3"
}
layer {
  name: "pool4"
  type: "Pooling"
  bottom: "conv4_3"
  top: "pool4"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}
layer {
  name: "conv5_1"
  type: "Convolution"
  bottom: "pool4"
  top: "conv5_1"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 512
    kernel_size: 3
    pad: 1
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "relu5_1"
  type: "ReLU"
  bottom: "conv5_1"
  top: "conv5_1"
}
layer {
  name: "conv5_2"
  type: "Convolution"
  bottom: "conv5_1"
  top: "conv5_2"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 512
    kernel_size: 3
    pad: 1
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "relu5_2"
  type: "ReLU"
  bottom: "conv5_2"
  top: "conv5_2"
}
layer {
  name: "conv5_3"
  type: "Convolution"
  bottom: "conv5_2"
  top: "conv5_3"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 512
    kernel_size: 3
    pad: 1
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "relu5_3"
  type: "ReLU"
  bottom: "conv5_3"
  top: "conv5_3"
}
layer {
  name: "pool5"
  type: "Pooling"
  bottom: "conv5_3"
  top: "pool5"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
  }
}

layer {
  name: "fc6"
  type: "InnerProduct"
  bottom: "pool5"
  top: "fc6"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  inner_product_param {
    num_output: 4096
    weight_filler {
      type: "gaussian"
      std: 0.005
    }
    bias_filler {
      type: "constant"
      value: 1
    }
  }
}
layer {
  name: "relu6"
  type: "ReLU"
  bottom: "fc6"
  top: "fc6"
}
layer {
  name: "drop6"
  type: "Dropout"
  bottom: "fc6"
  top: "fc6"
  dropout_param {
    dropout_ratio: 0.5
  }
}
layer {
  name: "fc7"
  type: "InnerProduct"
  bottom: "fc6"
  top: "fc7"
  # Note that lr_mult can be set to 0 to disable any fine-tuning of this, and any other, layer
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  inner_product_param {
    num_output: 4096
    weight_filler {
      type: "gaussian"
      std: 0.005
    }
    bias_filler {
      type: "constant"
      value: 1
    }
  }
}
layer {
  name: "relu7"
  type: "ReLU"
  bottom: "fc7"
  top: "fc7"
}
layer {
  name: "drop7"
  type: "Dropout"
  bottom: "fc7"
  top: "fc7"
  dropout_param {
    dropout_ratio: 0.5
  }
}
layer {
  name: "fc8_flickr"
  type: "InnerProduct"
  bottom: "fc7"
  top: "fc8_flickr"
  # lr_mult is set to higher than for other layers, because this layer is starting from random while the others are already trained
  propagate_down: false
  inner_product_param {
    num_output: 40
    weight_filler {
      type: "gaussian"
      std: 0.01
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "fc8_flickr"
  bottom: "label"
  top: "accuracy"
  include {
    phase: TEST
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "fc8_flickr"
  bottom: "label"
  top: "loss"
}

拷貝models/finetune_flickr_style/solver.prototxt,並將新的針對現問題進行修改,主要修改

net: "models/finetune/train_val.prototxt"
test_iter: 100
test_interval: 100
# lr for fine-tuning should be lower than when starting from scratch
base_lr: 0.001
lr_policy: "step"
gamma: 0.1
# stepsize should also be lower, as we're closer to being done
stepsize: 2000
display: 20
max_iter: 10000
momentum: 0.9
weight_decay: 0.0005
snapshot: 1000
snapshot_prefix: "models/finetune/finetune"
# uncomment the following to default to CPU mode solving
#solver_mode: CPU

最後使用自己的數據對模型進行fine-tuning。代碼如下:

./build/tools/caffe train -solver models/finetune/solver.prototxt -weights models/vgg_face_caffe/VGG_FACE.caffemodel -gpu 0
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章