深度學習實踐經驗：用Faster R-CNN訓練行人檢測數據集Caltech——準備工作

前言

Faster R-CNN是Ross Girshick大神在Fast R-CNN基礎上提出的又一個更加快速、更高mAP的用於目標檢測的深度學習框架，它對Fast R-CNN進行的最主要的優化就是在Region Proposal階段，引入了Region Proposal Network (RPN)來進行Region Proposal，同時可以達到和檢測網絡共享整個圖片的卷積網絡特徵的目標，使得region proposal幾乎是cost free的。

關於Faster R-CNN的詳細介紹，可以參考我上一篇博客。

Faster R-CNN的代碼是開源的，有兩個版本：MATLAB版本(faster_rcnn)，Python版本(py-faster-rcnn)。

這裏我主要使用的是Python版本，Python版本在測試期間會比MATLAB版本慢10%，因爲Python layers中的一些操作是在CPU中執行的，但是準確率應該是差不多的。

準備工作1——py-faster-rcnn的編譯安裝測試

py-faster-rcnn的編譯安裝

克隆Faster R-CNN倉庫：
```
git clone --recursive https://github.com/rbgirshick/py-faster-rcnn.git
```
一定要加上--recursive標誌，假設克隆後的文件夾名字叫py-faster-rcnn
編譯Cython模塊：
```
cd py-faster-rcnn/lib
make
```

編譯裏面的Caffe和pycaffe：

cd py-faster-rcnn/caffe-fast-rcnn

# 按照編譯Caffe的方法，進行編譯


# 注意Makefile.config的修改，這裏不再贅述Caffe的安裝


# 編譯

make -j8 && make pycaffe

這裏貼上我的Makefile.config文件代碼，根據你的情況進行相應修改


## Refer to http://caffe.berkeleyvision.org/installation.html


# Contributions simplifying and improving our build system are welcome!



# cuDNN acceleration switch (uncomment to build with cuDNN).

USE_CUDNN := 1


# CPU-only switch (uncomment to build without GPU support).


# CPU_ONLY := 1



# uncomment to disable IO dependencies and corresponding data layers


# USE_OPENCV := 0


# USE_LEVELDB := 0


# USE_LMDB := 0



# uncomment to allow MDB_NOLOCK when reading LMDB files (only if necessary)


# You should not set this flag if you will be reading LMDBs with any


# possibility of simultaneous read and write


# ALLOW_LMDB_NOLOCK := 1



# Uncomment if you're using OpenCV 3

OPENCV_VERSION := 3


# To customize your choice of compiler, uncomment and set the following.


# N.B. the default for Linux is g++ and the default for OSX is clang++


# CUSTOM_CXX := g++



# CUDA directory contains bin/ and lib/ directories that we need.

CUDA_DIR := /usr/local/cuda

# On Ubuntu 14.04, if cuda tools are installed via


# "sudo apt-get install nvidia-cuda-toolkit" then use this instead:


# CUDA_DIR := /usr



# CUDA architecture setting: going with all of them.


# For CUDA < 6.0, comment the *_50 lines for compatibility.

CUDA_ARCH := -gencode arch=compute_20,code=sm_20 \
-gencode arch=compute_20,code=sm_21 \
-gencode arch=compute_30,code=sm_30 \
-gencode arch=compute_35,code=sm_35 \
-gencode arch=compute_50,code=sm_50 \
-gencode arch=compute_50,code=compute_50


# BLAS choice:


# atlas for ATLAS (default)


# mkl for MKL


# open for OpenBlas

BLAS :=mkl

# Custom (MKL/ATLAS/OpenBLAS) include and lib directories.


# Leave commented to accept the defaults for your choice of BLAS


# (which should work)!


# BLAS_INCLUDE := /path/to/your/blas


# BLAS_LIB := /path/to/your/blas



# Homebrew puts openblas in a directory that is not on the standard search path


# BLAS_INCLUDE := $(shell brew --prefix openblas)/include


# BLAS_LIB := $(shell brew --prefix openblas)/lib



# This is required only if you will compile the matlab interface.


# MATLAB directory should contain the mex binary in /bin.

MATLAB_DIR := /usr/local/MATLAB/R2016b

# MATLAB_DIR := /Applications/MATLAB_R2012b.app



# NOTE: this is required only if you will compile the python interface.


# We need to be able to find Python.h and numpy/arrayobject.h.


# PYTHON_INCLUDE := /usr/include/python2.7 \

/usr/lib/python2.7/dist-packages/numpy/core/include

# Anaconda Python distribution is quite popular. Include path:


# Verify anaconda location, sometimes it's in root.

ANACONDA_HOME := $(HOME)/anaconda
PYTHON_INCLUDE := $(ANACONDA_HOME)/include \
$(ANACONDA_HOME)/include/python2.7 \
$(ANACONDA_HOME)/lib/python2.7/site-packages/numpy/core/include \
$ /usr/include/python2.7

# Uncomment to use Python 3 (default is Python 2)


# PYTHON_LIBRARIES := boost_python3 python3.5m


# PYTHON_INCLUDE := /usr/include/python3.5m \


# /usr/lib/python3.5/dist-packages/numpy/core/include



# We need to be able to find libpythonX.X.so or .dylib.


# PYTHON_LIB := /usr/lib

PYTHON_LIB := $(ANACONDA_HOME)/lib


# Homebrew installs numpy in a non standard path (keg only)


# PYTHON_INCLUDE += $(dir $(shell python -c 'import numpy.core; print(numpy.core.__file__)'))/include


# PYTHON_LIB += $(shell brew --prefix numpy)/lib



# Uncomment to support layers written in Python (will link against Python libs)

WITH_PYTHON_LAYER := 1


# Whatever else you find you need goes here.


# INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include


# LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib

INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include /usr/include/hdf5/serial 
LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib /usr/lib/x86_64-linux-gnu /usr/lib/x86_64-linux-gnu/hdf5/serial


# If Homebrew is installed at a non standard location (for example your home directory) and you use it for general dependencies


# INCLUDE_DIRS += $(shell brew --prefix)/include


# LIBRARY_DIRS += $(shell brew --prefix)/lib



# Uncomment to use `pkg-config` to specify OpenCV library paths.


# (Usually not necessary -- OpenCV libraries are normally installed in one of the above $LIBRARY_DIRS.)


# USE_PKG_CONFIG := 1



# N.B. both build and distribute dirs are cleared on `make clean`

BUILD_DIR := build
DISTRIBUTE_DIR := distribute


# Uncomment for debugging. Does not work on OSX due to https://github.com/BVLC/caffe/issues/171


# DEBUG := 1



# The ID of the GPU that 'make runtest' will use to run unit tests.

TEST_GPUID := 0


# enable pretty build (comment to see full commands)

Q ?= @

Demo運行

爲了檢驗你的py-faster-rcnn是否成功安裝，作者給出了一個demo，可以利用在PASCAL VOC2007數據集上體現訓練好的模型，來進行demo的運行，步驟如下：

下載預訓練好的Faster R-CNN檢測器：
```
cd py-faster-rcnn
./data/scripts/fetch_faster_rcnn_models.sh
```
這條命令會自動下載名爲faster_rcnn_models.tgz的文件，解壓後會創建data/faster_rcnn_models文件夾，裏面會有兩個模型：
- ZF_faster_rcnn_final.caffemodel：在ZF網絡模型下訓練所得
- VGG16_faster_rcnn_final.caffemodel：在VGG16網絡模型下訓練所得。
運行demo：
```
cd py-faster-rcnn
./tools/demo.py
```
demo會檢測5張圖片，這5張圖片放在data/demo/文件夾下，其中一張的檢測結果如下：
至此如果上述過程沒有出錯，那麼py-faster-rcnn算是成功編譯安裝。

準備工作2——Caltech數據集

由於Faster R-CNN的一部分實驗是在PASCAL VOC2007數據集上進行的，所以要想用Faster R-CNN訓練我們自己的數據集，首先應該搞清楚PASCAL VOC2007數據集中的目錄、圖片、標註格式，這樣我們才能用自己的數據集製作出類似於PASCAL VOC2007類似的數據集，供Faster R-CNN來進行訓練及測試。

獲取PASCAL VOC2007數據集

這一部分不是必須的，如果你需要PASCAL VOC2007數據集，可以利用以下命令獲取數據集，但我們下載VOC數據集的目的主要是觀察他的文件結構和文件內容，以便於我們構建符合要求的自己的數據集。

創建一個專門用來存數據集的地方，假設是$HOME/data文件夾。

下載PASCAL VOC2007的訓練、驗證和測試數據集：

cd $HOME/data
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar

下載完後用以下命令解壓：

tar xvf VOCtrainval_06-Nov-2007.tar
tar xvf VOCtest_06-Nov-2007.tar

會得到如下文件結構：

$HOME/data/VOCdevkit/                        # 根文件夾
$HOME/data/VOCdevkit/VOC2007                 # VOC2007文件夾
$HOME/data/VOCdevkit/VOC2007/Annotations     # 標記文件夾
$HOME/data/VOCdevkit/VOC2007/ImageSets       # 供train.txt、test.txt、val.txt等文件存放的文件夾
$HOME/data/VOCdevkit/VOC2007/JPEGImages      # 存放圖片文件夾

# ... 以及其他的文件夾及子文件夾 ...

創建快捷方式symlinks來連接到VOC數據集存放的地方：
```
cd py-faster-rcnn/data
ln -s $HOME/data/VOCdevkit/ VOCdevkit
```
這裏需要把$HOME/data/VOCdevkit/改爲你存放VOCdevkit文件夾的路徑

最好使用symlinks來在共享同一份數據集，防止數據集多處拷貝，佔用空間。
至此VOC數據集創建完畢。

PASCAL VOC數據集的分析

PASCAL VOC數據集的文件結構，如下：

└── VOCdevkit
    └── VOC2007　
        ├── Annotations　　
        ├── ImageSets　　
        │   ├── Layout　　
        │   ├── Main　　
        │   └── Segmentation　　
        ├── JPEGImages　　
        ├── SegmentationClass　　
        └── SegmentationObject

Annotations

該文件夾主要用來存放圖片標註（即爲ground truth），文件是.xml格式，每張圖片都有一個.xml文件與之對應。選取其中一個文件進行如下分析：

<annotation>
    <folder>VOC2007</folder> # 必須有，父文件夾的名稱
    <filename>000005.jpg</filename>　#　必須有
    <source>　# 可有可無
        <database>The VOC2007 Database</database>
        <annotation>PASCAL VOC2007</annotation>
        <image>flickr</image>
        <flickrid>325991873</flickrid>
    </source>
    <owner>　# 可有可無
        <flickrid>archintent louisville</flickrid>
        <name>?</name>
    </owner>
    <size>　# 表示圖像大小
        <width>500</width>
        <height>375</height>
        <depth>3</depth>
    </size>
    <segmented>0</segmented>　# 用於分割
    <object>　# 目標信息，類別，bbox信息，圖片中每個目標對應一個<object>標籤
        <name>chair</name>
        <pose>Rear</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>263</xmin>
            <ymin>211</ymin>
            <xmax>324</xmax>
            <ymax>339</ymax>
        </bndbox>
    </object>
    <object>
        <name>chair</name>
        <pose>Unspecified</pose>
        <truncated>1</truncated>
        <difficult>1</difficult>
        <bndbox>
            <xmin>5</xmin>
            <ymin>244</ymin>
            <xmax>67</xmax>
            <ymax>374</ymax>
        </bndbox>
    </object>
</annotation>

需要注意的，對於我們自己準備的xml標記文件中，每個<object>標籤中的<xmin>和<ymin>標籤中所對應的座標值最好大於0，千萬不能爲負數，否則在訓練過程中會報錯：AssertionError: assert (boxes[:, 2]) >= boxes[:, 0]).all()，如下：

所以爲了能夠順利訓練，一定要仔細檢查自己的xml文件中的左上角的座標是否都爲正。我被這個bug卡了一兩天，最終把自己標記中所有的錯誤座標找出來，才得以順利訓練。

ImageSets

ImageSets文件夾下有三個子文件夾，這裏我們只需關注Main文件夾即可。Main文件夾下主要用到的是train.txt、val.txt、test.txt、trainval.txt文件，每個文件中寫着供訓練、驗證、測試所用的文件名的集合，如下：

JPEGImages

JPEGImages文件夾下主要存放着所有的.jpg文件格式的輸入圖片，不在贅述。

製作VOC類似的Caltech數據集

經過以上對PASCAL VOC數據集文件結構的分析，我們仿照其，創建首先創建類似的文件結構即可：

└── VOCdevkit
    └── VOC2007　
    └── Caltech　
        ├── Annotations　　
        ├── ImageSets　　　
        │   └── Main　　
        └── JPEGImages

我建議將Caltech文件創建一個symlinks鏈接到VOCdevkit文件夾之下，因爲這樣會方便之後訓練代碼的修改。

至於Caltech數據集如何從.seq文件轉化爲一張張.jpg圖片，這裏可以參考這裏。
至於Annotations中一個個.xml標記文件是實驗室師兄給我的，上面提到的方法也可以轉化，但是並不符合要求。
至於ImageSets中的train.txt是根據.xml文件得來的，test.txt是每個seq中每隔30幀取一幀圖片得來的。

深度學習實踐經驗：用Faster R-CNN訓練行人檢測數據集Caltech——準備工作

前言

準備工作1——py-faster-rcnn的編譯安裝測試

py-faster-rcnn的編譯安裝

Demo運行

準備工作2——Caltech數據集

獲取PASCAL VOC2007數據集

PASCAL VOC數據集的分析

Annotations

ImageSets

JPEGImages

製作VOC類似的Caltech數據集

參考博客

行人檢測論文筆記：Pedestrian Detection - An Evaluation of the State of the Art

iOS中多線程知識總結：進程、線程、GCD、串行隊列、並行隊列、全局隊列、主線程隊列、同步任務、異步任務等

南清北復交北航哈工大中科院華科保研記

行人檢測論文筆記：Pedestrian Detection - A Benchmark

Xcode運行後報錯： this class is not key value coding-compliant for the key的原因

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結