深度學習實踐經驗:用Faster R-CNN訓練行人檢測數據集Caltech——準備工作

前言

Faster R-CNN是Ross Girshick大神在Fast R-CNN基礎上提出的又一個更加快速、更高mAP的用於目標檢測的深度學習框架,它對Fast R-CNN進行的最主要的優化就是在Region Proposal階段,引入了Region Proposal Network (RPN)來進行Region Proposal,同時可以達到和檢測網絡共享整個圖片的卷積網絡特徵的目標,使得region proposal幾乎是cost free的。

關於Faster R-CNN的詳細介紹,可以參考我上一篇博客

Faster R-CNN的代碼是開源的,有兩個版本:MATLAB版本(faster_rcnn)Python版本(py-faster-rcnn)

這裏我主要使用的是Python版本,Python版本在測試期間會比MATLAB版本慢10%,因爲Python layers中的一些操作是在CPU中執行的,但是準確率應該是差不多的。

準備工作1——py-faster-rcnn的編譯安裝測試

py-faster-rcnn的編譯安裝

  1. 克隆Faster R-CNN倉庫:

    git clone --recursive https://github.com/rbgirshick/py-faster-rcnn.git

    一定要加上--recursive標誌,假設克隆後的文件夾名字叫py-faster-rcnn

  2. 編譯Cython模塊:

    cd py-faster-rcnn/lib
    make
  3. 編譯裏面的Caffe和pycaffe:

    cd py-faster-rcnn/caffe-fast-rcnn
    
    # 按照編譯Caffe的方法,進行編譯
    
    
    # 注意Makefile.config的修改,這裏不再贅述Caffe的安裝
    
    
    # 編譯
    
    make -j8 && make pycaffe
  4. 這裏貼上我的Makefile.config文件代碼,根據你的情況進行相應修改

    
    ## Refer to http://caffe.berkeleyvision.org/installation.html
    
    
    # Contributions simplifying and improving our build system are welcome!
    
    
    
    # cuDNN acceleration switch (uncomment to build with cuDNN).
    
    USE_CUDNN := 1
    
    
    # CPU-only switch (uncomment to build without GPU support).
    
    
    # CPU_ONLY := 1
    
    
    
    # uncomment to disable IO dependencies and corresponding data layers
    
    
    # USE_OPENCV := 0
    
    
    # USE_LEVELDB := 0
    
    
    # USE_LMDB := 0
    
    
    
    # uncomment to allow MDB_NOLOCK when reading LMDB files (only if necessary)
    
    
    # You should not set this flag if you will be reading LMDBs with any
    
    
    # possibility of simultaneous read and write
    
    
    # ALLOW_LMDB_NOLOCK := 1
    
    
    
    # Uncomment if you're using OpenCV 3
    
    OPENCV_VERSION := 3
    
    
    # To customize your choice of compiler, uncomment and set the following.
    
    
    # N.B. the default for Linux is g++ and the default for OSX is clang++
    
    
    # CUSTOM_CXX := g++
    
    
    
    # CUDA directory contains bin/ and lib/ directories that we need.
    
    CUDA_DIR := /usr/local/cuda
    
    # On Ubuntu 14.04, if cuda tools are installed via
    
    
    # "sudo apt-get install nvidia-cuda-toolkit" then use this instead:
    
    
    # CUDA_DIR := /usr
    
    
    
    # CUDA architecture setting: going with all of them.
    
    
    # For CUDA < 6.0, comment the *_50 lines for compatibility.
    
    CUDA_ARCH := -gencode arch=compute_20,code=sm_20 \
    -gencode arch=compute_20,code=sm_21 \
    -gencode arch=compute_30,code=sm_30 \
    -gencode arch=compute_35,code=sm_35 \
    -gencode arch=compute_50,code=sm_50 \
    -gencode arch=compute_50,code=compute_50
    
    
    # BLAS choice:
    
    
    # atlas for ATLAS (default)
    
    
    # mkl for MKL
    
    
    # open for OpenBlas
    
    BLAS :=mkl
    
    # Custom (MKL/ATLAS/OpenBLAS) include and lib directories.
    
    
    # Leave commented to accept the defaults for your choice of BLAS
    
    
    # (which should work)!
    
    
    # BLAS_INCLUDE := /path/to/your/blas
    
    
    # BLAS_LIB := /path/to/your/blas
    
    
    
    # Homebrew puts openblas in a directory that is not on the standard search path
    
    
    # BLAS_INCLUDE := $(shell brew --prefix openblas)/include
    
    
    # BLAS_LIB := $(shell brew --prefix openblas)/lib
    
    
    
    # This is required only if you will compile the matlab interface.
    
    
    # MATLAB directory should contain the mex binary in /bin.
    
    MATLAB_DIR := /usr/local/MATLAB/R2016b
    
    # MATLAB_DIR := /Applications/MATLAB_R2012b.app
    
    
    
    # NOTE: this is required only if you will compile the python interface.
    
    
    # We need to be able to find Python.h and numpy/arrayobject.h.
    
    
    # PYTHON_INCLUDE := /usr/include/python2.7 \
    
    /usr/lib/python2.7/dist-packages/numpy/core/include
    
    # Anaconda Python distribution is quite popular. Include path:
    
    
    # Verify anaconda location, sometimes it's in root.
    
    ANACONDA_HOME := $(HOME)/anaconda
    PYTHON_INCLUDE := $(ANACONDA_HOME)/include \
    $(ANACONDA_HOME)/include/python2.7 \
    $(ANACONDA_HOME)/lib/python2.7/site-packages/numpy/core/include \
    $ /usr/include/python2.7
    
    # Uncomment to use Python 3 (default is Python 2)
    
    
    # PYTHON_LIBRARIES := boost_python3 python3.5m
    
    
    # PYTHON_INCLUDE := /usr/include/python3.5m \
    
    
    # /usr/lib/python3.5/dist-packages/numpy/core/include
    
    
    
    # We need to be able to find libpythonX.X.so or .dylib.
    
    
    # PYTHON_LIB := /usr/lib
    
    PYTHON_LIB := $(ANACONDA_HOME)/lib
    
    
    # Homebrew installs numpy in a non standard path (keg only)
    
    
    # PYTHON_INCLUDE += $(dir $(shell python -c 'import numpy.core; print(numpy.core.__file__)'))/include
    
    
    # PYTHON_LIB += $(shell brew --prefix numpy)/lib
    
    
    
    # Uncomment to support layers written in Python (will link against Python libs)
    
    WITH_PYTHON_LAYER := 1
    
    
    # Whatever else you find you need goes here.
    
    
    # INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include
    
    
    # LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib
    
    INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include /usr/include/hdf5/serial 
    LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib /usr/lib/x86_64-linux-gnu /usr/lib/x86_64-linux-gnu/hdf5/serial
    
    
    # If Homebrew is installed at a non standard location (for example your home directory) and you use it for general dependencies
    
    
    # INCLUDE_DIRS += $(shell brew --prefix)/include
    
    
    # LIBRARY_DIRS += $(shell brew --prefix)/lib
    
    
    
    # Uncomment to use `pkg-config` to specify OpenCV library paths.
    
    
    # (Usually not necessary -- OpenCV libraries are normally installed in one of the above $LIBRARY_DIRS.)
    
    
    # USE_PKG_CONFIG := 1
    
    
    
    # N.B. both build and distribute dirs are cleared on `make clean`
    
    BUILD_DIR := build
    DISTRIBUTE_DIR := distribute
    
    
    # Uncomment for debugging. Does not work on OSX due to https://github.com/BVLC/caffe/issues/171
    
    
    # DEBUG := 1
    
    
    
    # The ID of the GPU that 'make runtest' will use to run unit tests.
    
    TEST_GPUID := 0
    
    
    # enable pretty build (comment to see full commands)
    
    Q ?= @

Demo運行

爲了檢驗你的py-faster-rcnn是否成功安裝,作者給出了一個demo,可以利用在PASCAL VOC2007數據集上體現訓練好的模型,來進行demo的運行,步驟如下:

  1. 下載預訓練好的Faster R-CNN檢測器:

    cd py-faster-rcnn
    ./data/scripts/fetch_faster_rcnn_models.sh

    這條命令會自動下載名爲faster_rcnn_models.tgz的文件,解壓後會創建data/faster_rcnn_models文件夾,裏面會有兩個模型:

    • ZF_faster_rcnn_final.caffemodel:在ZF網絡模型下訓練所得
    • VGG16_faster_rcnn_final.caffemodel:在VGG16網絡模型下訓練所得。
  2. 運行demo:

    cd py-faster-rcnn
    ./tools/demo.py
  3. demo會檢測5張圖片,這5張圖片放在data/demo/文件夾下,其中一張的檢測結果如下:

  4. 至此如果上述過程沒有出錯,那麼py-faster-rcnn算是成功編譯安裝。

準備工作2——Caltech數據集

由於Faster R-CNN的一部分實驗是在PASCAL VOC2007數據集上進行的,所以要想用Faster R-CNN訓練我們自己的數據集,首先應該搞清楚PASCAL VOC2007數據集中的目錄、圖片、標註格式,這樣我們才能用自己的數據集製作出類似於PASCAL VOC2007類似的數據集,供Faster R-CNN來進行訓練及測試。

獲取PASCAL VOC2007數據集

這一部分不是必須的,如果你需要PASCAL VOC2007數據集,可以利用以下命令獲取數據集,但我們下載VOC數據集的目的主要是觀察他的文件結構和文件內容,以便於我們構建符合要求的自己的數據集。

  1. 創建一個專門用來存數據集的地方,假設是$HOME/data文件夾。

  2. 下載PASCAL VOC2007的訓練、驗證和測試數據集:

    cd $HOME/data
    wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
    wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
  3. 下載完後用以下命令解壓:

    tar xvf VOCtrainval_06-Nov-2007.tar
    tar xvf VOCtest_06-Nov-2007.tar
  4. 會得到如下文件結構:

    $HOME/data/VOCdevkit/                        # 根文件夾
    $HOME/data/VOCdevkit/VOC2007                 # VOC2007文件夾
    $HOME/data/VOCdevkit/VOC2007/Annotations     # 標記文件夾
    $HOME/data/VOCdevkit/VOC2007/ImageSets       # 供train.txt、test.txt、val.txt等文件存放的文件夾
    $HOME/data/VOCdevkit/VOC2007/JPEGImages      # 存放圖片文件夾
    
    # ... 以及其他的文件夾及子文件夾 ...
    
  5. 創建快捷方式symlinks來連接到VOC數據集存放的地方:

    cd py-faster-rcnn/data
    ln -s $HOME/data/VOCdevkit/ VOCdevkit

    這裏需要把$HOME/data/VOCdevkit/改爲你存放VOCdevkit文件夾的路徑

    最好使用symlinks來在共享同一份數據集,防止數據集多處拷貝,佔用空間。

  6. 至此VOC數據集創建完畢。

PASCAL VOC數據集的分析

PASCAL VOC數據集的文件結構,如下:

└── VOCdevkit
    └── VOC2007 
        ├── Annotations  
        ├── ImageSets  
        │   ├── Layout  
        │   ├── Main  
        │   └── Segmentation  
        ├── JPEGImages  
        ├── SegmentationClass  
        └── SegmentationObject

Annotations

該文件夾主要用來存放圖片標註(即爲ground truth),文件是.xml格式,每張圖片都有一個.xml文件與之對應。選取其中一個文件進行如下分析:

<annotation>
    <folder>VOC2007</folder> # 必須有,父文件夾的名稱
    <filename>000005.jpg</filename> # 必須有
    <source> # 可有可無
        <database>The VOC2007 Database</database>
        <annotation>PASCAL VOC2007</annotation>
        <image>flickr</image>
        <flickrid>325991873</flickrid>
    </source>
    <owner> # 可有可無
        <flickrid>archintent louisville</flickrid>
        <name>?</name>
    </owner>
    <size> # 表示圖像大小
        <width>500</width>
        <height>375</height>
        <depth>3</depth>
    </size>
    <segmented>0</segmented> # 用於分割
    <object> # 目標信息,類別,bbox信息,圖片中每個目標對應一個<object>標籤
        <name>chair</name>
        <pose>Rear</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>263</xmin>
            <ymin>211</ymin>
            <xmax>324</xmax>
            <ymax>339</ymax>
        </bndbox>
    </object>
    <object>
        <name>chair</name>
        <pose>Unspecified</pose>
        <truncated>1</truncated>
        <difficult>1</difficult>
        <bndbox>
            <xmin>5</xmin>
            <ymin>244</ymin>
            <xmax>67</xmax>
            <ymax>374</ymax>
        </bndbox>
    </object>
</annotation>

需要注意的,對於我們自己準備的xml標記文件中,每個<object>標籤中的<xmin><ymin>標籤中所對應的座標值最好大於0,千萬不能爲負數,否則在訓練過程中會報錯:AssertionError: assert (boxes[:, 2]) >= boxes[:, 0]).all(),如下:

所以爲了能夠順利訓練,一定要仔細檢查自己的xml文件中的左上角的座標是否都爲正。我被這個bug卡了一兩天,最終把自己標記中所有的錯誤座標找出來,才得以順利訓練。

ImageSets

ImageSets文件夾下有三個子文件夾,這裏我們只需關注Main文件夾即可。Main文件夾下主要用到的是train.txt、val.txt、test.txt、trainval.txt文件,每個文件中寫着供訓練、驗證、測試所用的文件名的集合,如下:

JPEGImages

JPEGImages文件夾下主要存放着所有的.jpg文件格式的輸入圖片,不在贅述。

製作VOC類似的Caltech數據集

經過以上對PASCAL VOC數據集文件結構的分析,我們仿照其,創建首先創建類似的文件結構即可:

└── VOCdevkit
    └── VOC2007 
    └── Caltech 
        ├── Annotations  
        ├── ImageSets   
        │   └── Main  
        └── JPEGImages

我建議將Caltech文件創建一個symlinks鏈接到VOCdevkit文件夾之下,因爲這樣會方便之後訓練代碼的修改。

  • 至於Caltech數據集如何從.seq文件轉化爲一張張.jpg圖片,這裏可以參考這裏
  • 至於Annotations中一個個.xml標記文件是實驗室師兄給我的,上面提到的方法也可以轉化,但是並不符合要求。
  • 至於ImageSets中的train.txt是根據.xml文件得來的,test.txt是每個seq中每隔30幀取一幀圖片得來的。

參考博客

  1. FastRCNN 訓練自己數據集 (1編譯配置)
  2. 目標檢測–Faster RCNN2

1

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章