【Face Detection】《Face Detection using Deep Learning: An Improved Faster RCNN Approach》

原創

2020-06-28 04:40

Neurocomputing-2018

文章目錄

4 Experiments

5 Conclusion（own）

1 Background and Motivation

face detection 效果的提升，有助於許多 subsequent face-related applications，例如 face verification，face recognition and face clustering！

傳統的 face detection 方法（eg ViolaJones）依賴 hand-crafted features，each individual component is optimized separately（不是 end-to-end 的）, making the whole detection pipeline often sub-optimal.

這幾年，CNN 橫空出世，在各大 CV tasks 中大放異彩，隨着 CNN 的普及和推廣，許多研究者也將注意力聚焦在用深度學習做 face detection 上！

通常，face detection 可以看做是 a special type of object detection task！所以現有的方法也基本基於 R-CNN 的 pipeline！

作者在 Faster R-CNN 上擴展（ R-CNN 系列中最好的方法），運用各種策略，在 Face Detection Dataset and Benchmark (FDDB) 數據集上奪魁！

2 Advantages / Contributions

提出了 a new scheme for face detection by improving the Faster RCNN framework，在 FDDB 數據集上奪魁（更多是工程上）

3 Method

feature concatenation
hard negative mining
multi-scale training
Convert bbox to ellipses

用 WIDER FACE 數據集訓練，來產生 hard negatives！完成的細節流程請看後面實驗部分

3.1 Feature Concatenation

faster rcnn 的 ROI pooling 是接在最後一個特徵圖上，這可能會 omit some feature 特徵（更深層的特徵圖雖感受野更大，但有 grosser granularity）

作者，在多個 stage 的特徵圖上採用 RoI pooling，然後 concatenate 起來（H,W 應該都一樣），接 1x1 Conv 恢復成原來的 channels！以此來 capture more fine-grained details of the RoIs

3.2 Hard Negative Mining

作者將 hard negative sample 摻雜到負樣本中！

hard negatives are the regions where the network has failed to make correct prediction

在 proposals 到 RoIs ——準備訓練 head 的過程（不是 anchor 到 proposal——訓練 RPN），正負樣本 1：3，IoU threshold 爲 0.5

3.3 Multi-Scale Training

randomly assign one of three scales for each image before it is fed into the network

shorter side will be one of 480，600，750 長邊不超過 1200

多尺度訓練，可惜，沒有實戰過！

4 Experiments

caffe，VGG-16, Faster R-CNN

4.1 Datasets

FDDB face detetion benchmark，5,171 faces in 2,845 images
WIDER FACE（相比於 FDDB，larger-scale face data）
including various detection challenges, such as occlusions, difficult poses, and low resolution and out-of-focus faces.

4.2 Experimental Setup

第一步，用 WIDER FACE training and validation datasets 作爲訓練集，訓練 VGG16+Faster RCNN

對每個 face 按照下表的評分系統進行打分（正常圖0分），discard 得分超過兩分的圖片，discard 超過 1000 個 annotation 的圖片

第二步，用 WIDER FACE dataset inference 一遍模型，score 高於 0.8，IoU 小於 0.5 的 proposal 視爲 hard negatives! 接着用固定的學習率訓練 100，000 個 iteration 進行 hard negative mining procedure，每次要確保上一次篩選出來的 hard negatives 被抽取到成爲 RoIs