【Face Detection】《Face Detection using Deep Learning: An Improved Faster RCNN Approach》

原創

2020-06-28 04:40

Neurocomputing-2018

文章目录

4 Experiments

5 Conclusion（own）

1 Background and Motivation

face detection 效果的提升，有助于许多 subsequent face-related applications，例如 face verification，face recognition and face clustering！

传统的 face detection 方法（eg ViolaJones）依赖 hand-crafted features，each individual component is optimized separately（不是 end-to-end 的）, making the whole detection pipeline often sub-optimal.

这几年，CNN 横空出世，在各大 CV tasks 中大放异彩，随着 CNN 的普及和推广，许多研究者也将注意力聚焦在用深度学习做 face detection 上！

通常，face detection 可以看做是 a special type of object detection task！所以现有的方法也基本基于 R-CNN 的 pipeline！

作者在 Faster R-CNN 上扩展（ R-CNN 系列中最好的方法），运用各种策略，在 Face Detection Dataset and Benchmark (FDDB) 数据集上夺魁！

2 Advantages / Contributions

提出了 a new scheme for face detection by improving the Faster RCNN framework，在 FDDB 数据集上夺魁（更多是工程上）

3 Method

feature concatenation
hard negative mining
multi-scale training
Convert bbox to ellipses

用 WIDER FACE 数据集训练，来产生 hard negatives！完成的细节流程请看后面实验部分

3.1 Feature Concatenation

faster rcnn 的 ROI pooling 是接在最后一个特征图上，这可能会 omit some feature 特征（更深层的特征图虽感受野更大，但有 grosser granularity）

作者，在多个 stage 的特征图上采用 RoI pooling，然后 concatenate 起来（H,W 应该都一样），接 1x1 Conv 恢复成原来的 channels！以此来 capture more fine-grained details of the RoIs

3.2 Hard Negative Mining

作者将 hard negative sample 掺杂到负样本中！

hard negatives are the regions where the network has failed to make correct prediction

在 proposals 到 RoIs ——准备训练 head 的过程（不是 anchor 到 proposal——训练 RPN），正负样本 1：3，IoU threshold 为 0.5

3.3 Multi-Scale Training

randomly assign one of three scales for each image before it is fed into the network

shorter side will be one of 480，600，750 长边不超过 1200

多尺度训练，可惜，没有实战过！

4 Experiments

caffe，VGG-16, Faster R-CNN

4.1 Datasets

FDDB face detetion benchmark，5,171 faces in 2,845 images
WIDER FACE（相比于 FDDB，larger-scale face data）
including various detection challenges, such as occlusions, difficult poses, and low resolution and out-of-focus faces.

4.2 Experimental Setup

第一步，用 WIDER FACE training and validation datasets 作为训练集，训练 VGG16+Faster RCNN

对每个 face 按照下表的评分系统进行打分（正常图0分），discard 得分超过两分的图片，discard 超过 1000 个 annotation 的图片

第二步，用 WIDER FACE dataset inference 一遍模型，score 高于 0.8，IoU 小于 0.5 的 proposal 视为 hard negatives! 接着用固定的学习率训练 100，000 个 iteration 进行 hard negative mining procedure，每次要确保上一次筛选出来的 hard negatives 被抽取到成为 RoIs

最后，用 FDDB 数据集进行微调，horizontal flipping 配合多尺度（三个尺度）训练，100 个 RoI 送到头部结构，NMS 阈值设置为 0.3，分类的阈值设置为 0.8

4.3 Experimental Results on FDDB Benchmark

continuous ROC score 效果更明显

4.4 Ablation Experiments

右下角局部区域的放大还是很值得借鉴的！算下 ROC 的面积，做成表格多好，这样图和表中编号对照着看还是挺麻烦的

5 Conclusion（own）

运用创新，把目标检测的那一套迁移到人脸检测上来，introduction 的套路太熟悉了啦，哈哈哈
稍微感觉有新意的是 feature concatenate 那里，和 hard negatives 的挖掘方式（用另外的更大的数据集来挖）

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

【Face Detection】《Face Detection using Deep Learning: An Improved Faster RCNN Approach》

文章目录

1 Background and Motivation

2 Advantages / Contributions

3 Method

3.1 Feature Concatenation

3.2 Hard Negative Mining

3.3 Multi-Scale Training

4 Experiments

4.1 Datasets

4.2 Experimental Setup

4.3 Experimental Results on FDDB Benchmark

4.4 Ablation Experiments

5 Conclusion（own）

钉钉打卡速度慢

Nginx R31 doc 官方文档-01-nginx 如何安装

Python 潮流周刊#51：用 Python 绘制美观的图表

Qt/C++音视频开发74-合并标签图形/生成yolo运算结果图形/文字和图形合并成一个/水印滤镜

挑战程序设计竞赛 2.2章习题 POJ - 3617 Best Cow Line 贪心

字节面试：MySQL什么时候锁表？如何防止锁表？

.NET8连接SQL SERVER 2008 R2 报：证书链是由不受信任的颁发机构颁发的

golang开发环境搭建(win10)

python计算机视觉学习笔记——PIL库的用法

Golang初学：获取程序内存使用情况，std runtime

【python】Stack / Queue

【python】Single / Single Cycle / Double Link List

【MoCo】《Momentum Contrast for Unsupervised Visual Representation Learning》

【python】Sort and Search

【Distilling】《Learning Efficient Object Detection Models with Knowledge Distillation》

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結