人臉檢測BlazeFace

原創

watersink

2020-06-16 08:52

論文：BlazeFace: Sub-millisecond Neural Face Detection on Mobile GPUs

MNN-github：https://github.com/xindongzhang/MNN-APPLICATIONS

Google出品，

論文提出了一個移動端的超級實時的人臉檢測框架（人臉檢測+關鍵點檢測），基於MobileNetV1/V2和ssd進行修改，在移動端GPU上可以達到200-1000FPS，基於MNN框架在rk3399板子上cpu速度可以達到10ms。

主要貢獻：

基於MobileNetV1/V2修改的一個非常緊湊的基礎特徵提取網絡，輕量化，專爲移動端設計。
基於SSD修改的GPU友好的anchor方案。
相比於預測的質量，對於NMS操作，一個緊的可選的分辨率策略，有助於取得平穩，平滑的預測。

網絡結構設計：

BlazeFace主要檢測邊框和eye centers, ear tragions, mouth center, nose tip共6個關鍵點。

1.增加感受野的大小（Enlarging the receptive field sizes）

在深度可分離卷積（depthwise separable convolution）中，depthwise convolution部分（s*s*c*k*k）與 pointwise convolution 部分（s*s*c*d）計算量比值爲（k*k：d），可見depthwise separable convolution計算量主要由d決定。大部分時候，k=3，而d則很大，取值包括，24，32，64，96，160，320，1280。所以1*1卷積的計算量大於3*3可分離卷積的計算量。

a 3×3 depthwise convolution in 16-bit floating point arithmetic takes 0.07 ms for a 56×56×128 tensor, while the subsequent 1×1 convolution from 128 to 128 channels is 4.3× slower at 0.3 ms

使用5*5卷積核代替3*3卷積核，不會帶來太大開銷，但是可以增大感受野（receptive field）。

在原始MobileNet殘差結構的基礎上，將3*3卷積換成5*5卷積，從而增大感受野得到BlazeBlock模塊。並且對兩個BlazeBlock模塊進行疊加，得到double BlazeBlock模塊。

2.特徵提取器基礎網絡（Feature extractor）

輸入圖像大小128*128*3，輸出大小8*8，網絡包含5個BlazeBlock模塊，6個double BlazeBlock模塊

3.Anchor 策略（Anchor scheme）

相比ssd的6個尺度的anchor，BlazeFace修改爲只有2個scale的anchor，而aspect ratio 只取1。

4.後處理（Post-processing）

後處理爲了解決手機拍攝或者錄像中的抖動問題，引入blending nms，可以提高10%的準確性。

Blending nms代碼：

typedef struct ObjectInfo {
    float x1;
    float y1;
    float x2;
    float y2;
    float score;
} ObjectInfo;

void nms(std::vector<ObjectInfo> &input, std::vector<ObjectInfo> &output, int type) {
    std::sort(input.begin(), input.end(), [](const ObjectInfo &a, const ObjectInfo &b) { return a.score > b.score; });

    int box_num = input.size();

    std::vector<int> merged(box_num, 0);

    for (int i = 0; i < box_num; i++) {
        if (merged[i])
            continue;
        std::vector<ObjectInfo> buf;

        buf.push_back(input[i]);
        merged[i] = 1;

        float h0 = input[i].y2 - input[i].y1 + 1;
        float w0 = input[i].x2 - input[i].x1 + 1;

        float area0 = h0 * w0;

        for (int j = i + 1; j < box_num; j++) {
            if (merged[j])
                continue;

            float inner_x0 = input[i].x1 > input[j].x1 ? input[i].x1 : input[j].x1;
            float inner_y0 = input[i].y1 > input[j].y1 ? input[i].y1 : input[j].y1;

            float inner_x1 = input[i].x2 < input[j].x2 ? input[i].x2 : input[j].x2;
            float inner_y1 = input[i].y2 < input[j].y2 ? input[i].y2 : input[j].y2;

            float inner_h = inner_y1 - inner_y0 + 1;
            float inner_w = inner_x1 - inner_x0 + 1;

            if (inner_h <= 0 || inner_w <= 0)
                continue;

            float inner_area = inner_h * inner_w;

            float h1 = input[j].y2 - input[j].y1 + 1;
            float w1 = input[j].x2 - input[j].x1 + 1;

            float area1 = h1 * w1;

            float score;

            score = inner_area / (area0 + area1 - inner_area);

            if (score > iou_threshold) {
                merged[j] = 1;
                buf.push_back(input[j]);
            }
        }
        switch (type) {
            case hard_nms: {
                output.push_back(buf[0]);
                break;
            }
            case blending_nms: {
                float total = 0;
                for (int i = 0; i < buf.size(); i++) {
                    total += exp(buf[i].score);
                }
                ObjectInfo rects;
                memset(&rects, 0, sizeof(rects));
                for (int i = 0; i < buf.size(); i++) {
                    float rate = exp(buf[i].score) / total;
                    rects.x1 += buf[i].x1 * rate;
                    rects.y1 += buf[i].y1 * rate;
                    rects.x2 += buf[i].x2 * rate;
                    rects.y2 += buf[i].y2 * rate;
                    rects.score += buf[i].score * rate;
                }
                output.push_back(rects);
                break;
            }
            default: {
                printf("wrong type of nms.");
                exit(-1);
            }
        }
    }
}

實驗結果：

總結：

針對手機端的正向人臉檢測，可以同時做檢測+關鍵點
速度超級快

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

人臉檢測BlazeFace

人臉檢測之RetinaFace

臉型匹配

人臉檢測之CenterFace

基於人臉先驗的人臉超分FSRNet

人臉美顏磨皮Dermabrasion

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結