CVPR-2018

文章目錄

1 Background and Motivation

目標檢測要解決 recognition problem 和 localization problem，這兩個任務都比較難，因爲都要面對 many “close” false positives！

然而區分 positives 和 negatives 的 threshold （0.5）是 quite loose 的，常常會產生 noisy bounding boxes，如下圖（a）所示

作者想探討下閾值對模型結果的影響，作者有個基本的假設，The basic idea is that a single detector can only be optimal for a single quality level.（cost-sensitive learning 的思想，在自己擅長的領域才能遊刃有餘）

這句話有點不太好理解，看完下面兩個圖就好理解了！

（c）橫座標是 bbox regression 之前 proposal 和 GT 的 IoU，縱座標是 bbox regression 之後 proposal 與 GT 的 IoU，可以發現，大部分都在灰線以上，表示 bbox regression 確實提升了定位的精度，而且用不同正負樣本的 IoU 閾值（ $u$ ）trian 出來的模型都在對應的 input IoU 範圍內，效果提升顯著——三條彩色線鼓起來的地方，紅色和綠色的線條比較明顯
（d）橫座標是測試的時候 IoU 的閾值，縱座標是對應的 AP，可以看出， $u = 0.5$ 的模型（IoU>0.5 爲正樣本，eg：faster rcnn 中，頭部結構位置，不是訓練 RPN 的時候），在 low IoU examples 上表現更好（eg ： $AP_{0.5}$ ），在 higher IoU levels 中表現就差一些（eg： $AP_{0.95}$ ）

These observations suggest that higher quality detection （用更高的正負樣本 IoU 閾值訓練出來的模型， $u$ 值更高）requires a closer quality match（比如 $u = 0.5$ 可能在 AP0.5 的標準下，AP 最好） between the detector and the hypotheses that it processes（假設就是測試時候的 IoU 閾值）.

只有proposal自身的閾值和訓練器訓練用的閾值較爲接近的時候，訓練器的性能才最好（Cascade R-CNN 詳細解讀）

爲了提升 detector 的質量，單單提高 $u$ 的值顯然是不可行的，

（d）圖可以看出， $u$ 越大，曲線面積未必越大（COCO 的 AP 評價指標），
而且由於proposal 的 IoU 分佈很不均衡， $u$ 的話，正樣本會指數級的下降，訓練的時候 prone to over-fitting

作者基於（c）的觀察，大部分輸出總比輸入的 location 要更好，提出了 cascade 的結構，逐步提升 $u$ ，each stage aims to find a good set of close false positives for training the next stage

2 Advantages / Contributions

surpasses all previous state-of-the-art single-model detectors on the challenging COCO detection task

3 Method

3.1 Regression and Classification

3.1.1 Regression

bbox regression 的 loss 是在如下的距離向量上操作的，

$\Delta$ 越小，預測的越準，loss 越小， $\Delta$ 越大，預測的越不準，loss 越大

其中

$(b_x，b_y，b_h，b_w)$ 是預測的 bounding box 的左上角座標（或者中心點），長寬

$(g_x，g_y，g_h，g_w)$ 是 ground truth 的左上角座標（或者中心點），長寬

因爲 $\Delta$ 往往比較小（bounding box 的調整比較小），所以反映在 loss 上也不明顯，因此 regression 的 loss 往往比 classification 的 loss 小的多，爲了解決這個問題

To improve the effectiveness of multi-task learning, $\Delta$ is usually normalized by its mean and variance, i.e.

Faster RCNN，R-FCN 都這麼操作

圖 3 （b）方法是多次對同一種 head 結構進行多次迭代，屬於後處理操作

$b$ 是 proposal， $x$ 是 image patch， $f$ 是 neural network

這樣有兩個缺點

圖1（a）所示，eg $f$ 是在 $u =0.5$ 下訓練得到的，處理 $AP_{0.5}$ 時是最優的，但處理更高的 IoU 效果可能下降，如 $AP_{0.85}$
圖2 是迭代時分佈的變化，可以看到，分佈是在變的，顯然同一個 $f$ 無法在所有的分佈上都取得最優

3.1.2 Classification

分類的時候

$g_y$ 是 gt 的 label， $x$ 是 image patch

$u$ 太大，positive 很少，容易過擬合， $u$ 太小，positive 多樣化起來，但是 detector 很難去拒絕 close false positives！

In general, it is very difficult to ask a single classifier to perform uniformly well over all IoU levels.

一種改進的思路如圖3（c）所示，an ensemble of classifiers

可以看出，隨着 $u$ 增加，會產生更高質量的 proposal，positive 數量降低，不同的 Head 結構，同樣的輸入，導致正樣本數量不一，訓練質量不一樣，最後綜合起來效果可能不好！

This solution fails to address the problem that the different losses of (4) operate on different numbers of positives.

3.2 Cascade R-CNN

$f$ 隨着 bbox 的 distribution 改變，與時俱進

cascade 的時候，三種 $u$

4 Experiments

級聯的 IoU 來漸漸的 close（not correct） false positives

4.1 Datasets

MSCOCO 2017

4.2 Quality Mismatch

（a）的實線是三種 $u$ 單獨訓練，（b）是加入了 gt，（a）（b）對比可以看出， $u = 0.7$ 的時候提升最明顯，說明高 $u$ 需要匹配高質量的 proposal 才能達到更好的效果，作者用 cascade 的 2nd 3rd 階段提取出的 proposal 來替換 $u = 0.6，0.7$ 時的 head 結構，發現效果也提升了（虛線所示）

這個圖是，用三個階段訓練好的頭，在每個階段的 proposal 上測試，對比圖6和圖5（a）,可以看出 cascade 的這種訓練方式，比單獨訓練能達到更好的效果！

4.3. Comparison with Iterative BBox and Integral Loss

（a）中可以看出，相同的 head 結構迭代三次反而反而變差了
（b）中，0.6 最好，0.7最差，合起來沒有更好

4.4. Ablation Experiments

要注意的是，不同 stage 的 classification 的結果平均了，location 的結果是 3rd stage 的

這個表細節不是很理解，第一行按道理就是 Iterative BBox，可和前面的結果沒對應上！stat 到底是怎麼作用在 cascade 結構中的！我還不是很透徹

3 stage 級聯效果最好

4.5. Comparison with the state-of-the-art

4.6. Generalization Capacity

在各種結構上對比有無 cascade 結構

可以看到，頭部結構複雜的模型， cascade 後大多了

4.7. Results on PASCAL VOC

5 Conclusion（own）

cost-sensitive learning

參考：關於不平衡數據集以及代價敏感學習的探討
Usually, there is no benefit beyond applying $f$ twice.（iterative bounding box regression）
作者說的 hypothesis 指的是輸入到 head 結構中的 proposal，eg 1st stage 就是 RPN 的結果！
table 3 有疑惑，stat

【Cascade R-CNN】《Cascade R-CNN: Delving into High Quality Object Detection》

文章目錄

1 Background and Motivation

2 Advantages / Contributions

3 Method

3.1 Regression and Classification

3.1.1 Regression

3.1.2 Classification

3.2 Cascade R-CNN

4 Experiments

4.1 Datasets

4.2 Quality Mismatch

4.3. Comparison with Iterative BBox and Integral Loss

4.4. Ablation Experiments

4.5. Comparison with the state-of-the-art

4.6. Generalization Capacity

4.7. Results on PASCAL VOC

5 Conclusion（own）

Python 爬蟲：Spring Boot 反爬蟲的成功案例

京東科技數字化營銷能力的演進與最佳實踐| 京東雲技術團隊

【python】Stack / Queue

【python】Single / Single Cycle / Double Link List

【MoCo】《Momentum Contrast for Unsupervised Visual Representation Learning》

【python】Sort and Search

【Distilling】《Learning Efficient Object Detection Models with Knowledge Distillation》

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結