CVPR-2019

文章目錄

4 Experiments

5 Conclusion（own）

1 Background and Motivation

相較於 model architectures，training process 往往被人忽視，在目標檢測任務中，它同樣至關重要：

無論 one-stage 還是 two-stage，訓練策略都如下：

sampling regions
extracting features
jointly recognizing the categories and refining the locations under the guidance of a standard multi-task objective function

這樣會帶來三個疑問：

whether the selected region samples are representative（sample level）
whether the extracted visual features are fully utilized（feature level）
whether the designed objective function is optimal（objective level）

更具體的說：

a）貢獻度小的 easy sample 佔了大多數，貢獻度大的 hard sample 佔少數

OHEM 進行了改進，但是 sensitive to noise labels and incurring considerable memory and computing costs. Focal loss 適用於 one-stage，two-stage 中效果會被 second stage 沖刷弱化

b）low-level 和 high-level 的特徵的 integrate 不是最優，還有改進空間

FPN 或者 PANet 都是 focus more on adjacent resolution but less on others，

The semantic information contained in non-adjacent levels would be diluted once per fusion during the information flow.（這句話寫的真棒）

c）目標函數的 imbalance

If they are not properly balanced, one goal may be compromised, leading to suboptimal performance overall.

爲了緩解上面三個問題，作者提出了相應的三種解決辦法：

IoU-balanced sampling
balanced feature pyramid
balanced L1 loss

模型命名爲 Libra R-CNN（天秤座）

2 Advantages / Contributions

systematically revisit the training process of detectors，並提出了 sample 、feature、objective 三個 level 的 imbalance 問題
提出 Libra R-CNN detectors 來緩解 sample 、feature、objective 的 imbalance 問題
significant improvements over state-of-the-art detectors

3 Method

3.1. IoU balanced Sampling

Is the overlap between a training sample and its corresponding ground truth associated with its difficulty?

訓練 RPN 時，小於 0.5 是負樣本，上圖中間橙色的真實的難負樣本的百分比分佈，可以看出難負樣本並不是均勻分佈的！所有 0-0.5 區間內，難負樣本的數量分佈是不均勻的！

上圖可以看出，random sample 由於分佈和真實的不匹配，會導致採樣過程中，極小閾值的難負樣本過採樣（eg 0-0.05），大一些的閾值範圍內，欠採樣，這樣會導致在採樣的過程中遺漏部分難負樣本！

作者的改進，原來要從 M 個 candidates 採集 N 個 negative samples，隨機採樣下，每個樣本被抽到的概率是

爲了提升難樣本被抽到的概率，作者將 M 分爲 K 組，然後在每組中均勻採樣，每組樣本被採集的概率是

the performance is not sensitive to K,

這樣處理後，就如上圖的綠色分佈，總之目的就是爲了逼近真實的分佈，改進是把隨機採樣改成分層採樣（參考：2019CVPR Libra RCNN目標檢測算法（特徵融合））！因爲每個區間難負樣本的分佈不一樣，分區間，然後每個區間均勻採樣能保證更大的覆蓋率（以前生物課本上學的分層採樣，回想一下就好理解了）

3.2. Balanced Feature Pyramid

1）Obtaining balanced semantic features

先 4 合 1 成一個分辨率，差值或者 max pooling

non-parametric method

2）Refining balanced semantic features

Gaussian non-local attention 來 enhance the integrated features，用卷積也差不多，non-local 穩定一些

最後再和原特徵相加（identity），增強原特徵。

3.3. Balanced L1 Loss

我們的目標函數如下：

爲了平衡兩個不同任務的損失，可以調整參數 $\lambda$ 。

然而 owing to the unbounded regression targets, directly raising the weight of localization loss will make the model more sensitive to outliers.

作者定義 loss greater than or equal to 1.0 的樣本爲outliers，其它樣本爲 inliers

outliers 能看成是 hard samples
inliers 可以看成 easy samples

我們先看看原版的 Smooth L1 loss

有個拐點，來 separate inliers from outliners，clip the large gradients produced
by outliers with a maximum value of 1.0（原來是這樣）

To be more specific, inliers only contribute 30% gradients average per sample compared with outliers.

爲了秉持中庸之道，提升 inliers sample 的梯度，作者對 smooth L1 進行了改進，先看下圖

改進前的是紅色的虛線，左邊梯度，右邊 loss（對應我上面畫的 loss 圖），拐點是 1.0！

公式改進如下

原來目標函數的梯度應該是 $x$ ，現在變成 $aln(b|x|+1)$ ，也就是線性增長變成了對數增長， $\alpha$ 越小，inliers 的梯度增加的幅度越大

所以，對應的 loss 爲

求下導數就是上面的梯度公式了，其中

作者實驗中 $\alpha$ 設計成 0.5， $\gamma$ 設計成 1.5

4 Experiments

4.1 Datasets

COCO data

train-2017 train
val-2017 ablation results
test-dev test

4.2 Main Results

1）華山論劍

table 3 可以看出，更先進的主幹網絡帶來的提升，不如作者提出的 IoU balanced sampling

4.3 Ablation Experiments

沒有能看到更多的組合

1）IoU-balanced Sampling

文章中關於正樣本的處理寫了如下描述

也就是對正樣本的採樣也從隨機採樣變成了分層採樣，從圖 6 中可以看出，更多的 hard negative sample 被捕捉到（正樣本被負樣本蓋住了，看不出來具體改變）

sampling equal number of positive samples for each ground truth，叫做 Pos Balance，因爲正樣本本來就少（所有樣本中大於閾值的數量本來就很少），在大於閾值的範圍內隨機採樣基本也可以採集到大部分，所以提升不是很明顯！

對組數並不敏感，2，3，5 差別不大，as long as the hard negatives are more likely selected.

2）Balanced Feature Pyramid

3）Balanced L1 Loss

balanced L1 loss 的提升主要來自於 AP75，說明大大的提升了定位能力

注意到 loss weight = 2.0 時，效果反而下降了，作者給出瞭如下解釋

These results show that the outliers bring negative influence on the training process, and leave the potential of model architecture from being fully exploited.

5 Conclusion（own）

隨機採樣到分層採樣改進難負樣本的 sampling，正樣本的選取也進行了同樣操作
Smooth L1 loss 拐點設計原來這麼講究（outlier，inlier），讓我更深刻理解到了調分類定位 loss 的係數本質上是在調什麼！一切要從 loss 和梯度源頭思考！！！
定量的給出了 hard negative 的重要性，666
FPN 改進的時候，“影分身”實現細節論文中還真看不出，感謝 2019CVPR Libra RCNN目標檢測算法（特徵融合）

【Libra R-CNN】《Libra R-CNN: Towards Balanced Learning for Object Detection》

文章目錄

1 Background and Motivation

2 Advantages / Contributions

3 Method

3.1. IoU balanced Sampling

3.2. Balanced Feature Pyramid

3.3. Balanced L1 Loss

4 Experiments

4.1 Datasets

4.2 Main Results

4.3 Ablation Experiments

5 Conclusion（own）

探究職業發展的關鍵：能力模型解讀

如何在低代碼平臺中引用 JavaScript ？

高效率使用windows

智能決策新時代：可視化大屏是否能夠超越傳統白板？

解密Prompt系列28. LLM Agent之金融領域摸索：FinMem & FinAgent

分享幾個.NET開源的AI和LLM相關項目框架

【python】Stack / Queue

【python】Single / Single Cycle / Double Link List

【MoCo】《Momentum Contrast for Unsupervised Visual Representation Learning》

【python】Sort and Search

【Distilling】《Learning Efficient Object Detection Models with Knowledge Distillation》

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結