目錄
1 引言
在目標檢測的視覺任務中,分類和檢測框的迴歸是核心內容,因此損失函數的選擇對模型的表現效果也具有較大影響,對於分類損失函數這裏不多做贅述(Focal loss考慮到了正負樣本以及難例和簡單例平衡的問題,目前已經被各位煉丹師廣泛使用。)所以這個博客主要講解檢測框迴歸損失函數,根據其演變路線SmoothL1>Iou>Giou>Diou>Ciou進行詳細介紹。
2 檢測框迴歸損失函數
2.1 SmoothL1
SmoothL1最早在何凱明大神的Faster RCNN模型中使用到。計算公式如下所示,SmoothL1預測框值和真實框值差的絕對值大於1時採用線性函數,其導數爲常數,避免了因差值過大造成梯度爆炸。當差值小於1時採用非線性函數,導數值變小有利用模型收斂到更高精度。
2.1.1 SmoothL1 Torch實現代碼
def Smooth_L1(predict_box,gt_box,th=1.0,reduction = "mean"):
'''
predict_box:[[x1,y1,x2,y2], [x1,y1,x2,y2],,,]
gt_box:[[x1,y1,x2,y2], [x1,y1,x2,y2],,,]
th: float
reduction:"mean"or"sum"
return: loss
'''
# 計算座標差
x_diff = torch.abs(predict_box - gt_box)
print("x_diff:\n",x_diff)
#torch.where(fu(ction:: where(condition, x, y) -> Tensor)
#滿足條件返回x,不滿足條件返回y
loss = torch.where(x_diff < th, 0.5 * x_diff * x_diff, x_diff - 0.5)
print("loss_first_stage:\n",loss)
if reduction == "mean":
loss = torch.mean(loss)
elif reduction == "sum":
loss = torch.sum(loss)
else:
pass
print("loss_last_stage:\n",loss)
return loss
if __name__ == "__main__":
pred_box = torch.tensor([[2,4,6,8],[5,9,13,12]])
gt_box = torch.tensor([[3,4,7,9]])
loss = Smooth_L1(predict_box=pred_box,gt_box=gt_box)
#輸出結果
"""
x_diff:
tensor([[1, 0, 1, 1],
[2, 5, 6, 3]])
loss_first_stage:
tensor([[0.5000, 0.0000, 0.5000, 0.5000],
[1.5000, 4.5000, 5.5000, 2.5000]])
loss_last_stage:
tensor(1.9375)
"""
2.1.2 SmoothL1的缺點
在nms或者計算AP的時候對檢測框採用的是Iou的標準,因此SmoothL1和評價標準的關聯性不是很大,具有相同SmoothL1 值的一對框可能具有不同的Iou。針對這一問題,提出了以下的Iou作爲損失函數。
2.2 Iou
Iou的就是交併比,預測框和真實框相交區域面積和合並區域面積的比值,計算公式如下,Iou作爲損失函數的時候只要將其對數值輸出就好了。
2.2.1 Iou Torch代碼實現
def Iou_loss(preds, bbox, eps=1e-6, reduction='mean'):
'''
preds:[[x1,y1,x2,y2], [x1,y1,x2,y2],,,]
bbox:[[x1,y1,x2,y2], [x1,y1,x2,y2],,,]
reduction:"mean"or"sum"
return: loss
'''
x1 = torch.max(preds[:, 0], bbox[:, 0])
y1 = torch.max(preds[:, 1], bbox[:, 1])
x2 = torch.min(preds[:, 2], bbox[:, 2])
y2 = torch.min(preds[:, 3], bbox[:, 3])
w = (x2 - x1 + 1.0).clamp(0.)
h = (y2 - y1 + 1.0).clamp(0.)
inters = w * h
print("inters:\n",inters)
uni = (preds[:, 2] - preds[:, 0] + 1.0) * (preds[:, 3] - preds[:, 1] + 1.0) + (bbox[:, 2] - bbox[:, 0] + 1.0) * (
bbox[:, 3] - bbox[:, 1] + 1.0) - inters
print("uni:\n",uni)
ious = (inters / uni).clamp(min=eps)
loss = -ious.log()
if reduction == 'mean':
loss = torch.mean(loss)
elif reduction == 'sum':
loss = torch.sum(loss)
else:
raise NotImplementedError
print("last_loss:\n",loss)
return loss
if __name__ == "__main__":
pred_box = torch.tensor([[2,4,6,8],[5,9,13,12]])
gt_box = torch.tensor([[3,4,7,9]])
loss = Iou_loss(preds=pred_box,bbox=gt_box)
# 輸出結果
"""
inters:
tensor([20., 3.])
uni:
tensor([35., 63.])
last_loss:
tensor(1.8021)
"""
2.2.2 Iou的缺點
當預測框和真實框不相交時Iou值爲0,導致很大範圍內損失函數沒有梯度。針對這一問題,提出了Giou作爲損失函數。
2.3 Giou
Giou Loss先計算閉包區域(閉包區域是包含預測框和檢測框的最小區域),再計算閉包區域中不屬於兩個框區域佔閉包區域的比重,最後用Iou減去這個比重。Giou計算公式如下,最後將1-Giou的輸出作爲損失值。這樣也能夠衡量出不相交情況下的距離,以及不同相交方式下的距離。論文地址:https://arxiv.org/pdf/1902.09630.pdf
2.3.1 Giou Torch實現代碼
def Giou_loss(preds, bbox, eps=1e-7, reduction='mean'):
'''
https://github.com/sfzhang15/ATSS/blob/master/atss_core/modeling/rpn/atss/loss.py#L36
:param preds:[[x1,y1,x2,y2], [x1,y1,x2,y2],,,]
:param bbox:[[x1,y1,x2,y2], [x1,y1,x2,y2],,,]
:return: loss
'''
ix1 = torch.max(preds[:, 0], bbox[:, 0])
iy1 = torch.max(preds[:, 1], bbox[:, 1])
ix2 = torch.min(preds[:, 2], bbox[:, 2])
iy2 = torch.min(preds[:, 3], bbox[:, 3])
iw = (ix2 - ix1 + 1.0).clamp(0.)
ih = (iy2 - iy1 + 1.0).clamp(0.)
# overlap
inters = iw * ih
print("inters:\n",inters)
# union
uni = (preds[:, 2] - preds[:, 0] + 1.0) * (preds[:, 3] - preds[:, 1] + 1.0) + (bbox[:, 2] - bbox[:, 0] + 1.0) * (
bbox[:, 3] - bbox[:, 1] + 1.0) - inters + eps
print("uni:\n",uni)
# ious
ious = inters / uni
print("Iou:\n",ious)
ex1 = torch.min(preds[:, 0], bbox[:, 0])
ey1 = torch.min(preds[:, 1], bbox[:, 1])
ex2 = torch.max(preds[:, 2], bbox[:, 2])
ey2 = torch.max(preds[:, 3], bbox[:, 3])
ew = (ex2 - ex1 + 1.0).clamp(min=0.)
eh = (ey2 - ey1 + 1.0).clamp(min=0.)
# enclose erea
enclose = ew * eh + eps
print("enclose:\n",enclose)
giou = ious - (enclose - uni) / enclose
loss = 1 - giou
if reduction == 'mean':
loss = torch.mean(loss)
elif reduction == 'sum':
loss = torch.sum(loss)
else:
raise NotImplementedError
print("last_loss:\n",loss)
return loss
if __name__ == "__main__":
pred_box = torch.tensor([[2,4,6,8],[5,9,13,12]])
gt_box = torch.tensor([[3,4,7,9]])
loss = Giou_loss(preds=pred_box,bbox=gt_box)
# 輸出結果
"""
inters:
tensor([20., 3.])
uni:
tensor([35., 63.])
Iou:
tensor([0.5714, 0.0476])
enclose:
tensor([36., 99.])
last_loss:
tensor(0.8862)
"""
2.3.2 Giou 缺點
如下圖所示,當預測框被真實框包含,或者真實框被預測框包含,則閉包區域和相併區域一樣,因此Giou退化爲Iou,未能反映出包含的各種方式,針對這一問題,提出了Diou和Ciou。
2.4 Diou和Ciou
Diou和Ciou是哥哥和弟弟的關係,是在同一篇論文中被提出。論文鏈接:https://arxiv.org/pdf/1911.08287.pdf,在論文中作者對Iou和Giou進行對比:(1)認爲Iou和Giou收斂速度較慢,因此需要更多的迭代步數。(2)特殊情況下(互相包含),Iou和Giou錯誤率依舊很高,如下圖所示:
論文的貢獻點總結爲四點:
(1)Diou相對Giou效率高,模型更易收斂。
(2)Ciou對精度更加有利,考慮到了三個重要的幾何因素:重疊面積、中心點距離和長寬比。能夠更好的描述框的迴歸問題。
(3)Diou能夠直接插入到後處理過程nms裏面,比原始nms裏面用到的Iou表現更好。
(4)Diou和Ciou的可移植性強,很牛逼!
Diou計算公式如下:
式中:b和b(gt)分別表示預測框和真實框的中心點,然後計算歐式距離的平方。c表示預測框和真實框的最小閉包區域的對角線長度。
2.4.1 Diou Torch實現代碼
def Diou_loss(preds, bbox, eps=1e-7, reduction='mean'):
'''
preds:[[x1,y1,x2,y2], [x1,y1,x2,y2],,,]
bbox:[[x1,y1,x2,y2], [x1,y1,x2,y2],,,]
eps: eps to avoid divide 0
reduction: mean or sum
return: diou-loss
'''
ix1 = torch.max(preds[:, 0], bbox[:, 0])
iy1 = torch.max(preds[:, 1], bbox[:, 1])
ix2 = torch.min(preds[:, 2], bbox[:, 2])
iy2 = torch.min(preds[:, 3], bbox[:, 3])
iw = (ix2 - ix1 + 1.0).clamp(min=0.)
ih = (iy2 - iy1 + 1.0).clamp(min=0.)
# overlaps
inters = iw * ih
# union
uni = (preds[:, 2] - preds[:, 0] + 1.0) * (preds[:, 3] - preds[:, 1] + 1.0) + (bbox[:, 2] - bbox[:, 0] + 1.0) * (
bbox[:, 3] - bbox[:, 1] + 1.0) - inters
# iou
iou = inters / (uni + eps)
print("iou:\n",iou)
# inter_diag
cxpreds = (preds[:, 2] + preds[:, 0]) / 2
cypreds = (preds[:, 3] + preds[:, 1]) / 2
cxbbox = (bbox[:, 2] + bbox[:, 0]) / 2
cybbox = (bbox[:, 3] + bbox[:, 1]) / 2
inter_diag = (cxbbox - cxpreds) ** 2 + (cybbox - cypreds) ** 2
print("inter_diag:\n",inter_diag)
# outer_diag
ox1 = torch.min(preds[:, 0], bbox[:, 0])
oy1 = torch.min(preds[:, 1], bbox[:, 1])
ox2 = torch.max(preds[:, 2], bbox[:, 2])
oy2 = torch.max(preds[:, 3], bbox[:, 3])
outer_diag = (ox1 - ox2) ** 2 + (oy1 - oy2) ** 2
print("outer_diag:\n",outer_diag)
diou = iou - inter_diag / outer_diag
diou = torch.clamp(diou, min=-1.0, max=1.0)
diou_loss = 1 - diou
print("last_loss:\n",diou_loss)
if reduction == 'mean':
loss = torch.mean(diou_loss)
elif reduction == 'sum':
loss = torch.sum(diou_loss)
else:
raise NotImplementedError
return loss
if __name__ == "__main__":
pred_box = torch.tensor([[2,4,6,8],[5,9,13,12]])
gt_box = torch.tensor([[3,4,7,9]])
loss = Diou_loss(preds=pred_box,bbox=gt_box)
# 輸出結果
"""
iou:
tensor([0.5714, 0.0476])
inter_diag:
tensor([ 1, 32])
outer_diag:
tensor([ 50, 164])
last_loss:
tensor([0.4286, 0.9524])
"""
Ciou在Diou的基礎上進行改進,認爲Diou只考慮了中心點距離和重疊面積,但是沒有考慮到長寬比。
Ciou計算公式:
2.4.2 Ciou Torch實現代碼
import math
def Ciou_loss(preds, bbox, eps=1e-7, reduction='mean'):
'''
https://github.com/Zzh-tju/DIoU-SSD-pytorch/blob/master/utils/loss/multibox_loss.py
:param preds:[[x1,y1,x2,y2], [x1,y1,x2,y2],,,]
:param bbox:[[x1,y1,x2,y2], [x1,y1,x2,y2],,,]
:param eps: eps to avoid divide 0
:param reduction: mean or sum
:return: diou-loss
'''
ix1 = torch.max(preds[:, 0], bbox[:, 0])
iy1 = torch.max(preds[:, 1], bbox[:, 1])
ix2 = torch.min(preds[:, 2], bbox[:, 2])
iy2 = torch.min(preds[:, 3], bbox[:, 3])
iw = (ix2 - ix1 + 1.0).clamp(min=0.)
ih = (iy2 - iy1 + 1.0).clamp(min=0.)
# overlaps
inters = iw * ih
# union
uni = (preds[:, 2] - preds[:, 0] + 1.0) * (preds[:, 3] - preds[:, 1] + 1.0) + (bbox[:, 2] - bbox[:, 0] + 1.0) * (
bbox[:, 3] - bbox[:, 1] + 1.0) - inters
# iou
iou = inters / (uni + eps)
print("iou:\n",iou)
# inter_diag
cxpreds = (preds[:, 2] + preds[:, 0]) / 2
cypreds = (preds[:, 3] + preds[:, 1]) / 2
cxbbox = (bbox[:, 2] + bbox[:, 0]) / 2
cybbox = (bbox[:, 3] + bbox[:, 1]) / 2
inter_diag = (cxbbox - cxpreds) ** 2 + (cybbox - cypreds) ** 2
# outer_diag
ox1 = torch.min(preds[:, 0], bbox[:, 0])
oy1 = torch.min(preds[:, 1], bbox[:, 1])
ox2 = torch.max(preds[:, 2], bbox[:, 2])
oy2 = torch.max(preds[:, 3], bbox[:, 3])
outer_diag = (ox1 - ox2) ** 2 + (oy1 - oy2) ** 2
diou = iou - inter_diag / outer_diag
print("diou:\n",diou)
# calculate v,alpha
wbbox = bbox[:, 2] - bbox[:, 0] + 1.0
hbbox = bbox[:, 3] - bbox[:, 1] + 1.0
wpreds = preds[:, 2] - preds[:, 0] + 1.0
hpreds = preds[:, 3] - preds[:, 1] + 1.0
v = torch.pow((torch.atan(wbbox / hbbox) - torch.atan(wpreds / hpreds)), 2) * (4 / (math.pi ** 2))
alpha = v / (1 - iou + v)
ciou = diou - alpha * v
ciou = torch.clamp(ciou, min=-1.0, max=1.0)
ciou_loss = 1 - ciou
if reduction == 'mean':
loss = torch.mean(ciou_loss)
elif reduction == 'sum':
loss = torch.sum(ciou_loss)
else:
raise NotImplementedError
print("last_loss:\n",loss)
return loss
if __name__ == "__main__":
pred_box = torch.tensor([[2,4,6,8],[5,9,13,12]])
gt_box = torch.tensor([[3,4,7,9]])
loss = Ciou_loss(preds=pred_box,bbox=gt_box)
# 輸出結果
"""
iou:
tensor([0.5714, 0.0476])
diou:
tensor([0.5714, 0.0476])
last_loss:
tensor(0.6940)
"""
2.4.3 Ciou和Diou缺點
通過對論文的理解,我認爲Ciou和Diou依然有一些特殊情況沒有解決。如下圖所示,紅色代表真實框,藍色表示預測框,當共中心點時,Diou退化爲Iou,當然Ciou因爲長寬比懲罰項的原因能對這一情況有很好的處理,但是當出現共中心點、預測框和真實框的長寬比也相同的情況則Ciou也退化爲Iou。因爲我認爲應該從Ciou的alpha*v的懲罰項入手,設計新的懲罰項來避免這種情況,我願意將未來的這種檢測框迴歸損失函數稱爲Yiyi_Iou。
3 總結
希望我的這篇文章能對你產生幫助。