1 引言

在目標檢測的視覺任務中，分類和檢測框的迴歸是核心內容，因此損失函數的選擇對模型的表現效果也具有較大影響，對於分類損失函數這裏不多做贅述（Focal loss考慮到了正負樣本以及難例和簡單例平衡的問題，目前已經被各位煉丹師廣泛使用。）所以這個博客主要講解檢測框迴歸損失函數，根據其演變路線SmoothL1>Iou>Giou>Diou>Ciou進行詳細介紹。

2 檢測框迴歸損失函數

2.1 SmoothL1

SmoothL1最早在何凱明大神的Faster RCNN模型中使用到。計算公式如下所示，SmoothL1預測框值和真實框值差的絕對值大於1時採用線性函數，其導數爲常數，避免了因差值過大造成梯度爆炸。當差值小於1時採用非線性函數，導數值變小有利用模型收斂到更高精度。

2.1.1 SmoothL1 Torch實現代碼

def Smooth_L1(predict_box,gt_box,th=1.0,reduction = "mean"):
    '''
    predict_box:[[x1,y1,x2,y2], [x1,y1,x2,y2],,,]
    gt_box:[[x1,y1,x2,y2], [x1,y1,x2,y2],,,]
    th: float
    reduction:"mean"or"sum"
    return: loss
    '''
    # 計算座標差
    x_diff = torch.abs(predict_box - gt_box)
    print("x_diff:\n",x_diff)
    #torch.where(fu(ction:: where(condition, x, y) -> Tensor)
    #滿足條件返回x，不滿足條件返回y
    loss = torch.where(x_diff < th, 0.5 * x_diff * x_diff, x_diff - 0.5)
    print("loss_first_stage:\n",loss)
    if reduction == "mean":
        loss = torch.mean(loss)
    elif reduction == "sum":
        loss = torch.sum(loss)
    else:
        pass
    print("loss_last_stage:\n",loss)
    return loss
if __name__ == "__main__":
    pred_box = torch.tensor([[2,4,6,8],[5,9,13,12]])
    gt_box = torch.tensor([[3,4,7,9]])
    loss = Smooth_L1(predict_box=pred_box,gt_box=gt_box)

#輸出結果
"""
x_diff:
 tensor([[1, 0, 1, 1],
        [2, 5, 6, 3]])
loss_first_stage:
 tensor([[0.5000, 0.0000, 0.5000, 0.5000],
        [1.5000, 4.5000, 5.5000, 2.5000]])
loss_last_stage:
 tensor(1.9375)
"""

2.1.2 SmoothL1的缺點

在nms或者計算AP的時候對檢測框採用的是Iou的標準，因此SmoothL1和評價標準的關聯性不是很大，具有相同SmoothL1 值的一對框可能具有不同的Iou。針對這一問題，提出了以下的Iou作爲損失函數。

2.2 Iou

Iou的就是交併比，預測框和真實框相交區域面積和合並區域面積的比值，計算公式如下，Iou作爲損失函數的時候只要將其對數值輸出就好了。

2.2.1 Iou Torch代碼實現

def Iou_loss(preds, bbox, eps=1e-6, reduction='mean'):
    '''
    preds:[[x1,y1,x2,y2], [x1,y1,x2,y2],,,]
    bbox:[[x1,y1,x2,y2], [x1,y1,x2,y2],,,]
    reduction:"mean"or"sum"
    return: loss
    '''
    x1 = torch.max(preds[:, 0], bbox[:, 0])
    y1 = torch.max(preds[:, 1], bbox[:, 1])
    x2 = torch.min(preds[:, 2], bbox[:, 2])
    y2 = torch.min(preds[:, 3], bbox[:, 3])
    
    w = (x2 - x1 + 1.0).clamp(0.)
    h = (y2 - y1 + 1.0).clamp(0.)
    inters = w * h
    print("inters:\n",inters)

    uni = (preds[:, 2] - preds[:, 0] + 1.0) * (preds[:, 3] - preds[:, 1] + 1.0) + (bbox[:, 2] - bbox[:, 0] + 1.0) * (
            bbox[:, 3] - bbox[:, 1] + 1.0) - inters
    print("uni:\n",uni)
    ious = (inters / uni).clamp(min=eps)
    loss = -ious.log()

    if reduction == 'mean':
        loss = torch.mean(loss)
    elif reduction == 'sum':
        loss = torch.sum(loss)
    else:
        raise NotImplementedError
    print("last_loss:\n",loss)
    return loss
if __name__ == "__main__":
    pred_box = torch.tensor([[2,4,6,8],[5,9,13,12]])
    gt_box = torch.tensor([[3,4,7,9]])
    loss = Iou_loss(preds=pred_box,bbox=gt_box)

# 輸出結果
"""
inters:
 tensor([20.,  3.])
uni:
 tensor([35., 63.])
last_loss:
 tensor(1.8021)
"""

2.2.2 Iou的缺點

當預測框和真實框不相交時Iou值爲0，導致很大範圍內損失函數沒有梯度。針對這一問題，提出了Giou作爲損失函數。

2.3 Giou

Giou Loss先計算閉包區域（閉包區域是包含預測框和檢測框的最小區域），再計算閉包區域中不屬於兩個框區域佔閉包區域的比重，最後用Iou減去這個比重。Giou計算公式如下，最後將1-Giou的輸出作爲損失值。這樣也能夠衡量出不相交情況下的距離，以及不同相交方式下的距離。論文地址：https://arxiv.org/pdf/1902.09630.pdf

2.3.1 Giou Torch實現代碼

def Giou_loss(preds, bbox, eps=1e-7, reduction='mean'):
    '''
    https://github.com/sfzhang15/ATSS/blob/master/atss_core/modeling/rpn/atss/loss.py#L36
    :param preds:[[x1,y1,x2,y2], [x1,y1,x2,y2],,,]
    :param bbox:[[x1,y1,x2,y2], [x1,y1,x2,y2],,,]
    :return: loss
    '''
    ix1 = torch.max(preds[:, 0], bbox[:, 0])
    iy1 = torch.max(preds[:, 1], bbox[:, 1])
    ix2 = torch.min(preds[:, 2], bbox[:, 2])
    iy2 = torch.min(preds[:, 3], bbox[:, 3])

    iw = (ix2 - ix1 + 1.0).clamp(0.)
    ih = (iy2 - iy1 + 1.0).clamp(0.)

    # overlap
    inters = iw * ih
    print("inters:\n",inters)
    # union
    uni = (preds[:, 2] - preds[:, 0] + 1.0) * (preds[:, 3] - preds[:, 1] + 1.0) + (bbox[:, 2] - bbox[:, 0] + 1.0) * (
            bbox[:, 3] - bbox[:, 1] + 1.0) - inters + eps
    print("uni:\n",uni)
    # ious
    ious = inters / uni
    print("Iou:\n",ious)
    ex1 = torch.min(preds[:, 0], bbox[:, 0])
    ey1 = torch.min(preds[:, 1], bbox[:, 1])
    ex2 = torch.max(preds[:, 2], bbox[:, 2])
    ey2 = torch.max(preds[:, 3], bbox[:, 3])
    ew = (ex2 - ex1 + 1.0).clamp(min=0.)
    eh = (ey2 - ey1 + 1.0).clamp(min=0.)

    # enclose erea
    enclose = ew * eh + eps
    print("enclose:\n",enclose)

    giou = ious - (enclose - uni) / enclose
    loss = 1 - giou

    if reduction == 'mean':
        loss = torch.mean(loss)
    elif reduction == 'sum':
        loss = torch.sum(loss)
    else:
        raise NotImplementedError
    print("last_loss:\n",loss)
    return loss
if __name__ == "__main__":
    pred_box = torch.tensor([[2,4,6,8],[5,9,13,12]])
    gt_box = torch.tensor([[3,4,7,9]])
    loss = Giou_loss(preds=pred_box,bbox=gt_box)

# 輸出結果
"""
inters:
 tensor([20.,  3.])
uni:
 tensor([35., 63.])
Iou:
 tensor([0.5714, 0.0476])
enclose:
 tensor([36., 99.])
last_loss:
 tensor(0.8862)
"""

2.3.2 Giou 缺點

如下圖所示，當預測框被真實框包含，或者真實框被預測框包含，則閉包區域和相併區域一樣，因此Giou退化爲Iou，未能反映出包含的各種方式，針對這一問題，提出了Diou和Ciou。

2.4 Diou和Ciou

Diou和Ciou是哥哥和弟弟的關係，是在同一篇論文中被提出。論文鏈接：https://arxiv.org/pdf/1911.08287.pdf，在論文中作者對Iou和Giou進行對比：(1)認爲Iou和Giou收斂速度較慢，因此需要更多的迭代步數。（2）特殊情況下（互相包含），Iou和Giou錯誤率依舊很高，如下圖所示：

論文的貢獻點總結爲四點：

（1）Diou相對Giou效率高，模型更易收斂。

（2）Ciou對精度更加有利，考慮到了三個重要的幾何因素：重疊面積、中心點距離和長寬比。能夠更好的描述框的迴歸問題。

（3）Diou能夠直接插入到後處理過程nms裏面，比原始nms裏面用到的Iou表現更好。

（4）Diou和Ciou的可移植性強，很牛逼！

Diou計算公式如下：

式中：b和b(gt)分別表示預測框和真實框的中心點，然後計算歐式距離的平方。c表示預測框和真實框的最小閉包區域的對角線長度。

2.4.1 Diou Torch實現代碼

def Diou_loss(preds, bbox, eps=1e-7, reduction='mean'):
    '''
    preds:[[x1,y1,x2,y2], [x1,y1,x2,y2],,,]
    bbox:[[x1,y1,x2,y2], [x1,y1,x2,y2],,,]
    eps: eps to avoid divide 0
    reduction: mean or sum
    return: diou-loss
    '''
    ix1 = torch.max(preds[:, 0], bbox[:, 0])
    iy1 = torch.max(preds[:, 1], bbox[:, 1])
    ix2 = torch.min(preds[:, 2], bbox[:, 2])
    iy2 = torch.min(preds[:, 3], bbox[:, 3])

    iw = (ix2 - ix1 + 1.0).clamp(min=0.)
    ih = (iy2 - iy1 + 1.0).clamp(min=0.)

    # overlaps
    inters = iw * ih

    # union
    uni = (preds[:, 2] - preds[:, 0] + 1.0) * (preds[:, 3] - preds[:, 1] + 1.0) + (bbox[:, 2] - bbox[:, 0] + 1.0) * (
            bbox[:, 3] - bbox[:, 1] + 1.0) - inters

    # iou
    iou = inters / (uni + eps)
    print("iou:\n",iou)

    # inter_diag
    cxpreds = (preds[:, 2] + preds[:, 0]) / 2
    cypreds = (preds[:, 3] + preds[:, 1]) / 2

    cxbbox = (bbox[:, 2] + bbox[:, 0]) / 2
    cybbox = (bbox[:, 3] + bbox[:, 1]) / 2

    inter_diag = (cxbbox - cxpreds) ** 2 + (cybbox - cypreds) ** 2
    print("inter_diag:\n",inter_diag)

    # outer_diag
    ox1 = torch.min(preds[:, 0], bbox[:, 0])
    oy1 = torch.min(preds[:, 1], bbox[:, 1])
    ox2 = torch.max(preds[:, 2], bbox[:, 2])
    oy2 = torch.max(preds[:, 3], bbox[:, 3])

    outer_diag = (ox1 - ox2) ** 2 + (oy1 - oy2) ** 2
    print("outer_diag:\n",outer_diag)

    diou = iou - inter_diag / outer_diag
    diou = torch.clamp(diou, min=-1.0, max=1.0)

    diou_loss = 1 - diou
    print("last_loss:\n",diou_loss)

    if reduction == 'mean':
        loss = torch.mean(diou_loss)
    elif reduction == 'sum':
        loss = torch.sum(diou_loss)
    else:
        raise NotImplementedError
    return loss
if __name__ == "__main__":
    pred_box = torch.tensor([[2,4,6,8],[5,9,13,12]])
    gt_box = torch.tensor([[3,4,7,9]])
    loss = Diou_loss(preds=pred_box,bbox=gt_box)

# 輸出結果
"""
iou:
 tensor([0.5714, 0.0476])
inter_diag:
 tensor([ 1, 32])
outer_diag:
 tensor([ 50, 164])
last_loss:
 tensor([0.4286, 0.9524])
"""

Ciou在Diou的基礎上進行改進，認爲Diou只考慮了中心點距離和重疊面積，但是沒有考慮到長寬比。

Ciou計算公式：

2.4.2 Ciou Torch實現代碼

import math
def Ciou_loss(preds, bbox, eps=1e-7, reduction='mean'):
    '''
    https://github.com/Zzh-tju/DIoU-SSD-pytorch/blob/master/utils/loss/multibox_loss.py
    :param preds:[[x1,y1,x2,y2], [x1,y1,x2,y2],,,]
    :param bbox:[[x1,y1,x2,y2], [x1,y1,x2,y2],,,]
    :param eps: eps to avoid divide 0
    :param reduction: mean or sum
    :return: diou-loss
    '''
    ix1 = torch.max(preds[:, 0], bbox[:, 0])
    iy1 = torch.max(preds[:, 1], bbox[:, 1])
    ix2 = torch.min(preds[:, 2], bbox[:, 2])
    iy2 = torch.min(preds[:, 3], bbox[:, 3])

    iw = (ix2 - ix1 + 1.0).clamp(min=0.)
    ih = (iy2 - iy1 + 1.0).clamp(min=0.)

    # overlaps
    inters = iw * ih

    # union
    uni = (preds[:, 2] - preds[:, 0] + 1.0) * (preds[:, 3] - preds[:, 1] + 1.0) + (bbox[:, 2] - bbox[:, 0] + 1.0) * (
            bbox[:, 3] - bbox[:, 1] + 1.0) - inters

    # iou
    iou = inters / (uni + eps)
    print("iou:\n",iou)

    # inter_diag
    cxpreds = (preds[:, 2] + preds[:, 0]) / 2
    cypreds = (preds[:, 3] + preds[:, 1]) / 2

    cxbbox = (bbox[:, 2] + bbox[:, 0]) / 2
    cybbox = (bbox[:, 3] + bbox[:, 1]) / 2

    inter_diag = (cxbbox - cxpreds) ** 2 + (cybbox - cypreds) ** 2

    # outer_diag
    ox1 = torch.min(preds[:, 0], bbox[:, 0])
    oy1 = torch.min(preds[:, 1], bbox[:, 1])
    ox2 = torch.max(preds[:, 2], bbox[:, 2])
    oy2 = torch.max(preds[:, 3], bbox[:, 3])

    outer_diag = (ox1 - ox2) ** 2 + (oy1 - oy2) ** 2

    diou = iou - inter_diag / outer_diag
    print("diou:\n",diou)

    # calculate v,alpha
    wbbox = bbox[:, 2] - bbox[:, 0] + 1.0
    hbbox = bbox[:, 3] - bbox[:, 1] + 1.0
    wpreds = preds[:, 2] - preds[:, 0] + 1.0
    hpreds = preds[:, 3] - preds[:, 1] + 1.0
    v = torch.pow((torch.atan(wbbox / hbbox) - torch.atan(wpreds / hpreds)), 2) * (4 / (math.pi ** 2))
    alpha = v / (1 - iou + v)
    ciou = diou - alpha * v
    ciou = torch.clamp(ciou, min=-1.0, max=1.0)

    ciou_loss = 1 - ciou
    if reduction == 'mean':
        loss = torch.mean(ciou_loss)
    elif reduction == 'sum':
        loss = torch.sum(ciou_loss)
    else:
        raise NotImplementedError
    print("last_loss:\n",loss)
    return loss
if __name__ == "__main__":
    pred_box = torch.tensor([[2,4,6,8],[5,9,13,12]])
    gt_box = torch.tensor([[3,4,7,9]])
    loss = Ciou_loss(preds=pred_box,bbox=gt_box)

# 輸出結果
"""
iou:
 tensor([0.5714, 0.0476])
diou:
 tensor([0.5714, 0.0476])
last_loss:
 tensor(0.6940)
"""

2.4.3 Ciou和Diou缺點

通過對論文的理解，我認爲Ciou和Diou依然有一些特殊情況沒有解決。如下圖所示，紅色代表真實框，藍色表示預測框，當共中心點時，Diou退化爲Iou，當然Ciou因爲長寬比懲罰項的原因能對這一情況有很好的處理，但是當出現共中心點、預測框和真實框的長寬比也相同的情況則Ciou也退化爲Iou。因爲我認爲應該從Ciou的alpha*v的懲罰項入手，設計新的懲罰項來避免這種情況，我願意將未來的這種檢測框迴歸損失函數稱爲Yiyi_Iou。

3 總結

希望我的這篇文章能對你產生幫助。

目標檢測框迴歸損失函數-SmoothL1、Iou、Giou、Diou和Ciou