深度學習裏非常重要的一塊就是loss的設計,物體檢測裏怎麼都逃不開IOU及其變種,一開始會使用bbox的距離作爲loss,最近幾年會將IOU變種直接作爲loss訓練,可以提點不少,讓網絡更容易學習到框的位置。
IOU(Intersection over Union)
- 交併比,兩個rect的交集面積除以並集,很直觀,
- 不適合作爲loss
- 在GIOU出現之前一般還是用bbox的距離作爲loss
def bbox_iou(self, boxes1, boxes2):
boxes1_area = boxes1[..., 2] * boxes1[..., 3]
boxes2_area = boxes2[..., 2] * boxes2[..., 3]
boxes1 = tf.concat([boxes1[..., :2] - boxes1[..., 2:] * 0.5,
boxes1[..., :2] + boxes1[..., 2:] * 0.5], axis=-1)
boxes2 = tf.concat([boxes2[..., :2] - boxes2[..., 2:] * 0.5,
boxes2[..., :2] + boxes2[..., 2:] * 0.5], axis=-1)
left_up = tf.maximum(boxes1[..., :2], boxes2[..., :2])
right_down = tf.minimum(boxes1[..., 2:], boxes2[..., 2:])
inter_section = tf.maximum(right_down - left_up, 0.0)
inter_area = inter_section[..., 0] * inter_section[..., 1]
union_area = boxes1_area + boxes2_area - inter_area
iou = 1.0 * inter_area / union_area
return iou
CVPR2019--GIOU:https://arxiv.org/pdf/1902.09630.pdf
- GIOU=IOU-外接矩形空隙面積/並集
- 利用1-GIOU替代bbox距離作爲loss,數據集上表現更好
- 取值範圍(-1,1],當兩個rect不重疊時候,IOU數值爲0,但GIOU不爲0,故而適合作爲loss
- 比IOU更能反映直觀感覺上兩個Rect的接近程度,當然最主要的是能提點
def bbox_giou(self, boxes1, boxes2):
boxes1 = tf.concat([boxes1[..., :2] - boxes1[..., 2:] * 0.5,
boxes1[..., :2] + boxes1[..., 2:] * 0.5], axis=-1)
boxes2 = tf.concat([boxes2[..., :2] - boxes2[..., 2:] * 0.5,
boxes2[..., :2] + boxes2[..., 2:] * 0.5], axis=-1)
boxes1 = tf.concat([tf.minimum(boxes1[..., :2], boxes1[..., 2:]),
tf.maximum(boxes1[..., :2], boxes1[..., 2:])], axis=-1)
boxes2 = tf.concat([tf.minimum(boxes2[..., :2], boxes2[..., 2:]),
tf.maximum(boxes2[..., :2], boxes2[..., 2:])], axis=-1)
boxes1_area = (boxes1[..., 2] - boxes1[..., 0]) * (boxes1[..., 3] - boxes1[..., 1])
boxes2_area = (boxes2[..., 2] - boxes2[..., 0]) * (boxes2[..., 3] - boxes2[..., 1])
left_up = tf.maximum(boxes1[..., :2], boxes2[..., :2])
right_down = tf.minimum(boxes1[..., 2:], boxes2[..., 2:])
inter_section = tf.maximum(right_down - left_up, 0.0)
inter_area = inter_section[..., 0] * inter_section[..., 1]
union_area = boxes1_area + boxes2_area - inter_area
iou = inter_area / union_area
enclose_left_up = tf.minimum(boxes1[..., :2], boxes2[..., :2])
enclose_right_down = tf.maximum(boxes1[..., 2:], boxes2[..., 2:])
enclose = tf.maximum(enclose_right_down - enclose_left_up, 0.0)
enclose_area = enclose[..., 0] * enclose[..., 1]
giou = iou - 1.0 * (enclose_area - union_area) / enclose_area
return giou
Rezatofighi H , Tsoi N , Gwak J Y , et al. Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regression[J]. 2019.
CVPR2020--DIOU/CIOU:https://arxiv.org/pdf/1911.08287.pdf
- DIOU=IOU-(中心點距離/並集外接矩形對角線距離)²
- 作爲loss收斂得更快,可以進一步提點
- 用就對了
- 進一步考慮ratio,將長寬比作爲罰項加進來(提出CIOU)
最後一行代表利用CIOU作爲loss訓練,利用DIOU來NMS
def bbox_diou_ciou(self, boxes1, boxes2):
center_vec = boxes1[..., :2] - boxes2[..., :2]
v_ = 0.4052847483961759*(tf.atan(boxes1[..., 2]/boxes1[..., 3]) - tf.atan(boxes2[..., 2]/boxes2[..., 3]))**2
boxes1 = tf.concat([boxes1[..., :2] - boxes1[..., 2:] * 0.5,
boxes1[..., :2] + boxes1[..., 2:] * 0.5], axis=-1)
boxes2 = tf.concat([boxes2[..., :2] - boxes2[..., 2:] * 0.5,
boxes2[..., :2] + boxes2[..., 2:] * 0.5], axis=-1)
boxes1 = tf.concat([tf.minimum(boxes1[..., :2], boxes1[..., 2:]),
tf.maximum(boxes1[..., :2], boxes1[..., 2:])], axis=-1)
boxes2 = tf.concat([tf.minimum(boxes2[..., :2], boxes2[..., 2:]),
tf.maximum(boxes2[..., :2], boxes2[..., 2:])], axis=-1)
boxes1_area = (boxes1[..., 2] - boxes1[..., 0]) * (boxes1[..., 3] - boxes1[..., 1])
boxes2_area = (boxes2[..., 2] - boxes2[..., 0]) * (boxes2[..., 3] - boxes2[..., 1])
left_up = tf.maximum(boxes1[..., :2], boxes2[..., :2])
right_down = tf.minimum(boxes1[..., 2:], boxes2[..., 2:])
inter_section = tf.maximum(right_down - left_up, 0.0)
inter_area = inter_section[..., 0] * inter_section[..., 1]
union_area = boxes1_area + boxes2_area - inter_area
iou = inter_area / union_area
enclose_left_up = tf.minimum(boxes1[..., :2], boxes2[..., :2])
enclose_right_down = tf.maximum(boxes1[..., 2:], boxes2[..., 2:])
enclose_vec = enclose_right_down - enclose_left_up
center_dis = tf.sqrt(center_vec[..., 0]**2+center_vec[..., 1]**2)
enclose_dis = tf.sqrt(enclose_vec[..., 0]**2+enclose_vec[..., 1]**2)
diou = iou - 1.0 * center_dis / enclose_dis
a_ = v_/(1-IOU+v_)
ciou= diou- a_*v_
return diou,ciou
Zheng, Zhaohui, Wang, Ping, Liu, Wei. Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression[J]. 2019.
參考:
https://zhuanlan.zhihu.com/p/94799295
https://github.com/YunYang1994/tensorflow-yolov3