非極大抑制

NMS的英文是Non-maximum suppression的縮寫。

簡單的說，就是模型給出了多個重疊在一起的候選框，我們只需要保留一個就可以了。其他的重疊的候選框就刪掉了，效果可見下圖：

交併比

IoU的英文全稱Intersection over Union，就是兩個候選框區域的交集面積比上並集的面積，用下圖可以理解：

hard-NMS

hard-nms其實就是經典版本的NMS的方法。就是根據模型給出每個box的置信度從大到小進行排序，然後保留最大的，刪除所以與這個最大置信度的候選框的IoU大於閾值的其他候選框。

舉個例子吧，現在有4個候選框：
(box1,0.8),(box2,0.9),
(box3,0.7),(box4,0.5)

我們把這四個候選框按照置信度從大到小排序：
box2>box1>box3>box4

現在我們保留置信度最大的候選框box2,然後計算剩下三個box與box2之間的IoU，如果IoU大於一個事先設置的閾值，那麼就刪除這個box。假設,閾值是0.5:
IoU(box1,box2)=0.1<0.5，保留；IoU(box3,box2)=0.7<0.5，刪除；IoU(box4,box2)=0.2<0.5，保留；

現在還有box1和box4，然後再重複上面的過程，排序，然後刪除。

下面是python實現的hard-NMS：

def hard_nms(box_scores, iou_threshold, top_k=-1, candidate_size=200):
    """
    Args:
        box_scores (N, 5): box的集合，N爲框的數量，5即4(位置信息)+1(可能爲物體的概率)
        iou_threshold: 我們用IOU標準去除多餘檢測框的閾值
        top_k: 保留多少個計算後留下來的候選框，如果爲-1則全保留
        candidate_size: 參與計算的boxes數量
    Returns:
         picked: 經過nms計算後保留下來的box
    """
    scores = box_scores[:, -1]                # 首先我們取出box中的最後一個元素也就是當前box檢測到物體的概率
    boxes = box_scores[:, :-1]                # 取出box中的四個座標(左上、右下)
    picked = []  
    _, indexes = scores.sort(descending=True) # 按照降序排列所有的物體的概率，得到排序後在原數組中的索引信息 indexes
    indexes = indexes[:candidate_size]        # 只保留前 candidate_size 個 boxes 其餘的不考慮了
    while len(indexes) > 0:
        current = indexes[0]                  # 每次取出當前在 indexes 中 檢測到物體概率最大的一個 
        picked.append(current.item())         # 將這個最大的存在結果中
        if 0 < top_k == len(picked) or len(indexes) == 1:
            break
        current_box = boxes[current, :]       # 當前第一個也就是最高概率的box
        indexes = indexes[1:]                
        rest_boxes = boxes[indexes, :]        # 剩下其餘的box
        iou = iou_of(                         # 將當前的box與剩下其餘的boxes用IOU標準進行篩選
            rest_boxes,
            current_box.unsqueeze(0),
        )
        indexes = indexes[iou <= iou_threshold]# 保留與當前box的IOU小於一定閾值的boxes，

    return box_scores[picked, :]

如何計算iou的面積呢？實現方法在下面：

def area_of(left_top, right_bottom) -> torch.Tensor:
    """Compute the areas of rectangles given two corners.

    Args:
        left_top (N, 2): left top corner.
        right_bottom (N, 2): right bottom corner.

    Returns:
        area (N): return the area.
    """
    hw = torch.clamp(right_bottom - left_top, min=0.0)
    return hw[..., 0] * hw[..., 1]


def iou_of(boxes0, boxes1, eps=1e-5):
    """Return intersection-over-union (Jaccard index) of boxes.

    Args:
        boxes0 (N, 4): ground truth boxes.
        boxes1 (N or 1, 4): predicted boxes.
        eps: a small number to avoid 0 as denominator.
    Returns:
        iou (N): IoU values.
    """
    overlap_left_top = torch.max(boxes0[..., :2], boxes1[..., :2])
    overlap_right_bottom = torch.min(boxes0[..., 2:], boxes1[..., 2:])

    overlap_area = area_of(overlap_left_top, overlap_right_bottom)
    area0 = area_of(boxes0[..., :2], boxes0[..., 2:])
    area1 = area_of(boxes1[..., :2], boxes1[..., 2:])
    return overlap_area / (area0 + area1 - overlap_area + eps)

soft-NMS

在密集目標檢測任務中，hard-NMS會有一些問題，看下面的例子：

兩個物體重疊起來了，但是根據hard-NMS綠色的框會被掉。

Soft-NMS就改動了一個地方。 在判斷最高的置信度的box和其他box的IoU的時候增加了一個係數，可以更好的選擇哪些纔是多餘的box。

對於hard-NMS來說，\(iou(M,b_i)<N_t\)的時候，保留，大於等於的時候刪除，\(s\)表示置信度：

對於soft-NMS來說，\(iou(M,b_i)<N_t\)的時候，保留，大於的時候削減：

可以看出來，hard-NMS對於IoU大於閾值的候選框，直接把其置信度變成0，這樣就相當於刪除了這個box；但是soft-NMS的會根據IoU的大小，去適當的削減置信度，從而留下一些餘地。

【如何削減】
這裏有兩種方法來降低重疊候選框的置信度：

\(s=s(1-iou(M,b))\)簡單的線性衰減；
\(s = se^{-\frac{iou(M,b)^2}{\sigma}}\)指數衰減。其中sigma是常數，一般是0.5.

第二種方法更爲常見。

下面是python來實現的soft-NMS，其實跟hard-NMS相比，就多了一行代碼罷了：

def soft_nms(box_scores, score_threshold, sigma=0.5, top_k=-1):
    """Soft NMS implementation.

    References:
        https://arxiv.org/abs/1704.04503
        https://github.com/facebookresearch/Detectron/blob/master/detectron/utils/cython_nms.pyx

    Args:
        box_scores (N, 5): boxes in corner-form and probabilities.
        score_threshold: boxes with scores less than value are not considered.
        sigma: the parameter in score re-computation.
            scores[i] = scores[i] * exp(-(iou_i)^2 / simga)
        top_k: keep top_k results. If k <= 0, keep all the results.
    Returns:
         picked_box_scores (K, 5): results of NMS.
    """
    picked_box_scores = []
    while box_scores.size(0) > 0:
        max_score_index = torch.argmax(box_scores[:, 4])
        cur_box_prob = torch.tensor(box_scores[max_score_index, :])
        picked_box_scores.append(cur_box_prob)
        if len(picked_box_scores) == top_k > 0 or box_scores.size(0) == 1:
            break
        cur_box = cur_box_prob[:-1]
        box_scores[max_score_index, :] = box_scores[-1, :]
        box_scores = box_scores[:-1, :]
        ious = iou_of(cur_box.unsqueeze(0), box_scores[:, :-1])

        # 以下這句是新加的，如果沒有這句就是Hard-NMS了
        box_scores[:, -1] = box_scores[:, -1] * torch.exp(-(ious * ious) / sigma) 

        box_scores = box_scores[box_scores[:, -1] > score_threshold, :]
    if len(picked_box_scores) > 0:
        return torch.stack(picked_box_scores)
    else:
        return torch.tensor([])

一分鐘速學 | NMS, IOU 與 SoftMax

非極大抑制

交併比

hard-NMS

soft-NMS

5分鐘就能學會的簡單結構 | MLP-Mixer: An all-MLP Architecture for Vision | CVPR2021

域遷移DA | Learning From Synthetic Data: Addressing Domain Shift for Se | CVPR2018

光流 | flownet | CVPR2015 | 論文+pytorch代碼

醫學圖像 | DualGAN與兒科超聲心動圖分割 | MICCAI

圖像匹配 | NCC 歸一化互相關損失 | 代碼 + 講解

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結