faster rcnn代碼解讀（二）anchor的生成

faster rcnn代碼解讀參考

：https://github.com/adityaarun1/pytorch_fast-er_rcnn

https://github.com/jwyang/faster-rcnn.pytorch

單獨說一下anchor生成。

一、首先是爲一個像素點生成anchor


def generate_anchors(base_size=16,  ratios=[0.5, 1, 2], scales=2**np.arange(3, 6)):
    """
  Generate anchor (reference) windows by enumerating aspect ratios X
  scales wrt a reference (0, 0, 15, 15) window.
  """
    base_anchor = np.array([1, 1, base_size, base_size]) - 1 
   ratio_anchors = _ratio_enum(base_anchor, ratios)
    anchors = np.vstack([
        _scale_enum(ratio_anchors[i, :], scales)
        for i in range(ratio_anchors.shape[0])
    ])
    return anchors

這裏可以看到有幾個參數：

base_size：其實也就是一開始變換的基礎，resize的基礎吧。

ratios：主要就是改變base_size，其實也就是關聯長寬比

scales：就是在ratios基礎上做一些放縮。

這裏生成的anchor基本上是左上、右下角座標表示。

def _ratio_enum(anchor, ratios):
    """
  Enumerate a set of anchors for each aspect ratio wrt an anchor.
  """
    w, h, x_ctr, y_ctr = _whctrs(anchor)#計算anchor的寬高和中心點座標
    size = w * h                        #計算size
    size_ratios = size / ratios         #將size進行ratios放縮
    ws = np.round(np.sqrt(size_ratios)) #計算放縮有的w，多個w
    hs = np.round(ws * ratios)          #計算放縮後的h，多個h
    anchors = _mkanchors(ws, hs, x_ctr, y_ctr)#根據計算的ws、hs生成多個anchor，左下角和右上角座標
    return anchors

上面的是根據ratios將base anchor長寬進行等比例放縮。_whctrs沒什麼特別的，就是將top left、bottom right座標轉成 center+w+h，center是爲了定位，w、h是爲了變換

def _whctrs(anchor):
    """
  Return width, height, x center, and y center for an anchor (window).
  返回滑動窗口anchor的寬、高、中心點座標
  """
    w = anchor[2] - anchor[0] + 1
    h = anchor[3] - anchor[1] + 1
    x_ctr = anchor[0] + 0.5 * (w - 1)
    y_ctr = anchor[1] + 0.5 * (h - 1)
    return w, h, x_ctr, y_ctr

_mkanchors就是把之前的center+w+h方式再變換anchors形式，其實也就是左上、右下角的表示形式。


def _mkanchors(ws, hs, x_ctr, y_ctr):
    """
  Given a vector of widths (ws) and heights (hs) around a center
  (x_ctr, y_ctr), output a set of anchors (windows).
  # 在ctr爲中心點生成多個以ws、hs爲寬高的anchor
  """
    ws = ws[:, np.newaxis]
    hs = hs[:, np.newaxis]
    anchors = np.hstack((x_ctr - 0.5 * (ws - 1), y_ctr - 0.5 * (hs - 1),
                         x_ctr + 0.5 * (ws - 1), y_ctr + 0.5 * (hs - 1)))
    return anchors

_scale_enum做了一個類似的尺度變換

def _scale_enum(anchor, scales):
    """
  Enumerate a set of anchors for each scale wrt an anchor.
  """
    w, h, x_ctr, y_ctr = _whctrs(anchor)#計算anchor的寬高和中心點座標
    ws = w * scales #將w進行scales放縮，多個ws
    hs = h * scales #將h進行scales放縮，多個hs
    anchors = _mkanchors(ws, hs, x_ctr, y_ctr)#根據計算的ws、hs生成多個anchor，左下角和右上角座標
    return anchors

二、在feature上將相隔stride的pixel生成的len(scale)*len(ratios)組合起來。

def generate_anchors_pre(height,
                         width,
                         feat_stride,
                         anchor_scales=(8, 16, 32),
                         anchor_ratios=(0.5, 1, 2)):
    """ A wrapper function to generate anchors given different scales
    Also return the number of anchors in variable 'length'
  """
    # 生成標準的len(anchor_ratios)*len(anchor_scales)爲一組的anchors
    anchors = generate_anchors( ratios=np.array(anchor_ratios), scales=np.array(anchor_scales))
    A = anchors.shape[0]
    shift_x = np.arange(0, width) * feat_stride #計算x座標移動長度
    shift_y = np.arange(0, height) * feat_stride#計算y座標移動長度
    shift_x, shift_y = np.meshgrid(shift_x, shift_y)#生成採樣網格，即映射到原height、width上的anchor中心點座標
    shifts = np.vstack((shift_x.ravel(), shift_y.ravel(), shift_x.ravel(),
                        shift_y.ravel())).transpose()
    K = shifts.shape[0]
    # width changes faster, so here it is H, W, C
    anchors = anchors.reshape((1, A, 4)) + shifts.reshape((1, K, 4)).transpose((1, 0, 2))
    anchors = anchors.reshape(( K * A, 4)).astype(np.float32, copy=False)
    length = np.int32(anchors.shape[0])

    return anchors, length

這裏有幾個參數：

height、width：這是之前從vgg16生成的feature map大小，這裏不需要傳入feature map的，只是生成anchor。

feat_stride：和卷積的stride一個意思。

meshgrid這個函數很有意思，能看得懂它做了什麼，但是要理解從輸入到輸出就有點匪夷所思了，類似於枚舉了採樣網格點，對採樣網格座標進行操作，具體是從誰採樣到誰，可以仔細的研究一下，簡單的理解就是以全排列的方式枚舉pixel座標，其中x、y座標是分開的，ravel的作用是將x、y座標flatten。爲什麼是vstack兩個shift_x，是因爲左上角右下角連個都需要移動。

A = len(anchor_ratios)*len(anchor_scales) = anchor_num

K = (feat_width *feat_height)

要注意的一點是，python中的數據是按行存取的，因此會經常看到reshape操作，因爲按行存。

最後一個feature map輸出的大小是K*A*4 = (feat_width *feat_height) * anchor_num

faster rcnn代碼解讀（二）anchor的生成

.Net 8.0 下的新RPC，IceRPC之試試的新玩法"打洞"

關於遊戲付費的一點想法

我通過CKA和CKS啦！

《最新出爐》系列入門篇-Python+Playwright自動化測試-42-強大的可視化追蹤利器Trace Viewer

大數據怎麼學？對大數據開發領域及崗位的詳細解讀，完整理解大數據開發領域技術體系

3D slicer勾畫流程

simpleitk read dicom

Multi-Modal Image Registration with Unsupervised Deep Learning

DEFORM-GAN:AN UNSUPERVISED LEARNING MODEL FOR DEFORMABLE REGISTRATION

sourcetree煩人的history

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結