faster rcnn 中核心部分RPN網絡的整理與理解

原創

wakojosin

2018-09-04 18:07

學習fasterrcnn檢測已經有一段時間了，最近才把核心的RPN部分進行的理解和整理，理解的偏差還請各位大神指正，

RPN(RegionProposal Network)區域生成網絡

1. 在五層conv，poolling，relu之後，取出conv5的輸出，送給RPN網絡；

layer {

type: "Convolution"

bottom: "conv5"

top: "rpn_conv1"

param { lr_mult: 1.0 }

param { lr_mult: 2.0 }

convolution_param {

num_output: 256

kernel_size: 3 pad: 1 stride: 1

weight_filler { type: "gaussian" std: 0.01 }

bias_filler { type: "constant" value: 0 }

}

layer {

type: "ReLU"

bottom: "rpn_conv1"

top: "rpn_conv1"

}

我們只需要一個3*3*256*256這樣的一個4維的卷積核，就可以將每一個3*3的sliding window 卷積成一個256維的向量，相當於feature map每個點都是256-d。

anchors。按照尺度變換（128×128， 256*256,512*512，2:1, 1:1, 1:2）計算這256維向量每個像素的9個anchor，所謂anchors，實際上就是一組由rpn/generate_anchors.py生成的矩形。直接運行generate_anchors.py得到以下輸出：

[[ -84. -40. 99. 55.]

[-176. -88. 191. 103.]

[-360. -184. 375. 199.]

[ -56. -56. 71. 71.]

[-120. -120. 135. 135.]

[-248. -248. 263. 263.]

[ -36. -80. 51. 95.]

[ -80. -168. 95. 183.]

[-168. -344. 183. 359.]]。

計算每個像素256-d的9個尺度下的值，得到9個anchor，我們給每個anchor分配一個二進制的標籤（前景背景）。我們分配正標籤前景給兩類anchor：1）與某個ground truth（GT）包圍盒有最高的IoU重疊的anchor（也許不到0.7），2）與任意GT包圍盒有大於0.7的IoU交疊的anchor。注意到一個GT包圍盒可能分配正標籤給多個anchor。我們分配負標籤（背景）給與所有GT包圍盒的IoU比率都低於0.3的anchor。非正非負的anchor對訓練目標沒有任何作用，由此輸出維度爲（2*9）18-d，anchor（label和概率）一共18維。

layer {

type: "Convolution"

bottom: "rpn_conv1"

top: "rpn_cls_score"

param { lr_mult: 1.0 }

param { lr_mult: 2.0 }

convolution_param {

num_output: 18 # 2(bg/fg) *9(anchors)

kernel_size: 1 pad: 0 stride: 1

weight_filler { type: "gaussian" std: 0.01 }

bias_filler { type: "constant" value: 0 }

}

對前景anchor使用softmax進行分類，得到anchor類別以及softmax score。

前2.）中已經計算出foreground anchors，使用bounding box regression迴歸得到預設anchor-box到ground-truth-box之間的變換參數，即平移（dx和dy）和伸縮參數（dw和dh），由此得到初步確定proposal。

Boundingbox regression原理http://blog.csdn.net/elaine_bao/article/details/60469036

將預proposal利用feat_stride和im_info將anchors映射回原圖，判斷預proposal是否大範圍超過邊界，剔除嚴重超出邊界的。

按照softmax score進行從大到小排序，提取前2000個預proposal，對這個2000個進行NMS(非極大值抑制)，將得到的再次進行排序，輸出300個proposal。

繼續：

對300個proposal進行ROIpooling提取出固定長度的特徵送入全連接層

再進行softmax分類計算得分，進行boundingbox regression得到精確位置。

附上參考的連接，

http://www.cnblogs.com/zf-blog/p/7286405.html

http://lib.csdn.net/article/deeplearning/61641

http://blog.csdn.net/mllearnertj/article/details/53709766

如理解有偏差請留言指正！謝謝

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

faster rcnn 中核心部分RPN網絡的整理與理解

再談23種設計模式（3）：行爲型模式（學習筆記）

微前端學習筆記(4):從微前端到微模塊之EMP與hel-micro方案探索

微前端學習筆記（1）：微前端總體架構概述，從微服務發微

985 碩士程序員，空窗 4 個月沒有 Offer！

一文搞懂 Spring 循環依賴

賽博鬥地主——使用大語言模型扮演Agent智能體玩牌類遊戲。

VScode右鍵打開(添加到右鍵)

記一次 .NET某工控視覺自動化系統卡死分析

WindowsServer--SQL Server搭建主從同步實現讀寫分離 - 事務性分發

java由於越界導致的報錯

STM32的RT-Thread PIN系統中 KEIL warning: #1296-D: extended constant initialiser used 處理方法

STM32 HAL庫RTC復位丟失年月日的解決辦法

關於msys2中pip安裝 cffi 出現fatal error: crypt.h No such file or directory #include crypt.h 的解決辦法

UBOOT LOGO替換小結

【EasyARM-i.MX280A】【UBOOT】環境變量記錄

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結