論文 Learning to Segment Object Candidates

原創

2018-09-01 20:30

9:40-11:35

這是一篇2015年的文章，是較早的一篇關於圖像的像素分割的論文，但如果分割出來的像素做候選區域會不會大材小用了？因爲需要像素級別的標註來監督學習，相比標註ground true box 的難度係數大很多。如果用它來做box proposal的話，對於像素分割的要求則不是太高了，因爲網絡最後都會對 proposal boxes 做迴歸的。

網絡架構圖如下：

網絡架構：

和通常的detection任務一樣，該architection是基於VGG的，輸入爲：3x244x244(按論文中記法),將VGG-A的最後一層 pooling去掉，輸出爲512x14x14，後接兩個branch：
1、segmentation branch：接conv層降採樣到 512x1x1,全連接到56x56的圖像上，採用閾值>0.x（根據數據集）爲1，bilinear upsamping到原圖像尺寸， 實現 mask功能。

它不像deconv的功能，是h×w pixel classifiers功能，相當於一個多分類問題，判斷一個像素是否屬於該物體。use either locally or fully connected pixel classifiers，一個只能獲取局部信息，一個有大量冗餘的參數，其實是1x1的conv layer。the output of the classification layer to be h’×w’ with h’ < h and w’ < w and upsample the output to h × w to match the input dimensions。也就是bilinear upsamping。

2、scoring branch:max_pool_2x2下采樣，接兩層fc(全連接),最後輸出一個實值。（output is a single ‘objectness’ score）


 文中的loss function 中的lambda=1/32，並沒有做進一步的討論，這個參數對像素的整體loss是敏感的。當socre=反例時，segmentation loss=0;


During full image inference, we apply the model densely at multiple locations and scales.(推理階段，對每個位置運行一次模型 224/16=14（次）。產生多種scale，translate shift的mask box)。這樣的缺點就是不是 single shot的，time-consuming。

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

論文 Learning to Segment Object Candidates

公司剛入職了一名 Java 中級開發，短短 4 行代碼居然湊齊了 3 個 bug！我哭了~~

Nginx R31 doc-13-Limiting Access to Proxied HTTP Resources 訪問限流

中外程序員到底有啥區別？

Python數據分析與挖掘實戰（5章）

python包：pandas

C++文件/流

一、什麼是Docker

二、Docker 組件

揹包九講一 01揹包

今天！通義靈碼在北京、成都、杭州三城開講啦

Latex 引用、索引不跳轉的問題

串聯的圓：有趣的心形線

幾種MAPE的實現方式

TensorFlow 和 Pytorch 中交叉熵 Cross Entropy Loss 的代碼實現和手動實現方式

機器學習之優化算法（二）之梯度下降及收斂性分析

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結