1707.ICCV.Neural Person Search Machines檢測加識別論文閱讀筆記

原創

2020-02-21 08:57

Neural Person Search Machines（NPSM）一個新穎的end-to-end的（檢測+reid）行人搜索識別方法

論文貢獻：
1.提出了一種NPSM框架（基於LSTM記憶遞歸網絡attention機制）來模擬人的視覺搜索機制，在記憶query/probe特徵信息的指導下，遞歸地由小到大定位有效區域，由粗到精的得到iamge中與query匹配的行人區域。
2.相比於現階段PRW和CUHK-SYSU用的two-stage strategy或者組合策略，該方法提出了無約束檢測，引入query-aware 信息的區域縮減機制（包含更多的上下文信息），同時地解決定位和query的行人識別匹配。 memory of query person can also effectively guide the neural search model to find the right person
3.做了很多實驗，在最新的PRW和CUHK-SYSU數據集上得到了最好的性能

缺點是：ranking排序時間比較耗時，尤其是在PRW數據集上。

主要涉及的論文總結：
作者主要基於以下論文構架自己的模型
T. Xiao, S. Li, B. Wang, L. Lin, and X. Wang. Joint detection and identification feature learning for person search. arXiv:1604.01850,2017.
S. Xingjian, Z. Chen, H. Wang, D.-Y. Yeung, W.-k. Wong, and W.-c.Woo. Convolutional lstm network: A machine learning approach for precipitation nowcasting. In NIPS, pages 802–810, 2015.
K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 770–778, 2016

行人檢測的文章：
1.基於傳統特徵：
DPM：P. F. Felzenszwalb, R. B. Girshick, D. McAllester, and D. Ramanan. Object detection with discriminatively trained part-based models.
IEEE TPAMI, 32(9):1627–1645, 2010.
ACF：P. Dollar, R. Appel, S. Belongie, and P. Perona. Fast feature pyramids ´for object detection. IEEE TPAMI, 36(8):1532–1545, 2014
LDCF：W. Nam, P. Dollar, and J. H. Han. Local decorrelation for improved pedestrian detection. NIPS, 1:424–432, 2014.

2.基於CNN：
DeepParts ：Y. Tian, P. Luo, X. Wang, and X. Tang. Deep learning strong parts for pedestrian detection. In IEEE ICCV, pages 1904–1912, 2015.
CompACT boosting ：Z. Cai, M. Saberian, and N. Vasconcelos. Learning complexityaware cascades for deep pedestrian detection. In IEEE CVPR, pages 3361–3369, 2015.

CCF：B. Yang, J. Yan, Z. Lei, and S. Z. Li. Convolutional channel features.In IEEE ICCV, pages 82–90, 2015.

R-CNN：Rich feature hierarchies for accurate object detection and semantic segmentation
fast R-CNN：Fast R-CNN https://github.com/rbgirshick/fast-rcnn
faster R-CNN：S. Ren, K. He, R. Girshick, and J. Sun. Faster r-cnn: Towards realtime object detection with region proposal networks. In NIPS, pages 91–99, 2015.

其他：
Edgeboxes（proposal model ）：C. L. Zitnick and P. Dollar. Edge boxes: Locating object proposals ´ from edges. In ECCV, pages 391–405. Springer, 2014.

論文介紹：

論文采用的框架與傳統的方法對比：

框架結構：

1.採用了Xiao Tong的OIMloss，FCN提取特徵圖採用resnet50+ROI pooling.
2.在訓練時，在NSN網絡的各時間步採用segmentation alike softmax loss as the “region shrinkage loss”訓練策略使網絡產生合適的包含target的attention maps
3.爲使學到的特徵更具判別性，訓練時增加an identification loss following the “Identification Net”
4.Region Shrinkage with Primitive Memory。more context information would be included from a large region and the number of irrelevant person candidates with the target person would be recursively reduced in the search process。

實驗：
評價標準：採用mAp反映the accuracy of detecting the query person from
the gallery images。cmc採用top-1，計數只在預測框與GT的IoU>0.5.

Baseline中按數據集PRW和CUHK-SYSU來區分：
PRW：R-CNN [7] detectors of DPM [6], CCF [36],ACF [4], LDCF [21]) and recognizers (LOMO, XQDA [17], IDEdet, CWS [41]). AlexNet作爲R-CNN detector的基網絡，其中，DPM-AlexNet比DPM整合其他的（如VGG，ResNet）性能更優
CUHK-SYSU：CNN（Faster-RCNN with ResNet50）+IDNet的組合和基於以上和OIM的聯合優化等

Analytic experiments on CUHK-SYSU benchmark to investigate the contribution of each component in our proposed NPSM architecture.