Main contributions:
- Device a pairwise costs for object tracking based on several 3D cues.
- The costs are agnostic to the data association method.
- Can incorporated into any optimization framework.
The efficacy of the monocular 3D cues:
-
前兩行是幀t與幀t+1以及它們各自的bounding boxes。
-
通過將幀t中的對象lifting到3D並ballooning它們的位置來將其project到在t+1時刻觀察到的圖像上,在此映射區域尋找匹配對象並計算3D-2Dcost,大大地減少了搜索區域,降低了配對成本。
-
通過將僅在此映射區域的檢測backproject到3D並基於3Dvolume重疊計算3D-3Dcost(代碼實現是通過3D凸包重疊)。
-
混合各類cost(不僅僅是上述兩個),使用匈牙利關聯模式進行數據關聯。
-
odometry estimates obtained from ORB-SLAM。
Composition of costs:
①中的五元組分別爲檢測的bounding box左上角的(x,y)座標,bounding box的寬與高(w,h),以及bounding box中檢測器的置信度。
公式(1)中的第一項是一個對象類別的平均形狀,第二項中的V是表徵平均形狀形變方向的形變基礎(一組特徵向量)。
Illustration for understanding the concept of 3D-2D and 3D-3D costs:
Results
引用文獻
(前面沒縮進,見諒)[7] J. K. Murthy, G. S. Krishna, F. Chhaya, and K. M. Krishna, “Reconstructing vehicles from a single image: Shape priors for road scene understanding,” in Proceedings of the IEEE Conference on Robotics and Automation, 2017.
[8] J. K. Murthy, S. Sharma, and M. Krishna, “Shape priors for real-time monocular object localization in dynamic environments,” in Proceedings of the IEEE Conference on Intelligent Robots and Systems(In Press), 2017.
[22] S. Song and M. Chandraker, “Joint sfm and detection cues for monocular 3d localization in road scenes,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015.