RGBDTAM: A Cost-Effective and Accurate RGB-D Tracking and Mapping System

0.引言

IROS2017.一个小众的不太出名的RGBD-SLAM系统，无意中搜索到，发现是开源的。学习一下。

github
paper

主要贡献：

1.系统结合半稠密光度误差和稠密集合误差作为VO的优化error,并证明了这样选择是精确度最高的组合；
2.构建了多视图约束及其在tracking和mapping线程的误差模型；

．　　In the case of the geometric error, all the pixels have a high signal/noise ratio. There are some degenerated cases, though, where some degrees of freedom are not constrained, and those justify the combination of both residuals. As they are complementary, the minimization of both errors achieves the best performance. The photometric error is useless in texture-less scenarios, and the geometric one is useless in structure-less scenarios.

系统说明：

1.vo_system launch the three threads, tracking, semidense mapping and dense mapping (3D superpixels)

///Launch semidense tracker thread
boost::thread thread_semidense_tracker(&ThreadSemiDenseTracker,&images,&semidense_mapper,&semidense_tracker,&dense_mapper,&Map,&vis_pub,&pub_image);
//Launch semidense mapper thread
boost::thread thread_semidense_mapper(&ThreadSemiDenseMapper,&images,&images_previous_keyframe,&semidense_mapper,&semidense_tracker,&dense_mapper,&Map,&pub_cloud);
///Launch viewer updater.
boost::thread thread_viewer_updater(&ThreadViewerUpdater, &semidense_tracker,&semidense_mapper,&dense_mapper);

2.mapping线程构建新帧： $\mathcal{M}$
关键帧： $\left\{\mathcal{K}_{1}, \ldots, \mathcal{K}_{j}, \ldots, \mathcal{K}_{m}\right\}$ 其中 $\mathcal{K}_{j}=\left\{T_{w}^{j}, P^{j}\right\}$
点云帧： $P_{w}^{j}=$ $\left\{p_{w}^{1}, \dots, p_{w}^{i}, \dots, p_{w}^{n}\right\}$
3.代码使用openmp加速，boost管理线程。

1.相关系统(direct RGB-D odometry)

1.KinectFusion:只利用了深度信息;
2.Kintinuous:KinectFusion的改进，对内存机制进行了改进，能进行较大场景的重建且加入了闭环与位姿优化；
3.DVO-SLAM:基于图优化，关键帧约束，based on dense photometric and geometric error minimization，CPU实时，但是输入系统的图像分辨率被降低(不是640*480)；
4.ElasticFusion：ICP + photometric reprojection error.

2. tracking thread

最小化光度误差 $r_{ph}$ 与几何误差 $r_g$ .

$\{\hat{T}, \hat{a}, \hat{b}\}=\underset{T, a, b}{\arg \min } r_{p h}+\lambda r_{g}$

其中， $a、b$ 是当前图像的增益和亮度. $T$ 是当前相机姿态的运动估计增量。 $\lambda$ 是对光度和几何项进行加权的学习常数。tracking线程只优化 $T、a、b$ 三个量。
优化在李代数空间进行：

$T=\left[\begin{array}{cc}{\exp _{\mathrm{SO}(3)}(\delta \omega)} & {\delta t} \\ {0_{1 \times 3}} & {1}\end{array}\right]$
GN优化得到的结果右乘更新：
$T_{w}^{f} \leftarrow T_{w}^{f} \hat{T}^{-1}$

2.1.Photometric error ( $r_{ph}$ )

We minimize the photometric error only for those pixels belonging to Canny edges.Their inverse depth is estimated using the mapping method.

photometric error:

$r_{p h}=\sum_{i=1}^{n} w_{p}\left(\frac{\left(I_{k}\left(\pi\left(T_{w}^{k} p_{w}^{i}\right)\right)-a I_{f}\left(\pi\left(T_{w}^{f} T^{-1} p_{w}^{i}\right)\right)+b\right)^{2}}{\sigma_{p h}^{2}}\right)$
其中：

$I_{k}\left(\pi\left(\boldsymbol{T}_{w}^{k} p_{w}^{i}\right)\right.$ 为关键帧 $I_k$ 中3D点的 $P_w^i$ 的灰度(或则说光照强度)；
$\left.I_{f}\left(\pi\left(\boldsymbol{T}_{w}^{f} \hat{T}^{-1} p_{w}^{i}\right)\right)\right)$ 为当前帧 $I_f$ 中3D点的 $P_w^i$ 的灰度(或则说光照强度)；
$\pi()$ 为重投影函数；
a和b是当前帧相对于当前关键帧的增益和亮度，通过估计a和b来解决全局光照明暗的变化；
$w_p$ 是Geman-McClure鲁棒cost function，用于消除遮挡和动态对象的影响;
$\sigma_{p h}^{2}$ 是什么？论文中没提到。是高维高斯传递的方差？？

2.2.Covariance-weighted Geometric error ( $r_g$ )

（1）Covariance-weighted Geometric error:
$\left.r_{g}=\sum_{i=1}^{n} w_{p}\left(\frac{\left(\frac{1}{e_{z}^{T} T_{w}^{f} T^{-1} p_{w}^{i}}-D_{f}\left(\pi\left(T_{w}^{f} T^{-1} p_{w}^{i}\right)\right)^{2}\right.}{\sigma_{g}^{2}}\right)\right)$

其中:

$\frac{1}{e_{z}^{T} \boldsymbol{T}_{w}^{f} p_{x}^{i}}$ 3D点云与投影帧对齐后的逆深度？错了：是当前帧的逆深度(残差构建就是预测的逆深度减去测量的你深度值，从而优化 $T$ )；
$D_{f}$ 测量值的逆深度；
$e_{z}=[0,0,1]$ 为三维向量；

.　　 in order to achieve CPU real-time performance. We use four pyramid levels (from 80 × 60 to 640 × 480). For the first level we use all pixels. For the second, third and fourth levels we use one in every two, three and four pixels respectively –horizontally and vertically.

（2）Covariance Propagation for Structured Light Cameras:

只对结构光深度相机？？有什么区别吗？双目！！
深度：
$z=\frac{f b}{d}$
逆深度：
$\rho=\frac{d}{f b}$

标准差：
$\sigma_{z}=\frac{\partial z}{\partial d} \sigma_{d}=\frac{f b}{d^{2}} \sigma_{d}=\frac{z^{2}}{f b} \sigma_{d}$

逆深度标准差：

$\sigma_{\rho}=\frac{\partial \rho}{\partial d} \sigma_{d}=\frac{\sigma_{d}}{f b}$

2.3.Scaling parameters

As we combine residuals of different magnitudes, we need to scale them according to their covariances. For the geometric error we propagate its uncertainty using equations 8 and 9.（就是上面的标准差） For the photometric error we use the median absolute deviation of the residuals of the previous frame to extract a robust estimation of the standard deviation.
对于光度误差，使用前一帧残差的中值绝对偏差来提取标准偏差的可靠估计：
$\sigma_{p h}=1.482 * \operatorname{median}\left(r_{p h}-\operatorname{median}\left(r_{p h}\right)\right)$

3.Mapping thread

添加关键帧到地图。每个像素有两种方式来估计其逆深度：传感器测量 $\rho 1$ 以及多视图三角化 $\rho 2$ .The inverse depth $\rho 2$ for every high-gradient pixel $u^*$ in a keyframe $I_j$ is estimated by minimizing its photometric error $r_{ph}^o$ with respect to several overlapping views $I_o$ .

$\hat{\rho}_{2}=\underset{\rho_{2}}{\arg \min } r_{p h}$
$r_{p h}=\sum_{o}\left\|\left(I_{j}\left(s_{u^{*}}\right)-I_{o}\left(G\left(s_{u^{*}}, T_{w}^{j}, T_{w}^{o}, \rho\right)\right)\right)\right\|_{2}^{2}$

其中：

$\rho=\frac{\sum_{j=1}^{2} \frac{\rho_{j}}{\sigma_{j}^{2}}}{\sum_{j=1}^{2} \frac{1}{\sigma_{j}^{2}}}, \quad \sigma=\frac{1}{\sum_{j=1}^{2} \frac{1}{\sigma_{j}^{2}}}$
$S_{u} *$ ：座标
$G()$ 投影函数， $I_j$ 投影到 $I_o$ ；
$\sigma_{j}$ 为前面的标准差；

4.LOOP CLOSURE AND MAP REUSE

略.

RGBDTAM: A Cost-Effective and Accurate RGB-D Tracking and Mapping System