讀源碼學算法之TSDF Volume模型

圖1、ScalableTSDFVolume生成的mesh和tsdf模型

目前稠密三維重建主要使用兩種框架，分別是基於體素的(volumetric-based) TSDF框架和基於面元（surfel-based）框架。基於體素的框架可以通過維護重建的歷史信息，可以獲得緊緻的曲面和高質量的重建效果，在kinectfusion等一系列經典方法中被廣泛的應用。

首先，推薦一個3D開源算法庫：Open3D，它實現了很多經典的三維數據幾何處理算法，代碼風格友好，非常容易閱讀。
Open3D介紹：http://www.open3d.org/docs/introduction.html
Github鏈接：https://github.com/intel-isl/Open3D.

之前看論文對TSDF的認識僅僅停留在表面，基於科研需要，爲了深入學習TSDF框架及marching cubes網格抽取算法，我最近對TSDF部分進行了詳細的研讀。代碼地址：https://github.com/intel-isl/Open3D/blob/master/examples/Cpp/IntegrateRGBD.cpp

測試代碼用的是：ScalableTSDFVolume，這實際上是原始TSDF（UniformTSDFVolume）的子類，它叫做Voxel Hashing，它試用於較大場景的三維場景重建，並且可以高效的管理內存。而我這裏主要講UniformTSDFVolume的具體實現，主要涉及Integrate和ExtractTriangleMes兩個函數。

1、UniformTSDFVolume::Integrate

除去前面的一大堆判斷，這一部分主要有兩行代碼：

//計算深度圖中每個點投影到相機座標後，該點到光心的距離 / 該點實際的深度
auto depth2cameradistance = geometry::Image::CreateDepthToCameraDistanceMultiplierFloatImage(intrinsic);
//實際的Integrate操作
IntegrateWithDepthToCameraDistanceMultiplier(image, intrinsic, extrinsic,*depth2cameradistance);

1.1 CreateDepthToCameraDistanceMultiplierFloatImage

接下來深入到第一個函數：CreateDepthToCameraDistanceMultiplierFloatImage，它根據一張深度圖上每個像素的深度和位置，計算出每個距離深度比，方便後續根據深度數據直接算出體素的實際距離。

std::shared_ptr<Image> Image::CreateDepthToCameraDistanceMultiplierFloatImage(
        const camera::PinholeCameraIntrinsic &intrinsic) {
    auto fimage = std::make_shared<Image>();
    fimage->Prepare(intrinsic.width_, intrinsic.height_, 1, 4);
    float ffl_inv[2] = {
            1.0f / (float)intrinsic.GetFocalLength().first,  //fx
            1.0f / (float)intrinsic.GetFocalLength().second, //fy
    };
    float fpp[2] = {
            (float)intrinsic.GetPrincipalPoint().first,     //cx
            (float)intrinsic.GetPrincipalPoint().second,    //cy
    };
    std::vector<float> xx(intrinsic.width_);    //640
    std::vector<float> yy(intrinsic.height_);   //480 
    for (int j = 0; j < intrinsic.width_; j++) { 
        xx[j] = (j - fpp[0]) * ffl_inv[0];      // (j-cx)/fx
    }
    for (int i = 0; i < intrinsic.height_; i++) {
        yy[i] = (i - fpp[1]) * ffl_inv[1];      // (i-cy)/fy
    }
    for (int i = 0; i < intrinsic.height_; i++) {
        float *fp =(float *)(fimage->data_.data() + i * fimage->BytesPerLine());
        for (int j = 0; j < intrinsic.width_; j++, fp++) {
            *fp = sqrtf(xx[j] * xx[j] + yy[i] * yy[i] + 1.0f);  // sqrt(x^2+y^2+z^2)/z, 距離深度比
        }
    }
    return fimage;
}

相機模型爲：

$Z\begin{bmatrix} u \\ v \\ 1 \end{bmatrix} = \begin{bmatrix} f_x & 0 & c_x\\ 0 & f_y & c_y\\ 0 & 0 & 1 \end{bmatrix} \cdot \begin{bmatrix}X \\ Y \\ Z \end{bmatrix}$

展開寫就是這個樣子：

$u = Xf_x/Z+ c_x$ ， $v = Yf_y/Z+ c_y$

那麼，

$(u - cx)/ f_x= X/Z$ ， $(v - c_y) / f_y= Y/Z$

所以

$\sqrt{X^2/Z^2 + Y^2/Z^2 + 1} = \sqrt{X^2/Z^2 + Y^2/Z^2 + Z^2/Z^2} = \sqrt{X^2 + Y^2 + Z^2} / Z$

1.2 IntegrateWithDepthToCameraDistanceMultiplier

接下來深入到第二個函數：IntegrateWithDepthToCameraDistanceMultiplier，這是最主要的函數。
先上代碼，然後分析

void UniformTSDFVolume::IntegrateWithDepthToCameraDistanceMultiplier(
        const geometry::RGBDImage &image,
        const camera::PinholeCameraIntrinsic &intrinsic,
        const Eigen::Matrix4d &extrinsic,
        const geometry::Image &depth_to_camera_distance_multiplier) {
    const float fx = static_cast<float>(intrinsic.GetFocalLength().first);
    const float fy = static_cast<float>(intrinsic.GetFocalLength().second);
    const float cx = static_cast<float>(intrinsic.GetPrincipalPoint().first);
    const float cy = static_cast<float>(intrinsic.GetPrincipalPoint().second);
    const Eigen::Matrix4f extrinsic_f = extrinsic.cast<float>();
    const float voxel_length_f = static_cast<float>(voxel_length_);
    const float half_voxel_length_f = voxel_length_f * 0.5f;
    const float sdf_trunc_f = static_cast<float>(sdf_trunc_);
    const float sdf_trunc_inv_f = 1.0f / sdf_trunc_f;
    const Eigen::Matrix4f extrinsic_scaled_f = extrinsic_f * voxel_length_f;
    const float safe_width_f = intrinsic.width_ - 0.0001f;
    const float safe_height_f = intrinsic.height_ - 0.0001f;

    for (int x = 0; x < resolution_; x++) {
        for (int y = 0; y < resolution_; y++) {
            Eigen::Vector4f pt_3d_homo(float(half_voxel_length_f + voxel_length_f * x + origin_(0)),
                                       float(half_voxel_length_f + voxel_length_f * y + origin_(1)),
                                       float(half_voxel_length_f + origin_(2)),
                                       1.f);
            Eigen::Vector4f pt_camera = extrinsic_f * pt_3d_homo;
            
            for (int z = 0; z < resolution_; z++,  
                     pt_camera(0) += extrinsic_scaled_f(0, 2),
                     pt_camera(1) += extrinsic_scaled_f(1, 2),
                     pt_camera(2) += extrinsic_scaled_f(2, 2)) {
                // Skip if negative depth after projection
                if (pt_camera(2) <= 0)
                    continue;

                // Skip if x-y coordinate not in range
                float u_f = pt_camera(0) * fx / pt_camera(2) + cx + 0.5f;
                float v_f = pt_camera(1) * fy / pt_camera(2) + cy + 0.5f;  
                if (!(u_f >= 0.0001f && u_f < safe_width_f && v_f >= 0.0001f && v_f < safe_height_f)) 
                    continue;
               
                // Skip if negative depth in depth image
                int u = (int)u_f;
                int v = (int)v_f;
                float d = *image.depth_.PointerAt<float>(u, v);
                if (d <= 0.0f) 
                    continue;
               
                int v_ind = IndexOf(x, y, z);
                float sdf = (d - pt_camera(2)) *(*depth_to_camera_distance_multiplier.PointerAt<float>(u, v));
                if (sdf > -sdf_trunc_f) {  
                    // integrate
                    float tsdf = std::min(1.0f, sdf * sdf_trunc_inv_f);
                    voxels_[v_ind].tsdf_ = (voxels_[v_ind].tsdf_ * voxels_[v_ind].weight_ + tsdf) /
                            (voxels_[v_ind].weight_ + 1.0f);
                    if (color_type_ == TSDFVolumeColorType::RGB8) {
                        const uint8_t *rgb = image.color_.PointerAt<uint8_t>(u, v, 0);
                        Eigen::Vector3d rgb_f(rgb[0], rgb[1], rgb[2]);
                        voxels_[v_ind].color_ = (voxels_[v_ind].color_ * voxels_[v_ind].weight_ +rgb_f) /(voxels_[v_ind].weight_ + 1.0f);
                    } else if (color_type_ == TSDFVolumeColorType::Gray32) {
                        const float *intensity =image.color_.PointerAt<float>(u, v, 0);
                        voxels_[v_ind].color_ = (voxels_[v_ind].color_.array() * voxels_[v_ind].weight_ + (*intensity)) /(voxels_[v_ind].weight_ + 1.0f);
                    }
                    voxels_[v_ind].weight_ += 1.0f;
                }
            }
        }
    }
}

19-33行：從 $z = 0$ 所在平面依次遍歷所有voxel，這裏爲了加速計算，先計算 $(x,y,0)$ 體素所在中心點（+half_voxel_length_f）在世界座標系的座標。

Eigen::Vector4f pt_3d_homo(float(half_voxel_length_f + voxel_length_f * x + origin_(0)),
                           float(half_voxel_length_f + voxel_length_f * y + origin_(1)),
                           float(half_voxel_length_f + origin_(2)),
                           1.f);

注意：世界座標系可能不在體素(0,0,0)的位置，但世界座標系一般與體素的座標系的方向保持一致，（即沒有旋轉但可能有相對平移變換）。如果世界座標系在體素中（2,2,0）的位置，那麼這裏origin_=（-2,-2,0）。這裏加上 origin_實際就是移動體素的原點位置。

Eigen::Vector4f pt_camera = extrinsic_f * pt_3d_homo;
for (int z = 0; z < resolution_; z++,    
     pt_camera(0) += extrinsic_scaled_f(0, 2),
     pt_camera(1) += extrinsic_scaled_f(1, 2),
     pt_camera(2) += extrinsic_scaled_f(2, 2))

pt_camera是世界座標系中pt_3d_homo轉移到當前相機座標系的結果。但這來自於 $(x,y,0)$ 處體素的轉換結果。隨着 $z$ 每次遞增1，其實只需要在pt_camera的基礎上不斷累加extrinsic_scaled_f( 0/1/2, 2 )即可，這個很容易推導。

36-49行：將相機座標系中pt_camera，轉換到當前圖像空間，+0.5是爲了四捨五入。從而得到圖像空間的位置 $(u,v)$ 。
進一步，通過實際深度與測量深度之差（×距離深度比）拿到距離之差，也就是sdf值。

float u_f = pt_camera(0) * fx / pt_camera(2) + cx + 0.5f;
float v_f = pt_camera(1) * fy / pt_camera(2) + cy + 0.5f; 
int u = (int)u_f;
int v = (int)v_f;
float d = *image.depth_.PointerAt<float>(u, v);  //取得深度圖中(u, v)點的深度， 然後利用1.2中距離深度比拿到實際距離。
float sdf = (d - pt_camera(2)) * (*depth_to_camera_distance_multiplier.PointerAt<float>(u, v));

52-63行：接下來，sdf融合。公式爲：
$vox.sdf = (vox.sdf * vox.w + sdf) / (vox.w+ 1)$
$vox.w = vox.w + 1$

 voxels_[v_ind].tsdf_ =(voxels_[v_ind].tsdf_ * voxels_[v_ind].weight_ + tsdf) /
                       (voxels_[v_ind].weight_ + 1.0f);
 voxels_[v_ind].weight_ += 1.0f;

2、UniformTSDFVolume::ExtractTriangleMes

std::shared_ptr<geometry::TriangleMesh> UniformTSDFVolume::ExtractTriangleMesh() {
    // implementation of marching cubes, based on http://paulbourke.net/geometry/polygonise/
    auto mesh = std::make_shared<geometry::TriangleMesh>();
    double half_voxel_length = voxel_length_ * 0.5;
    // Map of "edge_index = (x, y, z, 0) + edge_shift" to "global vertex index"
    std::unordered_map<
            Eigen::Vector4i, int, utility::hash_eigen::hash<Eigen::Vector4i>,
            std::equal_to<Eigen::Vector4i>,
            Eigen::aligned_allocator<std::pair<const Eigen::Vector4i, int>>>
            edgeindex_to_vertexindex;
    int edge_to_index[12];
    for (int x = 0; x < resolution_ - 1; x++) {
        for (int y = 0; y < resolution_ - 1; y++) {
            for (int z = 0; z < resolution_ - 1; z++) {
                int cube_index = 0;
                float f[8];  //依次遍歷voxel的8個頂點
                Eigen::Vector3d c[8];
                for (int i = 0; i < 8; i++) {
                    Eigen::Vector3i idx = Eigen::Vector3i(x, y, z) + shift[i];
                    if (voxels_[IndexOf(idx)].weight_ == 0.0f) {
                        cube_index = 0;
                        break;
                    } else {
                        f[i] = voxels_[IndexOf(idx)].tsdf_;
                        if (f[i] < 0.0f) {
                            cube_index |= (1 << i); //內部的頂點，對應的位標記爲1
                        }
                        if (color_type_ == TSDFVolumeColorType::RGB8) {
                            c[i] = voxels_[IndexOf(idx)].color_.cast<double>() / 255.0;
                        } else if (color_type_ == TSDFVolumeColorType::Gray32) {
                            c[i] = voxels_[IndexOf(idx)].color_.cast<double>();
                        }
                    }
                }
                //完全在曲面內部或外部不予考慮，因爲沒有面穿過當前voxel
                if (cube_index == 0 || cube_index == 255) { 
                    continue;
                }
                for (int i = 0; i < 12; i++) { //依次遍歷voxel的12條邊
                    if (edge_table[cube_index] & (1 << i)) { //當前曲面與當前voxel的第i條邊相交
                        Eigen::Vector4i edge_index = Eigen::Vector4i(x, y, z, 0) + edge_shift[i];
                        if (edgeindex_to_vertexindex.find(edge_index) == edgeindex_to_vertexindex.end()) {
                            edge_to_index[i] = (int)mesh->vertices_.size();  //當前邊對應的交點編號
                            edgeindex_to_vertexindex[edge_index] =(int)mesh->vertices_.size(); //存入上述映射
                            Eigen::Vector3d pt( //相交邊的起點所在voxel的中心
                                    half_voxel_length + voxel_length_ * edge_index(0), 
                                    half_voxel_length + voxel_length_ * edge_index(1),
                                    half_voxel_length + voxel_length_ * edge_index(2));
                            double f0 = std::abs((double)f[edge_to_vert[i][0]]);  //edge_index第1個端點的sdf
                            double f1 = std::abs((double)f[edge_to_vert[i][1]]);  //edge_index第2個端點的sdf
                            pt(edge_index(3)) += f0 * voxel_length_ / (f0 + f1);  //插值得到曲面交點
                            mesh->vertices_.push_back(pt + origin_); //新的曲面交點插入mesh中

                            if (color_type_ != TSDFVolumeColorType::NoColor) {
                                const auto &c0 = c[edge_to_vert[i][0]];
                                const auto &c1 = c[edge_to_vert[i][1]];
                                mesh->vertex_colors_.push_back((f1 * c0 + f0 * c1) / (f0 + f1));
                            }
                        } else {
                            edge_to_index[i] = edgeindex_to_vertexindex.find(edge_index) ->second;
                        }
                    }
                }
                for (int i = 0; tri_table[cube_index][i] != -1; i += 3) {
                    mesh->triangles_.push_back(Eigen::Vector3i(
                            edge_to_index[tri_table[cube_index][i]],
                            edge_to_index[tri_table[cube_index][i + 2]],
                            edge_to_index[tri_table[cube_index][i + 1]]));
                }
            }
        }
    }
    return mesh;
}

18-34行：依次遍歷座標爲 $(x,y,z)$ 的體素的8個頂點，這裏使用了shift變量，用於標記所有8個頂點相對於 $(x,y,z)$ 的偏移量，定義如下：

const Eigen::Vector3i shift[8] = {
        Eigen::Vector3i(0, 0, 0), Eigen::Vector3i(1, 0, 0),
        Eigen::Vector3i(1, 1, 0), Eigen::Vector3i(0, 1, 0),
        Eigen::Vector3i(0, 0, 1), Eigen::Vector3i(1, 0, 1),
        Eigen::Vector3i(1, 1, 1), Eigen::Vector3i(0, 1, 1),
};

同時使用8個二進制位，用1標記曲面內部的頂點（sdf < 0）, 0標記曲面外的頂點

f[i] = voxels_[IndexOf(idx)].tsdf_;
if (f[i] < 0.0f) 
    cube_index |= (1 << i); //內部的頂點，對應的位標記爲1， 否則標記爲0

39-41行：依次遍歷座標爲 $(x,y,z)$ 的體素的12條邊，這裏使用了edge_shift變量，標記了邊的起點和方向（有了方向相當於指明瞭邊的終點），edge_shift定義如下，前三個數表示起點，最後一個數表示方向，看如下注釋很清楚。

// First 3 elements: edge start vertex coordinate (assume origin at (0, 0, 0))
// The last element: edge direction {0: x, 1: y, 2: z}
const Eigen::Vector4i edge_shift[12] = {
        Eigen::Vector4i(0, 0, 0, 0),  // Edge  0: {0, 1}
        Eigen::Vector4i(1, 0, 0, 1),  // Edge  1: {1, 2}
        Eigen::Vector4i(0, 1, 0, 0),  // Edge  2: {3, 2}
        Eigen::Vector4i(0, 0, 0, 1),  // Edge  3: {0, 3}
        Eigen::Vector4i(0, 0, 1, 0),  // Edge  4: {4, 5}
        Eigen::Vector4i(1, 0, 1, 1),  // Edge  5: {5, 6}
        Eigen::Vector4i(0, 1, 1, 0),  // Edge  6: {7, 6}
        Eigen::Vector4i(0, 0, 1, 1),  // Edge  7: {4, 7}
        Eigen::Vector4i(0, 0, 0, 2),  // Edge  8: {0, 4}
        Eigen::Vector4i(1, 0, 0, 2),  // Edge  9: {1, 5}
        Eigen::Vector4i(1, 1, 0, 2),  // Edge 10: {2, 6}
        Eigen::Vector4i(0, 1, 0, 2),  // Edge 11: {3, 7}
};

42-61行：注意這裏有一個變量叫：edgeindex_to_vertexindex，請注意它的定義方式，他實際上是一個map，對每一條與模型表面相交的邊edgeindex，與edgeindex上具體的交點 vertexindex綁定（每個交點給定一個全局的編號）。

std::unordered_map< Eigen::Vector4i, int, utility::hash_eigen::hash<Eigen::Vector4i>,
                    std::equal_to<Eigen::Vector4i>,
                    Eigen::aligned_allocator<std::pair<const Eigen::Vector4i, int>>
                  > edgeindex_to_vertexindex;

42-44行：如果當前邊edge_index還沒存入edgeindex_to_vertexindex，那麼接下來就要把edge_index和對應的交點存入

edge_to_index[i] = (int)mesh->vertices_.size();  //當前邊綁定的交點 編號
edgeindex_to_vertexindex[edge_index] =(int)mesh->vertices_.size(); //將上述映射存入edgeindex_to_vertexindex

45-48行：edge_index起點所在的voxel的中心。
49-52行：得到edge_index的兩個sdf，並通過插值（注意sdf必然一正一負）得到交點，並將新的交點插入mesh中。

pt(edge_index(3)) += f0 * voxel_length_ / (f0 + f1);

59-61行：如果當前邊edge_index已經存入edgeindex_to_vertexindex，那麼接下來就只需要提供對應的交點即可
64-70行：上面的代碼已經得到了所有的交點，接下來將臨近的3個交點，構成一個三角面片插入。

for (int i = 0; tri_table[cube_index][i] != -1; i += 3) {
    mesh->triangles_.push_back(Eigen::Vector3i(
          						edge_to_index[tri_table[cube_index][i]],
                            	edge_to_index[tri_table[cube_index][i + 2]],
                           	 	edge_to_index[tri_table[cube_index][i + 1]])
                           	  );
}

參考資料
Kinect Fusion 算法淺析：精巧中帶坑
 三維重建中的表面模型構建–TSDF算法
 https://github.com/andyzeng/tsdf-fusion

讀源碼學算法之TSDF Volume模型

1、UniformTSDFVolume::Integrate

1.1 CreateDepthToCameraDistanceMultiplierFloatImage

1.2 IntegrateWithDepthToCameraDistanceMultiplier

2、UniformTSDFVolume::ExtractTriangleMes

python列出centos7內存使用前50的進程信息

Garnet：微軟官方基於.NET開源的高性能分佈式緩存存儲數據庫

評估統計算法在銀行僞造鈔票檢測中的價值

讀源碼學算法之TSDF Volume模型

ubuntu 16.04安裝nvdia驅動，cuda驅動以及cudnn

marching cubes表面重建原理

位姿估計和座標系變換

清華計算機類推薦學術期刊會議列表

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結