PointNet網絡結構詳細解析

原創

小邋遢-lxh

2019-08-07 10:37

PointNet網絡結構詳細解析

一、重要知識點

Transforming point clouds data to regular 3D voxel grids or collections of images, however, renders(cause to be) data unnecessarily voluminous(length, vast) and introducing quantization artifacts, obscure(conceal) natural invariances of the data.
PointNet learns to summarize an input point cloud by a sparse set of key points, which roughly corresponds to the skeleton of objects.
將點雲體素化會改變點雲數據的原始特徵，造成不必要的數據損失，並且額外增加了工作量，而 PointNet 採用了原始點雲的輸入方式，最大限度地保留了點雲的空間特徵，並在最終的測試中取得了很好的效果。
A symmetric function is invariant to the input order. For example, + and * operators are symmetric binary function.
Treat the input as a sequence to train an RNN.
相同的點雲在空間中經過一定的剛性變化（旋轉或平移），座標發生變化，但希望網絡都能正確的識別出物體（Special Transform Network, STN），但最終實驗結果和後續論文PointNet++表示，STN並無多大作用。
基本思想：對輸入點雲中的每一個點學習其對應的空間編碼，之後再利用所有點的特徵，得到一個全局的點雲特徵。
第一次input transform是對空間中點雲進行調整，直觀上理解是旋轉出一個有利於分類或分割的角度（點雲的剛性變化），第二次feature transform是對提取出的特徵進行變換，類似點的剛性變化，想利用這個得到一個有利於分類的特徵角度（上一個是dim=3，這一個是dim=64罷了，沒有本質區別）。

二、詳細網絡架構
解析詳細結構時需注意以下幾點：

kernel_shape = (kernel_h, kernel_w, num_in_channels, num_output_channels)，實現conv2卷積時代碼中定義的卷積核大小。
tf_util.conv2d: activation = tf.nn.relu, batch_norm_for_conv2d ，每次卷積時均使用了relu和bn，畫圖時不再畫出。
數據維度表示-(B, H, W, C)，B=Batch, H=Height, W=width, C=channel，畫圖時省略了B，只畫了後三維，但書寫時沒有省略。