再談單應性矩陣及位姿分解

最近在研究無人駕駛的定位問題，想利用車載攝像頭提取的車道線、配合廠商提供的GNSS+IMU(非ADMA級別方案，存在一定誤差)，還有地圖廠商提供的HD map(包含車道線元素)糾正定位。也就是說不依賴於視覺slam構造的特徵圖層，嘗試下在GNSS+IMU定位不精確的情況下，僅利用前置攝像頭的車道線，糾正lateral以及heading angle的誤差。

偶然之間，搜到一篇論文“Lane Level Localization; Using Images and HD Maps to Mitigate the Lateral Error”.

論文大致思想通過初始定位位置搜索到地圖對應的車道線元素，然後假設車身在longitudinal方向的錯位尚可，在攝像頭提取的左和右車道線上分別提取2個對共4個點，取車頭前方D米以及2D米的視覺車道線，同理在地圖的對應位置上也提取這麼4個點，然後求圖像平面到地圖平面的單應性變換關係，再分解單應性矩陣得到R和T作爲修正汽車定位的依據。(注意：廠商提供的地圖和視覺車道線，可能是N階曲線的係數，自己採樣即可；另外，提取的車道線越遠誤差越大，但是地圖誤差不會隨着距離增加而加大)

這篇論文的假設在一定情況下是成立的，汽車雖然行徑方向誤差較大，上述車道線採樣的點和地圖採樣的點實際上不一定匹配，但是只要是車頭前方不遠處的點，都不影響糾正lateral誤差。

本文打算總結下我所瞭解單應性矩陣，因爲針對不同場景的單應性矩陣，矩陣分解有不同方法。

1.什麼是單應性矩陣

平面的單應性即爲一個平面到另一個平面的映射。這個平面可以是圖像平面，也可以是三維世界中的平面(z=0)

數學定義爲，x=Mx'

單應性矩陣解法，一般就是DLT 4點法。如果給定的匹配點比較多，可以考慮RANSAC之類的，opencv有很方便的庫

2.單應性矩陣有哪些場景

單應性變換適用於如下兩種場景，這種變換矩陣都叫做單應性矩陣。但是他們是不同的。

a.圖像平面與物理平面

如下例子相機與物理平面不必是正對的，可以想想下無人機定位航拍，俯視角度做定位

b.圖像平面與圖像平面

兩個相機平面看着同樣的物理平面，但是相機在不同位置拍攝

3.如何分解R和T

a.針對圖像平面與物理世界平面

上述公式是套用了相機模型的投影公式，理論上旋轉和平移應該是，

但是由於噪聲，實際情況是h1和h2並不正交，所以有如下處理

這裏，解是唯一的，也是易於理解的。有人這麼幹的，

void cameraPoseFromHomography(const Mat& H, Mat& pose)
{
    pose = Mat::eye(3, 4, CV_32FC1);      // 3x4 matrix, the camera pose
    float norm1 = (float)norm(H.col(0));  
    float norm2 = (float)norm(H.col(1));  
    float tnorm = (norm1 + norm2) / 2.0f; // Normalization value

    Mat p1 = H.col(0);       // Pointer to first column of H
    Mat p2 = pose.col(0);    // Pointer to first column of pose (empty)

    cv::normalize(p1, p2);   // Normalize the rotation, and copies the column to pose

    p1 = H.col(1);           // Pointer to second column of H
    p2 = pose.col(1);        // Pointer to second column of pose (empty)

    cv::normalize(p1, p2);   // Normalize the rotation and copies the column to pose

    p1 = pose.col(0);
    p2 = pose.col(1);

    Mat p3 = p1.cross(p2);   // Computes the cross-product of p1 and p2
    Mat c2 = pose.col(2);    // Pointer to third column of pose
    p3.copyTo(c2);       // Third column is the crossproduct of columns one and two

    pose.col(3) = H.col(2) / tnorm;  //vector t [R|t] is the last column of pose
}

b.針對圖像平面之間的情況

假設3D平面方程爲（一般式）

那麼有，

根據兩個相機的成像模型，我們有

我們對第二個圖像畫面做如下擴展，

上述公式只是證明了，這種變換也是單應性變換。接下來怎麼求這個R和t呢？比較麻煩

數值法：

解析法：(opencv函數就是這種)

雖然能做R和T的分解，但是對於位移T來說單目視覺沒有實際的scale深度信息，只能知道一個位移的方向。

所以那個論文被我同事鄙視了，lateral方向不能給出明確的偏移量。關於位移量的問題，可以看如下仿真代碼。

#include <opencv2/opencv.hpp>
#include <iostream>
#include <vector>


int main() {
  // set up a virtual camera
  float f = 100, w = 640, h = 480;

  cv::Mat1f K = (cv::Mat1f(3, 3) <<
      f, 0, w/2,
      0, f, h/2,
      0, 0,   1);

  // set transformation from 1st to 2nd camera (assume K is unchanged)
  cv::Mat1f rvecDeg = (cv::Mat1f(3, 1) << 45, 12, 66);
  cv::Mat1f t = (cv::Mat1f(3, 1) << 100, 200, 300);

  std::cout << "-------------------------------------------\n";
  std::cout << "Ground truth:\n";

  std::cout << "K = \n" << K << std::endl << std::endl;
  std::cout << "rvec = \n" << rvecDeg << std::endl << std::endl;
  std::cout << "t = \n" << t << std::endl << std::endl;

  // set up points on a plane
  std::vector<cv::Point3f> p3d{{0, 0, 10},
                               {100, 0, 10},
                               {0, 100, 10},
                               {100, 100, 10}};

  // project on both cameras
  std::vector<cv::Point2f> Q, P, S;

  cv::projectPoints(p3d,
                    cv::Mat1d::zeros(3, 1),
                    cv::Mat1d::zeros(3, 1),
                    K,
                    cv::Mat(),
                    Q);

  cv::projectPoints(p3d,
                    rvecDeg*CV_PI/180,
                    t,
                    K,
                    cv::Mat(),
                    P);

  // find homography
  cv::Mat H = cv::findHomography(Q, P);

  std::cout << "-------------------------------------------\n";
  std::cout << "Estimated H = \n" << H << std::endl << std::endl;


  // check by reprojection
  std::vector<cv::Point2f> P_(P.size());
  cv::perspectiveTransform(Q, P_, H);

  float sumError = 0;

  for (size_t i = 0; i < P.size(); i++) {
    sumError += cv::norm(P[i] - P_[i]);
  }

  std::cout << "-------------------------------------------\n";
  std::cout << "Average reprojection error = "
      << sumError/P.size() << std::endl << std::endl;


  // decompose using identity as internal parameters matrix
  std::vector<cv::Mat> Rs, Ts;
  cv::decomposeHomographyMat(H,
                             K,
                             Rs, Ts,
                             cv::noArray());

  std::cout << "-------------------------------------------\n";
  std::cout << "Estimated decomposition:\n\n";
  std::cout << "rvec = " << std::endl;
  for (auto R_ : Rs) {
    cv::Mat1d rvec;
    cv::Rodrigues(R_, rvec);
    std::cout << rvec*180/CV_PI << std::endl << std::endl;
  }

  std::cout << std::endl;

  std::cout << "t = " << std::endl;
  for (auto t_ : Ts) {
    std::cout << t_ << std::endl << std::endl;
  }

  return 0;
}

輸出結果：

-------------------------------------------
Ground truth:
K =
[100, 0, 320;
0, 100, 240;
0, 0, 1]

rvec =
[45;
12;
66]

t =
[100;
200;
300]

-------------------------------------------
Estimated H =
[0.04136041220427821, 0.04748763375951008, 358.5557917287962;
0.05074854454707714, 0.06137211243830468, 297.4585754092336;
8.294458769850147e-05, 0.0002294875562580223, 1]

-------------------------------------------
Average reprojection error = 0

-------------------------------------------
Estimated decomposition:

rvec =
[-73.21470385654712;
56.64668212487194;
82.09114210289061]

[-73.21470385654712;
56.64668212487194;
82.09114210289061]

[45.00005330430893;
12.00000697952995;
65.99998380038915]

[45.00005330430893;
12.00000697952995;
65.99998380038915]


t =
[10.76993852870029;
18.60689642878277;
30.62344129378435]

[-10.76993852870029;
-18.60689642878277;
-30.62344129378435]

[10.00001378255982;
20.00002581449634;
30.0000336510648]

[-10.00001378255982;
-20.00002581449634;
-30.0000336510648]

As you can see, there is correct estimation of rotation vector among the hypothesis, and there's a up-to-scale correct estimation of translation.

關於尺度問題，它和像素點的取值有關係。大家可以把那個3D 平面座標，放大/縮小10倍，觀察translation的區別

4.單應性矩陣與對極幾何的關係

這裏援引高翔的話，

單應性在 SLAM 中具重要意義。當特徵點共面，或者相機發生純旋轉的時候，基礎矩陣的自由度下降，這就出現了所謂的退化（ degenerate）。現實中的數據總包含一些噪聲。這時候如果我們繼續使用八點法求解基礎矩陣，基礎矩陣多餘出來的自由度將會主要由噪聲決定。爲了能夠避免退化現象造成的影響，通常我們會同時估計基礎矩陣 F 和單應矩陣H，選擇重投影誤差比較小的那個作爲最終的運動估計矩陣。

再談單應性矩陣及位姿分解

《Python進階》學習筆記

Leetcode 3161. 物塊放置查詢

leetcode 60 排列序列

一個docker容器暴露多個端口

微服務實踐之使用 Visual Studio 2022 調試Dapr 應用程序

wpf附加屬性理解 WPF附加屬性

NDT Matching 算法學習

蒙特卡洛馬爾可夫鏈學習小結

黑夜給了我黑色的眼睛，我卻用它研究CV【36kr報道】

商業模式新生代之免費商業模式

《Master Opencv...讀書筆記》非剛性人臉跟蹤 IV (終)

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結