第7講2 --- 求解相機的運動

上一篇博客中學習了特徵提取和匹配的概念，並且調用OpenCV庫實現了ORB特徵的提取和匹配。

找到了匹配點後，我們希望能夠根據匹配的點對來估計相機的運動。一共有三種方法可以用來估計相機的運動，分別爲對極幾何約束、PnP、ICP。這三種方法分別可以用來處理不同情況下的相機的位姿估計。

當相機爲單目的時候，我們只知道2D的像素座標，因而問題是根據兩組2D點估計運動。該問題用對極幾何來解決。
如果有3D點及其在相機的投影位置，即3D-2D問題，也能估計相機的運動，該問題通過PnP來求解。
當相機爲雙目、RGB-D時，或者通過某種方法得到了距離信息，那麼問題就是根據兩組3D點估計運動，即3D-3D問題。該問題通常用ICP來解決。

1、對極約束

下面先來看一下兩個相鄰幀之間的匹配點有什麼關係？

以上圖爲例，我們設從 $I_{1}$ 到 $I_{2}$ 的運動爲 $R,\vec{t}$ ，兩個相機的中心分別爲 $O_{1},O_{2}$ 。

現在，假設 $I_{1}$ 中有一個特徵點 $p_{1}$ ，它在 $I_{2}$ 中對應的點爲 $p_{2}$

（我們知道這是通過特徵匹配得到的，如果匹配結果正確的話，就可以認爲 $p_{1},p_{2}$ 是同一個空間點在兩個成像平面上投影）

現在，我們假設匹配結果是正確的，然後就可以開始下面的數學推導了：

在 $I_{1}$ 的座標系下，設點的空間位置爲 $P=[X,Y,Z]^{T}$ ，

根據針孔相機模型可以知道，兩個像素點 $p_{1},p_{2}$ 的像素位置滿足：

$z_{1}p_{1}=KP$ ， $z_{2}p_{2}=K(RP+t)$

因爲使用的是齊次座標，所以在上式的左邊乘以任意一個非零常數也是成立的，則有：

$p_{1}=KP$ ， $p_{2}=K(RP+t)$

現在取兩個像素點在歸一化平面上的座標： $x_{1}=K^{-1}p_{1}$ ， $x_{2}=K^{-1}p_{2}$ ，代入上式中得：

$x_{2}=Rx_{1}+t$

然後兩邊同時左乘 $t^{\Lambda }$ ，注意： $t^{\Lambda }t=t\times t=0$ ，得：

$t^{\Lambda }x_{2}=t^{\Lambda }Rx_{1}$

然後，兩式同時左乘 $x_{2}^{T}$ ，得到：

$x_{2}^{T}t^{\Lambda }x_{2}=x_{2}^{T}t^{\Lambda }Rx_{1}$

因爲 $t^{\Lambda }x_{2}=t\times x_{2}$ 是一個和 $t,x_{2}$ 都垂直的向量，所以： $x_{2}^{T}t^{\Lambda }x_{2}$ =0，則有：

$x_{2}^{T}t^{\Lambda }Rx_{1}=0$

將 $x_{1},x_2$ 的值重新代入得到：

${\color{Red} p_{2}^{T}K^{-T}t^{\Lambda }RK^{-1}p_{1}=0}$

上式就稱之爲對極約束，它的幾何意義是共面。

我們把上式的中間兩個部分記爲兩個矩陣：基礎矩陣（Fundamental matrix）F、本質矩陣（Essential matrix）E

${\color{Red} E=t^{\Lambda }R}$ ， ${\color{Red} F=K^{-T}t^{\Lambda }RK^{-1}=K^{-T}EK^{-1}}$

於是對極約束進一步簡化爲下式：

有了上面的基礎之後，相機的位姿估計問題就可以分解爲下面兩步：

（1）根據相鄰幀配對點的像素位置求出E或者F

（2）根據E或者F求出R, t

注：由對極幾何的表達式也可以知道利用對極幾何僅根據配對的像素點位置就可以求相機運動，記爲 2D-2D 問題（瞎說的）

------------------------------------------------------本質矩陣E------------------------------------------------

上面說了，根據 E 或者 F 都可以求出R, t，E 和 F 之間相差的就是相機內參K，而相機的內參矩陣在SLAM問題中一般是已知的，所以我們經常使用形式更加簡單的本質矩陣E 來求解相機的運動。

根據定義，本質矩陣 $E=t^{\Lambda}R$ ，它是一個3*3的矩陣，內有9個未知數。另一方面，由於旋轉和平移各有3個自由度，故 $E=t^{\Lambda}R$ 一共有6個自由度，但由於尺度等價性，實際上只有5個自由度。具有5個自由度的事實，表明我們最少可以用5對點來求解。但是的內在性質是一種非線性性質，在求解線性方程時會帶來麻煩，因此，也可以只考慮它的尺度等價性，使用8對點來估計---------這就是經典的八點法。（ps: 我對這段話不是很理解）

接下來的問題就是如何根據估計到的本質矩陣，來分解出R和t了。這個過程是通過奇異值分解得到的。

------------------------------------------------------單應矩陣H------------------------------------------------

單應矩陣H，描述了兩個平面之間的映射關係。若場景中的特徵點都落在同一個平面上（比如牆、地面），則可以通過單應性來進行運動估計。這種情況在無人攜帶的俯視相機或者掃地機攜帶的頂視相機中比較常見。

下面從數學的角度說明一下什麼是單應矩陣：

考慮在圖像 $I_{1},I_{2}$ 上有一對匹配好的特徵點 $p_{1},p_{2}$ ，這些特徵點落在平面上，設這個平面的方程爲：

$n^{T}P+d=0$

整理一下，就是：

$-\frac{n^{T}P}{d}=1$

又因爲： $p_{2}=K(RP+t)$ ，可以得到：

$p_{2}=K(RP+t(-\frac{n^{T}P}{d}))=K(R-\frac{tn^{T}}{d})P=K(R-\frac{tn^{T}}{d})K^{-1}p_{1}$

令 $H=K(R-\frac{tn^{T}}{d})K^{-1}$ ，於是 $p_{2}=Hp_{1}$ （H就是所謂的單應矩陣）

自由度爲8的單應矩陣可以通過4對匹配特徵點算出（這些特徵點不能有三點共線的情況），然後通過H分解得到R，t。

計算出兩幅圖像之間的單應矩陣H之後，可以有如下用途：

（1）根據H分解出相機的運動R、t

（2）應用 $p_{2}=Hp_{1}$ 這個關係可以將一個視圖中的所有點變換到另一個圖的視角下的圖片。

實踐部分：下面看一下如何使用OpenCV計算基礎矩陣、本質矩陣、單應矩陣以及分解R、T和利用單應矩陣變換圖像的視角。

feature_extract_match.hpp

（這個文件中定義了特徵提取和特徵匹配函數，之後所有關於ORB特徵提取和匹配的操作都調用這個頭文件裏面的代碼）

#ifndef FEATURE_EXTRACT_MATCH_H
#define FEATURE_EXTRACT_MATCH_H

#include <iostream>
using namespace std;

//opencv
#include <opencv2/core/core.hpp>
#include <opencv2/features2d/features2d.hpp>
#include <opencv2/highgui/highgui.hpp>
using namespace cv;

/*
實現兩張圖像的特徵提取和匹配：
使用引用的目的是爲了直接將提取和匹配的結果保存在key_point_1，key_point_2，matches中
*/
void feature_extract_match(Mat img1, Mat img2, vector<KeyPoint> &key_point_1, vector<KeyPoint> &key_point_2, vector<DMatch> &matches)
{

    //創建ORB提取器、描述子提取器、描述子匹配器
    Ptr<FeatureDetector> orb_detectore = ORB::create(); 
    Ptr<DescriptorExtractor> orb_descriptor = ORB::create();
    Ptr<DescriptorMatcher> matcher = DescriptorMatcher::create("BruteForce-Hamming");

    //定義變量
    Mat descriptor_1, descriptor_2;
    vector<DMatch> matches_before_filter;
    Mat img_matches_before_filter, img_good_matches;

    //提取ORB關鍵點
    orb_detectore->detect(img1, key_point_1);
    orb_detectore->detect(img2, key_point_2);

    //計算描述子
    orb_descriptor->compute(img1, key_point_1, descriptor_1);
    orb_descriptor->compute(img2, key_point_2, descriptor_2);

    //特徵點匹配，匹配結果中會包含一些誤匹配的點
    matcher->match(descriptor_1, descriptor_2, matches_before_filter);

    //剔除誤匹配的點：漢明距離大於兩倍最小距離的點剔除
    double min_distance = 9999;
    for(int i = 0; i < descriptor_1.rows; i++)
    {
        if(matches_before_filter[i].distance < min_distance)
        {
            min_distance = matches_before_filter[i].distance;
        }
    }
    cout << "min_distance = " << min_distance << endl;
    
    //有的時候最小距離非常小，這個情況下取最小距離爲一個經驗值30
    for( int i = 0; i < descriptor_1.rows; i++ )
    {
        if(matches_before_filter[i].distance <= max(2*min_distance, 30.0))
        {
            matches.push_back( matches_before_filter[i] );
        }
    }

    drawMatches(img1, key_point_1, img2, key_point_2, matches, img_good_matches);
    drawMatches(img1, key_point_1, img2, key_point_2, matches_before_filter, img_matches_before_filter);
    // imshow("matches_before_filter", img_matches_before_filter);
    // imshow("goog_matches", img_good_matches);
    // waitKey(0);
}

#endif

estimation.hpp:

/*
在這裏，要接觸到OpenCV的兩個重要的模塊：
（1）features2d模塊：進行特徵提取和匹配要用到的模塊
（2）calib3d模塊：這是一個之前沒有接觸過的模塊，它是OpenCV實現的一個相機校準和姿態估計模塊
*/

#ifndef ESTIMATION_H
#define ESTIMATION_H

#include "feature_extract_match.hpp"

//opencv
#include <opencv2/calib3d/calib3d.hpp>

//估計相機的運動：返回R、t
void pose_estimation_2d2d(vector<KeyPoint> key_points_1, vector<KeyPoint> key_points_2, vector<DMatch> matches, Mat &R, Mat &t)
{
    //內參矩陣
    double fx, fy, cx, cy;
    Mat K = Mat::eye(3, 3, CV_64FC1);   //Mat::eye()返回一個指定大小和類型的恆定矩陣
    fx = 520.9;
    fy = 521.0;
    cx = 325.1;
    cy = 249.7;
    K.at<double>(0, 0) = fx;
    K.at<double>(0, 2) = cx;
    K.at<double>(1, 1) = fy;
    K.at<double>(1, 2) = cy;
    cout << "internal reference matrix = " << endl;
    cout << K << endl;

    //將所有的KeyPoint轉化爲Point2f
    vector<Point2f> points_1;
    vector<Point2f> points_2;
    for(int i = 0; i < matches.size(); i++)
    {
        points_1.push_back(key_points_1[matches[i].queryIdx].pt);   //pt屬性是KeyPoint的座標
        points_2.push_back(key_points_2[matches[i].trainIdx].pt);
    }

    //計算基礎矩陣
    Mat fundamental_matrix;
    fundamental_matrix = findFundamentalMat(points_1, points_2, CV_FM_8POINT);  //最後一個參數CV_FM_8POINT表示使用8點法計算基礎矩陣
    cout << "fundamental_matrix = " << endl;
    cout << fundamental_matrix << endl;

    //直接調用findEssentialMat()函數來求本質矩陣，findEssentialMat()有兩種原型，本次要調用的爲下面這一種
    //findEssentialMat(InputArray points1, InputArray points2, double focal = 1.0, Point2d pp = Point2d(0, 0), int method = RANSAC, double prob = 0.999, double threshold = 1.0, OutputArray mask = noArray());
    double focal = 521;
    Point2d pp(325.1, 249.7);
    Mat essential_matrix;
    essential_matrix = findEssentialMat(points_1, points_2, focal, pp);
    cout << "essential_matrix = " << endl;
    cout << essential_matrix << endl;

    //從本質矩陣恢復R、t
    //recoverPose(InputArray E, InputArray points1, InputArray points2, OutputArray R, OutputArray t, double focal = 1.0, Point2d pp = Point2d(0, 0), InputOutputArray = noArray());
    recoverPose(essential_matrix, points_1, points_2, R, t, focal, pp);
}

//計算單應矩陣
void homograph_estimation(vector<KeyPoint> key_points_1, vector<KeyPoint> key_points_2, vector<DMatch> matches, Mat &H)
{
    //將所有的KeyPoint轉換爲Point2f
    vector<Point2f> points_1, points_2;
    for(int i = 0; i < matches.size(); i++)
    {
        points_1.push_back(key_points_1[matches[i].queryIdx].pt);
        points_2.push_back(key_points_2[matches[i].trainIdx].pt);
    }

    //計算單應矩陣
    H = findHomography(points_1, points_2, RANSAC);
    cout << "H = " << endl;
    cout << H <<endl;
}

#endif

main.hpp:

#include "estimation.hpp"
#include "feature_extract_match.hpp"

//opencv，warpPerspective()在這個頭文件中
#include "opencv2/imgproc.hpp"

int main(int argc, char ** argv)
{
    Mat img1, img2;
    img1 = imread("../datas/1.png");
    img2 = imread("../datas/2.png");

    vector<KeyPoint> key_points_1, key_points_2;
    vector<DMatch> matches;
    feature_extract_match(img1, img2, key_points_1, key_points_2, matches);

    cout << "一共找到了： " << matches.size() << "個匹配點" << endl;

    Mat R, t;
    pose_estimation_2d2d(key_points_1, key_points_2, matches, R, t);
    cout << "R = " << endl;
    cout << R << endl;
    cout << "t = " << endl;
    cout << t << endl;

    Mat H;
    Mat img_warp_perspective;
    homograph_estimation(key_points_1, key_points_2, matches, H);
    warpPerspective(img1, img_warp_perspective, H, img2.size());    //利用單應矩陣進行轉換
    imshow("img1", img1);
    imshow("img2", img2);
    imshow("img_warp_perspective", img_warp_perspective);
    waitKey(0);

    return 0;
}

2、3D-2D --- PnP

PnP也是求解相機運動的一種方法，它描述了當知道了n個3D空間點以及它們對應的像素點（在成像平面上的投影點）時，如何估計相機的位姿。

在PnP方法中，最少只需要知道3個點對（額外還至少需要一個點來驗證結果）就可以估計相機運動。特徵點的3D位置可以由三角測量或者rgb-d相機的深度圖給出。因此，在雙目或者rgb-d的視覺里程計中，我們可以直接用PnP方法來估計相機運動。但是在單目視覺里程計中，必須先進行初始化，然後通過三角測量的方法得到深度才能使用PnP。

由於3D-2D方法不需要使用對極約束，並且只需要很少的匹配點就可以較好的估計相機的運動，因此它也是最重要的一種姿態估計方法。

PnP問題有很多種求解方法，例如：P3P、DLT、EPnP、UPnP等等，此外還可以用非線性優化的方式，構建最小二乘問題迭代求解，也就是BA。這裏暫時只學習其中的P3P方法。

給出一個博客，很好的參考資料：https://www.cnblogs.com/mafuqiang/p/8302663.html#commentform

P3P需要利用給定的3對點的幾何關係，它的輸入爲3對3D-2D匹配點。此外，它還需要一對驗證點，爲了從可能的解中選出正確的那一個。

下面看一下解決該問題的簡單思路：

如上圖一樣，A、B、C是已知的3D點，對應的2D點爲a、b、c，相機光心爲O。

（首先需要清楚的一點是P3P並不是直接根據3D-2D點求出相機的位姿矩陣，而是先求出對應的2D點在當前相機座標系下的3D座標，然後根據世界座標系下的3D座標和當前相機座標系下的3D座標來求解相機姿態的。-這應該是後面要講的ICP問題）

首先，從餘弦定理出發，可以得到下面的關係式：

$OA^{2}+OB^{2}-2*OA*OB\cos<a,b>=AB^{2}$

$OB^{2}+OC^{2}-2*OB*OC\cos<b,c>=BC^{2}$

$OA^{2}+OC^{2}-2*OA*OC\cos<a,c>=AC^{2}$

上面的三個式子同時除以 $OC^{2}$ ，並令 $x=\frac{OA}{OC},y=\frac{OB}{OC}$ ，得：

$x^{2}+y^{2}-2xy\cos<a,b>=\frac{AB^{2}}{OC^{2}}$

$y^{2}+1-2y\cos<b,c>=\frac{BC^{2}}{OC^{2}}$

$x^{2}+1-2x\cos<a,c>=\frac{AC^2}{OC^2}$

下面做一下換元： $v=\frac{AB^2}{OC^2},uv=\frac{BC^2}{OC^2},wv=\frac{AC^2}{OC^2}$ ，得到：

$x^{2}+y^{2}-2xy\cos<a,b>=v$

$y^{2}+1-2y\cos<b,c>=uv$

$x^{2}+1-2x\cos<a,c>=wv$

然後把第一個式子帶入下面兩個式子中可以得到：

$(1-u)y^2-ux^2-2\cos<b,c>y+2uxycox<a,b>+1=0$

$(1-w)x^2-wy^2-2\cos<a,c>+2wxy\cos<a,b>+1=0$

我們現在要弄清楚上面的兩個式子中，哪些變量是已知的：

（1） $u=\frac{BC^2}{AB^2},w=\frac{AC^2}{AB^2}$ 可以通過A、B、C的3D座標點直接算出來

（2） $\cos<a,b>,\cos<b,c>,\cos<a,c>$ 是可以算出來，上面給出來的博客中給出了計算方法

是未知的，隨着相機的移動會發生變化。解析的求上面這個二元二次方程是非常複雜的，需要用“吳消元法”求解。與分解本質矩陣E的情況類似，這個方程最多可能會得到四組解，但是我們會用驗證點（這就是前面爲什麼說需要至少一對額外的點）來計算最可能的解，得到A、B、C在相機座標系下的3D座標，然後根據3D-3D點對，求解R、t。

P3P中還存在的問題：

（1）P3P只利用了3對點的信息，當配對的點比較多的時候，難以利用更多的信息（我認爲這既是優點也是缺點）。

（2）如果3D-2D點存在無匹配的情況，那麼P3P算法就會失效。

在SLAM中，通常的做法是先使用P3P、EPnP等方法估計相機的位姿，然後構建最小二乘優化問題對估計值進行優化。

實踐部分：利用OpenCV已經集成好的求解方法進行求解。

先介紹兩個方法：

（1）solve() --- calib3d.hpp

(2)rodrigues() --- calib3d.hpp --- 將旋轉向量轉換爲選裝矩陣

/*
P3P要使用3D點，爲了避免單目初始化帶來的麻煩，使用深度圖提供的深度信息來構建3D點
*/

#include "feature_extract_match.hpp"
//opencv
#include <opencv2/core/core.hpp>
#include <opencv2/features2d/features2d.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/calib3d/calib3d.hpp>

//像素座標轉換爲歸一化座標
Point2d pixel_to_cam(Point2d p, Mat K)
{
    return Point2d
        (
            (p.x - K.at<double>(0, 2)) / K.at<double>(0, 0),
            (p.y - K.at<double>(1, 2)) / K.at<double>(1, 1)
        );
}

int main(int argc, char **argv)
{
    Mat rgb_img1 = imread("../datas/1.png");
    Mat rgb_img2 = imread("../datas/2.png");

    vector<KeyPoint> key_points_1, key_points_2;
    vector<DMatch> matches;

    //進行特徵提取和匹配
    feature_extract_match(rgb_img1, rgb_img2, key_points_1, key_points_2, matches);

    Mat depth_img = imread("../datas/1_depth.png", CV_LOAD_IMAGE_UNCHANGED);    //深度圖爲16位無符號數，單通道圖像
    vector<Point3d> p_3d;
    vector<Point2d> p_2d;

    //內參矩陣
    Mat K = (Mat_<double>(3, 3) <<
        529.0, 0, 325.1,
        0, 521.0, 249.9,
        0, 0, 1
    );

    for(int i=0; i<matches.size(); i++)
    {
        //取出深度信息
        ushort d = depth_img.ptr<unsigned short>( (int)(key_points_1[matches[i].queryIdx].pt.y) ) [ (int)(key_points_1[matches[i].queryIdx].pt.x) ];
        if(d == 0)
        {
            continue;
        }
        float dd = d/1000.0;
        Point2d normalized_p = pixel_to_cam(key_points_1[matches[i].queryIdx].pt, K);

        p_3d.push_back( Point3d(normalized_p.x*dd, normalized_p.y*dd, dd) );
        p_2d.push_back( key_points_2[matches[i].trainIdx].pt );
    }

    cout << "the number of 3D-2D pairs = " << p_3d.size() << endl;

    //調用OpenCV中的函數求解P3P
    Mat r, t, R;
    solvePnP(p_3d, p_2d, K, Mat(), r, t, false, cv::SOLVEPNP_EPNP);//使用EPNP方法求解，返回的r是旋轉向量
    Rodrigues(r, R);//將旋轉向量轉爲旋轉矩陣

    cout << "R = " << endl;
    cout << R << endl;
    cout << "t = " << endl;
    cout << t << endl;

    return 0;
}

3、3D-3D --- ICP

看一下ICP的問題描述：

假設我們有一組匹配好的3D點： $P=\begin{Bmatrix} p_{1},p_{2},...,p_{n} \end{Bmatrix} ,P^{'}=\begin{Bmatrix} p^{'}_{1},p^{'}_{2},...,p^{'}_{n} \end{Bmatrix}$ ，然後需要找一個歐式變換R、t，使得：

$\forall i,p_{i}=Rp^{'}_{i}+t$ （注意：按照這樣的推導方式，得到是從第二幀到第一幀的R、t，如果要求從第一幀到第二幀的，則要進行逆變換）。（從對問題的描述中可以看到，3D-3D問題和相機是沒有關係的，因爲在激光SLAM中也會用到ICP來求解位姿）

和PnP類似，ICP的求解也有兩種方式：（1）利用線性代數的求解方法（主要是SVD）；（2）利用非線性優化方式求解（類似BA，暫時還沒有學，所以在這裏關注的是第一種方法）

-----------------------------------------------------------SVD求解ICP----------------------------------------------

根據上面描述的ICP問題，我們可以定義第對點的誤差項：

$e_{i}=p_{i}-(Rp_{i}^{'}+t)$

然後，構建最小二乘問題，求使得誤差平方和達到極小的:

$\min_{R,t} J=\frac{1}{2}\sum_{i=1}^{n}\begin{Vmatrix} (p_{i}-(Rp^{'}_{i}+t)) \end{Vmatrix}^{2}$

下面來推導上式的求解方法：

首先，定義兩組點的質心：

$p=\frac{1}{n}\sum_{i=1}^{n}(p_{i}),p^{'}=\frac{1}{n}\sum_{i=1}^{n}(p_{i}^{'})$

然後，對上面的誤差函數左如下處理：

注意到一個現象： $\sum_{i=1}^{n}(p_{i}-p)-R(p^{'}_{i}-p&{'})=0$ ，所以上面的這個誤差函數就變爲：

注意到上式中的左右兩項：左邊這一項只和R有關，右邊這一項和R、t都有關，並且它只和質心有關。因此，可以想到：只要我們通過第一項式子求出了R，令第二項式子爲0就可以求出t了。因此，可以總結出ICP的求解步驟：

1、計算兩組點的質心位置 $p.p^{'}$ ，然後計算每個點的去質心座標：

$q_{i}=p_{i}-p$

$q_{i}^{'}=p_{i}^{'}-p^{'}$

2、然後根據一下優化問題計算旋轉矩陣：

將計算好的去質心座標代入得到：

3、根據2中的R計算t：

下面看一下如何計算R：

注意：這裏省去了《視覺SLAM十四講》中的部分推導。

定義一個矩陣：，W是一個3*3的矩陣，需要對W進行SVD分解：

下面這個博客詳細的講了SVD：https://blog.csdn.net/zhongkejingwang/article/details/43053513

對W進行SVD分解得到：

是一個由奇異值組成的對角矩陣，對角線元素從大到小排列，而也均爲對角矩陣。

當W爲滿秩矩陣時， $R=UV^{T}$ ，然後利用R就可以求出t了。

下面用OpenCV實踐一下用SVD求解ICP的程序：

1）common_include.h中包含了公用的源文件，這樣可以避免每次都寫一大堆的include命令。

//一些都要include的頭文件放在common_include.h中，以免要寫很多次include指令

#ifndef COMMON_INCLUDE_H
#define COMMON_INCLUDE_H

#include <iostream>
using namespace std;

//opencv
#include <opencv2/core/core.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/features2d/features2d.hpp>
#include <opencv2/calib3d/calib3d.hpp>
using namespace cv;

//eigen3
#include <Eigen/Core>
#include <Eigen/Geometry>
#include <Eigen/SVD>

#endif

2）pose_estimation.hpp定義了3d3d位姿估計的函數，可以把上面的幾個2d2d、3d2d的位姿估計都合到這個頭文件中來。

#ifndef POSE_EXTIMATION_H
#define POSE_EXTIMATION_H

#include "common_include.h"

//3D-3D 位姿估計
void pose_estimation_3d3d(vector<Point3f> points_1, vector<Point3f> points_2, Mat &R, Mat &t)
{
    //求質心
    Point3f center_points_1, center_points_2;
    int N = points_1.size();

    for(int i = 0; i < N; i++)
    {
        center_points_1 += points_1[i];
        center_points_2 += points_2[i];
    }

    center_points_1 /= N;
    center_points_2 /= N;

    //求每個點的去質心座標
    vector<Point3f> q_1, q_2;
    for(int i = 0; i < N; i++)
    {
        q_1.push_back(points_1[i] - center_points_1);
        q_2.push_back(points_2[i] - center_points_2);
    }

    //計算W
    Eigen::Matrix3d W = Eigen::Matrix3d::Zero();
    for(int i = 0; i < N; i++)
    {
        W += Eigen::Vector3d(q_1[i].x, q_1[i].y, q_1[i].z) * Eigen::Vector3d(q_2[i].x, q_2[i].y, q_2[i].z).transpose();
    }
    cout << "W = " << endl;
    cout << W << endl;

    //對W進行SVD分解
    Eigen::JacobiSVD<Eigen::Matrix3d> svd(W, Eigen::ComputeFullU|Eigen::ComputeFullV);
    Eigen::Matrix3d U = svd.matrixU();
    Eigen::Matrix3d V = svd.matrixV();
    cout << "U = " << endl;
    cout << U << endl;
    cout << "V = " << endl;
    cout << V << endl;

    Eigen::Matrix3d R_ = U * (V.transpose());
    Eigen::Vector3d t_ = Eigen::Vector3d(center_points_1.x, center_points_1.y, center_points_1.z) - R_ * Eigen::Vector3d(center_points_2.x, center_points_2.y, center_points_2.z);
    cout << "R_ = " << endl;
    cout << R << endl;
    cout << "t_ = " << endl;
    cout << t_ << endl;

    //Eigen::Matrix3d和opencv中的矩陣數據格式不是一致的
    R = (Mat_<double>(3, 3) <<
        R_(0,0), R_(0,1), R_(0,2),
        R_(1,0), R_(1,1), R_(1,2),
        R_(2,0), R_(2,1), R_(2,2)
    );

    t = (Mat_<double>(3, 1) <<
        t_(0, 0),
        t_(1, 0),
        t_(2, 0)
    );
}

#endif

3）test_estimation.cpp定義了測試程序，3d點是讀取深度圖得到的。

#include "common_include.h"
#include "pose_estimation.hpp"
#include "feature_extract_match.hpp"

//像素座標轉換爲歸一化座標
Point2d pixel_to_cam(Point2d p, Mat K);

int main ( int argc, char** argv )
{
    //-- 讀取圖像
    Mat rgb_img_1 = imread ("../datas/1.png");
    Mat rgb_img_2 = imread ("../datas/2.png");

    vector<KeyPoint> key_points_1, key_points_2;
    vector<DMatch> matches;
    feature_extract_match(rgb_img_1, rgb_img_2, key_points_1, key_points_2, matches);
    //cout<<"一共找到了"<<matches.size() <<"組匹配點"<<endl;

    // 建立3D點
    Mat depth_img_1 = imread ("../datas/1_depth.png", CV_LOAD_IMAGE_UNCHANGED);       // 深度圖爲16位無符號數，單通道圖像
    Mat depth_img_2 = imread ("../datas/2_depth.png", CV_LOAD_IMAGE_UNCHANGED);       // 深度圖爲16位無符號數，單通道圖像
    Mat K = ( Mat_<double> ( 3,3 ) << 520.9, 0, 325.1, 0, 521.0, 249.7, 0, 0, 1 );
    vector<Point3f> pts1, pts2;

    for (int i = 0; i < matches.size(); i++)
    {
        ushort d1 = depth_img_1.ptr<unsigned short> (int(key_points_1[matches[i].queryIdx].pt.y))[int(key_points_1[matches[i].queryIdx].pt.x)];
        ushort d2 = depth_img_2.ptr<unsigned short> (int(key_points_2[matches[i].trainIdx].pt.y))[int(key_points_2[matches[i].trainIdx].pt.x)];
        if ( d1==0 || d2==0 )   // bad depth
            continue;

        Point2d p1 = pixel_to_cam (key_points_1[matches[i].queryIdx].pt, K);
        Point2d p2 = pixel_to_cam (key_points_2[matches[i].trainIdx].pt, K);
        float dd1 = float ( d1 ) /5000.0;
        float dd2 = float ( d2 ) /5000.0;
        pts1.push_back ( Point3f(p1.x*dd1, p1.y*dd1, dd1) );
        pts2.push_back ( Point3f(p2.x*dd2, p2.y*dd2, dd2) );
    }

    cout<<"3d-3d pairs: "<<pts1.size() <<endl;
    
    Mat R, t;
    pose_estimation_3d3d ( pts1, pts2, R, t );
    cout << "ICP via SVD results: " << endl;
    cout << "R = " << R << endl;
    cout << "t = " << t << endl;

    return 0;
}

//像素座標轉換爲歸一化座標
Point2d pixel_to_cam(Point2d p, Mat K)
{
    return Point2d
        (
            (p.x - K.at<double>(0, 2)) / K.at<double>(0, 0),
            (p.y - K.at<double>(1, 2)) / K.at<double>(1, 1)
        );
}

上面就是要學習的2d-2d、3d-2d、3d-3d的位姿估計方法，求解PNP、ICP也是可以用優化的方法來求的，下一節會學習優化的內容，採用的優化庫是G2O庫。

第7講2 --- 求解相機的運動

1、對極約束

2、3D-2D --- PnP

3、3D-3D --- ICP

再談23種設計模式（3）：行爲型模式（學習筆記）

Power Automate Desktop 安裝完，登錄後老是提示one driver 錯誤

微前端學習筆記(4):從微前端到微模塊之EMP與hel-micro方案探索

微前端學習筆記（1）：微前端總體架構概述，從微服務發微

985 碩士程序員，空窗 4 個月沒有 Offer！

一文搞懂 Spring 循環依賴

賽博鬥地主——使用大語言模型扮演Agent智能體玩牌類遊戲。

VScode右鍵打開(添加到右鍵)

記一次 .NET某工控視覺自動化系統卡死分析

WindowsServer--SQL Server搭建主從同步實現讀寫分離 - 事務性分發

vscode配置mingw打造輕量級c++ IDE

Ubuntu系統使用常見問題

ubuntu16.04上製作軟件圖標

Mnist數據集介紹

第4講李羣和李代數2

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結