PNP algorithm

PNP algorithm

这个算法比我想想的要复杂的多,所以我还是从理论入手把,结合opencv的代码。单从代码讲会忽略好多东西。以后有时间慢慢改。

最近用到了PNP算法,看了好多,网上搜索了一番,发现有很多种解法,在这里,我就准备通过OPENCV源码,看看OPENCV里是怎么实现的(我看的OPENCV的源码是2.4,C++实现的)。

PNP用的比较多的就是P3P和EPNP两种,在这里,我也准备把OPENCV里如何实现这两种算法的过程详细的记录下来。至于PNP算法用来干什么的我就不说了,网上很多解释。

OPENCV源码

首先,看一下OPENCV里的SolvePNP函数的API:

 C++: bool solvePnP(InputArray objectPoints, InputArray imagePoints, InputArray cameraMatrix, InputArray distCoeffs, OutputArray rvec, OutputArray tvec, bool useExtrinsicGuess=false, int flags=ITERATIVE )

其中参数为:
Parameters:

  • objectPoints – Array of object points in the object coordinate space, 3xN/Nx3 1-channel or 1xN/Nx1 3-channel, where N is the number of points. vector can be also passed here.
  • imagePoints – Array of corresponding image points, 2xN/Nx2 1-channel or 1xN/Nx1 2-channel, where N is the number of points. vector can be also passed here.
  • cameraMatrix – Input camera matrix A = \vecthreethree{fx}{0}{cx}{0}{fy}{cy}{0}{0}{1} .
  • distCoeffs – Input vector of distortion coefficients (k_1, k_2, p_1, p_2[, k_3[, k_4, k_5, k_6]]) of 4, 5, or 8 elements. If the vector is NULL/empty, the zero distortion coefficients are assumed.
  • rvec – Output rotation vector (see Rodrigues() ) that, together with tvec , brings points from the model coordinate system to the camera coordinate system.
  • tvec – Output translation vector.
  • useExtrinsicGuess – If true (1), the function uses the provided rvec and tvec values as initial approximations of the rotation and translation vectors, respectively, and further optimizes them.
  • flags –

    Method for solving a PnP problem:

    • CV_ITERATIVE Iterative method is based on Levenberg-Marquardt optimization. In this case the function finds such a pose that minimizes reprojection error, that is the sum of squared distances between the observed projections imagePoints and the projected (using projectPoints() ) objectPoints .
    • CV_P3P Method is based on the paper of X.S. Gao, X.-R. Hou, J. Tang, H.-F. Chang “Complete Solution Classification for the Perspective-Three-Point Problem”. In this case the function requires exactly four object and image points.
    • CV_EPNP Method has been introduced by F.Moreno-Noguer, V.Lepetit and P.Fua in the paper “EPnP: Efficient Perspective-n-Point Camera Pose Estimation”.

每个参数的意义已经写的很清楚了,objectPoints是世界座标系的点,imagePoints是图像座标系的点,cameraMatrix是相机内参,distCoeffs是相机畸变系数,rvec是旋转矩阵,tvec是平移矩阵,支持三种方法P3P,EPNP,ITERATIVE,现在就来看P3P情况。

solvePnP源码:

bool cv::solvePnP( InputArray _opoints, InputArray _ipoints,
                  InputArray _cameraMatrix, InputArray _distCoeffs,
                  OutputArray _rvec, OutputArray _tvec, bool useExtrinsicGuess, int flags )
{
    Mat opoints = _opoints.getMat(), ipoints = _ipoints.getMat();
    int npoints = std::max(opoints.checkVector(3, CV_32F), opoints.checkVector(3, CV_64F));
    CV_Assert( npoints >= 0 && npoints == std::max(ipoints.checkVector(2, CV_32F), ipoints.checkVector(2, CV_64F)) );
    Mat cameraMatrix = _cameraMatrix.getMat(), distCoeffs = _distCoeffs.getMat();

    Mat rvec, tvec;
    if( flags != CV_ITERATIVE )
        useExtrinsicGuess = false;

    if( useExtrinsicGuess )
    {
        int rtype = _rvec.type(), ttype = _tvec.type();
        Size rsize = _rvec.size(), tsize = _tvec.size();
        CV_Assert( (rtype == CV_32F || rtype == CV_64F) &&
                   (ttype == CV_32F || ttype == CV_64F) );
        CV_Assert( (rsize == Size(1, 3) || rsize == Size(3, 1)) &&
                   (tsize == Size(1, 3) || tsize == Size(3, 1)) );
    }
    else
    {
        _rvec.create(3, 1, CV_64F);
        _tvec.create(3, 1, CV_64F);
    }
    rvec = _rvec.getMat();
    tvec = _tvec.getMat();

    if (flags == CV_EPNP)
    {
        cv::Mat undistortedPoints;
        cv::undistortPoints(ipoints, undistortedPoints, cameraMatrix, distCoeffs);
        epnp PnP(cameraMatrix, opoints, undistortedPoints);

        cv::Mat R;
        PnP.compute_pose(R, tvec);
        cv::Rodrigues(R, rvec);
        return true;
    }
    else if (flags == CV_P3P)
    {
        CV_Assert( npoints == 4);
        cv::Mat undistortedPoints;
        cv::undistortPoints(ipoints, undistortedPoints, cameraMatrix, distCoeffs);
        p3p P3Psolver(cameraMatrix);

        cv::Mat R;
        bool result = P3Psolver.solve(R, tvec, opoints, undistortedPoints);
        if (result)
            cv::Rodrigues(R, rvec);
        return result;
    }
    else if (flags == CV_ITERATIVE)
    {
        CvMat c_objectPoints = opoints, c_imagePoints = ipoints;
        CvMat c_cameraMatrix = cameraMatrix, c_distCoeffs = distCoeffs;
        CvMat c_rvec = rvec, c_tvec = tvec;
        cvFindExtrinsicCameraParams2(&c_objectPoints, &c_imagePoints, &c_cameraMatrix,
                                     c_distCoeffs.rows*c_distCoeffs.cols ? &c_distCoeffs : 0,
                                     &c_rvec, &c_tvec, useExtrinsicGuess );
        return true;
    }
    else
        CV_Error(CV_StsBadArg, "The flags argument must be one of CV_ITERATIVE or CV_EPNP");
    return false;
}

函数前三行

    Mat opoints = _opoints.getMat(), ipoints = _ipoints.getMat();//将世界座标系点的数据和图像座标系点的数据变为Mat格式
    int npoints = std::max(opoints.checkVector(3, CV_32F), opoints.checkVector(3, CV_64F));//检查Mat是不是由npoints个3维CV_32F/CV_64F向量组成的
    CV_Assert( npoints >= 0 && npoints == std::max(ipoints.checkVector(2, CV_32F), ipoints.checkVector(2, CV_64F)) );// 对ipoints进行同样的检查,并对比两组点的个数是否相同

下一句:

    Mat cameraMatrix = _cameraMatrix.getMat(), distCoeffs = _distCoeffs.getMat();//转换为Mat类型

P3P

我们先看P3P部分:

 else if (flags == CV_P3P)
    {
        // 看看是否有4组点对,CV_P3P使用第四个点对来获得更好的结果
        CV_Assert( npoints == 4);
        //获得图像座标在畸变前的座标,我们的目的是看P3P算法怎么实现的,这里不用管。
        cv::Mat undistortedPoints;
        cv::undistortPoints(ipoints, undistortedPoints, cameraMatrix, distCoeffs);
        //初始化一个p3p类型的对象
        p3p P3Psolver(cameraMatrix);

        cv::Mat R;
        //在这里计算!
        bool result = P3Psolver.solve(R, tvec, opoints, undistortedPoints);
        if (result)
            cv::Rodrigues(R, rvec);
        return result;
    }

先看一下p3p类是怎么初始化的:

p3p::p3p(cv::Mat cameraMatrix)
{
    //如果矩阵的depth()是CV_32F,则
    //depth可以理解为数据类型
    if (cameraMatrix.depth() == CV_32F)
        init_camera_parameters<float>(cameraMatrix);
    else
        init_camera_parameters<double>(cameraMatrix);
    init_inverse_parameters();
}

init_camera_parameters函数将已知的内参赋值给p3p类里的变量cx,cy,fx,fy

  void init_camera_parameters(const cv::Mat& cameraMatrix)
  {
    cx = cameraMatrix.at<T> (0, 2);  //摄像机内参的X offset
    cy = cameraMatrix.at<T> (1, 2);
    fx = cameraMatrix.at<T> (0, 0);  // focal length x
    fy = cameraMatrix.at<T> (1, 1);  // focal length y
  }

init_inverse_parameters方法:

不用多解释了

void p3p::init_inverse_parameters()
{
    inv_fx = 1. / fx;
    inv_fy = 1. / fy;
    cx_fx = cx / fx;
    cy_fy = cy / fy;
}

之后我们来看看这一部分:

cv::Mat R;
bool result = P3Psolver.solve(R, tvec, opoints, undistortedPoints);

首先是p3p类里的solve部分:

bool p3p::solve(cv::Mat& R, cv::Mat& tvec, const cv::Mat& opoints, const cv::Mat& ipoints)
{
    //声明了rotation_matrix  translation
    double rotation_matrix[3][3], translation[3];

    std::vector<double> points;
    //根据不同的数据类型,进行数据点的提取
    if (opoints.depth() == ipoints.depth())
    {
        if (opoints.depth() == CV_32F)
            extract_points<cv::Point3f,cv::Point2f>(opoints, ipoints, points);
        else
            extract_points<cv::Point3d,cv::Point2d>(opoints, ipoints, points);
    }
    else if (opoints.depth() == CV_32F)
        extract_points<cv::Point3f,cv::Point2d>(opoints, ipoints, points);
    else
        extract_points<cv::Point3d,cv::Point2f>(opoints, ipoints, points);
    //计算开始了
    bool result = solve(rotation_matrix, translation, points[0], points[1], points[2], points[3], points[4], points[5],
          points[6], points[7], points[8], points[9], points[10], points[11], points[12], points[13], points[14],
          points[15], points[16], points[17], points[18], points[19]);
    cv::Mat(3, 1, CV_64F, translation).copyTo(tvec);
    cv::Mat(3, 3, CV_64F, rotation_matrix).copyTo(R);
    return result;
}

extract_points模板方法

template <typename OpointType, typename IpointType>
  void extract_points(const cv::Mat& opoints, const cv::Mat& ipoints, std::vector<double>& points)
  {
      points.clear();
      points.resize(20);
      //可以看到,这个方法把image points 和 object points都存到了同一个vector里
      //每一对2D-3D的点对,有5个值(2D两个,3D三个),有四个点对,总共存20个
      //图像座标系里,把x*fx后再加偏移量cx
      //y也这么做
      for(int i = 0; i < 4; i++)
      {
          points[i*5] = ipoints.at<IpointType>(0,i).x*fx + cx;
          points[i*5+1] = ipoints.at<IpointType>(0,i).y*fy + cy;
          points[i*5+2] = opoints.at<OpointType>(0,i).x;
          points[i*5+3] = opoints.at<OpointType>(0,i).y;
          points[i*5+4] = opoints.at<OpointType>(0,i).z;
      }
  }

之后就是重点了:
我们注意到,solve函数声明了两个数组:

double rotation_matrix[3][3], translation[3];

之后在

bool result = solve(rotation_matrix, translation, points[0], points[1], points[2], points[3], points[4], points[5], points[6], points[7], points[8], points[9], points[10], points[11], points[12], points[13], points[14],points[15], points[16], points[17], points[18], points[19]);

里,我们根据函数的参数列表,找到相应的调用,因为solve被重载了好几次。
跳转到:

bool p3p::solve(double R[3][3], double t[3],
    double mu0, double mv0,   double X0, double Y0, double Z0,
    double mu1, double mv1,   double X1, double Y1, double Z1,
    double mu2, double mv2,   double X2, double Y2, double Z2,
    double mu3, double mv3,   double X3, double Y3, double Z3)
{
    double Rs[4][3][3], ts[4][3];
    //这里会跳转到别的被重载的solve函数中
    //使用前三个点对,第四个点对并没有传入进去
    int n = solve(Rs, ts, mu0, mv0, X0, Y0, Z0,  mu1, mv1, X1, Y1, Z1, mu2, mv2, X2, Y2, Z2);

    if (n == 0)
        return false;

    int ns = 0;
    double min_reproj = 0;
    for(int i = 0; i < n; i++) {
        double X3p = Rs[i][0][0] * X3 + Rs[i][0][1] * Y3 + Rs[i][0][2] * Z3 + ts[i][0];
        double Y3p = Rs[i][1][0] * X3 + Rs[i][1][1] * Y3 + Rs[i][1][2] * Z3 + ts[i][1];
        double Z3p = Rs[i][2][0] * X3 + Rs[i][2][1] * Y3 + Rs[i][2][2] * Z3 + ts[i][2];
        double mu3p = cx + fx * X3p / Z3p;
        double mv3p = cy + fy * Y3p / Z3p;
        double reproj = (mu3p - mu3) * (mu3p - mu3) + (mv3p - mv3) * (mv3p - mv3);
        if (i == 0 || min_reproj > reproj) {
            ns = i;
            min_reproj = reproj;
        }
    }

    for(int i = 0; i < 3; i++) {
        for(int j = 0; j < 3; j++)
            R[i][j] = Rs[ns][i][j];
        t[i] = ts[ns][i];
    }

    return true;
}

再继续跳转到这里:

//没有第四个点对
int p3p::solve(double R[4][3][3], double t[4][3],
    double mu0, double mv0,   double X0, double Y0, double Z0,
    double mu1, double mv1,   double X1, double Y1, double Z1,
    double mu2, double mv2,   double X2, double Y2, double Z2)
{
    double mk0, mk1, mk2;
    double norm;
   // inv_fx = 1. / fx;
   // inv_fy = 1. / fy;
   // cx_fx = cx / fx;
   // cy_fy = cy / fy;
   //mu0 = x
   //mv0 = y
   // 将图像座标系的点投影到一个球面
    mu0 = inv_fx * mu0 - cx_fx;
    mv0 = inv_fy * mv0 - cy_fy;
    //求第一个座标x,y到原点的距离
    norm = sqrt(mu0 * mu0 + mv0 * mv0 + 1);
    //将数据单位化
    //原来点的座标x,y被单位化了
    mk0 = 1. / norm; mu0 *= mk0; mv0 *= mk0;

    //同样,单位化第二个点mu1, mv1
    mu1 = inv_fx * mu1 - cx_fx;
    mv1 = inv_fy * mv1 - cy_fy;
    norm = sqrt(mu1 * mu1 + mv1 * mv1 + 1);
    mk1 = 1. / norm; mu1 *= mk1; mv1 *= mk1;

    //单位化第三个点 mu2, mv2
    mu2 = inv_fx * mu2 - cx_fx;
    mv2 = inv_fy * mv2 - cy_fy;
    norm = sqrt(mu2 * mu2 + mv2 * mv2 + 1);
    mk2 = 1. / norm; mu2 *= mk2; mv2 *= mk2;

    double distances[3];
    //计算第1个3D点和第2个3D点的欧式距离
    distances[0] = sqrt( (X1 - X2) * (X1 - X2) + (Y1 - Y2) * (Y1 - Y2) + (Z1 - Z2) * (Z1 - Z2) );
    //计算第0个3D点和第2个3D点的欧式距离
    distances[1] = sqrt( (X0 - X2) * (X0 - X2) + (Y0 - Y2) * (Y0 - Y2) + (Z0 - Z2) * (Z0 - Z2) );
    //计算第0个3D点和第1个3D点的欧式距离
    distances[2] = sqrt( (X0 - X1) * (X0 - X1) + (Y0 - Y1) * (Y0 - Y1) + (Z0 - Z1) * (Z0 - Z1) );

    // Calculate angles
    double cosines[3];
    cosines[0] = mu1 * mu2 + mv1 * mv2 + mk1 * mk2;
    cosines[1] = mu0 * mu2 + mv0 * mv2 + mk0 * mk2;
    cosines[2] = mu0 * mu1 + mv0 * mv1 + mk0 * mk1;

    double lengths[4][3];
    int n = solve_for_lengths(lengths, distances, cosines);

    int nb_solutions = 0;
    for(int i = 0; i < n; i++) {
        double M_orig[3][3];

        M_orig[0][0] = lengths[i][0] * mu0;
        M_orig[0][1] = lengths[i][0] * mv0;
        M_orig[0][2] = lengths[i][0] * mk0;

        M_orig[1][0] = lengths[i][1] * mu1;
        M_orig[1][1] = lengths[i][1] * mv1;
        M_orig[1][2] = lengths[i][1] * mk1;

        M_orig[2][0] = lengths[i][2] * mu2;
        M_orig[2][1] = lengths[i][2] * mv2;
        M_orig[2][2] = lengths[i][2] * mk2;

        if (!align(M_orig, X0, Y0, Z0, X1, Y1, Z1, X2, Y2, Z2, R[nb_solutions], t[nb_solutions]))
            continue;

        nb_solutions++;
    }

    return nb_solutions;
}

欢迎使用Markdown编辑器写博客

本Markdown编辑器使用StackEdit修改而来,用它写博客,将会带来全新的体验哦:

  • 列表内容
  • Markdown和扩展Markdown简洁的语法
  • 代码块高亮
  • 图片链接和图片上传
  • LaTex数学公式
  • UML序列图和流程图
  • 离线写博客
  • 导入导出Markdown文件
  • 丰富的快捷键

快捷键

  • 加粗 Ctrl + B
  • 斜体 Ctrl + I
  • 引用 Ctrl + Q
  • 插入链接 Ctrl + L
  • 插入代码 Ctrl + K
  • 插入图片 Ctrl + G
  • 提升标题 Ctrl + H
  • 有序列表 Ctrl + O
  • 无序列表 Ctrl + U
  • 横线 Ctrl + R
  • 撤销 Ctrl + Z
  • 重做 Ctrl + Y
  • sdfk
  • sdflk

dkskkf
这里写代码片

Markdown及扩展

Markdown 是一种轻量级标记语言,它允许人们使用易读易写的纯文本格式编写文档,然后转换成格式丰富的HTML页面。 —— [ 维基百科 ]

使用简单的符号标识不同的标题,将某些文字标记为粗体或者斜体,创建一个链接等,详细语法参考帮助?。

本编辑器支持 Markdown Extra ,  扩展了很多好用的功能。具体请参考Github.

表格

Markdown Extra 表格语法:

项目 价格
Computer $1600
Phone $12
Pipe $1

可以使用冒号来定义对齐方式:

项目 价格 数量
Computer 1600 元 5
Phone 12 元 12
Pipe 1 元 234

定义列表

Markdown Extra 定义列表语法:
项目1
项目2
定义 A
定义 B
项目3
定义 C

定义 D

定义D内容

代码块

代码块语法遵循标准markdown代码,例如:

@requires_authorization
def somefunc(param1='', param2=0):
    '''A docstring'''
    if param1 > param2: # interesting
        print 'Greater'
    return (param2 - param1 + 1) or None
class SomeClass:
    pass
>>> message = '''interpreter
... prompt'''

脚注

生成一个脚注1.

目录

[TOC]来生成目录:

数学公式

使用MathJax渲染LaTex 数学公式,详见math.stackexchange.com.

  • 行内公式,数学公式为:Γ(n)=(n1)!n
  • 块级公式:

x=b±b24ac2a

更多LaTex语法请参考 这儿.

UML 图:

可以渲染序列图:

Created with Raphaël 2.1.0张三张三李四李四嘿,小四儿, 写博客了没?李四愣了一下,说:忙得吐血,哪有时间写。

或者流程图:

Created with Raphaël 2.1.0开始我的操作确认?结束yesno
  • 关于 序列图 语法,参考 这儿,
  • 关于 流程图 语法,参考 这儿.

离线写博客

即使用户在没有网络的情况下,也可以通过本编辑器离线写博客(直接在曾经使用过的浏览器中输入write.blog.csdn.net/mdeditor即可。Markdown编辑器使用浏览器离线存储将内容保存在本地。

用户写博客的过程中,内容实时保存在浏览器缓存中,在用户关闭浏览器或者其它异常情况下,内容不会丢失。用户再次打开浏览器时,会显示上次用户正在编辑的没有发表的内容。

博客发表后,本地缓存将被删除。 

用户可以选择 把正在写的博客保存到服务器草稿箱,即使换浏览器或者清除缓存,内容也不会丢失。

注意:虽然浏览器存储大部分时候都比较可靠,但为了您的数据安全,在联网后,请务必及时发表或者保存到服务器草稿箱

浏览器兼容

  1. 目前,本编辑器对Chrome浏览器支持最为完整。建议大家使用较新版本的Chrome。
  2. IE9以下不支持
  3. IE9,10,11存在以下问题
    1. 不支持离线功能
    2. IE9不支持文件导入导出
    3. IE10不支持拖拽文件导入


  1. 这里是 脚注内容.
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章