PCG(preconditioned conjugate gradient) for RCS(reduced camera system) in SLAM

1. Introduction

Try to reproduce the article Pushing the Envelope of Modern Methods for Bundle Adjustment.

The main contribution of this article are :

  • Use BLAS to rewrite the optimziation algorithms.
  • A novel embedded point iterations (EPIs).
  • Use block-based preconditioned conjugate gradients.

1.1 Linear algebra softwares

Recommond to read this slide from Standford Convex optimzation course EE364B website .

  • BLAS offers three levels of basic matrix operations.
  • LAPACK offers more high level linear algebra algorithms.
  • Both can be called from C++ (our main develop language).
  • Eigen is another matrix calculation library (has better interface, but little bit slows). See their Benchmark, And also see How does Eigen compare to BLAS/LAPACK?

In my point of view, I prefer Eigen , as it is easier to use, and it offers NEON acceleration (As I am focus on android cellphone applications).

1.2 Preconditioned Conjugate Gradient

see my blog of Conjugate Gradient for more details.

  • In the upper blog, I have tested solving SLAM hessien matrix useing PCG, while it didn’t work well.
  • But now we know (from this article), that PCG will accelerate the system when it is used to the RCS (Reduced camera system), which is the linear system after schur complement (which marginalize the feature point parameters). So we will go on this direction.

1.3 Prepare real SLAM data

  • We will extract a real SLAM data from ORB_SLAM2 system. Code in gitee
  • One small RCS system with 33 camera frames. (data in gitee)
    在這裏插入圖片描述
    (Left image is the original data, while in the right side, we want to show the sparsity pattern, by assigning all non-zero elements to “1”)
  • Another larger RCS system with 230 camera frames. (data in gitee)
    在這裏插入圖片描述
  • Another larger RCS system with 710 camera frames. (As in my later tests, I find the upper two sets are too small for my PC, and I don’t know how to limit my matlab cpu usage. So I made a much larger set to better the acceleration)
    在這裏插入圖片描述

2. PCG test

2.1 PCG Algorithm

  • PCG use the same as my last blog , matlab code could be found here (which has shown to be faster than the matlab offical version).
  • Preconditioning with a block diagonal matrix (bandwidth choose to be 6, the DOF of camera frame).
time_start = cputime;
n = length(A);
A_approx = sparse(n, n);
m = 6;
for i = 1:n/m
    Asub = A(((i-1)*m+1):(i*m),((i-1)*m+1):(i*m));
    A_approx(((i-1)*m+1):(i*m),((i-1)*m+1):(i*m)) = Asub;
end
L = chol(A_approx)';
L = chol(A_approx)';
time_end = cputime;
tchol = time_end - time_start;
fprintf('\nCholesky factorization of A approx. Time taken: %e\n', tchol);

fprintf('\nStarting Mine PCG ...\n');
time_start = cputime;
[x,res,resvec] = mine_pcg(A,b,L, L', 1e-4,200);
time_end = cputime;
fprintf('Mine PCG done. With %d iterations\nTime taken: %e\n', ...
length(res),time_end - time_start);
fprintf('\nTotal time includes Cholesky factorization: %e\n', tchol+time_end - time_start);
semilogy(resvec/norm(b), '.--'); hold on;
set(gca,'FontSize', 16, 'FontName', 'Times');
xlabel('cgiter'); ylabel('relres');

2.2 Direct solve

fprintf('\nStarting dense direct solve ...\n');
time_start = cputime;
x_star_dense = A\b;
time_end = cputime;
relres_dense = norm(A*x_star_dense - b)/norm(b);
fprintf('Relative residual: %e\n', relres_dense);
fprintf('Dense direct solve done.\nTime taken: %e\n',...
time_end - time_start);

2.3 CG

  • Mine CG matlab implementation could be found here
  • Tested to use exactly the same time as the matlab offical version.
  • Different from the paper, I used Cholesky factorization.
fprintf('\nStarting Mine CG ...\n');
time_start = cputime;
[x,res,resvec] = mine_cg(A,b,1e-4,200);
time_end = cputime;
fprintf('Mine CG done.  With %d iterations\nTime taken: %e\n',...
length(res),time_end - time_start);
semilogy(resvec/norm(b), '.--'); hold on;
set(gca,'FontSize', 16, 'FontName', 'Times');
xlabel('cgiter'); ylabel('relres');

2.4 Final Output

在這裏插入圖片描述
在這裏插入圖片描述

Algorithm Full Time
Direct solve 1.2656
CG Failed
PCG 0.125

In a word, We have a 10 times speed up !

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章