Quasi-Newton擬牛頓法（共軛方向法）

3. 共軛方向法

4. 擬牛頓法

1. Introduction

擬牛頓法可以理解爲使用迭代的方法近似Hessian矩陣，但是擬牛頓法本質上其實是共軛方向法，所以用共軛方向法來理解擬牛頓法更加貼切。

本文的主要內容來自於《最優化導論》（《An introduction to optimization》）

2. 牛頓法

牛頓法在很多地方都有詳細的說明，就不在這裏贅述了。

2.1 不能保證收斂

一般的非線性函數，牛頓法不能保證從任意起始點都可以收斂到極小值點。結果可能會隨着迭代在極小值附件震盪，甚至越走越遠。這就要求我們設置合理的步長。

2.2 Hessian計算複雜

另外牛頓法中Hessian矩陣計算十分複雜，於是就引入了擬牛頓法，可以設計近似矩陣來代替複雜的Hessian矩陣。

3. 共軛方向法

共軛方向法的求解的主要是n維的二次型函數：

通過找到關於Q的一系列共軛方向d，然後分別從每個共軛方向上優化，最終可以在n步之內得到結果。

同時爲了更方便的找到共軛方向，引入使用迭代求出共軛方向的方法，如下的共軛梯度法：

爲了說明清楚，我們需要：

定義共軛方向
通過朝共軛方向更新x，可以收斂到極小值
共軛梯度法構造得到的是共軛方向

3.1 共軛方向

共軛方向的定義如下，Q是上面所述的二次型函數中的表達。

關於共軛方向還有一個重要引理：

3.2 共軛方向上可以收斂到極小

3.3 共軛梯度法得到的是Q上的共軛方向

這個可以通過數學歸納法證明，證明可以參見原書。我們只需要知道共軛梯度法得到的d確實是Q的共軛方向。

3.4 算法效果

共軛方向法的效率在最速下降法和牛頓法之間。
對n維的二次型問題，n步之內可以得到結果。
不需要計算Hessian矩陣。
也不需要存儲n*n的矩陣，不需要求逆運算。

Matlab compare example

%block preconditioning solution
clear all;
ex_blockprecond;

A_full = full(A);
fprintf('\nStarting dense direct solve ...\n');
time_start = cputime;
x_star_dense = A_full\b;
time_end = cputime;
relres_dense = norm(A*x_star_dense - b)/norm(b);
fprintf('Relative residual: %e\n', relres_dense);
fprintf('Dense direct solve done.\nTime taken: %e\n',...
time_end - time_start);

fprintf('\nStarting sparse direct solve ...\n');
time_start = cputime;
x_star_sparse = A\b;
time_end = cputime;
relres_sparse = norm(A*x_star_dense - b)/norm(b);
fprintf('Relative residual: %e\n', relres_sparse)
fprintf('Sparse direct solve done.\nTime taken: %e\n\n',...
time_end - time_start);

fprintf('\nStarting CG ...\n');
time_start = cputime;
[x,flag,relres,iter,resvec] = pcg(A,b,1e-4,200);
time_end = cputime;
fprintf('CG done. Status: %d\nTime taken: %e\n', flag,...
time_end - time_start);
figure; semilogy(resvec/norm(b), '.--'); hold on;
set(gca,'FontSize', 16, 'FontName', 'Times');
xlabel('cgiter'); ylabel('relres');

time_start = cputime;
L = chol(A_blk)';
time_end = cputime;
tchol = time_end - time_start;
fprintf('\nCholesky factorization of A blk. Time taken: %e\n', tchol);

fprintf('\nStarting CG with block preconditioning ...\n');
time_start = cputime;
[x,flag,relres,iter,resvec] = pcg(A,b,1e-8,200,L,L');
time_end = cputime;
fprintf('PCG done. Status: %d\nTime taken: %e\n', flag,...
time_end - time_start);

semilogy(resvec/norm(b), 'k.-'); hold on;
print('-depsc', 'ex_blockprecond_relres.eps');
fprintf('Total time block preconditioned PCG: %e\n',...
tchol+time_end - time_start);

4. 擬牛頓法

逆牛頓法的主要步驟如下，它實質上是一致共軛方向法。可以發現，它其實是構造了一系列共軛方向，然後用共軛方向法更新變量x。

4.1 擬牛頓法構造的是Q的共軛方向

擬牛頓法是共軛方向法，可以在n此迭代內得到結果。
擬牛頓法是通過H迭代構造共軛方向，H的構造則有很多方法，下面介紹幾個常用的。

4.2 確定Hk - 秩1修正公式

4.2 確定Hk - DFP

4.3 確定Hk - BFGS

由於我們在運算中使用的都是H的逆，於是BFGS只關注H的逆（記爲B）。

4.4 BFGS ceres

BFGS ceres 代碼可以在這裏找到ceres中實現BFGS的代碼步驟。
爲了理解代碼，我們先寫出BFGS的另一種表達方式，並且證明它與上面的表達式是等價的。

      // Efficient O(num_parameters^2) BFGS update [2].
      //
      // Starting from dense BFGS update detailed in Nocedal [2] p140/177 and
      // using: y_k = delta_gradient, s_k = delta_x:
      //
      //   \rho_k = 1.0 / (s_k' * y_k)
      //   V_k = I - \rho_k * y_k * s_k'
      //   H_k = (V_k' * H_{k-1} * V_k) + (\rho_k * s_k * s_k')
      //
      // This update involves matrix, matrix products which naively O(N^3),
      // however we can exploit our knowledge that H_k is positive definite
      // and thus by defn. symmetric to reduce the cost of the update:
      //
      // Expanding the update above yields:
      //
      //   H_k = H_{k-1} +
      //         \rho_k * ( (1.0 + \rho_k * y_k' * H_k * y_k) * s_k * s_k' -
      //                    (s_k * y_k' * H_k + H_k * y_k * s_k') )
      //
      // Using: A = (s_k * y_k' * H_k), and the knowledge that H_k = H_k', the
      // last term simplifies to (A + A'). Note that although A is not symmetric
      // (A + A') is symmetric. For ease of construction we also define
      // B = (1 + \rho_k * y_k' * H_k * y_k) * s_k * s_k', which is by defn
      // symmetric due to construction from: s_k * s_k'.
      //
      // Now we can write the BFGS update as:
      //
      //   H_k = H_{k-1} + \rho_k * (B - (A + A'))

      // For efficiency, as H_k is by defn. symmetric, we will only maintain the
      // *lower* triangle of H_k (and all intermediary terms).

如上面的ceres的描述中描述的，對BFGS中的參數做了進一步標記：

然後可以自然的得到下面的代碼：

      const double rho_k = 1.0 / delta_x_dot_delta_gradient;

      // Calculate: A = s_k * y_k' * H_k
      Matrix A = delta_x * (delta_gradient.transpose() *
                            inverse_hessian_.selfadjointView<Eigen::Lower>());

      // Calculate scalar: (1 + \rho_k * y_k' * H_k * y_k)
      const double delta_x_times_delta_x_transpose_scale_factor =
          (1.0 + (rho_k * delta_gradient.transpose() *
                  inverse_hessian_.selfadjointView<Eigen::Lower>() *
                  delta_gradient));
      // Calculate: B = (1 + \rho_k * y_k' * H_k * y_k) * s_k * s_k'
      Matrix B = Matrix::Zero(num_parameters_, num_parameters_);
      B.selfadjointView<Eigen::Lower>().
          rankUpdate(delta_x, delta_x_times_delta_x_transpose_scale_factor);

      // Finally, update inverse Hessian approximation according to:
      // H_k = H_{k-1} + \rho_k * (B - (A + A')).  Note that (A + A') is
      // symmetric, even though A is not.
      inverse_hessian_.triangularView<Eigen::Lower>() +=
          rho_k * (B - A - A.transpose());
    }

    *search_direction =
        inverse_hessian_.selfadjointView<Eigen::Lower>() *
        (-1.0 * current.gradient);

Quasi-Newton擬牛頓法（共軛方向法）

Quasi-Newton擬牛頓法（共軛方向法）

1. Introduction

2. 牛頓法

2.1 不能保證收斂

2.2 Hessian計算複雜

3. 共軛方向法

3.1 共軛方向

3.2 共軛方向上可以收斂到極小

3.3 共軛梯度法得到的是Q上的共軛方向

3.4 算法效果

4. 擬牛頓法

4.1 擬牛頓法構造的是Q的共軛方向

4.2 確定Hk - 秩1修正公式

4.2 確定Hk - DFP

4.3 確定Hk - BFGS

4.4 BFGS ceres

測試人員都是畫畫大神，讓我看看誰還不會用代碼圖？

Object.values()對象遍歷

PCL點雲特徵小結

C/C++ socket basic example with code

VINS 代碼閱讀分析 (1)

LOAM, ALOAM, LegoLOAM, hdl graph slam比較

CVX based SLAM algorithms paper read

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結