Line Search Methods

原創

2020-07-04 18:42

重點

Armijo condition的直觀理解

背景: In gradient descent algorithms, step size may be too large or too small, as shown in the figures below.

Backtracking line search

 - Initialization: alpha (=1), tau (decay rate)
 - while f(x^t + alpha p^t) ">" f(x^t)
	 alpha = tau*alpha
	end
 - Update x^{t+1} = x^t

$\alpha = \alpha * \tau$ 的作用是防止step length過小
$\textcolor{blue}{f(x^t + \alpha p^t) “>” f(x^t): \text{ prevent steps that are too long relative to the decrease in } f \text{；通過Armijo condition實現}}$

Wolfe conditions

Armijo condition

$f(x^t + \alpha^t p^t) \leq f(x^t) + \alpha^t c_1 \cdot [g^t]^T p^k,$
where $g^t$ denotes the first derivative of $f$ ; $p$ denotes the direction, $g^T p <0$ (remark: gradient descent $-g \Leftrightarrow p$ ).
In practice, $c_1$ is chosen quite small, say $c_1=10^{-4}$ .
In the case that $p=g$ , the Armijo condition in the 2nd step of pseudo-code step can be simplified as follows:
$f(x - \alpha \nabla(f)) > f(x) - c_1\alpha \|\nabla(f) \|_2^2 .$
*B&V book, $c_1 \in [0.01,0.03], \tau \in [0.1,0.8]$

$\textcolor{blue}{直觀理解}$

require the reduction in $f$ to be at least a fixed fraction $\beta$ of the reduction promised by the first-order Taylor approximation of $f$ at $x^t$ .
aka significant decrease condition: require $\alpha$ to decrease the objective function by a significant amount.

Curvature condition

The curvature condition rules out small steps.
$\nabla f(x^t + \alpha^t p^t)^T p^t \geq c_2 \nabla f(x^t)^T p^t,$
where $c_2 \in (c_1,1)$ .
$\textcolor{gray}{\text{The condition requires that the new slope is at least } c_2 \text{ times the original gradient.}}$

圖片來源

step sizes: https://people.maths.ox.ac.uk/hauser/hauser_lecture2.pdf
Figs 3.3, 3.4:
Numerical Optimization

參考文獻：

https://people.maths.ox.ac.uk/hauser/hauser_lecture2.pdf
https://optimization.mccormick.northwestern.edu/index.php/Line_search_methods
Numerical Optimization

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

Line Search Methods

Backtracking line search

Wolfe conditions

Armijo condition

Curvature condition

「Pygors跨平臺GUI」1：Pygors跨平臺GUI應用研究

[轉帖]

python列出centos7內存使用前50的進程信息

「Pygors跨平臺GUI」2：安裝MinGW-w64、MSYS2還是WSL2

一鍵自動化博客發佈工具,用過的人都說好(掘金篇)

通義千問 2.5 “客串” ChatGPT4，你分的清嗎？

Garnet：微軟官方基於.NET開源的高性能分佈式緩存存儲數據庫

Flink執行圖

Java響應式編程

評估統計算法在銀行僞造鈔票檢測中的價值

梯度下降、隨機梯度下降法、及其改進

機器學習中的凸和非凸優化問題

L1正則項與稀疏性

驗證梯度的正確性

Deep Learning相關概念

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結