[NOTE in progress] ECE236C - Optimization Methods for Large-Scale Systems [on going]

Source:http://www.seas.ucla.edu/~vandenbe/ee236c.html

 

Introduction

Outline

  1. First-order algorithms
  2. Decomposition and splitting
  3. Second-order algorithms for unconstrained optimization
  4. Interior point for conic optimization

Gradient

Convexity

- ∇^2(f (x) )> 0 is not necessary for strict convexity (cf., f (x) = x^4) but sufficient

- a differentiable function f is convex if and only if dom f is convex and (∇ f (x) − ∇ f (y)) ^T (x − y) ≥ 0 for all x, y ∈ dom f i.e., the gradient ∇ f : Rn → Rn is a monotone mapping

- Lipschitz Continuity 即利普希茨連續條件,是一個比通常連續更強的光滑性條件。 直覺上,利普希茨連續函數使用dual norm限制了函數改變的速度(從效果上感覺與限制二階導數ub類似),符合利普希茨條件的函數的斜率,必小於一個稱爲利普希茨常數的實數(該常數依函數而定)。

- ||∇ f (x) − ∇ f (y)||∗ ≤ L||x − y|| for all x, y ∈ dom f 則f爲L-smooth。此處的dual norm ||z||_*=sup_{||v||\leq 1}||z^Tv||,定義是對於所有normed vector v,線性關係zv的模的最大值。並且易見:z^Tv=||v||\cdot z^T\frac{v}{||v||}\leq||v||\cdot||z||_*

- Convexity vs Strict Convexity vs Strong Convexity (更加凸) (wiki)

- By using the property of convexity and Lipschitz continuity, upper and lower bound of f(x+delta) can be proved with a quadratic term. The equations reminds the second order Taylor approximation. After that, it can be proved by choosing step size smartly, gradient decent does the work at each iteration.

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章