1. Lipschitz gradient
If ∇f(x) L-is lipschitz continuous, then we have
12L∥∇f(x)∥22≤f(x)−f(x∗)≤L2∥x−x∗∥22
The right hand is because that
f(x)≤f(x∗)+∇f(x∗)⋅(x−x∗)+L2∥x−x∗∥22=L2∥x−x∗∥22
The left hand is because that
f(x∗)≤f(y)≤infyf(x)+∇f(x)⋅(y−x)+L2∥y−x∥22=f(x)−12L∥∇f(x)∥22
This tells that function with lipschitz derivative is upper bounded by a quadratic function.
2. Strong continuous
If f(x) is m-strong continuous, then we have
m2∥x−x∗∥22≤f(x)−f(x∗)≤12m∥∇f(x)∥22
The left hand is because that
f(x)≥f(x∗)+∇f(x∗)⋅(x−x∗)+m2∥x−x∗∥2=m2∥x−x∗∥2
The left hand is because that ∀y
f(y)≥infyf(x)+∇f(x)⋅(y−x)+m2∥x−y∥22=f(x)+12m∥∇f(x)∥22
so
f(x∗)≥f(x)+12m∥∇f(x)∥22
This tells that function with strong convexity is lower bounded by a quadratic function.
3. Co-coercivity of gradient
3.1 Lipschitz of gradient
If f(x) is convex and L2∥x∥22−f(x) is convex(∇f(x) is lipschitz continuous), then we have
0≤(∇f(x)−∇f(y))⋅(x−y)≤L∥x−y∥22
which can be rewritten as
((Lx−∇f(x))−(Ly−∇f(y)))⋅(x−y)≥0
which says that
g(z)=L2∥z∥22−f(z)
with increasing derivative
Lz−∇f(z) , is convex. So both of
g(z)+f(x)=L2∥z∥22−f(z)+∇f(x)⋅zfx(z)g(z)+f(y)=L2∥z∥22−f(z)+∇f(y)⋅zfy(z)
are convex, then both of
fx(z) and
fy(z) are L-lipschitz. So
f(y)−f(x)−∇f(x)⋅(y−x)=(f(y)+∇f(x)⋅y)−(f(x)−∇f(x)⋅x)=fx(y)−fx(x)≥12L∥∇fx(y)∥22=12L∥∇f(y)−∇f(x)∥22
the same
f(x)−f(y)−∇f(y)⋅(x−y)≥12L∥∇f(y)−∇f(x)∥22
combining these two, we get the co-coercivity of L-lipschitz gradient is
(∇f(x)−∇f(y))⋅(x−y)≥1L∥∇f(y)−∇f(x)∥22
3.2 Strong convex
If f(x) is m-strong convex, then we have
h(x)=f(x)−m2∥x∥22
is convex.
And from theorem in blog
http://blog.csdn.net/comeyan/article/details/50541596#2-strong-convex
there exists a M such that ∇f(x) is M-lipschitz. So we have
∥∇h(x)−∇h(y)∥2=∥∇f(x)−mx−∇f(y)+m(y)∥2≤(M+m)∥x−y∥2
So ∇h(x) is M+m lipschitz continuous. From last subsection, we know that
(∇h(x)−∇h(y))⋅(x−y)≥1M+m∥∇h(y)−∇h(x)∥22⇒(∇f(x)−mx−∇f(z)+my)⋅(x−y)≥1M+m∥∇f(x)−mx−∇f(z)+my∥22⇒(∇f(x)−∇f(y))⋅(x−y)≥1M+m∥∇f(x)−∇f(y)∥22+mMM+m∥x−y∥22