For a second order differentiable univariate function f ( x ) f(x) f(x), the conditions for the existence of extremum points are,
Similarly, for a second order differentiable multivariate function f ( x ) f(\bm{x}) f(x), the conditions for the existence of extremum points are,
In summary, for a second order differentiable function f ( x ) f(\bm{x}) f(x), the conditions for the existence of local minima(maxima) are the gradient ∇ f ( x ) \nabla f(\bm{x}) ∇f(x) equals to 0 \bm{0} 0 and the Hessian matrix ∇ 2 f ( x ) \nabla^2 f(\bm{x}) ∇2f(x) is positive(negative) definite.
In addition to using the definition to determine whether a function is convex, we can also use the following two necessary and sufficient conditions:
First Order Condition: Assume
R
c
\mathbb{R}_c
Rc is a open convex set on Euclidean space
E
n
E_n
En, and
f
(
x
)
f(\bm{x})
f(x) is differentiable. For any two points
x
1
∈
R
c
\bm{x}_1 \in \mathbb{R}_c
x1∈Rc and
x
2
∈
R
c
\bm{x}_2 \in \mathbb{R}_c
x2∈Rc
f
(
x
2
)
≥
f
(
x
1
)
+
∇
f
(
x
1
)
T
(
x
2
−
x
1
)
f(\bm{x_2}) \geq f(\bm{x_1}) + \nabla f(\bm{x}_1)^T(\bm{x}_2 - \bm{x}_1)
f(x2)≥f(x1)+∇f(x1)T(x2−x1)
Then
f
(
x
)
f(\bm{x})
f(x) is a convex function. And if
f
(
x
2
)
>
f
(
x
1
)
+
∇
f
(
x
1
)
T
(
x
2
−
x
1
)
f(\bm{x_2}) > f(\bm{x_1}) + \nabla f(\bm{x}_1)^T(\bm{x}_2 - \bm{x}_1)
f(x2)>f(x1)+∇f(x1)T(x2−x1)
Then
f
(
x
)
f(\bm{x})
f(x) is a strictly convex function.
Second Order Condition: Assume
R
c
\mathbb{R}_c
Rc is a open convex set on Euclidean space
E
n
E_n
En, and
f
(
x
)
f(\bm{x})
f(x) is second order differentiable, if its Hessian matrix is semi positive definite
∇
2
f
(
x
)
⪰
0
\nabla^2 f(\bm{x}) \succeq 0
∇2f(x)⪰0
then
f
(
x
)
f(\bm{x})
f(x) is a convex function. And if
∇
2
f
(
x
)
≻
0
\nabla^2 f(\bm{x}) \succ 0
∇2f(x)≻0
then
f
(
x
)
f(\bm{x})
f(x) is a strictly convex function.
Consider the following nonlinear optimization problem,
(NLP)
min
f
(
x
)
s
.
t
.
g
i
(
x
)
≤
0
,
∀
i
=
1
,
.
.
.
,
m
h
j
(
x
)
=
0
,
∀
i
=
1
,
.
.
.
,
p
\begin{aligned} \textit{(NLP)}\min \quad & f(\bm{x}) \\ s.t. \quad & g_i(\bm{x}) \leq 0, \forall i = 1, ..., m \\ & h_j(\bm{x}) = 0, \forall i = 1, ..., p \end{aligned}
(NLP)mins.t.f(x)gi(x)≤0,∀i=1,...,mhj(x)=0,∀i=1,...,p
in which
f
(
x
)
,
g
(
x
)
f(\bm{x}), g(\bm{x})
f(x),g(x) and
h
(
x
)
h(\bm{x})
h(x) are differentiable, the functions
g
(
x
)
g(\bm{x})
g(x) and
h
(
x
)
h(\bm{x})
h(x) are assumed to satisfy certain regularity conditions also known as constraint qualifications (e.g. the functions
g
(
x
)
g(\bm{x})
g(x) and
h
(
x
)
h(\bm{x})
h(x) should be affine functions).
The Lagrange function is,
L
(
x
,
λ
,
μ
)
=
f
(
x
)
+
λ
T
g
(
x
)
+
μ
T
h
(
x
)
\mathcal{L}(\bm{x}, \bm{\lambda}, \bm{\mu}) = f(\bm{x}) + \bm{\lambda}^T g(\bm{x}) + \bm{\mu}^Th(\bm{x})
L(x,λ,μ)=f(x)+λTg(x)+μTh(x)
The Lagrangian dual function is,
d
(
λ
,
μ
)
=
inf
x
L
(
x
,
λ
,
μ
)
d(\bm{\lambda}, \bm{\mu}) = \inf_{\bm{x}} \mathcal{L}(\bm{x}, \bm{\lambda}, \bm{\mu})
d(λ,μ)=xinfL(x,λ,μ)
The the Lagrangian dual problem is,
max
λ
,
μ
d
(
λ
,
μ
)
\max_{\bm{\lambda}, \bm{\mu}} d(\bm{\lambda}, \bm{\mu})
λ,μmaxd(λ,μ)
The Karush-Kuhn-Tucker conditions can be described as:
KKT conditions are necessary conditions for optimality. If f ( x ) f(\bm{x}) f(x) is a convex function, and constraint functions g ( x ) g(\bm{x}) g(x) and h ( x ) h(\bm{x}) h(x) satisfies Slater’s condition, then strong duality holds, KKT conditions are sufficient and necessary.
For unconstrained convex minimization problem, in a few special cases, we can find a solution through analytically solving the optimality equation( ∇ f ( x ) = 0 \nabla f(\bm{x}) = \bm{0} ∇f(x)=0), but usually the problem must be solved by an iterative algorithm, such as Newton’s method, gradient descent method, steepest descent method.
For solving constrained convex minimization problem, one famous interior point method is barrier method. Interior point methods solve the KKT conditions by applying Newton’s method to a sequence of equality constrained problems, or to a sequence of modified versions of the KKT conditions.
The following paragraph is from Prof. Boyd’s book, Convex Optimization,
We can view interior-point methods as another level in the hierarchy of convex optimization algorithms.
[1] Boyd, Stephen, Stephen P. Boyd, and Lieven Vandenberghe. Convex optimization. Cambridge university press, 2004.