9.1. Convex OptimizationΒΆ

Main references for this section are [5, 7, 10].

9.1.1. Convex Optimization ProblemsΒΆ

Definition 9.1 (Convex optimization problem general form)

Let V be an n-dimensional real vector space. Let f:Vβ†’R be a convex function with S=domf. Let CβŠ†SβŠ†V be a convex set.

A mathematical optimization problem of the form

minimize f(x)subject to x∈C

is known as a convex optimization problem.

In other words, a convex optimization problem minimizes a convex function over a convex set. We can write the optimal value of the minimization problem as

pβˆ—=inf{f(x)|x∈C}.

The set of optimal points for the minimization problem is

Xopt={x∈C|f(x)=pβˆ—}.

Note

  1. f is the objective function (or cost function) being minimized.

  2. x is the optimization variable.

  3. C is the feasible set. Any x∈C is a feasible point.

  4. The problem is feasible if Cβ‰ βˆ….

  5. If C=βˆ…, then the problem is infeasible.

  6. If the problem is infeasible then pβˆ—=∞.

  7. If the problem is feasible then pβˆ—<∞.

  8. If pβˆ—=βˆ’βˆž, then the problem is unbounded below.

  9. We say that xβˆ— is an optimal point for the minimization problem if f(xβˆ—)=pβˆ—.

  10. If Xopt is nonempty, then we say that the optimal value pβˆ— is attained at some point.

  11. It is possible that pβˆ— is finite and yet the optimal set is empty.

Remark 9.1 (Closedness of the feasible set)

Some authors prefer to define convex optimization as the minimization of a convex function over a closed and convex set.

  1. We have not insisted that C be a closed set in the general definition of convex optimization problem.

  2. The closedness of C is an additional property which can provide additional guarantees on the existence of an optimal point. We shall introduce it whenever needed in the discussion below.

  3. Recall from Theorem 8.170 that a closed and convex set C is an intersection of all the halfspaces that contain it. Let {Ai}i∈I be the set of halfspaces that contains C.

  4. Then, each half space can be written as

    Ai={x∈V|⟨x,aiβŸ©β‰€bi}.
  5. Thus, x∈C is equivalent to ⟨x,aiβŸ©β‰€bi for every i∈I.

  6. This gives us the motivation to consider the feasible set in the form of intersection of sublevel sets of convex functions.

Majority of convex optimization problems can be transformed into a functional form where the constraints are expressed in the form of sublevel sets of convex functions and the level sets of affine functions.

9.1.1.1. Convex Optimization Standard FormΒΆ

Definition 9.2 (Convex optimization problem standard form)

Let V be an n-dimensional real vector space. A mathematical optimization problem of the form

(9.1)ΒΆminimize f0(x)subject to fi(x)≀0,i=1,…,mhj(x)=0,j=1,…,p

with optimization variable x∈V is called a convex optimization problem in standard form if

  1. The objective function f0:V→R is a convex function.

  2. The inequality constraint functions fi:Vβ†’R are convex functions for i=1,…,m.

  3. The equality constraint functions hj:Vβ†’R are affine functions for j=1,…,p.

Observation 9.1 (Discussion on the convex optimization standard form)

  1. The domain of the problem is given by

    D=domf0βˆ©β‹‚i=1mdomfiβˆ©β‹‚j=1pdomhj.
  2. domhj=V for every j=1,…,p since hj are affine functions.

  3. Thus,

    D=domf0βˆ©β‹‚i=1mdomfi.
  4. By definition domfi are convex for i=0,…,m. Hence D is convex.

  5. The feasible set C is given by

    C=domf0βˆ©β‹‚i=1mfiβˆ’1(βˆ’βˆž,0]βˆ©β‹‚j=1phjβˆ’1(0).
  6. Then, fiβˆ’1(βˆ’βˆž,0] are sublevel sets of convex functions hence they are also convex sets.

  7. By Theorem 4.211, the level sets of affine functions are affine sets.

  8. Thus, hjβˆ’1(0) is an affine set. Hence, it is a convex set.

  9. Thus, C is an intersection of convex sets.

  10. Thus, C is a convex set.

  11. We note that we can rewrite the standard form as

    minimize f0(x)subject to x∈C

    to match with Definition 9.1.

Remark 9.2 (Adding closedness of objective and constraint functions)

Suppose we add an additional requirement that the function fi for i=0,…,m are closed.

  1. Recall from Definition 3.64 that a function is closed if all its sublevel sets are closed.

  2. In particular, this means that the domain of a closed function is also closed.

  3. Thus, D is a closed set.

  4. Then, fiβˆ’1(βˆ’βˆž,0] are closed sets since fi are closed functions.

  5. Since V is finite dimensional, hence affine sets are closed. Hence hjβˆ’1(0) is closed for every j=1,…,p.

  6. Thus, C is an intersection of closed and convex sets.

  7. Thus, C is a closed and convex set.

  8. Recall from Theorem 3.92 that a function is closed if and only if it has a closed epigraph.

  9. Also, recall from Theorem 3.119 that a function is closed if and only if it is l.s.c. (lower semicontinuous).

  10. Further, recall from Theorem 8.172 that every convex function has a closure which is l.s.c. (hence closed) given by its lower semicontinuous hull and the closure of a convex function is also convex.

  11. Thus, if any of the fi for i=0,…,m is convex but not a closed function, we can replace it by its lower semicontinuous hull to convert it to a closed convex function. This way, be can bring such problems to the standard form for convex optimization.

Remark 9.3 (Comparison of two forms)

We have seen two forms of describing convex optimization problems.

  1. Definition 9.1 describes it as minimizing a convex function over a convex set.

  2. Definition 9.2 describes it in the form of minimizing a convex objective function with convex inequality constraints and affine equality constraints.

  3. As seen in comments above, the second form can be converted into the first form.

  4. The first form is more general. It is very useful in establishing the theoretical properties of convex optimization problems.

  5. The second form is more useful from practical perspective. Almost all real life convex optimization problems can be reduced to this form. It is easier to describe algorithms to solve convex optimization problems in terms of this form.

In the sequel, we will liberally use either form for proving theoretical results and developing algorithms.

Definition 9.3 (Concave function maximization problem)

Let V be an n-dimensional real vector space. A mathematical optimization problem of the form

(9.2)ΒΆmaximize f0(x)subject to fi(x)≀0,i=1,…,mhj(x)=0,j=1,…,p

with optimization variable x∈V is called a concave maximization problem if

  1. The objective function f0:V→R is a concave function.

  2. The inequality constraint functions fi:Vβ†’R are convex functions for i=1,…,m.

  3. The equality constraint functions hj:Vβ†’R are affine functions for j=1,…,p.

Remark 9.4 (Concave maximization as a convex optimization problem)

We note that maximizing f0 is same as minimizing βˆ’f0. Further, if f0 is concave then βˆ’f0 is convex. Thus, (9.2) is equivalent to the problem

minimize βˆ’f0(x)subject to fi(x)≀0,i=1,…,mhj(x)=0,j=1,…,p

which is a convex optimization problem in standard form. With an abuse of notation, we shall call a concave function maximization program also a convex optimization program.

9.1.1.2. Proper Convex FunctionsΒΆ

It is useful to extend these definitions to proper convex functions as objective functions and/or inequality constraint functions.

Definition 9.4 (Optimization problem general form for proper convex functions)

Let V be an n-dimensional real vector space. Let f:Vβ†’(βˆ’βˆž,∞] be a proper convex function with S=domf. Let XβŠ†V be a convex set.

A mathematical optimization problem of the form

minimize f(x)subject to x∈X

is known as a convex optimization problem for proper convex functions.

In other words, this problem minimizes a proper convex function over a convex set. We can write the optimal value of the minimization problem as

pβˆ—=inf{f(x)|x∈X}.

The set of feasible points is given by C=X∩domf which is a convex set.

The set of optimal points for the minimization problem is

Xopt={x∈C|f(x)=pβˆ—}.

Definition 9.5 (Convex optimization problem standard form for proper convex functions)

Let V be an n-dimensional real vector space. A mathematical optimization problem of the form

(9.3)ΒΆminimize f0(x)subject to fi(x)≀0,i=1,…,mhj(x)=0,j=1,…,p

with optimization variable x∈V is called a convex optimization problem in standard form for proper convex functions if

  1. The objective function f0:Vβ†’(βˆ’βˆž,∞] is a proper convex function.

  2. The inequality constraint functions fi:Vβ†’(βˆ’βˆž,∞] are proper convex functions for i=1,…,m.

  3. The equality constraint functions hj:Vβ†’R are affine functions for j=1,…,p.

The reader can check that the definitions of problem domain, feasible set, optimal value, optimal solutions/points, etc. can be easily extended by using the effective domains of f0,…,fm.

9.1.1.3. Local and Global OptimaΒΆ

Theorem 9.1 (Local minimum is a global minimum in convex optimization)

Let f:Vβ†’R be a convex function. Let CβŠ†domf be a convex set. Consider the problem of minimizing f over C. Let xβˆ— be locally optimal for f over C. Then, xβˆ— is globally optimal for f over C.

In other words,

f(y)β‰₯f(xβˆ—)βˆ€y∈C.

Proof. Since xβˆ— is a local minimum of f over C, hence there exists r>0 such that

f(xβˆ—)=inf{f(z)|z∈B[x,r]∩C}.

In other words, f(x)β‰₯f(xβˆ—) for every x∈B[x,r]∩C.

  1. Let y∈C be any point such that yβ‰ xβˆ—.

  2. We need to show that f(y)β‰₯f(xβˆ—).

  3. Let t∈(0,1] be such that z=xβˆ—+t(yβˆ’xβˆ—)∈B[x,r].

  4. Since C is convex, hence z∈C as xβˆ—,y∈C and z is their convex combination.

  5. Thus, z∈B[x,r]∩C.

  6. By the local optimality condition

    f(xβˆ—)≀f(z)=f(xβˆ—+t(yβˆ’xβˆ—))=f((1βˆ’t)xβˆ—+ty)≀(1βˆ’t)f(xβˆ—)+tf(y).

    We used the fact that f is convex.

  7. Cancelling and rearranging the terms, we get tf(xβˆ—)≀tf(y).

  8. Since t>0, hence it reduces to f(xβˆ—)≀f(y).

  9. Thus, xβˆ— is indeed globally optimal.

Corollary 9.1 (Local minimum is a global minimum in for proper convex functions)

Let f:Vβ†’(βˆ’βˆž,∞] be a proper convex function. Let X be a convex subset of V. Consider the problem of minimizing f over X. Let xβˆ— be locally optimal for f over X. Then, xβˆ— is globally optimal for f over X.

In other words,

f(y)β‰₯f(xβˆ—)βˆ€y∈X.

Proof. We note that the feasible set is C=X∩domf. Since both X and domf are convex, hence C is convex.

  1. If C=βˆ…, then there are no feasible points. Hence, no local/global optima.

  2. We now consider the case where Cβ‰ βˆ….

  3. Following the argument in Theorem 9.1, if xβˆ— is locally optimal for f over C, then it is globally optimal for f over C.

  4. Consequently, it is globally optimal for f over X as f(x)=∞>f(xβˆ—) for every x∈Xβˆ–domf.

The argument can be modified to show that if f is strictly convex, then a locally optimal point for f is strictly globally optimal point.

Theorem 9.2 (Local minimum is strict global minimum for strictly convex functions)

Let f:Vβ†’R be a strictly convex function. Let CβŠ†domf be a convex set. Let xβˆ— be locally optimal for f over C. Then, xβˆ— is strictly globally optimal for f over C.

In other words,

f(y)>f(xβˆ—)βˆ€y∈C.

Proof. There exists r>0 such that f(x)β‰₯f(xβˆ—) for every x∈B[x,r]∩C.

  1. Let y∈C be any point such that yβ‰ xβˆ—.

  2. Let t∈(0,1) be such that z=xβˆ—+t(yβˆ’xβˆ—)∈B[x,r].

  3. Since C is convex, hence z∈C as xβˆ—,y∈C and z is their convex combination.

  4. Thus, z∈B[x,r]∩C.

  5. Note that z=(1βˆ’t)xβˆ—+ty. Thus, z is distinct from xβˆ— and y and lies in the line segment between them.

  6. By the local optimality condition

    f(xβˆ—)≀f(z)=f((1βˆ’t)xβˆ—+ty)<(1βˆ’t)f(xβˆ—)+tf(y).

    We used the fact that f is strictly convex.

  7. Cancelling and rearranging the terms, we get tf(xβˆ—)<tf(y).

  8. Since t>0, hence it reduces to f(xβˆ—)<f(y).

  9. Thus, xβˆ— is indeed strictly globally optimal.

Corollary 9.2 (Local minimum is strict global minimum for proper strictly convex functions)

Let f:Vβ†’(βˆ’βˆž,∞] be a proper strictly convex function. Let X be a convex subset of V. Let xβˆ— be locally optimal for f over X. Then, xβˆ— is strictly globally optimal for f over X.

In other words,

f(y)>f(xβˆ—)βˆ€y∈X.

9.1.1.4. Optimal SetsΒΆ

The optimal sets of a convex optimization problem are also convex.

Theorem 9.3 (Optimal set is convex for a convex optimization problem)

Let f:Vβ†’R be a convex function. Let CβŠ†domf be a convex set. Let the optimal value for the minimization of f over C be given by

pβˆ—=inf{f(x)|x∈C}.

Let the optimal set (the set of optimal points) be given by

Xopt={x∈C|f(x)=pβˆ—}.

Then, Xopt is a convex set.

Proof. If Xopt is empty, then it is convex trivially. Now consider the case where Xopt is nonempty.

  1. Let x,y∈Xopt.

  2. Then, x,y∈C.

  3. Let t∈[0,1].

  4. Let z=tx+(1βˆ’t)y.

  5. Since C is convex, hence z∈C. Thus, z∈domf.

  6. Since f is convex, hence

    f(z)≀tf(x)+(1βˆ’t)f(y)=tpβˆ—+(1βˆ’t)pβˆ—=pβˆ—.
  7. But pβˆ—=inf{f(x)|x∈C}.

  8. Thus, f(z)β‰₯pβˆ—.

  9. Combining, the two inequalities, we get pβˆ—=f(z).

  10. Thus, z∈Xopt.

  11. Thus, for every x,y∈Xopt and every t∈[0,1], z=tx+(1βˆ’t)y∈Xopt.

  12. Thus, Xopt is indeed convex.

Theorem 9.4 (Optimal points for minimization of strictly convex functions)

Let f:Vβ†’R be a strictly convex function. Let CβŠ†domf be a convex set. Then, the minimization problem

minimize f(x)subject to x∈C

has at most one optimal point.

Proof. Let the optimal value for the minimization of f over C be given by

pβˆ—=inf{f(x)|x∈C}.

Let the optimal set (the set of optimal points) be given by

Xopt={x∈C|f(x)=pβˆ—}.
  1. By Theorem 9.1, Xopt is a convex set.

  2. If Xopt is empty or a singleton, there is nothing more to prove.

  3. For contradiction, assume that there are two distinct points x,y∈Xopt.

  4. We have pβˆ—=f(x)=f(y).

  5. Let z=12x+12y.

  6. Thus, it is a convex combination of x and y.

  7. By convexity of Xopt, z∈Xopt. Thus, f(z)=pβˆ—.

  8. By strict convexity of f

    f(z)<12f(x)+12f(y)=pβˆ—.
  9. This contradicts the fact that pβˆ— is the optimal value for the minimization problem.

  10. Thus, Xopt must be either empty or a singleton.

  11. Thus, the minimization problem has at most one optimal point.

9.1.2. Concave Objective FunctionsΒΆ

Theorem 9.5 (Minimization of concave function and relative interior)

Let V be an n-dimensional real normed linear space. Let f:Vβ†’R be a concave function with S=domf. Let CβŠ†S be a convex set. Consider the problem of minimizing f over C. Let the optimal value be given by

pβˆ—=inf{f(x)|x∈C}.

Let the set of optimal points for the minimization problem be given by

Xopt={x∈C|f(x)=pβˆ—}.

If Xopt contains a relative interior point of C, then f must be constant over C; i.e.,

Xopt=C.

Proof. Assume that x∈riC∩Xopt.

  1. Let y∈C be any vector distinct from x.

  2. Since x∈riC, hence due to Theorem 8.144, there exists s>1 such that z=x+(sβˆ’1)(xβˆ’y)∈C. z is a point behind x on the line from y to x which belongs to C.

  3. Letting t=1s we can rewrite it as

    x=tz+(1βˆ’t)y.
  4. Thus, x is a convex combination of z and y.

  5. By concavity of f, we have

    f(x)β‰₯tf(z)+(1βˆ’t)f(y).
  6. Since x is an optimal point over C and z,y∈C, hence f(x)≀f(z) and f(x)≀f(y).

  7. Thus,

    f(x)β‰₯tf(z)+(1βˆ’t)f(y)β‰₯tf(x)+(1βˆ’t)f(x)=f(x).
  8. This can be true only if f(x)=tf(z)+(1βˆ’t)f(y) which in turn implies that f(x)=f(z)=f(y).

  9. Thus, f must be constant over C.

Corollary 9.3 (Linear functions attain minimum only at boundary points)

Let V be an n-dimensional real normed linear space. Let f:Vβ†’R be a nonzero linear functional. Let CβŠ†V be a convex set. Then, f cannot attain a minimum at any relative interior point of C.

In other words, f can attain a minimum only at the relative boundary of C (if it does attain a minimum over C).

Proof. We note that every linear functional is also concave.

  1. Let x∈riC.

  2. Then, there exists r>0 such that B(x,r)∩affCβŠ†C.

  3. Since f is linear, hence f cannot be constant over B(x,r)∩affC.

  4. Thus, by Theorem 9.5, it cannot attain a minimum at x.