9.4. Basic DualityΒΆ

9.4.1. The Geometry of the VβŠ•R SpaceΒΆ

As we concern ourselves with the optimization of proper functions of type f:Vβ†’(βˆ’βˆž,∞], it is pertinent for us to introduce some terminology to explore the geometry of the VβŠ•R space. We recall that epifβŠ†VβŠ•R.

  1. The V part forms the horizontal axes for the product space VβŠ•R.

  2. The R part forms the vertical axis.

  3. We write the vectors in VβŠ•R as (x,t) where x∈V and t∈R.

  4. A hyperplane in VβŠ•R is associated with a nonzero normal vector of the form (a,b) where a∈V and b∈R.

    H={(x,t)∈VβŠ•R|⟨x,a⟩+tb=c}.

    Since the normal vector must be nonzero, hence either a or b or both must be nonzero.

  5. If a specific point (x0,t0)∈H, then

    H={(x,t)∈VβŠ•R|⟨x,a⟩+tb=⟨x0,a⟩+t0b}.

9.4.1.1. Different Types of HyperplanesΒΆ

Definition 9.10 (Vertical, horizontal and nonvertical hyperplanes)

Let H be a hyperplane of VβŠ•R with a normal vector (a,b).

  1. The hyperplane is called horizontal if a=0.

  2. The hyperplane is called vertical if b=0.

  3. The hyperplane is called nonvertical if b≠0.

Let us see how these definitions can be interpreted.

9.4.1.2. Horizontal HyperplanesΒΆ

Example 9.2 (Horizontal hyperplanes)

Consider the case where a=0.

  1. The hyperplane description reduces to

    H={(x,t)∈VβŠ•R|tb=c}.
  2. It simplifies to

    H={(x,t)∈VβŠ•R|t=cb}

    since b must be nonzero.

  3. Along the V axes, the points in set H can take any value but along the vertical axis, they must take a fixed value given by cb.

  4. We can see that H is a hyperplane which is parallel to VΓ—{0}.

  5. For the specific case where c=0, H=VΓ—{0}.

  6. Hence they are called horizontal hyperplanes.

  7. Note that H intersects with the vertical axis at the point (0,cb).

9.4.1.3. Vertical HyperplanesΒΆ

Example 9.3 (Vertical hyperplanes)

Now consider the case where b=0.

  1. The hyperplane description reduces to

    H={(x,t)∈VβŠ•R|⟨x,a⟩=c}.
  2. The set Hv={x∈V|⟨x,a⟩=c} describes a hyperplane of V.

  3. H is constructed by allowing Hv to slide along the vertical axis as any value is allowed in the last coordinate (vertical axis).

  4. Hence this is called a vertical hyperplane.

9.4.1.4. Nonvertical HyperplanesΒΆ

Observation 9.3 (Intersection of nonvertical hyperplane with vertical axis)

If a hyperplane H with the normal vector (a,b) is nonvertical (i.e., b≠0), then it intersects with the vertical axis at a unique point.

  1. Indeed the vertical axis is identified with the set of points

    L={(x,t)∈VβŠ•R|x=0}.
  2. The hyperplane is given by

    H={(x,t)∈VβŠ•R|⟨x,a⟩+tb=c}.
  3. Then the point of intersection is given by (0,t) where

    t=cb.
  4. In particular, assume that (x,s)∈H.

  5. Then by definition of H,

    c=⟨x,a⟩+sb.
  6. Hence

    t=1b⟨x,a⟩+s.

9.4.1.5. Vertical LinesΒΆ

Definition 9.11 (Vertical line)

A vertical line in VβŠ•R is a set of the form {(x,t)|t∈R} where x∈V is a fixed vector.

  1. A vertical hyperplane is a union of vertical lines.

  2. A vertical halfspace is also a union of vertical lines.

  3. If f is a proper function, then its epigraph doesn’t contain a vertical line.

It enables us to wonder if the epigraph of a proper function is contained entirely in a closed halfspace corresponding to some nonvertical hyperplane.

9.4.1.6. Nonvertical Supporting HyperplanesΒΆ

Theorem 9.30 (Convex sets and nonvertical supporting hyperplanes)

Let C be a nonempty convex subset of VβŠ•R that contains no vertical lines. Then

  1. C is contained in a closed halfspace corresponding to a nonvertical hyperplane; i.e., there exists a vector a∈V, a nonzero scalar b∈R, and a scalar c∈R such that

    ⟨x,a⟩+tbβ‰₯cβˆ€(x,t)∈C.
  2. If a point (xΒ―,tΒ―) doesn’t belong to clC, there exists a nonvertical hyperplane strongly separating (xΒ―,tΒ―) and C.

Proof. (1) We prove this claim by contradiction.

  1. Assume that every hyperplane such that C is contained in one of its closed half-spaces is vertical.

  2. Then every hyperplane such that clC is contained in one of its closed half spaces is vertical.

  3. By Theorem 8.170, clC is the intersection of all closed half-spaces that contain it.

  4. Hence, we have

    clC=β‹‚i∈I{(x,t)|⟨x,ai⟩β‰₯ci}

    where I is an index set for the family of the hyperplanes that contain clC in its closed half spaces above. Since the hyperplanes are vertical, hence bi=0 for ever i∈I.

  5. Let (x~,t~)∈clC.

  6. Then the vertical line {(x~,t)|t∈R} also belongs to clC as it satisfies the expression above.

  7. Hence, the vector (0,1) is a direction of recession for clC. In fact it belongs to its lineality space. In other words, (0,βˆ’1) is also a direction of recession for clC.

  8. By Theorem 8.150,

    riclC=riC

    since C is a convex set.

  9. By Theorem 8.182, the recession cones of clC and riC are equal.

  10. Hence (0,1) and (0,βˆ’1) are also directions of recession for riC.

  11. Hence for every (x~,t~)∈riC, the vertical line {(x~,t)|t∈R} also belongs to riC and hence to C.

  12. This contradicts the hypothesis that C doesn’t contain a vertical line.

(2) This follows from strict separation theorem.

  1. Since (xΒ―,tΒ―)βˆ‰clC, hence, due to Theorem 8.169, there exists a hyperplane strongly separating clC and (xΒ―,tΒ―).

  2. If this hyperplane is nonvertical, then there is nothing more to do.

  3. We now consider the case where this strongly separating hyperplane is vertical.

  4. Let this separating hyperplane be given by

    H={(x,t)∈VβŠ•R|⟨x,a¯⟩=cΒ―}

    such that

    ⟨x,a¯⟩>cΒ―>⟨xΒ―,aΒ―βŸ©βˆ€(x,t)∈clC.
  5. Our goal is to combine this vertical hyperplane with a suitably constructed nonvertical hyperplane so that the combined nonvertical hyperplane also strongly separates C and (xΒ―,tΒ―).

  6. By part (1), there exists a nonvertical hyperplane such that C is totally contained in one of its closed half spaces.

  7. Hence there exists (a^,b^)∈VβŠ•R with b^β‰ 0 and c^∈R such that

    ⟨x,a^⟩+tb^β‰₯c^βˆ€(x,t)∈clC.
  8. Multiply this inequality with some Ο΅>0 and add with the previous inequality to get

    ⟨x,aΒ―+Ο΅a^⟩+Ο΅tb^>cΒ―+Ο΅c^βˆ€(x,t)∈clC,βˆ€Ο΅>0.
  9. Since c¯>⟨x¯,a¯⟩, it is possible to select a small enough ϡ>0 such that

    c¯+ϡc^>⟨x¯,a¯+ϡa^⟩+ϡt¯b^.
  10. Thus, for the small enough Ο΅>0, we have

    ⟨x,aΒ―+Ο΅a^⟩+Ο΅tb^>cΒ―+Ο΅c^>⟨xΒ―,aΒ―+Ο΅a^⟩+Ο΅tΒ―b^βˆ€(x,t)∈clC.
  11. Letting a~=aΒ―+Ο΅a^, b~=Ο΅b^ and c~=cΒ―+Ο΅c^, we have

    ⟨x,a~⟩+tb~>c~>⟨xΒ―,a~⟩+tΒ―b~βˆ€(x,t)∈clC.
  12. This describes the strongly separating nonvertical hyperplane between clC and (xΒ―,tΒ―).

Definition 9.12 (Upper closed halfspace)

Let H be a nonvertical hyperplane. The closed halfspace of H whose recession cone includes the vertical halfline {(0,t)|tβ‰₯0} is known as its upper closed halfspace.

If you are in the upper closed halfspace and keep going up, you will stay in the upper closed halfspace. If you go down, you will hit the hyperplane.

Definition 9.13 (Lower closed halfspace)

Let H be a nonvertical hyperplane. The closed halfspace of H whose recession cone includes the vertical halfline {(0,t)|t≀0} is known as its lower closed halfspace.

If you are in the lower closed halfspace and keep going down, you will stay in the lower closed halfspace. If you go up, you will hit the hyperplane.

9.4.2. Min Common/Max Crossing DualityΒΆ

We introduce two simple optimization problems.

  1. Consider a set M in VβŠ•R.

  2. Assume that M intersects with the vertical axis; i.e., R={(0,t)|t∈R}.

  3. The min common point problem attempts to find the point x∈M∩R whose component along vertical axis is the minimum.

  4. Now consider the set of nonvertical hyperplanes such that M lies in their upper closed halfspaces.

  5. Each such hyperplane intersects with the vertical axis.

  6. Such points are called the crossing points between the nonvertical hyperplane and the vertical axis.

  7. Consider the set of all such crossing points.

  8. The max crossing point problem attempts to find the point whose vertical component is the largest (highest).

  9. Let pβˆ— be the minimum common point.

  10. Let qβˆ— be the maximum crossing point.

  11. Let pβˆ— and qβˆ— denote the component along the vertical axis for pβˆ— and qβˆ— respectively.

  12. In general, pβˆ— lies above qβˆ—. We shall show this later formally.

  13. We call pβˆ— as minimum common level and qβˆ— as maximum crossing level.

  14. Then pβˆ—β‰₯qβˆ—. This is known as weak duality.

  15. The gap pβˆ—βˆ’qβˆ— which is a nonnegative quantity is known as the duality gap.

  16. Under certain conditions pβˆ—=qβˆ—. In other words, if the set M meets specific conditions, then the optimal value for the min common point problem (the primal problem) as well as the max crossing point problem (the dual problem) are identical.

  17. When pβˆ—=qβˆ—, then the duality gap is 0. This is known as strong duality.

  18. When strong duality holds, then the min common point problem and the max crossing point problem are equivalent in the sense that they have the same solution.

We are now ready to define these problems formally.

9.4.2.1. Min Common ProblemΒΆ

Definition 9.14 (Min common problem)

Given a set MβŠ†VβŠ•R, the min common problem is defined as

minimize psubject to (0,p)∈M

Its optimal value is denoted by pβˆ—; i.e.,

pβˆ—=inf(0,p)∈Mp.

9.4.2.2. Max Crossing ProblemΒΆ

Recall that a nonvertical hyperplane has a normal (a,b) such that b≠0. Then (ab,1) is also a normal vector for this hyperplane. Hence one way to describe the set of nonvertical hyperplanes is the set of hyperplanes with normal vectors of the form (a,1). Following Observation 9.3, if a nonvertical hyperplane H intersects the vertical axis at some (0,q), then

q=⟨x,a⟩+t

where (x,t)∈H. Thus, the hyperplane can be characterized by a and q as

Ha,q={(x,t)|⟨x,a⟩+t=q}.

The corresponding upper closed half halfspace is given by

Ha,qu={(x,t)|⟨x,a⟩+tβ‰₯q}

since tβ‰₯q for every point on the vertical axis in Ha,qu. The coordinate along the vertical axis for all the points on the vertical axis in the upper half space must be more than or equal to q.

If the set M lies in the upper closed half space of a hyperplane characterized by a and q, then

⟨x,a⟩+tβ‰₯qβˆ€(x,t)∈M.

Hence, the maximum crossing level q over all hyperplanes Ha,q with the same normal (a,1) is given by

q(a)=inf(x,t)∈M{⟨x,a⟩+t}.

The problem of maximizing the crossing level over all nonvertical hyperplanes is to maximize over all a∈V, the maximum crossing level corresponding to a.

Definition 9.15 (Max crossing problem)

Given a set MβŠ†VβŠ•R, the max crossing problem is defined as

maximize q(a)subject to a∈V

where q(a)=inf(x,t)∈M{⟨x,a⟩+t}. Its optimal value is denoted by qβˆ—; i.e.,

qβˆ—=supa∈Vq(a).

The function q is the cost function (objective function) of the max crossing problem. It turns out that q is concave and upper semicontinuous.

Theorem 9.31 (Concavity and upper semicontinuity of q)

The cost function q of the max crossing problem is concave and upper semicontinuous over V.

Proof. Concavity

  1. For each (x,t), the function aβ†¦βŸ¨x,a⟩+t is an affine function.

  2. Thus q is an infimum of a family of affine functions over (x,t)∈M.

  3. Hence q is concave.

Upper semicontinuity

  1. We note that βˆ’q(a)=sup(x,t)∈M{βˆ’βŸ¨x,aβŸ©βˆ’t}.

  2. Hence βˆ’q is a supremum of affine functions.

  3. Affine functions are closed (Theorem 8.86).

  4. Pointwise supremum of closed functions is closed (Theorem 3.93).

  5. Hence βˆ’q is a closed function.

  6. By Theorem 3.119, βˆ’q is lower semicontinuous since it is closed.

  7. Hence q is upper semicontinuous.

9.4.2.3. Weak DualityΒΆ

Theorem 9.32 (Weak duality theorem)

For the min common and max crossing problems we have

qβˆ—β‰€pβˆ—.

Proof. Recall that

q(a)=inf(x,t)∈M{⟨x,a⟩+t}.

Then

inf(x,t)∈M{⟨x,a⟩+t}≀inf(0,t)∈M{⟨0,a⟩+t}=inf(0,t)∈M{t}=pβˆ—.

Thus we have

q(a)≀pβˆ—βˆ€a∈V.

Taking the supremum over a∈V on the L.H.S., we obtain

qβˆ—=supa∈Vq(a)≀pβˆ—.

9.4.2.4. Strong DualityΒΆ

We are now interested in conditions under which there is no duality gap and pβˆ—=qβˆ—. We shall assume in general that the min common problem is feasible and pβˆ—<∞.

Remark 9.6 (Optimal point of min common problem is a closure point)

Assume that the min common problem is feasible and let pβˆ—<∞. Then the point (0,pβˆ—) is a closure point of M.

  1. We have pβˆ—=inf(0,t)∈M{t}.

  2. Thus for every Ο΅>0 there exists (0,t)∈M such that t<pβˆ—+Ο΅.

  3. Thus for every Ο΅>0, there exists a point (0,t)∈M such that β€–(0,t)βˆ’(0,pβˆ—)β€–<Ο΅.

  4. Hence (0,pβˆ—) is a closure point of M.

Observation 9.4 (Closed and convex M)

If M is closed and convex and admits a nonvertical supporting hyperplane at (0,pβˆ—), then the strong duality holds and pβˆ—=qβˆ—. Also, the optimal values of both the min common problem and max crossing problem are attained.

  1. Since M is closed, hence (0,pβˆ—)∈M since (0,pβˆ—) is a closure point of M.

  2. Hence the optimal value of min common problem is attained at (0,pβˆ—).

  3. Let H be the nonvertical supporting hyperplane at (0,pβˆ—).

  4. Then intersection of H with the vertical axis is at (0,pβˆ—).

  5. Hence qβˆ—β‰₯pβˆ—.

  6. But by weak duality qβˆ—β‰€pβˆ—.

  7. Hence qβˆ—=pβˆ—.

  8. Clearly the optimal value of max crossing problem is attained at (0,pβˆ—) for the hyperplane H.

This is the most favorable case where strong duality holds as well as the optimal values of both problems are attained. We next provide a result that provides a necessary and sufficient condition for the strong duality however does not address the issue of attainment of the optimal values.

9.4.2.5. First Min Common / Max Crossing TheoremΒΆ

Theorem 9.33 (Min common/max crossing theorem I)

Consider the min common and max crossing problems. Assume the following:

  1. pβˆ—<∞; i.e., the min common problem is feasible.

  2. The set

    M―={(x,t)∈VβŠ•R| there exists t¯∈R with t¯≀t and (x,tΒ―)∈M}

    is convex.

Then we have qβˆ—=pβˆ— if and only if for every sequence {(xk,tk)} of M with xkβ†’0, there holds

pβˆ—β‰€lim infkβ†’βˆžtk.

The set M― is an extension of the set M going upwards along the vertical axis. In other words, all points above M are included in M―. In other words, the direction (0,1) is added to the recession cone. M― is unbounded along the (0,1) direction.

Proof. We first consider the trivial case where pβˆ—=βˆ’βˆž.

  1. By weak duality (Theorem 9.32), qβˆ—β‰€pβˆ—.

  2. Hence qβˆ—=βˆ’βˆž.

  3. Hence q(a)=βˆ’βˆž for every a∈V.

  4. The conclusion follows trivially.

We now consider the general case where pβˆ—βˆˆR. First assume that pβˆ—β‰€lim infkβ†’βˆžtk holds for every sequence {(xk,tk)} of M with xkβ†’0.

  1. Since MβŠ†M― and (0,pβˆ—) is a closure point of M, hence (0,pβˆ—) is also a closure point of M―.

  2. We first claim that the set M― doesn’t contain any vertical lines.

    1. For contradiction, assume that M― contains a vertical line.

    2. The set clM― also contains a vertical line then.

    3. Then (0,βˆ’1) is a direction of recession of clM―.

    4. Hence (0,βˆ’1) is also a direction of recession of riM―.

    5. Since (0,pβˆ—) is a closure point of M―, hence it is also a closure point of riM―.

    6. Hence, there exists a sequence {(xk,tk)} of riM― converging to (0,pβˆ—).

    7. Since (0,βˆ’1) is a direction of recession of riM―, hence the sequence {(xk,tkβˆ’1)} belongs to riM―.

    8. Hence its limit (0,pβˆ—βˆ’1)∈clM―.

    9. By definition of M―, for every k, there exists a point (xktk―)∈M such that tk―≀tkβˆ’1.

    10. Hence there exists a sequence {(xk,tk―)} of M with tk―≀tkβˆ’1 for all k so that lim infkβ†’βˆžtk―≀pβˆ—βˆ’1.

    11. This contradicts the assumption that pβˆ—β‰€lim infkβ†’βˆžtk since xkβ†’0.

  3. We next show that the vector (0,pβˆ—βˆ’Ο΅) does not belong to clM― for any Ο΅>0.

    1. Assume for contradiction that for some Ο΅>0, the vector (0,pβˆ—βˆ’Ο΅)∈clM―.

    2. Then there is a sequence {(xk,tk)} of M― converging to (0,pβˆ—βˆ’Ο΅).

    3. By definition of M―, for every k, there exists a point (xktk―)∈M such that tk―≀tk.

    4. Hence there exists a sequence {(xk,tk―)} of M with tk―≀tk for all k so that lim infkβ†’βˆžtk―≀pβˆ—βˆ’Ο΅.

    5. This contradicts the assumption that pβˆ—β‰€lim infkβ†’βˆžtk since xkβ†’0.

  4. Since M― does not contain any vertical lines and the vector (0,pβˆ—βˆ’Ο΅) doesn’t belong to clM―, hence, due to Theorem 9.30, there exists a nonvertical hyperplane strongly separating (0,pβˆ—βˆ’Ο΅) and M― for every Ο΅>0.

  5. This hyperplane crosses the vertical axis at a unique vector (0,ΞΎ) which must lie between (0,pβˆ—βˆ’Ο΅) and (0,pβˆ—); i.e., pβˆ—βˆ’Ο΅β‰€ΞΎβ‰€pβˆ—.

  6. By definition of max crossing problem, ξ≀qβˆ—.

  7. Hence we have pβˆ—βˆ’Ο΅β‰€qβˆ—β‰€pβˆ— for every Ο΅>0.

  8. Since Ο΅ can be arbitrarily small, it follows that pβˆ—=qβˆ—.

Conversely, we assume that the strong duality holds.

  1. Let {(xk,tk)} be any sequence of M such that xk→0.

  2. By definition of q(a),

    q(a)=inf(x,t)∈M{⟨x,a⟩+t}β‰€βŸ¨xk,a⟩+tkβˆ€k,βˆ€a∈V.
  3. Taking the limit on R.H.S. to kβ†’βˆž, we obtain

    q(a)≀lim infkβ†’βˆžtkβˆ€a∈V.
  4. Taking the supremum on the L.H.S. over a∈V, we have

    pβˆ—=qβˆ—=supa∈Vq(a)≀lim infkβ†’βˆžtk.

This result doesn’t guarantee the attainment of either the min common optimal point or the max crossing optimal point. The next result includes additional conditions which ensures that the optimal point of max crossing problem is attained.

9.4.2.6. Second Min Common / Max Crossing TheoremΒΆ

Theorem 9.34 (Min common/max crossing theorem II)

Consider the min common and max crossing problems. Assume the following:

  1. βˆ’βˆž<pβˆ—.

  2. The set

    M―={(x,t)∈VβŠ•R| there exists t¯∈R with t¯≀t and (x,tΒ―)∈M}

    is convex.

  3. The set

    D={u∈V| there exists t∈R with (u,t)∈M―}

    contains the origin in its relative interior.

Then qβˆ—=pβˆ— and the optimal solution set of the max crossing problem

Qβˆ—={a∈V|q(a)=qβˆ—}

has the form

Qβˆ—=(affD)βŠ₯+Q~

where Q~ is a nonempty, convex and compact set and (affD)βŠ₯ is the orthogonal complement of affD (which is a subspace by assumption 3 since it contains the origin).

Furthermore, Qβˆ— is nonempty and compact if and only if D contains the origin in its interior.

Note that D is the set of all possible horizontal coordinates of points in M. x∈D if and only if there exists some (x,t)∈M― if and only if there exists some (x,s)∈M with s≀t. Since M― is convex and D is a projection of M― on V, hence D is also convex.

Proof. We first show that the strong duality holds and Qβˆ— is nonempty, closed and convex.

  1. Condition (3) implies that there exists some (0,t)∈M―.

  2. By definition of M―, there exists some (0,t―)∈M such that t―≀t.

  3. Hence pβˆ—<∞.

  4. Then, by condition (1), pβˆ—βˆˆR; i.e., the min crossing level is a real number.

  5. The vertical axis {(0,t)|t∈R} belongs to affM―.

  6. pβˆ— is the optimal min common value.

  7. Hence (0,pβˆ—) is not a relative interior point of M―.

  8. Accordingly, (0,pβˆ—)βˆ‰riM―.

  9. By Theorem 8.163, there exists a hyperplane that separates M― and the point (0,pβˆ—) properly; i.e., it contains the point (0,pβˆ—), contains M― in one of its half-spaces and doesn’t contain M― fully.

  10. Hence, there exists a vector (a,r) such that

    ⟨x,a⟩+trβ‰₯pβˆ—rβˆ€(x,t)∈M―

    and

    sup(x,t)∈Mβ€•βŸ¨x,a⟩+tr>pβˆ—r.

    See Remark 8.10.

  11. Since for every (x―,t―)∈M, the set M― contains the half-line {(x―,t)|t―≀t}, it follows from the first inequality that rβ‰₯0 must hold true.

  12. r=0 leads to a contradiction.

    1. Assume that r=0.

    2. Then the first inequality reduces to

      ⟨x,a⟩β‰₯0βˆ€x∈D.
    3. The linear functional ⟨x,a⟩ attains its minimum over the set D at 0∈D which is a relative interior point of D by condition (3).

    4. Since D is convex (projection of convex M― on V), and the linear functional ⟨x,a⟩ (a concave function) attains its minimum at a relative interior point, hence, due to Theorem 9.5, the function must be constant over D.

    5. Hence we have

      ⟨x,a⟩=0βˆ€x∈D.
    6. But this contradicts the second (strict) inequality of the proper separation result since M― cannot be contained entirely inside the hyperplane.

    7. We arrive at a contradiction.

    8. Hence r≠0 and the separating hyperplane is nonvertical.

  13. By appropriate normalization, if necessary, we can assume that r=1.

  14. The proper separation inequalities simplify to

    ⟨x,a⟩+tβ‰₯pβˆ—βˆ€(x,t)∈M―

    and

    sup(x,t)∈Mβ€•βŸ¨x,a⟩+t>pβˆ—.
  15. We now note that

    pβˆ—β‰€inf(x,t)∈Mβ€•βŸ¨x,a⟩+t≀inf(x,t)∈M⟨x,a⟩+t=q(a)≀qβˆ—.
  16. Since by weak duality, we always have qβˆ—β‰€pβˆ—, hence all the inequalities must be equalities in the previous relation.

  17. Thus we have

    q(a)=qβˆ—=pβˆ—.
  18. Hence Qβˆ— is nonempty as a∈Qβˆ—.

  19. We can also write Qβˆ— as

    Qβˆ—={a∈V|q(a)β‰₯qβˆ—}.
  20. Since q is concave and upper semicontinuous, hence its superlevel sets are closed and convex.

  21. Hence Qβˆ— being a superlevel set of q is convex and closed.

We next show that Qβˆ—=(affD)βŠ₯+Q~.

TBD

9.4.3. Minimax and Maximin ProblemsΒΆ

We consider functions of the form Ο•:VβŠ•Wβ†’R with domΟ•=XΓ—Z. V and W are real vector spaces. XβŠ†V and ZβŠ†W are nonempty sets.

Definition 9.16 (Minimax problem)

A minimax problem takes the form

minimize supz∈ZΟ•(x,z)subject to x∈X

Definition 9.17 (Maximin problem)

A maximin problem takes the form

maximize infx∈XΟ•(x,z)subject to z∈Z.

We next provide a few examples.

9.4.3.1. Zero Sum GamesΒΆ

Example 9.4 (Zero sum games)

We consider a two player game with the following design.

  1. Player A can choose one out of n possible moves.

  2. Player B can choose one out of m possible moves.

  3. Both players make their moves simultaneously.

  4. A∈RnΓ—m is a payoff matrix.

  5. If move i is selected by player A and move j is selected by player B, then A gives the specified amount aij to B.

  6. Note that aij∈R can be zero, positive or negative.

  7. The players use mixed strategy.

  8. Player A selects a probability distribution x=(x1,…,xn) over her n possible moves.

  9. Player B selects a probability distribution z=(z1,…,zm) over her m possible moves.

  10. Since the probability of selecting move i by player A and move j by player B is xizj, hence the expected amount to be paid by A to B is

    βˆ‘ijxiaijzj=xTAz.
  11. Each player adopts a worst case viewpoint, whereby she optimizes her choice against the worst possible selection by the other player.

  12. Player A must minimize maxzxTAz so that she has to pay as low as possible to B.

    1. Suppose A selects the strategy x.

    2. The amount she has to pay to B for B’s selection of strategy z is xTAz.

    3. The maximum amount she has to pay across all possible strategies chosen by B is maxzxTAz.

    4. By selecting x, her goal is to minimize the maximum payoff.

  13. Player B must maximize minxxTAz.

    1. Suppose B selects a strategy z.

    2. Then the payoff she gets by a strategy x of A is xTAz.

    3. The minimum she can get for her choice of z is minxxTAz.

    4. By selecting z her goal is to maximize the minimum payoff.

  14. Here X=Ξ”nβŠ†Rn, the unit simplex of Rn.

  15. Similarly, Z=Ξ”mβŠ†Rm, the unit simplex of Rm.

  16. Ο•(x,z)=xTAz.

  17. The worst case pay off for player A is

    infx∈Xsupz∈ZxTAz.
  18. The worst case pay off for player B is

    supz∈Zinfx∈XxTAz.
  19. Under suitable conditions, we can guarantee that

    supz∈Zinfx∈XxTAz=infx∈Xsupz∈ZxTAz.

9.4.3.2. Lagrangian FunctionsΒΆ

Example 9.5 (Lagrangian functions and duality theory)

Consider the optimization problem of the form

minimize f(x)subject to gi(x)≀0,i=1,…,m

where f,g1,…,gm:Vβ†’R are given objective and inequality constraint functions.

We construct the Lagrangian function

L(x,z)=f(x)+βˆ‘i=1mzigi(x)

with domL=XβŠ•Z where X=domf∩domg1βˆ©β‹―βˆ©domgm and Z=R+m (the nonnegative orthant where zβͺ°0).

We construct the primal problem as below:

minimize supzβͺ°0L(x,z)subject to x∈X.
  1. Choose any x∈X.

  2. If any of the constraints gi(x)≀0 is violated, then gi(x)>0 and hence supzβͺ°0L(x,z)=∞. We can easily achieve this by taking ziβ†’βˆž.

  3. If none of the constraints are invalidated, then by picking z=0, we have supzβͺ°0L(x,z)=f(x).

  4. Hence the problem is equivalent to the original problem.

We can now construct the dual problem as below:

maximize infx∈XL(x,z)subject to zβͺ°0.

Thus the primal problem is a minimax problem and the dual problem is a maximin problem with the Lagrangian playing the role of Ο•.

Under suitable conditions, the two problems have equal optimal value.

9.4.3.3. Minimax EqualityΒΆ

In the following, we will explore conditions under which

(9.4)ΒΆsupz∈Zinfx∈XΟ•(x,z)=infx∈Xsupz∈ZΟ•(x,z)

and the infimum and the supremum are attained.

The key result is the saddle point theorem which guarantees the equality in (9.4) as well as the attainment of the infimum/supremum assuming convexity/concavity on Ο• and compactness on X and Z.

The compactness assumptions are restrictive. For example, in the duality theory, the Lagrange multipliers zβͺ°0 belong to the nonnegative orthant which is not compact.

The minimax theorem gives conditions guaranteeing the minimax equality (9.4), although it need not guarantee the attainment of the infimum and the supremum.

9.4.3.4. Minimax InequalityΒΆ

Observation 9.5 (Minimax inequality)

We always have

(9.5)ΒΆsupz∈Zinfx∈XΟ•(x,z)≀infx∈Xsupz∈ZΟ•(x,z).

In other words, the optimal value of the maximin problem is less than or equal to the optimum value of the minimax problem.

Proof. We note that for every z∈Z, we have

infx∈XΟ•(x,z)≀infx∈Xsupz∈ZΟ•(x,z).

Taking the supremum on the L.H.S., we obtain the desired inequality.

Thus, in order to show (9.4), it is sufficient to show that the reverse inequality holds.

9.4.3.5. Saddle PointsΒΆ

Definition 9.18 (Saddle point)

A pair of vectors xβˆ—βˆˆX and zβˆ—βˆˆZ is called a saddle point of Ο• if

Ο•(xβˆ—,z)≀ϕ(xβˆ—,zβˆ—)≀ϕ(x,zβˆ—),βˆ€x∈X,βˆ€z∈Z.

Remark 9.7 (Saddle point inequalities)

The pair (xβˆ—,zβˆ—)∈VβŠ•W is a saddle point if and only if xβˆ—βˆˆX, zβˆ—βˆˆZ and

supz∈ZΟ•(xβˆ—,z)=Ο•(xβˆ—,zβˆ—)=infx∈XΟ•(x,zβˆ—).

This equality further implies that

infx∈Xsupz∈ZΟ•(x,z)≀supz∈ZΟ•(xβˆ—,z)=Ο•(xβˆ—,zβˆ—)=infx∈XΟ•(x,zβˆ—)≀supz∈Zinfx∈XΟ•(x,z).

In short

infx∈Xsupz∈ZΟ•(x,z)≀ϕ(xβˆ—,zβˆ—)≀supz∈Zinfx∈XΟ•(x,z).

Combined with the minimax inequality (9.5), it shows that if a saddle point exists, then the minimax equality (9.4) holds.

Theorem 9.35 (Saddle point = minimax equality)

A pair (xβˆ—,zβˆ—) is a saddle point of Ο• if and only if the minimax equality (9.4) holds and xβˆ— is an optimal solution of the minimax problem

minimize supz∈ZΟ•(x,z)subject to x∈X

while zβˆ— is the optimal solution of the maximin problem

maximize infx∈XΟ•(x,z)subject to z∈Z.

Proof. Suppose that xβˆ— is the optimal solution of the minimax problem and zβˆ— is the optimal solution of the maximin problem.

  1. From the minimax problem we obtain

    infx∈Xsupz∈ZΟ•(x,z)=supz∈ZΟ•(xβˆ—,z)β‰₯Ο•(xβˆ—,zβˆ—).
  2. From the maximin problem we obtain

    supz∈Zinfx∈XΟ•(x,z)=infx∈XΟ•(x,zβˆ—)≀ϕ(xβˆ—,zβˆ—).
  3. Combining these inequalities, we have

    supz∈Zinfx∈XΟ•(x,z)=infx∈XΟ•(x,zβˆ—)≀ϕ(xβˆ—,zβˆ—)≀supz∈ZΟ•(xβˆ—,z)=infx∈Xsupz∈ZΟ•(x,z).
  4. If the minimax equality (9.4) holds, then equality holds throughout above giving us

    infx∈XΟ•(x,zβˆ—)=Ο•(xβˆ—,zβˆ—)=supz∈ZΟ•(xβˆ—,z).
  5. Hence (xβˆ—,zβˆ—) is a saddle point of Ο• as per Remark 9.7.

Conversely, assume that (xβˆ—,zβˆ—) is a saddle point of Ο•.

  1. From Remark 9.7, we have

    infx∈Xsupz∈ZΟ•(x,z)≀supz∈ZΟ•(xβˆ—,z)=Ο•(xβˆ—,zβˆ—)=infx∈XΟ•(x,zβˆ—)≀supz∈Zinfx∈XΟ•(x,z).
  2. The minimax inequality (9.5) gives us

    supz∈Zinfx∈XΟ•(x,z)≀infx∈Xsupz∈ZΟ•(x,z).
  3. Thus, all the inequalities in the previous relationship must be equalities. We have

    infx∈Xsupz∈ZΟ•(x,z)=supz∈ZΟ•(xβˆ—,z)=Ο•(xβˆ—,zβˆ—)=infx∈XΟ•(x,zβˆ—)=supz∈Zinfx∈XΟ•(x,z).
  4. Hence the minimax equality holds.

  5. It also implies that xβˆ— is an optimal solution of the minimax problem and zβˆ— is the optimal solution of the maximin problem.

Observation 9.6 (Saddle points as a Cartesian product)

When the set of saddle points is nonempty, it can be written as a Cartesian product Xβˆ—Γ—Zβˆ— where Xβˆ— is the set of optimal solutions of the minimax problem and the Zβˆ— is the set of optimal solutions of the maximin problem.

In other words, xβˆ— and zβˆ— can be chosen independently from the sets Xβˆ— and Zβˆ— respectively to form a saddle point. This is a direct consequence of Theorem 9.35.

If the minimax equality (9.4) does not hold, then there is no saddle point even if the minimax and maximin problems have optimal solutions.

9.4.4. Min Common/Max Crossing Framework for MinimaxΒΆ

Recall that the minimax problem is given by

infx∈Xsupz∈ZΟ•(x,z).

We add a linear perturbation to Ο• and introduce a function ψ:Wβ†’R― as

ψ(u)=infx∈Xsupz∈Z{Ο•(x,z)βˆ’βŸ¨u,z⟩}.

We can see that

ψ(0)=infx∈Xsupz∈ZΟ•(x,z).

The linear perturbation term impacts the optimum value of the minimax problem. We shall show that if ψ changes in a β€œregular” manner, then the minimax equality is guaranteed.

9.4.4.1. Framework DefinitionΒΆ

Definition 9.19 (Min common / max crossing framework for minimax problem)

We define the set M required for the min common/max crossing framework as

M=epiψ

where ψ:Wβ†’R― is given by

(9.6)¢ψ(u)=infx∈Xsupz∈Z{Ο•(x,z)βˆ’βŸ¨u,z⟩}.
  1. Recall that min common value is given by

    pβˆ—=inf(0,p)∈Mp.
  2. Note that for u=0, M contains all the points (0,t) such that ψ(0)≀t.

  3. In particular, (0,ψ(0))∈M and if (0,t)∈M then ψ(0)≀t.

  4. Hence pβˆ— is given by

    pβˆ—=ψ(0)=infx∈Xsupz∈ZΟ•(x,z).
  5. Since M is an epigraph, hence the sets M and M― are identical in the min common/max crossing framework.

  6. If ψ is convex, then M=epiψ is convex as desired in several results of the min common/max crossing framework.

The corresponding max crossing problem is given by:

maximize q(a)subject to a∈W
  1. We note that

    (9.7)ΒΆq(a)=inf(u,t)∈epiψ{⟨u,a⟩+t}=inf(u,t)∈ψ(u)≀t{⟨u,a⟩+t}=infu∈W{ψ(u)+⟨u,a⟩}.
  2. Its optimal value is denoted by qβˆ—; i.e.,

    qβˆ—=supa∈Wq(a).

9.4.4.2. ConnectionΒΆ

Observation 9.7 (Connection between minimax equality and min common/max crossing framework)

  1. By putting the definition of ψ(u) in the expression for q(a), we obtain

    q(a)=infu∈W{ψ(u)+⟨u,a⟩}=infu∈W{infx∈Xsupz∈Z{Ο•(x,z)βˆ’βŸ¨u,z⟩}+⟨u,a⟩}=infu∈Winfx∈Xsupz∈Z{Ο•(x,z)+⟨aβˆ’z,u⟩}.
  2. In particular, for every a∈Z, if we set z=a in this relation, we can see that

    infx∈XΟ•(x,a)≀q(a)βˆ€a∈Z.
  3. From the weak duality principle, we have qβˆ—β‰€pβˆ—.

  4. We have established that pβˆ—=ψ(0).

  5. We can now see that

    supz∈Zinfx∈XΟ•(x,z)=supa∈Zinfx∈XΟ•(x,a)≀supa∈Zq(a)≀supa∈Wq(a)=qβˆ—β‰€pβˆ—=ψ(0)=infx∈Xsupz∈ZΟ•(x,z).
  6. This is nothing but the minimax inequality (9.5).

  7. We can see that if the minimax equality (9.4) holds, then all inequalities in the previous relation turn into equalities and we have qβˆ—=pβˆ—.

  8. In other words, if the minimax equality holds, then the optimal values of the min common and max crossing problems are equal.

9.4.4.3. Convexity of ψ¢

Lemma 9.2 (Convexity of Ο• w.r.t. x and convexity of ψ)

Let X be a nonempty convex subset of V and Z be a nonempty subset of W. Let Ο•:VβŠ•Wβ†’R be a function with domΟ•=XΓ—Z. Assume that for each z∈Z, the function Ο•(β‹…,z):Vβ†’R is convex. Then the function ψ as defined in (9.6) is convex.

Proof. Recall that

ψ(u)=infx∈Xsupz∈Z{Ο•(x,z)βˆ’βŸ¨u,z⟩}.
  1. Fix some z∈Z.

  2. Consider the function fz(x,u)=Ο•(x,z)βˆ’βŸ¨u,z⟩.

  3. Clearly, fz is convex for each z∈Z by hypothesis.

  4. Taking the pointwise supremum over z∈Z, the function

    F(x,u)={supz∈Z{Ο•(x,z)βˆ’βŸ¨u,z⟩}x∈X;∞xβˆ‰X.

    is also convex (over x and u).

  5. We now have

    ψ(u)=infx∈VF(x,u).
  6. Since partial minimization preserves convexity, hence ψ is convex.

9.4.4.4. Minimax Equality Strong Duality Equivalence ConditionsΒΆ

Lemma 9.3 (Closedness and convexity of βˆ’Ο• w.r.t. z and minimax equality)

Let X be a nonempty convex subset of V and Z be a nonempty subset of W. Let Ο•:VβŠ•Wβ†’R be a function with domΟ•=XΓ—Z. Assume that for each x∈X, the function βˆ’Ο•(x,β‹…)β†’R is closed and convex. Then the function q:Wβ†’R― given by

q(a)=inf(u,t)∈epiψ{⟨u,a⟩+t},a∈W,

where ψ as defined in (9.6), satisfies

q(a)={infx∈XΟ•(x,a)a∈Z;βˆ’βˆžaβˆ‰Z.

Furthermore, we have qβˆ—=pβˆ— if and only if the minimax equality (9.4) holds.

Proof. We shall show the following one by one.

  1. q(a)β‰₯infx∈XΟ•(x,a)βˆ€a∈Z.

  2. q(a)≀infx∈XΟ•(x,a)βˆ€a∈Z.

  3. q(a)=βˆ’βˆžβˆ€aβˆ‰Z.

(1)

  1. We have already established in (9.7) that

    q(a)=infu∈W{ψ(u)+⟨u,a⟩}.
  2. By using the definition of ψ, we further established that

    q(a)=infu∈Winfx∈Xsupz∈Z{Ο•(x,z)+⟨aβˆ’z,u⟩}.
  3. By rearranging the order of infimum operations, we have

    q(a)=infx∈Xinfu∈Wsupz∈Z{Ο•(x,z)+⟨aβˆ’z,u⟩}.
  4. For any a∈Z we have

    supz∈Z{Ο•(x,z)+⟨aβˆ’z,u⟩}β‰₯Ο•(x,a)+⟨aβˆ’a,u⟩=Ο•(x,a)βˆ€x∈X,βˆ€u∈W.
  5. This in turn implies that

    q(a)β‰₯infx∈XΟ•(x,a)βˆ€a∈Z.

(2)

  1. Let rx:W→R be given by

    rx(z)=βˆ’Ο•(x,z).
  2. By hypothesis rx is closed and convex.

  3. Let a∈Z.

  4. Fix some x∈X.

  5. Since rx is a closed and convex function, hence epirx is a closed and convex set.

  6. Since a∈Z, hence the point (a,rx(a))∈epirx.

  7. For some Ο΅>0, consider the point (a,rx(a)βˆ’Ο΅).

  8. Clearly (a,rx(a)βˆ’Ο΅)βˆ‰epirx.

  9. By definition of rx, rx(z) is finite for all z∈Z, Z is nonempty and epirx is closed.

  10. Since rx(z) is finite for all z∈Z, the epigraph doesn’t contain any vertical lines.

  11. Hence, due to Theorem 9.30, there exists a nonvertical hyperplane that strongly separates the point (a,rx(a)βˆ’Ο΅) from epirx.

  12. Hence there exists a normal vector (u,1) and a scalar c such that

    ⟨a,u⟩+(rx(a)βˆ’Ο΅)<c<⟨u,z⟩+rx(z)βˆ€z∈Z.
  13. Substituting rx(z)=βˆ’Ο•(x,z), we get

    ⟨a,u⟩+(βˆ’Ο•(x,a)βˆ’Ο΅)<⟨u,zβŸ©βˆ’Ο•(x,z).
  14. Rearranging, we get

    Ο•(x,z)+⟨aβˆ’z,u⟩<Ο•(x,a)+Ο΅βˆ€z∈Z.
  15. Letting ϡ↓0, we have

    Ο•(x,z)+⟨aβˆ’z,uβŸ©β‰€Ο•(x,a)βˆ€z∈Z.
  16. We further note that

    infu∈Wsupz∈Z{Ο•(x,z)+⟨aβˆ’z,u⟩}≀supz∈Z{Ο•(x,z)+⟨aβˆ’z,u⟩}≀ϕ(x,a).
  17. Taking infimum over x∈X on the L.H.S., we obtain

    q(a)=infx∈Xinfu∈Wsupz∈Z{Ο•(x,z)+⟨aβˆ’z,u⟩}≀infx∈XΟ•(x,a)

    as desired.

(3)

  1. Take any aβˆ‰Z.

  2. Fix an arbitrary x∈X.

  3. Consider a sequence {tk} such that tkβ†’βˆž.

  4. Since aβˆ‰Z, hence (a,tk)βˆ‰epirx for every k.

  5. Again applying Theorem 9.30, for each k, there exists a nonvertical hyperplane strongly separating (a,tk) from epirx.

  6. Hence for each k, there exists a normal vector (uk,1) such that

    ⟨a,uk⟩+tk<⟨z,ukβŸ©βˆ’Ο•(x,z)βˆ€z∈Z.
  7. Rearranging, we have

    Ο•(x,z)+⟨aβˆ’z,uk⟩<βˆ’tk,βˆ€z∈Z,βˆ€k.
  8. Thus, we have

    infu∈Wsupz∈Z{Ο•(x,z)+⟨aβˆ’z,u⟩}≀supz∈Z{Ο•(x,z)+⟨aβˆ’z,uk⟩}β‰€βˆ’tkβˆ€k.
  9. Taking the limit on the R.H.S. as kβ†’βˆž, we can see that

    infu∈Wsupz∈Z{Ο•(x,z)+⟨aβˆ’z,u⟩}=βˆ’βˆžβˆ€x∈X.
  10. Finally, taking the infimum over x∈X, we can see that

    q(a)=βˆ’βˆž.

By Observation 9.7, if minimax equality holds, then qβˆ—=pβˆ—. For the converse, we now assume that qβˆ—=pβˆ—. Then

infx∈Xsupz∈ZΟ•(x,z)=ψ(0)=pβˆ—=qβˆ—=supa∈Wq(a)=supz∈Zq(z)=supz∈Zinfx∈XΟ•(x,z).

Thus the minimax equality holds.

9.4.5. Minimax TheoremsΒΆ

9.4.5.1. First TheoremΒΆ

Theorem 9.36 (Minimax theorem I)

Let X be a nonempty convex subset of V and Z be a nonempty subset of W. Let Ο•:VβŠ•Wβ†’R be a function with domΟ•=XΓ—Z. Assume that for each z∈Z, the function Ο•(β‹…,z):Vβ†’R is convex, and for each x∈X, the function βˆ’Ο•(x,β‹…):Wβ†’R is closed and convex. Assume further that

infx∈Xsupz∈ZΟ•(x,z)<∞.

Then, the minimax equality (9.4) holds; i.e.,

supz∈Zinfx∈XΟ•(x,z)=infx∈Xsupz∈ZΟ•(x,z),

if and only if the function ψ as defined in (9.6) is lower semicontinuous at u=0; i.e.,

ψ(0)≀lim infkβ†’βˆžΟˆ(uk)

for every sequence {uk} with uk→0.

Proof. The proof consists of establishing the correspondence between the conditions in this result and the conditions in the first min common/max crossing theorem (Theorem 9.33).

  1. We choose the set M as described in Definition 9.19, to be the epigraph of the function ψ.

    M=M―=epiψ={(u,t)∈WβŠ•R|ψ(u)≀t}.
  2. We have shown in Definition 9.19 that pβˆ—=ψ(0)=infx∈Xsupz∈ZΟ•(x,z).

  3. By hypothesis, pβˆ—<∞.

  4. Hence the first assumption of Theorem 9.33 is satisfied.

  5. Following Lemma 9.2, ψ is convex.

  6. Hence M=epiψ is also convex.

  7. Hence the second assumption of Theorem 9.33 is satisfied.

  8. Finally, the condition

    ψ(0)≀lim infkβ†’βˆžΟˆ(uk)

    is equivalent to the condition of Theorem 9.33 that for every sequence {(uk,tk)} of M with uk→0.

    pβˆ—β‰€lim infkβ†’βˆžtk

    holds true.

    1. We have tkβ‰₯ψ(uk).

    2. Hence

      lim infkβ†’βˆžtkβ‰₯lim infkβ†’βˆžΟˆ(uk)β‰₯ψ(0)=pβˆ—.
  9. Following Theorem 9.33, this condition holds if and only if pβˆ—=qβˆ—.

  10. Since βˆ’Ο•(x,β‹…) is closed and convex, hence following Lemma 9.3, this condition holds if and only if minimax equality holds.

9.4.5.2. Second TheoremΒΆ

We can adapt the argument of first minimax theorem to include conditions on the lines of the second min common/max crossing theorem Theorem 9.34.

Theorem 9.37 (Minimax theorem II)

Let X be a nonempty convex subset of V and Z be a nonempty subset of W. Let Ο•:VβŠ•Wβ†’R be a function with domΟ•=XΓ—Z. Assume that for each z∈Z, the function Ο•(β‹…,z):Vβ†’R is convex, and for each x∈X, the function βˆ’Ο•(x,β‹…):Wβ†’R is closed and convex. Assume further that

βˆ’βˆž<infx∈Xsupz∈ZΟ•(x,z)

and that 0 lies in the relative interior of the effective domain of the function ψ as defined in (9.6). Then, the minimax equality (9.4) holds; i.e.,

supz∈Zinfx∈XΟ•(x,z)=infx∈Xsupz∈ZΟ•(x,z),

and the supremum over Z in the left hand side is finite and is attained. Furthermore, the set of z∈Z attaining this supremum is compact if and only if 0 lies in the interior of the effective domain of ψ.

9.4.6. Saddle Point TheoremsΒΆ

  1. From the first minimax theorem (Theorem 9.36), we can see that minimax equality is satisfied if and only if the function ψ is lower semicontinuous at u=0.

  2. If ψ is closed, then it will be lower semicontinuous.

  3. The proof of Lemma 9.2 shows that ψ can be written as a partial minimization of F

    ψ(u)=infx∈VF(x,u).

    where F:VβŠ•Wβ†’(βˆ’βˆž,∞] is given by

    F(x,u)={supz∈Z{Ο•(x,z)βˆ’βŸ¨u,z⟩}x∈X;∞xβˆ‰X.
  4. The results in Partial Minimization and Closedness can be used to guarantee the closedness of ψ.

  5. These results can also guarantee that the infimum of F over x is attained.

  6. In particular ψ(0) will be finite and hence guaranteeing that the optimal value of the minimax problem is finite.

Definition 9.20 (Auxiliary functions for the minimax problem)

Let X be a nonempty convex subset of V and Z be a nonempty subset of W. Let Ο•:VβŠ•Wβ†’R be a function with domΟ•=XΓ—Z.

For each z∈Z, the function Ξ·z:Vβ†’(βˆ’βˆž,∞] is defined as

Ξ·z(x)={Ο•(x,z)x∈X;∞xβˆ‰X.

For each x∈X, the function ρx:Wβ†’(βˆ’βˆž,∞] is defined as

ρx(z)={βˆ’Ο•(x,z)z∈Z;∞zβˆ‰Z.

The function Ξ·:Vβ†’(βˆ’βˆž,∞] is defined as

η(x)=supz∈Zηz(x),x∈V.

The function ρ:Wβ†’(βˆ’βˆž,∞] is defined as

ρ(z)=supx∈Xρx(z),z∈W.

Following remarks are in order.

  1. If ηz is closed and convex for each z∈Z, then η is closed and convex due to Theorem 8.114.

  2. If ρx is closed and convex for each x∈X, then ρ is closed and convex due to Theorem 8.114.

  3. By definition, Ξ·(x)>βˆ’βˆž for every x∈X.

  4. η can be proper only if η(x)<∞ for some x∈X.

  5. Equivalently, for the optimal minimax value

    infx∈Xsupz∈ZΟ•(x,z)<∞

    must hold for Ξ· to be proper.

  6. Similarly, ρ(z)>βˆ’βˆž for every z∈Z.

  7. ρ can be proper only if ρ(z)<∞ for some z∈Z.

  8. This is possible only if

    infz∈Zsupx∈X(βˆ’Ο•(x,z))<∞.
  9. Equivalently, for the optimal maximin value

    βˆ’βˆž<supz∈Zinfx∈XΟ•(x,z)

    must hold true.

  10. The set of minimizers of Ξ· are the optimal points for the minimax problem: Xβˆ—.

  11. The set of minimizers of ρ are the optimal points of the maximin problem: Zβˆ—.

9.4.6.1. Minimax Equality and Attainment of Minimax SolutionΒΆ

In the following results, we provide conditions under which the minimax equality is attained and the the set of optimal solutions for the minimax problem is nonempty.

Theorem 9.38 (Compact sublevel sets of Ξ·)

Assume the following:

  1. ηz is closed and convex for every z∈Z.

  2. ρx is closed and convex for every x∈X.

  3. The optimal minimax value is less than infinity

    infx∈Xsupz∈ZΟ•(x,z)<∞.
  4. The sublevel sets of Ξ· are compact.

Then the minimax equality (9.4) holds and the set of optimal points for the minimax problem Xβˆ— is nonempty and compact.

Proof. .

  1. Under the assumptions above, the function Ξ· is proper, closed and convex.

  2. By definition F(x,u)>βˆ’βˆž for every x,u.

  3. Note that F(x,0)=Ξ·(x).

  4. Since η is proper, hence there exists x∈X such that F(x,0)<∞.

  5. Hence F is also proper.

  6. F is also closed and convex.

    1. Fix some z∈Z.

    2. Consider the function fz(x,u)=Ξ·z(x)βˆ’βŸ¨u,z⟩.

    3. Ξ·z is closed and convex for every z by hypothesis.

    4. Hence fz is closed and convex for every z.

    5. Taking the pointwise supremum over z∈Z, F is closed and convex.

  7. The sets

    {x|F(x,0)≀t}={x|Ξ·(x)≀t}

    are the sublevel sets of Ξ· which are compact for every t by hypothesis.

  8. We can easily select a scalar for which the sublevel set of Ξ· is also nonempty since Ξ· is proper.

  9. Hence, due to Theorem 8.120, the function ψ which is a partial minimization of F over x is proper, closed and convex.

  10. Since ψ is closed, hence it is lower semicontinuous.

  11. In particular, ψ is l.s.c. at u=0.

  12. Hence due to Theorem 9.36, the minimax equality holds.

  13. Since Xβˆ— is the set of minimizers of Ξ·, Ξ· is proper and closed, and the sublevel sets of Ξ· are compact, hence Xβˆ— is nonempty and compact due to Weierstrass’ theorem (Theorem 7.9).

Theorem 9.39 (Recession and constancy space of Ξ·)

Assume the following:

  1. ηz is closed and convex for every z∈Z.

  2. ρx is closed and convex for every x∈X.

  3. The optimal minimax value is less than infinity

    infx∈Xsupz∈ZΟ•(x,z)<∞.
  4. The recession cone and constancy space of the function Ξ· are equal.

Then the minimax equality (9.4) holds and the set of optimal points for the minimax problem Xβˆ— is nonempty.

Theorem 9.40 (F domain as a set of linear inequalities)

Assume the following:

  1. ηz is closed and convex for every z∈Z.

  2. ρx is closed and convex for every x∈X.

  3. The optimal minimax value is less than infinity

    infx∈Xsupz∈ZΟ•(x,z)<∞.
  4. The function

    F(x,u)={supz∈Z{Ο•(x,z)βˆ’βŸ¨u,z⟩}x∈X;∞xβˆ‰X

    has the form

    F(x,u)={F―(x,u)(x,u)∈C;∞(x,u)βˆ‰C

    where F― is a proper, closed, and convex function on VβŠ•W and C is specified by linear inequalities; i.e,

    C={(x,u)|Ax+Buβͺ―b},

    where A,B are matrices and b is a vector.

  5. Every common direction of recession of C and F― is a direction along which F― is constant.

Then the minimax equality (9.4) holds and the set of optimal points for the minimax problem Xβˆ— is nonempty.

Theorem 9.41 (Quadratic form Ο•)

Assume the following:

  1. V=Rn and W=Rm.

  2. The function Ο• has a quadratic form

    Ο•(x,z)=xTQx+cTx+zTMxβˆ’zTRzβˆ’dTz

    where Q and R are symmetric matrices, M is a matrix, and c and d are vectors.

  3. Z=Rm.

  4. X is a set of the form

    X={x|xTQjx+ajTx+bj≀0,j=1,…,r}

    where Qj are symmetric positive semidefinite matrices, aj are vectors and bj are scalars.

  5. ηz is closed and convex for every z∈Z.

  6. ρx is closed and convex for every x∈X.

  7. The optimal minimax value satisfies

    βˆ’βˆž<infx∈Xsupz∈ZΟ•(x,z)<∞.

Then the minimax equality (9.4) holds and the set of optimal points for the minimax problem Xβˆ— is nonempty.

9.4.6.2. Existence of Saddle PointsΒΆ

Theorem 9.42 (Compact sublevel sets of η and ρ)

Assume the following:

  1. ηz is closed and convex for every z∈Z.

  2. ρx is closed and convex for every x∈X.

  3. Assume that either

    infx∈Xsupz∈ZΟ•(x,z)<∞

    or

    βˆ’βˆž<supz∈Zinfx∈XΟ•(x,z)

    holds true.

  4. The sublevel sets of η and ρ are compact.

Then the set of saddle points of Ο• is nonempty and compact.

Proof. First assume that infx∈Xsupz∈ZΟ•(x,z)<∞.

  1. By applying Theorem 9.38, minimax equality holds and Xβˆ— is nonempty.

  2. Hence infx∈Xsupz∈ZΟ•(x,z) is finite.

  3. Due to minimax equality:

    βˆ’βˆž<supz∈Zinfx∈XΟ•(x,z)=infx∈Xsupz∈ZΟ•(x,z)<∞.
  4. We reverse the roles of x and z and the sign of Ο• and apply Theorem 9.38 again to show that Zβˆ— is nonempty and compact set.

  5. The set of saddle points is a Cartesian product of Xβˆ— and Zβˆ—.

  6. Since both Xβˆ— and Zβˆ— are nonempty and compact, hence Xβˆ—Γ—Zβˆ— is also nonempty and compact.

Now assume that βˆ’βˆž<supz∈Zinfx∈XΟ•(x,z).

  1. Then infz∈Zsupx∈X(βˆ’Ο•(x,z))<∞.

  2. we can then reverse the role of x and z in the preceding argument.

Theorem 9.43 (Recession cones and constancy spaces of η and ρ)

Assume the following:

  1. ηz is closed and convex for every z∈Z.

  2. ρx is closed and convex for every x∈X.

  3. Assume that either

    infx∈Xsupz∈ZΟ•(x,z)<∞

    or

    βˆ’βˆž<supz∈Zinfx∈XΟ•(x,z)

    holds true.

  4. The recession cones and constancy spaces of η and ρ are equal to each other:

    RΞ·=LΞ· and Rρ=Lρ.

Then the set of saddle points of Ο• is nonempty.

9.4.6.3. Saddle Point TheoremΒΆ

The analysis in this section culminates into the following result.

Theorem 9.44 (Saddle point theorem)

Assume that Ξ·z is closed and convex for every z∈Z and ρx is closed and convex for every x∈X. Then the set of saddle points of Ο• is nonempty and compact under any of the following conditions.

  1. X and Z are compact.

  2. Z is compact and there exists a vector zβ€•βˆˆZ and a scalar c such that the sublevel set

    {x∈X|Ο•(x,z―)≀c}

    is nonempty and compact.

  3. X is compact and there exists a vector xβ€•βˆˆX and a scalar c such that the superlevel set

    {z∈Z|Ο•(x―,z)β‰₯c}

    is nonempty and compact.

  4. There exist vectors xβ€•βˆˆX and zβ€•βˆˆZ and a scalar c such that the sets

    {x∈X|Ο•(x,z―)≀c} and {z∈Z|Ο•(x―,z)β‰₯c}

    are nonempty and compact.