8.2. Convex SetsΒΆ

Throughout this section, we assume that V is a real vector space. Wherever necessary, it is equipped with a norm β€–β‹…β€–:Vβ†’R or a real inner product βŸ¨β‹…,β‹…βŸ©:VΓ—Vβ†’R. It is also equipped with a metric d(x,y)=β€–xβˆ’yβ€–.

8.2.1. Line SegmentsΒΆ

Definition 8.5 (Line segment)

Let x1 and x2 be two points in V. Points of the form

y=(1βˆ’ΞΈ)x1+ΞΈx2 where 0≀θ≀1

form a (closed) line-segment between x1 and x2. The closed line segment is denoted by [x1,x2].

[x1,x2]β‰œ{(1βˆ’ΞΈ)x1+ΞΈx2|0≀θ≀1}.

Similarly, we define an open line segment as:

(x1,x2)β‰œ{(1βˆ’ΞΈ)x1+ΞΈx2|0<ΞΈ<1}.

The half-open segment (x1,x2] is defined as:

(x1,x2]β‰œ{(1βˆ’ΞΈ)x1+ΞΈx2|0<θ≀1}.

The half-open segment [x1,x2) is defined as:

[x1,x2)β‰œ{(1βˆ’ΞΈ)x1+ΞΈx2|0≀θ<1}.

8.2.2. Convex SetsΒΆ

Definition 8.6 (Convex set)

A set CβŠ†V is convex if the line segment between any two points in C lies in C. i.e.

ΞΈx1+(1βˆ’ΞΈ)x2∈Cβˆ€x1,x2∈C and 0≀θ≀1.

The empty set is vacuously convex. The entire vector space V is convex since it contains all the line segments between any pair of points in the space.

../_images/pic_convex_sets.png

Fig. 8.1 Examples of convex setsΒΆ

Observation 8.2

Since a convex set contains the line segment between any two points, any line segment is convex by definition.

Geometrically, a convex set has no holes, hollows, pits or dimples. A set is convex if from each point in the set, it is possible to see every other point without having the line of sight pass outside the set.

Example 8.1 (Real line)

On the real line R, the empty set, sing points, intervals (closed, open, half open), half lines, and the entire real line are convex sets. There are no other convex sets possible.

Theorem 8.7

Any linear subspace is convex.

Proof. Let E be a linear subspace of V. Then E is closed under addition and scalar multiplication. Thus, for any x,y∈E and 0≀t≀1,

tx+(1βˆ’t)y∈E.

Thus, E is convex.

Theorem 8.8

Any affine set is convex.

Proof. Let CβŠ†V be an affine set. By definition, for any x,y∈C and any t∈R, tx+(1βˆ’t)y∈C. It is valid in particular for 0≀t≀1. Thus, C is convex.

Theorem 8.9

Any hyperplane is convex since it is affine.

Theorem 8.10

Half spaces are convex.

Proof. Consider H+ defined as:

H+={x:⟨a,x⟩β‰₯b}

Let x,y∈H+. Then:

⟨a,x⟩β‰₯b and βŸ¨a,y⟩β‰₯b.

For some 0≀t≀1:

⟨a,tx+(1βˆ’t)y⟩=t⟨a,x⟩+(1βˆ’t)⟨a,y⟩β‰₯tb+(1βˆ’t)b=b.

Thus, x+(1βˆ’t)y∈H+. Analogous proofs apply for other types of half spaces.

Theorem 8.11 (Convex set as convex combination of itself)

Let C be a nonempty subset of V. If C is convex then for every t1,t2β‰₯0, we have

(t1+t2)C=t1C+t2C.

In particular, if t1,t2β‰₯0 such that t1+t2=1, then

C=t1C+t2C.

Proof. The statement (t1+t2)CβŠ†t1C+t2C is valid even for sets which are not convex.

  1. Let x∈t1+t2)C.

  2. Then there exist y∈C such that x=t1y+t2y.

  3. Hence t1y∈t1C and t2y∈t2C.

  4. Hence x=t1y+t2y∈t1C+t2C.

We now show the converse.

  1. Let x∈t1C+t2C.

  2. Then there exist x1,x2∈C such that

    x=t1x1+t2x2.
  3. If t1=t2=0, then x=0 and x∈(0+0)C.

  4. Now assume that t1+t2>0.

  5. By the convexity of C,

    y=t1t1+t2x1+t2t1+t2x2∈C.
  6. Hence x=(t1+t2)y∈(t1+t2)C.

  7. Hence t1C+t2CβŠ†(t1+t2)C.

Together, we have

(t1+t2)C=t1C+t2C.

Theorem 8.12 (Convex set as union of line segments)

Let C be a convex subset of V. Then C is the union of all the closed line segments connecting the points of the set. In other words

C=⋃x,y∈C[x,y].

Proof. Let D=⋃x,y∈C[x,y].

If C is empty, then D is also empty. Hence there is nothing to prove. If C={x} is a singleton, then D consists of the union of a single line segment

[x,x]={x}.

So C=D. We now consider the case where C consists of more than one point.

We first show that CβŠ†D.

  1. Let x∈C.

  2. Then [x,x]={x} is a line segment of C.

  3. Hence x∈[x,x]βŠ†D.

  4. Hence CβŠ†D.

We now show the converse.

  1. Let z∈D.

  2. Then there exists x,y∈C such that z∈[x,y].

  3. Then by convexity of C, [x,y]βŠ†C.

  4. Hence z∈[x,y]βŠ†C.

  5. Hence DβŠ†C.

Together, C=D.

8.2.3. RaysΒΆ

Definition 8.7 (Ray)

A ray R is defined as

Rβ‰œ{x0+tv|tβ‰₯0}

where v≠0 indicates the direction of ray and x0 is the base or origin of ray.

Theorem 8.13

A ray is convex.

Proof. Let a ray be given as:

R={x0+tv|tβ‰₯0}.

Let u,v∈R. Thus, there is tu,tvβ‰₯0 such that:

u=x0+tuv and v=x0+tvv.

Now, for some 0≀r≀1,

ru+(1βˆ’r)v=r(x0+tuv)+(1βˆ’r)(x0+tvv)=x0+(rtu+(1βˆ’r)tv)v.

Since rtu+(1βˆ’r)tvβ‰₯0, hence ru+(1βˆ’r)v∈R.

8.2.4. BallsΒΆ

Theorem 8.14

An open ball B(a,r) is convex for any norm ‖⋅‖:V→R.

Proof. Recall that an open ball in a normed linear space is defined as:

B(a,r)={x∈V|β€–xβˆ’aβ€–<r}.

Let x,y∈B(a,r) and let 0≀t≀1. Then,

tx+(1βˆ’t)yβˆ’a=t(xβˆ’a)+(1βˆ’t)(yβˆ’a).

By triangle inequality:

β€–t(xβˆ’a)+(1βˆ’t)(yβˆ’a)‖≀‖t(xβˆ’a)β€–+β€–(1βˆ’t)(yβˆ’a)β€–=tβ€–xβˆ’aβ€–+(1βˆ’t)β€–yβˆ’aβ€–<tr+(1βˆ’t)r=r.

Thus,

β€–tx+(1βˆ’t)yβˆ’aβ€–<r⟹tx+(1βˆ’t)y∈B(a,r).

Theorem 8.15

A closed ball B[a,r] is convex for any norm ‖⋅‖:V→R.

Proof. Recall that a closed ball in a normed linear space is defined as:

B[a,r]={x∈V|β€–xβˆ’a‖≀r}.

Let x,y∈B(a,r) and let 0≀t≀1. Then,

tx+(1βˆ’t)yβˆ’a=t(xβˆ’a)+(1βˆ’t)(yβˆ’a).

By triangle inequality:

β€–t(xβˆ’a)+(1βˆ’t)(yβˆ’a)‖≀‖t(xβˆ’a)β€–+β€–(1βˆ’t)(yβˆ’a)β€–=tβ€–xβˆ’aβ€–+(1βˆ’t)β€–yβˆ’a‖≀tr+(1βˆ’t)r=r.

Thus,

β€–tx+(1βˆ’t)yβˆ’a‖≀r⟹tx+(1βˆ’t)y∈B[a,r].

8.2.5. Convex CombinationsΒΆ

Definition 8.8 (Convex combination)

We call a point of the form ΞΈ1x1+β‹―+ΞΈkxk, where ΞΈ1+β‹―+ΞΈk=1 and ΞΈiβ‰₯0,i=1,…,k, a convex combination of the points x1,…,xk.

It is like a weighted average of the points xi. A weights ΞΈi in a convex combination can be interpreted as probabilities or proportions.

Example 8.2 (Center of mass)

Consider a system of particles pi, i=1,…,n each with mass mi and location in space as xi. The center of mass x satisfies the equation:

βˆ‘i=1nmi(xiβˆ’x)=0.

Solving this equation gives us:

x=βˆ‘i=1nmimxi

where m=βˆ‘i=1nmi.

If we assign ΞΈi=mim, we notice that ΞΈiβ‰₯0 and βˆ‘i=1nΞΈi=1. We can now write the center of mass as:

x=βˆ‘i=1nΞΈixi

which is a convex combination of the locations xi where ΞΈi gives the proportion of contribution of each particle according to its mass.

Remark 8.2 (Convex combinations and unit simplex)

We recall that the unit simplex in Rn is given by

Ξ”n={t∈Rn|⟨t,1⟩=1,tβͺ°0}={t∈Rn|t1+β‹―+tn=1,t1,…,tnβ‰₯0}.

Thus, the coefficients for convex combinations of n points are drawn from Ξ”n.

Theorem 8.16 (Closure under convex combinations)

A set is convex if and only if it contains all convex combinations of its points.

Let V be a real vector space and C be a subset of V. Then, C is convex if and only if for any m∈N, for any x1,…,xm∈C, and for every tβˆˆΞ”m, t1x1+β‹―+tmxm∈C.

Proof. We know that a set C is convex is equivalent to saying that it contains all 2 point convex combinations; i.e, for any x1,x2∈C, t1,t2β‰₯0 and t1+t2=1,

t1x1+t2x2∈C.

We first show that if C is convex, it contains all its (finite) convex combination by induction.

  1. By definition C contains all its 2 point convex combinations.

  2. As induction hypothesis, assume that C contains all convex combinations of mβˆ’1 or fewer points where m>2.

  3. Consider a convex combination of m points

    βˆ‘i=1mtixi

    where xi∈C, tiβ‰₯0, βˆ‘ti=1.

  4. Since βˆ‘t1=1, hence at least one of ti<1.

  5. Without loss of generality, assume tm<1.

  6. Note that tm<1 means that 1βˆ’tm>0.

  7. Define y=βˆ‘i=1mβˆ’1tiβ€²xi where tiβ€²=ti1βˆ’tm.

  8. Note that tiβ€²β‰₯0. Also, βˆ‘i=1mβˆ’1tiβ€²=1 since βˆ‘i=1mβˆ’1ti=1βˆ’tm.

  9. Thus, y is an mβˆ’1 point convex combination of C.

  10. By induction hypothesis, y∈C.

  11. Now, (1βˆ’tm)y=βˆ‘i=1mβˆ’1tixi.

  12. Hence, x=(1βˆ’tm)y+tmxm.

  13. It is a 2 point convex combination of y and xm.

  14. Since both y,xm∈C, hence x∈C.

  15. Thus, C contains all its m point convex combinations.

For the converse, note that if C contains all its convex combinations, then it contains, in particular, all its two point convex combinations. Hence, C is convex.

Theorem 8.17

A convex combination of convex combinations is a convex combination.

Proof. Let SβŠ†V. Note that S is arbitrary (no convexity assumed).

  1. Consider n points yi, i=1,…,n described as below.

  2. Let yi=βˆ‘j=1mjti,jxi,j be convex combinations of mj points:

    • xi,1,…,xi,mj∈S.

    • ti,jβ‰₯0.

    • βˆ‘j=1mjti,j=1.

  3. Consider the convex combination y=βˆ‘i=1nriyi.

    • riβ‰₯0.

    • βˆ‘ri=1.

  4. We need to show that y is a convex combination of points of S.

Towards this:

y=βˆ‘i=1nriyi=βˆ‘i=1nriβˆ‘j=1mjti,jxi,j=βˆ‘i=1nβˆ‘j=1mjriti,jxi,j.

Consider the terms:

si,j=riti,j.

Since riβ‰₯0 and ti,jβ‰₯0, hence si,jβ‰₯0.

Now, consider their sum:

βˆ‘i=1nβˆ‘j=1mjsi,j=βˆ‘i=1nβˆ‘j=1mjriti,j=βˆ‘i=1nriβˆ‘j=1mjti,j=βˆ‘i=1nri=1

Thus, βˆ‘i,jsi,j=1.

Hence,

y=βˆ‘i,jsi,jxi,j

is a convex combination of points of S.

8.2.6. Convex HullΒΆ

Definition 8.9 (Convex hull)

The convex hull of an arbitrary set SβŠ†V denoted as conv⁑(S), is the set of all convex combinations of points in S.

conv⁑(S)={ΞΈ1x1+β‹―+ΞΈkxk|xk∈S,ΞΈiβ‰₯0,i=1,…,k,ΞΈ1+β‹―+ΞΈk=1}.
../_images/pic_convex_hull_random_2d_points.png

Fig. 8.2 The convex hull of a set of random points on the 2D planeΒΆ

Property 8.1 (Convexity of convex hull)

The convex hull conv⁑(S) of a set S is always convex.

Proof. Let x,y∈conv⁑(S), t∈[0,1] and the point z=tx+(1βˆ’t)y. We need to show that z∈conv⁑(S).

  1. x,y are convex combinations of points of S.

  2. z is a convex combination of x and y.

  3. Hence, z is a convex combination of convex combinations of points in S.

  4. By Theorem 8.17, a convex combination of convex combinations is a convex combination.

  5. Thus, z is a convex combination of points of S.

  6. But conv⁑(S) contains all convex combinations of points of S by definition.

  7. Hence, z∈conv⁑(S).

  8. Thus, conv⁑(S) is convex.

Property 8.2 (Affine hull of convex hull)

Let V be a finite dimensional real vector space. Let S be a nonempty subset of V. Let C=conv(S). Then

affS=affC.

In other words, the affine hull of a set and its convex hull are identical.

Proof. By using a translation argument if necessary, without loss of generality, we assume that 0∈S.

  1. Then both affS and affC are linear subspaces.

  2. Since SβŠ†C, hence affSβŠ†affC.

  3. For the converse, assume that m=affC.

  4. Let x1,…,bxm∈C be m linearly independent vectors spanning affC.

  5. Then for every x∈affC, there exist scalars t1,…,tm so that

    x=βˆ‘i=1mtixi.
  6. By definition of convex hull, each xi∈C is a convex combination of points in S.

  7. Hence every x∈affC is a linear combination of points in S.

  8. Hence affCβŠ†affS.

Theorem 8.18

The convex hull of a set S is the smallest convex set containing it. In other words, let C be any convex set such that SβŠ†C. Then conv⁑(S)βŠ†C.

Proof. Let C be a convex set such that SβŠ†C.

  1. Let x∈conv⁑(S).

  2. Then, x is a convex combination of points of S.

  3. C is convex and SβŠ†C.

  4. Hence, C contains every convex combination of points of S.

  5. Thus, in particular x∈C.

  6. Since x∈conv⁑(S) was arbitrary, hence conv⁑(S)βŠ†C.

We could have started as defining the convex hull of S being the smallest convex set containing S and arrived at the conclusion that conv⁑(S) contains all convex combinations of S. Some authors prefer to define conv⁑(S) as the smallest convex set containing S. Both definitions are equivalent.

8.2.6.1. CarathΓ©odory TheoremΒΆ

Theorem 8.19 (CarathΓ©odory theorem)

Let V be an n-dimensional real vector space. Let SβŠ†V. Let x∈conv⁑(S).

Then, there exists a set of n+1 points x0,…,xn∈S such that

x∈conv⁑({x0,…,xn});

i.e., there exists a tβˆˆΞ”n+1 such that

x=βˆ‘i=0ntixi.

Proof. We note that x∈conv⁑(S).

  1. Thus, there exists a set of k+1 points in S and tβˆˆΞ”k+1 such that

    x=βˆ‘i=0ktixi.
  2. We can assume ti>0 for all i=0,…,k since otherwise, we can drop the vectors corresponding to the zero coefficients from the convex combination.

  3. If k≀n, there is nothing to prove.

  4. Hence, consider the case where k>n.

  5. We now describe a process which can reduce the number of points in the convex combination by one.

  6. The k vectors x1βˆ’x0,…,xkβˆ’x0 are linearly dependent as k>n and V is n-dimensional.

  7. Thus, there is a nontrivial linear combination of these vectors

    r1(x1βˆ’x0)+β‹―+rk(xkβˆ’x0)=0.
  8. Let r0=βˆ’r1βˆ’β‹―βˆ’rk. Then, we have a nontrivial combination

    βˆ‘i=0krixi=0

    with βˆ‘ri=0.

  9. In particular, there exists at least one index j for which rj<0.

  10. Let Ξ±β‰₯0.

  11. Then,

    x=βˆ‘i=0ktixi+Ξ±βˆ‘i=0krixi=βˆ‘i=0k(ti+Ξ±ri)xi

    with βˆ‘(ti+Ξ±ri)=βˆ‘ti+Ξ±βˆ‘ri=1.

  12. Thus, it is a convex combination for x if ti+Ξ±riβ‰₯0 for every i=0,…,k.

  13. Let

    Ξ±=mini|ri<0{βˆ’tiri}.
  14. Ξ± is well defined since there is at least one rj<0. Let j be the index for which Ξ± was obtained.

  15. Then, tj+Ξ±rj=0.

  16. Also, we can see that ti+Ξ±riβ‰₯0 for all i=0,…,k.

  17. Thus, we have found a convex combination for x where the coefficient for xj is 0.

  18. Thus, we have obtained a convex combination for x with kβˆ’1 points.

  19. Repeating this process up to kβˆ’n times, we can obtain a convex combination of x consisting of n+1 or less points.

8.2.7. DimensionΒΆ

Definition 8.10 (Dimension of a convex set)

The dimension of a convex set is defined to be the dimension of its affine hull.

If C is a convex set, then:

dimC=dimaffC.

Recall that the dimension of an affine set is equal to the dimension of the linear subspace associated with it (Definition 4.180).

  1. A circle will have a dimension of 2 even if it is in R3.

  2. A sphere will have a dimension of three.

8.2.8. SimplicesΒΆ

Theorem 8.20 (Convex hull of a finite set of points)

Let S={x0,…,xm} be a finite set of points of V. Then, conv⁑(S) consists of all the points of the form

t0x0+…tmxm,t0β‰₯0,…,tmβ‰₯0,βˆ‘i=0mti=1.

In Rn, this is known as a polytope.

A Simplex is a convex hull of a finite set of affine independent points. The simplex provides a powerful coordinate system for the points within it in terms of barycentric coordinates.

Definition 8.11 (k-simplex)

Let k+1 points v0,…,vk∈V be affine independent.

The simplex determined by them is given by

C=conv⁑{v0,…,vk}={t0v0+β‹―+tkvk|tβͺ°0,1Tt=1}

where t=[t1,…,tk]T and 1 denotes a vector of appropriate size (k) with all entries one.

In other words, C is the convex hull of the set {v0,…,vk}.

A simplex is a convex set since it is a convex hull of its vertices. k stands for the dimension of the simplex. Recall that the dimension of a convex set is the dimension of its affine hull.

Example 8.3 (Simplex examples)

In Rn:

  • A 0-simplex is a point.

  • A 1-simplex is a line segment (2 points).

  • A 2-simplex is a triangle (3 points).

  • A 3-simplex is a tetrahedron (4 points).

  • A 4-simplex is a 5-cell (5 points).

Theorem 8.21 (Barycentric coordinates)

Each point of a k simplex is uniquely expressible as a convex combination of the vertices.

Proof. Let C=conv⁑{v0,v1,…,vk}.

  1. Let v∈C.

  2. Then, v=βˆ‘i=0ktivi with tiβ‰₯0 and βˆ‘ti=1.

  3. For contradiction, assume there was another representation: v=βˆ‘i=0krivi with riβ‰₯0 and βˆ‘ri=1.

  4. Then,

    βˆ‘i=0ktivi=βˆ‘i=0kriviβŸΉβˆ‘i=0k(tiβˆ’ri)vi=0.
  5. But {v0,v1,…,vk} are affine independent.

  6. Hence, ti=ri.

  7. Thus, the representation is unique.

Definition 8.12 (Simplex midpoint)

The point βˆ‘i=0k1k+1vi in a simplex C=conv⁑{v0,…,vk} is known as its midpoint or barycenter.

Theorem 8.22

The dimension of a convex set C is the maximum of the dimensions of the various simplices included in C.

Proof. We need to show that there is a simplex SβŠ‚C such that dimS=dimC.

  1. Let A be any finite affine independent subset of C.

  2. Since C is convex, hence AβŠ†C⟹conv⁑(A)βŠ†C.

  3. Thus, C contains the simplices constructed from any set of finite affine independent points in C.

  4. Thus, if A={v0,…,vk} is a set of k+1 affine independent points of C, then conv⁑(A)βŠ†C implies that k≀dimC.

  5. Thus, if S is a k-simplex such that SβŠ†C, then dimS=k≀dimC.

  6. Let m be the maximum of the dimensions of the various simplices contained in C.

  7. Then, there exist affine independent points v0,…,vm∈C such that the simplex S=conv⁑{v0,…,vm}βŠ†C.

  8. Let M be the affine hull of S; i.e. M=affS.

  9. Then, dimM=m and MβŠ†affC.

  10. If Cβˆ–M were nonempty, then there would be an element v∈Cβˆ–M which would be affine independent of {v0,…,vm}.

  11. That would lead to a set of m+2 affine independent points in C. That would mean that C contains a simplex of dimension m+1. A contradiction.

  12. Hence, Cβˆ–M=βˆ….

  13. Thus, CβŠ†M.

  14. Since affC is the smallest affine set that contains C, hence affC=M.

  15. Thus, dimC=m.

8.2.9. Symmetric ReflectionsΒΆ

The symmetric reflection of a convex set is convex since convexity is preserved under scalar multiplication. See Theorem 8.30 below.

If a symmetric convex set contains a nonzero vector x, then it contains the entire line segment between βˆ’x and x.

8.2.10. Infinite Convex CombinationsΒΆ

We can generalize convex combinations to include infinite sums.

Theorem 8.23

Let ΞΈ1,ΞΈ2,… satisfy

ΞΈiβ‰₯0,i=1,2,…,βˆ‘i=1∞θi=1,

and let x1,x2,β‹―βˆˆC, where CβŠ†V is convex. Then

βˆ‘i=1∞θixi∈C,

if the series converges.

We can generalize it further to density functions.

Theorem 8.24

Let p:Vβ†’R satisfy p(x)β‰₯0 for all x∈C and

∫Cp(x)dx=1

Then

∫Cp(x)xdx∈C

provided the integral exists.

Note that p above can be treated as a probability density function if we define p(x)=0βˆ€x∈Vβˆ–C.

8.2.11. Convexity Preserving OperationsΒΆ

In the following, we will discuss several operations which transform a convex set into another convex set, and thus preserve convexity.

Understanding these operations is useful for determining the convexity of a wide variety of sets.

Usually, it is easier to prove that a set is convex by showing that it is obtained by a convexity preserving operation from a convex set compared to directly verifying the convexity property i.e.

tx1+(1βˆ’t)x2∈Cβˆ€x1,x2∈C,t∈[0,1].

8.2.11.1. Intersection and UnionΒΆ

Theorem 8.25 (Intersection of convex sets)

If S1 and S2 are convex sets then S1∩S2 is convex.

Proof. Let x1,x2∈S1∩S2. We have to show that

tx1+(1βˆ’t)x2∈S1∩S2,βˆ€t∈[0,1].

Since S1 is convex and x1,x2∈S1, hence

tx1+(1βˆ’t)x2∈S1,βˆ€t∈[0,1].

Similarly

tx1+(1βˆ’t)x2∈S2,βˆ€t∈[0,1].

Thus

tx1+(1βˆ’t)x2∈S1∩S2,βˆ€t∈[0,1].

which completes the proof.

We can generalize it further.

Theorem 8.26 (Intersection of arbitrary collection of convex sets)

Let {Ai}i∈I be a family of sets such that Ai is convex for all i∈I. Then ∩i∈IAi is convex.

Proof. Let x1,x2 be any two arbitrary elements in ∩i∈IAi.

x1,x2∈∩i∈IAi⟹x1,x2∈Aiβˆ€i∈I⟹tx1+(1βˆ’t)x2∈Aiβˆ€t∈[0,1]βˆ€i∈I since Ai is convex βŸΉtx1+(1βˆ’t)x2∈∩i∈IAi.

Hence ∩i∈IAi is convex.

Corollary 8.1 (Arbitrary intersection of closed half spaces)

Let I be an index set. Let ai∈V and bi∈R for every i∈I. Then, the set:

C={x∈V|⟨x,aiβŸ©β‰€biβˆ€i∈I}

is convex.

Proof. Since each of the half spaces is convex, hence so is their intersection.

This result is applicable for open half spaces and hyperplanes too too. It also applies for a mixture of hyperplanes and half-spaces.

Corollary 8.2

The solution set of a system of linear equations and inequalities in Rn is convex.

Proof. We proceed as follows:

  • The solution set of each linear equation is a hyperplane.

  • The solution set of each linear inequality is a half-space (closed or open).

  • The solution set of a system of linear equations and inequalities is the intersection of these hyperplanes and half-spaces.

  • Each hyperplane and each half-space is convex.

  • Hence, their intersection is convex.

Theorem 8.27 (Intersection and union of two sets)

Let C1 and C2 be convex in V. Then, the largest convex set contained in both is C1∩C2. And, the smallest convex set containing both is conv⁑(C1βˆͺC2).

Proof. Let C be a convex set contained in both C1 and C2. Then, CβŠ†C1∩C2. But C1∩C2 is convex (Theorem 8.25). Hence, C1∩C2 is the largest convex set contained in both C1 and C2.

Let C be a convex set which contains both C1 and C2. Then, C1βˆͺC2βŠ†C. The smallest convex set containing C1βˆͺC2 is its convex hull given by conv⁑(C1βˆͺC2) (Theorem 8.18).

Theorem 8.28 (Intersection and union of arbitrary sets)

Let I be an index set and F={Ci}i∈I be a family of convex sets in V. Then, the largest convex set contained in every set of F is:

β‹‚i∈ICi.

And, the smallest convex set containing every set of F is

conv⁑(⋃i∈ICi).

Proof. Let C be a convex set contained in every set of F. Then, CβŠ†β‹‚i∈ICi. But β‹‚i∈ICi is convex (Theorem 8.26). Hence, β‹‚i∈ICi is the largest convex set contained in every set of F.

Let C be a convex set which contains every set of F. Then, ⋃i∈ICiβŠ†C. The smallest convex set containing every set of F is its convex hull given by conv⁑(⋃i∈ICi) (Theorem 8.18).

8.2.11.2. Affine FunctionsΒΆ

Let us start with some simple results.

Theorem 8.29 (Convexity and translation)

Convexity is preserved under translation.

C (a subset of V) is convex if and only if C+a is convex for every a∈V.

Proof. Let CβŠ†V.

  1. Assume C is convex.

  2. Then, for every x,y∈C and every t∈[0,1], tx+(1βˆ’t)y∈C.

  3. Let a∈V.

  4. Let u,v∈C+a.

  5. Then, u=x+a and v=y+a for some x,y∈C.

  6. Then,

    tu+(1βˆ’t)v=t(x+a)+(1βˆ’t)(y+a)=tx+(1βˆ’t)y+a.
  7. But tx+(1βˆ’t)y∈C since C is convex.

  8. Then, tx+(1βˆ’t)y+a∈C+a.

  9. Thus, tu+(1βˆ’t)v∈C+a.

  10. Thus, C+a is convex.

We can follow the same argument in the opposite direction to establish that C+a is convex implies C is convex.

Theorem 8.30 (Convexity and scalar multiplication)

Convexity is preserved under scalar multiplication.

C (a subset of V) is convex if and only if αC is convex for every α∈R.

Proof. Let CβŠ†V.

  1. Assume C is convex.

  2. Let α∈R.

  3. Let u,y∈αC.

  4. Then, u=αx and v=αy for some x,y∈C.

  5. Let t∈[0,1].

  6. tu+(1βˆ’t)v=Ξ±(tx+(1βˆ’t)y).

  7. But tx+(1βˆ’t)y∈C since C is convex.

  8. Hence, Ξ±(tx+(1βˆ’t)y) in Ξ±C.

  9. Hence, tu+(1βˆ’t)v∈αC.

  10. Thus, Ξ±C is convex.

Similar argument in opposite direction establishes that Ξ±C is convex implies C is convex.

Recall that an affine function f:V→E from a real vector space V to another real vector space E is a function which satisfies

f(tx+(1βˆ’t)y)=tf(x)+(1βˆ’t)f(y)

for every t∈R.

Recall from Theorem 4.195 that an affine function can be written as a linear transformation followed by a translation:

f(x)=T(x)+b

where T is a linear operator.

Example 8.4

An affine function f:Rn→Rm takes the form of a matrix multiplication plus a vector addition:

f(x)=Ax+b

where A∈RmΓ—n and b∈Rm.

Theorem 8.31 (Image under affine function)

Let SβŠ†V be convex and f:Vβ†’E be an affine function. Then the image of S under f given by

f(S)={f(x)|x∈S}

is a convex set.

Proof. We proceed as follows:

  1. Let u,v∈f(S).

  2. Then, u=f(x) and v=f(y) for some x,y∈S.

  3. Let 0≀t≀1.

  4. Then, z=tx+(1βˆ’t)y∈S since S is convex.

  5. Since f is affine, hence

    f(z)=f(tx+(1βˆ’t)y)=tf(x)+(1βˆ’t)f(y)=tu+(1βˆ’t)v.
  6. Since z∈S, hence f(z)=tu+(1βˆ’t)v∈f(S).

  7. We have shown that for any u,v∈f(S) and any 0≀t≀1, tu+(1βˆ’t)v∈f(S).

  8. Thus, f(S) is convex.

It applies in the reverse direction also.

Theorem 8.32 (Inverse image under affine function)

Let f:Vβ†’E be affine and SβŠ†E be convex. Then the inverse image of S under f given by

fβˆ’1(S)={x∈V|f(x)∈S}

is convex.

Proof. Denote R=fβˆ’1(S). We need to show that if S is convex then R is convex too.

We proceed as follows:

  1. Let x,y∈R.

  2. Let u=f(x) and v=f(y).

  3. u,v∈S.

  4. Let 0≀t≀1.

  5. Then, w=tu+(1βˆ’t)v∈S since S is convex.

  6. Let z=tx+(1βˆ’t)y.

  7. Since f is affine, hence

    w=tu+(1βˆ’t)v=tf(x)+(1βˆ’t)f(y)=f(tx+(1βˆ’t)y)=f(z).
  8. Since w∈S, hence z∈R as w=f(z).

  9. We have shown that for any x,y∈R and any 0≀t≀1, tx+(1βˆ’t)y∈R.

  10. Thus, R is convex.

Example 8.5 (Affine functions preserving convexity)

Let S∈Rn be convex.

  • For some α∈R, Ξ±S is convex. This is the scaling operation.

  • For some a∈Rn, S+a is convex. This is the translation operation.

  • Let n=m+k. Then, let Rn=RmΓ—Rk. A vector x∈S can be written as x=(x1,x2) where x1∈Rm and x2∈Rk. Then

    T={x1∈Rm|(x1,x2)∈S for some x2∈Rk}

    is convex. This is the projection operation. It projects vectors from Rn to Rm by dropping last k entries.

Example 8.6 (System of linear equations)

Consider the system of linear equations Ax=y where A∈RmΓ—n.

If y ranges over a convex set, then the corresponding set of solutions also ranges over a convex set due to Theorem 8.32.

The nonnegative orthant R+n is a convex set.

Let Y=R+m+a for some a∈Rm; i.e.,

Y={y|yβͺ°a}.

Then, Aβˆ’1Y is the set of vectors satisfying the inequality

Axβͺ°a.

Thus, the solution set of a system of linear inequalities of the form Axβͺ°a is convex.

Now, if X=R+n, then AX is the set of vectors y∈Rm such that the equation Ax=y has a nonnegative solution (xβͺ°0). Since X is convex, so is AX.

Theorem 8.33 (Orthogonal projection of convex set)

The orthogonal projection of a convex set C on a subspace V is another convex set.

Proof. We recall that orthogonal projection is a linear mapping and thus an affine function. By Theorem 8.31, image of a convex set under an affine function is convex. Hence proved.

8.2.11.3. Set AdditionΒΆ

Theorem 8.34 (Convexity and set addition)

Let C1 and C2 be two convex subsets of V. Then C1+C2 is convex.

Proof. We proceed as follows:

  1. Let x,y∈C1+C2.

  2. Then, x=x1+x2 for some x1∈C1 and some x2∈C2.

  3. Similarly, y=y1+y2 for some y1∈C1 and some y2∈C2.

  4. Let 0≀t≀1.

  5. Then:

    tx+(1βˆ’t)y=t(x1+x2)+(1βˆ’t)(y1+y2)=tx1+(1βˆ’t)y1+tx2+(1βˆ’t)y2.
  6. But, z1=tx1+(1βˆ’t)y1∈C1 since C1 is convex.

  7. Similarly, z2=tx2+(1βˆ’t)y2∈C2 since C2 is convex.

  8. Hence, tx+(1βˆ’t)y=z1+z2∈C1+C2.

  9. Thus, C1+C2 is convex.

One way to think geometrically about set addition is as the union of all translates of C1 given by C1+x as x varies over C2.

Theorem 8.35

A set C is convex if and only if

(1βˆ’t)C+tC=Cβˆ€t∈[0,1].

Proof. Assume C is convex:

  1. (1βˆ’t)C+tC={tx+(1βˆ’t)y|x,y∈C}.

  2. Thus, (1βˆ’t)C+tCβŠ†C.

  3. For every x∈C, (1βˆ’t)x∈(1βˆ’t)C and tx∈tC.

  4. Thus, (1βˆ’t)x+tx=x∈(1βˆ’t)C+tC.

  5. Thus, CβŠ†(1βˆ’t)C+tC.

  6. Combining, we get (1βˆ’t)C+tC=C.

Assume (1βˆ’t)C+tC=C for every t∈[0,1].

  1. Let x,y∈C and t∈[0,1].

  2. Then, (1βˆ’t)x∈(1βˆ’t)C and ty∈tC.

  3. Hence, (1βˆ’t)x+ty∈(1βˆ’t)C+tC=C.

  4. Thus, C is convex.

Theorem 8.36 (Convexity and linear combination)

Convexity is preserved under linear combinations.

Let C1,…,Ck be convex. Let t1,…,tk∈R. Then, their linear combination:

C=t1C1+β‹―+tkCk

is convex.

Proof. Due to Theorem 8.30, tiCi are convex for i=1,…,k.

By (finite) repeated application of Theorem 8.34, their sum is also convex.

Theorem 8.37 (Nonnegative scalar multiplication distributive law)

Let C be convex and t1,t2β‰₯0. Then

(t1+t2)C=t1C+t2C.

Proof. From Theorem 4.18, we know that:

(t1+t2)CβŠ†t1C+t2C.

We now show that t1C+t2CβŠ†(t1+t2)C.

  1. If both t1=t2=0, then we have trivial equality.

  2. If either of t1 or t2 is 0, then also we have trivial equality.

  3. Now, consider the case t1,t2>0.

  4. Define t=t1+t2>0 and r=t1t.

  5. Then, 1βˆ’r=t2t.

  6. Then, since C is convex, hence rC+(1βˆ’r)CβŠ†C.

  7. Multiplying by t on both sides, we get: t1C+t2CβŠ†(t1+t2)C.

For the special case of t1=r and t2=1βˆ’r with r∈[0,1], we get:

C=rC+(1βˆ’r)C.

Some implications are C+C=2C, C+C+C=3C and so forth if C is convex.

Theorem 8.38 (Convex combinations over arbitrary unions)

Let I be an index set and F={Ci}i∈I be a family of convex sets in V. Let C be given as:

C=conv⁑(⋃i∈ICi);

i.e., C is the convex hull of the union of the family of sets F. Then,

C=⋃{βˆ‘i∈ItiCi}

where the union is taken over all finite convex combinations (i.e. over all nonnegative choices of ti such that only finitely many ti are nonzero and they add up to 1).

Proof. We proceed as follows:

  1. Let x∈C.

  2. Then, x is a convex combination of elements in ⋃i∈ICi.

  3. Thus, x=βˆ‘i=1maiyi where yiβˆˆβ‹ƒi∈ICi, aiβ‰₯0 and βˆ‘ai=1.

  4. Drop all the terms from x where ai=0.

  5. If yi,yj belong to some same Ck, then, we can replace aiyi+ajyj with some ay=aiyi+ajyj where a=ai+aj. Note that, with these assumptions, y=aiai+ajyi+ajai+ajyj is a convex combination of yi and yj, hence y∈Ck since Ck is convex.

  6. Thus, terms from a single Ck in x can be reduced by a single term.

  7. Thus, we can simplify x such that

    x=βˆ‘j=1pbjxj

    such that each xj belongs to a different Cij with ij∈I, bj>0 and βˆ‘bj=1.

  8. Thus, C is a union of finite convex combinations of the form

    b1Ci1+β‹―+bpCip

    where i1,…,ip∈I are different indices and bj>0, βˆ‘bj=1.

  9. This is the same set as ⋃{βˆ‘i∈ItiCi} except for notational differences.

Corollary 8.3 (Convex hull of union)

Let C1 and C2 be convex sets. Let C be the convex hull of their union:

C=conv⁑(C1βˆͺC2).

Then,

C=⋃t∈[0,1][(1βˆ’t)C1+tC2].

8.2.11.4. Partial AdditionΒΆ

Recall the notion of direct sum of two vector spaces V and W over R given by VβŠ•W.

Theorem 8.39 (Partial addition on convex sets)

Let V and W be real vector spaces. Let VβŠ•W be their direct sum. Let C1 and C2 be convex sets in VβŠ•W. Let C be the set of vectors x=(y,z) such that there exist vectors z1,z2∈W with (y,z1)∈C1, (y,z2)∈C2 and z1+z2=z. Then, C is a convex set in VΓ—W.

Proof. Let (y,z)∈C such that there exist vectors z1,z2∈W with (y,z1)∈C1, (y,z2)∈C2 and z1+z2=z.

Let (yβ€²,zβ€²)∈C such that there exist vectors z1β€²,z2β€²βˆˆW with (yβ€²,z1β€²)∈C1, (yβ€²,z2β€²)∈C2 and z1β€²+z2β€²=zβ€².

Let t∈[0,1]. Consider the vector (yβ€³,zβ€³)=t(y,z)+(1βˆ’t)(yβ€²,zβ€²).

  1. yβ€³=ty+(1βˆ’t)yβ€².

  2. zβ€³=tz+(1βˆ’t)zβ€²=t(z1+z2)+(1βˆ’t)(z1β€²+z2β€²).

  3. Let z1β€³=tz1+(1βˆ’t)z1β€².

  4. Let z2β€³=tz2+(1βˆ’t)z2β€².

  5. Since (y,z1),(yβ€²,z1β€²)∈C1 and C1 is convex, hence (yβ€³,z1β€³)∈C1.

  6. Since (y,z2),(yβ€²,z2β€²)∈C2 and C2 is convex, hence (yβ€³,z2β€³)∈C2.

  7. But then, we note that zβ€³=z1β€³+z2β€³.

  8. Thus, (yβ€³,z1β€³)∈C1 and (yβ€³,z2β€³)∈C2 implies that (yβ€³,zβ€³)∈C.

  9. Thus, C is convex.

We can write a version of the theorem above for Rn.

Corollary 8.4 (Partial addition on convex sets in Euclidean space)

Let C1 and C2 be convex sets in Rn+m. Let C be the set of vectors x=(y,z) such that there exist vectors z1,z2∈Rm with (y,z1)∈C1, (y,z2)∈C2 and z1+z2=z. Then, C is a convex set in Rn+m.

The relationship between C and C1,C2 is known as partial addition.

  • When V={0} we are left with the result that C1,C2 convex implies C1+C2 is convex.

  • When W={0}, we are left with the result that C1,C2 convex implies C1∩C2 is convex.

  • In between, we have a spectrum of results where for a vector in C, part of the representation must be common between C1 and C2 while the remaining part must be the sum of corresponding parts of vectors in C1 and C2.

  • In other words, if a vector space can be decomposed as a direct sum of two subspaces, then we have intersection or representation in one subspace while addition in the other.

  • This partial addition (binary) operation is commutative as well as associative.

Partial additions appear naturally in convex cones in VβŠ•R generated by a convex set in V. See Observation 8.3 and discussion thereafter.

8.2.11.5. Cartesian Product/Direct SumΒΆ

Theorem 8.40 (Direct sum of convex sets)

Let V and W be real vector spaces. Let CβŠ†V and DβŠ†W be convex subsets of V and W respectively. Then, CβŠ•D is a convex subset of VβŠ•W.

More generally, if V1,…,Vk are real vector spaces and CiβŠ†Vi are convex subsets for i=1,…,k, then C=C1βŠ•β‹―βŠ•Ck is convex in the direct sum of vector spaces V1βŠ•β‹―βŠ•Vk.

Proof. If either C or D is empty, then CβŠ•D is empty, hence convex. We shall thus assume that both C and D are nonempty.

  1. Let z1,z2∈CβŠ•D and t∈(0,1).

  2. Then, z1=(x1,y1) and z2=(x2,y2) such that x1,x2∈C and y1,y2∈D.

  3. Since C and D are convex, hence x=tx1+(1βˆ’t)x2∈C and y=ty1+(1βˆ’t)y2∈D.

  4. Now,

    z=tz1+(1βˆ’t)z2=t(x1,y1)+(1βˆ’t)(x2,y2)=(tx1+(1βˆ’t)x2,ty1+(1βˆ’t)y2)=(x,y).
  5. Since x∈C and y∈D, hence z=(x,y)∈CβŠ•D.

  6. Thus, CβŠ•D is closed under convex combination.

  7. Thus, CβŠ•D is convex.

The generalization for multiple real vector spaces is easily verifiable through induction.

8.2.11.6. ProjectionΒΆ

Theorem 8.41 (Projection of a direct sum)

Let V and W be real vector spaces. Let CβŠ†V and DβŠ†W. Assume that CβŠ•D is a convex subset of VβŠ•W. Then, C and D are convex subsets of V and W respectively.

More generally, if V1,…,Vk are real vector spaces and CiβŠ†Vi are subsets for i=1,…,k, such that C=C1βŠ•β‹―βŠ•Ck is convex in the direct sum of vector spaces V1βŠ•β‹―βŠ•Vk; then Ci are convex subsets of Vi for i=1,…,k.

Proof. Consider the case of two vector spaces V and W.

  1. Let x1,x2∈C and t∈(0,1).

  2. Pick any y∈D.

  3. Then, (x1,y),(x2,y)∈CβŠ•D.

  4. Since CβŠ•D is convex, hence

    t(x1,y)+(1βˆ’t)(x2,y)=(tx1+(1βˆ’t)x2,y)∈CβŠ•D.
  5. Thus, tx1+(1βˆ’t)x2∈C.

  6. Thus, C is convex.

  7. Similarly D is also convex.

The argument can be extended by mathematical induction for multiple vector spaces.

Theorem 8.42 (Projection of a convex set)

Let V and W be real vector spaces. Let CβŠ†VβŠ•W be a convex set of VβŠ•W.

For every x∈V, define

Dx={y∈W|(x,y)∈C}.

Then Dx is convex for every x∈V.

Similarly, if for every y∈W, we define

Ey={x∈V|(x,y)∈C}

then Ey is convex for every y∈W.

Proof. If Dx is empty then it is convex vacuously. Hence assume that Dx is nonempty.

  1. Then for every y∈Dx there exists (x,y)∈C.

  2. Let u,v∈Dx and t∈[0,1].

  3. Then (x,u)∈C and (x,v)∈C.

  4. Since C is convex hence t(x,u)+(1βˆ’t)(x,v)∈C.

  5. i.e., (x,tu+(1βˆ’t)v)∈C.

  6. This implies that tu+(1βˆ’t)v∈Dx.

  7. Hence Dx is convex.

The argument for the convexity of Ey is identical.

8.2.12. Extreme PointsΒΆ

Definition 8.13 (Extreme points of convex sets)

Let VV be a real vector space and let C be a subset of V.

A point x∈S is called an extreme point of S if there do not exist x1,x2∈S with x1β‰ x2 and t∈(0,1) such that

x=tx1+(1βˆ’t)x2.

In other words, x cannot be expressed as a nontrivial convex combination of two different points in S.

The set of extreme points of a set S is denoted by extS.

Example 8.7 (Extreme points)

  1. Let C=[0,1]βŠ†R. Then, 0 and 1 are extreme points of C.

  2. Let C=(0,1)βŠ†R. C doesn’t have any extreme point.

  3. In a triangle, the three vertices are extreme points.

  4. In a convex polytope, all the vertices are extreme points.

A more intricate example of the set of extreme points for the set P={x∈Rn|Ax=b,xβͺ°0} is discussed in Theorem 8.59.