4.6. Inner Product SpacesΒΆ

Inner products are a generalization of the notion of dot product. We restrict our attention to real vector spaces and complex vector spaces. Thus, the field F can be either R or C.

4.6.1. Inner ProductΒΆ

Definition 4.88 (Inner product)

An inner product over an F-vector space V is any map ⟨,⟩:VΓ—Vβ†’F mapping (v1,v2)β†¦βŸ¨v1,v2⟩ satisfying following properties:

  1. [Positive definiteness]

    ⟨v,v⟩β‰₯0 and βŸ¨v,v⟩=0⟺v=0.
  2. [Conjugate symmetry]

    ⟨v1,v2⟩=⟨v2,v1βŸ©β€•βˆ€v1,v2∈V.
  • [Linearity in the first argument]

    ⟨αv,w⟩=α⟨v,wβŸ©βˆ€v,w∈V;βˆ€Ξ±βˆˆF⟨v1+v2,w⟩=⟨v1,w⟩+⟨v2,wβŸ©βˆ€v1,v2,w∈V

Theorem 4.72 (Scaling in second argument)

Let βŸ¨β‹…,β‹…βŸ©:VΓ—Vβ†’F be an inner product. Then

⟨v,Ξ±w⟩=Ξ±β€•βŸ¨v,w⟩.

Proof. We proceed as follows:

⟨v,Ξ±w⟩=⟨αw,vβŸ©β€•=α⟨w,vβŸ©β€•=Ξ±β€•βŸ¨w,vβŸ©β€•=Ξ±β€•βŸ¨v,w⟩.

Theorem 4.73 (Distribution in second argument)

Let βŸ¨β‹…,β‹…βŸ©:VΓ—Vβ†’F be an inner product. Then for any v,x,y∈V:

⟨v,x+y⟩=⟨v,x⟩+⟨v,y⟩.

Proof. We proceed as follows:

⟨v,x+y⟩=⟨x+y,vβŸ©β€•=⟨x,v⟩+⟨y,vβŸ©β€•=⟨x,vβŸ©β€•+⟨y,vβŸ©β€•=⟨v,x⟩+⟨v,y⟩.

Theorem 4.74 (Inner product with zero)

Let βŸ¨β‹…,β‹…βŸ©:VΓ—Vβ†’F be an inner product. Then,

⟨0,v⟩=⟨v,0⟩=0βˆ€v∈V.

Proof. We proceed as follows:

⟨u,v⟩=⟨u+0,v⟩=⟨u,v⟩+⟨0,v⟩.

By cancelling terms, we get:

⟨0,v⟩=0.

Using the conjugate symmetry, we get:

⟨v,0⟩=⟨0,vβŸ©β€•=0―=0.
  • Linearity in first argument extends to any arbitrary linear combination:

βŸ¨βˆ‘Ξ±ivi,w⟩=βˆ‘Ξ±i⟨vi,w⟩.
  • Similarly we have conjugate linearity in second argument for any arbitrary linear combination:

⟨v,βˆ‘Ξ±iwi⟩=βˆ‘Ξ±iβ€•βŸ¨v,wi⟩.

Example 4.21

The standard inner product on Rn is defined as:

⟨x,y⟩=βˆ‘i=1nxiyi.

This is often called the dot product or scalar product.

Example 4.22

The standard inner product on Cn is defined as:

⟨x,y⟩=βˆ‘i=1nxiyi―.

Example 4.23

Let x,y∈R2. Define:

⟨x,y⟩=x1y1βˆ’x2y1βˆ’x1y2+4x2y2.

Now:

  1. ⟨x,x⟩=(x1βˆ’x2)2+3x22. Thus, x=0⟺⟨x,x⟩=0. Thus, it is positive definite.

  2. ⟨y,x⟩=y1x1βˆ’y2x1βˆ’y1x2+4y2x2=⟨x,y⟩. It is symmetric.

  3. We can also verify that it is linear in the first argument.

Thus, it satisfies all the properties of an inner product.

Note that, in the matrix notation, we can write this inner product as:

⟨x,y⟩=[x1x2][1βˆ’1βˆ’14][y1y2]

The matrix

A=[1βˆ’1βˆ’14]

is positive definite. Its trace is 5 and its determinant is 3. Its eigen values are 4.303,0.697.

Example 4.24

Let CnΓ—n be the space of nΓ—n matrices. For any A=(aij) and B=(bij) in CnΓ—n, we define the inner product as:

⟨A,B⟩=βˆ‘j,kajkbjk―.

It can be easily seen that:

⟨A,B⟩=tr(ABH)=tr(BHA)

where BH is the conjugate transpose of B and tr computes the trace of a matrix (sum of its diagonal values).

Example 4.25

Let CnΓ—1 be the space of column vectors. Let Q be an arbitrary nΓ—n invertible matrix over C.

For any x,y∈CnΓ—1, define

⟨x,y⟩=yHQHQx.

We identify the 1Γ—1 matrix in the R.H.S. with its single entry as a complex number C. This is a valid inner product.

When Q=I, the identity matrix, the inner product reduces to:

⟨x,y⟩=yHx.

This is the standard inner product on the space of column vectors.

Theorem 4.75

For complex inner products, the inner product is determined identified by its real part.

This statement may be confusing. Let us unpack what it means. Let

⟨x,y⟩=Re⟨x,y⟩+iIm⟨x,y⟩.

Then, computing the inner product involves computing the real part as well as computing the complex part. What the statement means is that, if we know how to compute Re⟨x,y⟩ for any x,y∈V, then, we can use the same method to compute Im⟨x,y⟩ too; but using different inputs. See below.

Proof. Let

⟨x,y⟩=Re⟨x,y⟩+iIm⟨x,y⟩.

For any complex number z=x+iy∈C, we have:

Re(βˆ’iz)=Re(βˆ’i(x+iy))=Re(yβˆ’ix)=y=Im(z).

Since, ⟨x,y⟩ is a complex number, hence:

Im⟨x,y⟩=Re(βˆ’i⟨x,y⟩)=Re⟨x,iy⟩.

Thus,

⟨x,y⟩=Re⟨x,y⟩+iRe⟨x,iy⟩.

4.6.2. Real Inner ProductΒΆ

From the perspective of convex analysis, the general inner product is not very useful. We prefer a special class of inner products whose value is always real. This is applicable on vector spaces where the field of scalars is R.

Definition 4.89 (Real inner product)

A real inner product over an R-vector space V is any map ⟨,⟩:VΓ—Vβ†’R mapping (v1,v2)β†¦βŸ¨v1,v2⟩ satisfying following properties:

  1. [Positive definiteness]

    ⟨v,v⟩β‰₯0 and βŸ¨v,v⟩=0⟺v=0.
  2. [Symmetry]

    ⟨v1,v2⟩=⟨v2,v1βŸ©βˆ€v1,v2∈V.
  • [Linearity in the first argument]

    ⟨αv,w⟩=α⟨v,wβŸ©βˆ€v,w∈V;βˆ€Ξ±βˆˆR⟨v1+v2,w⟩=⟨v1,w⟩+⟨v2,wβŸ©βˆ€v1,v2,w∈V
  • Real inner product is always real valued no matter whether the vectors are real or complex.

  • Since the real inner product is symmetric, hence since it is linear in first argument, it is linear in second argument too.

⟨v,βˆ‘Ξ±iwi⟩=βˆ‘Ξ±i⟨v,wi⟩.

Example 4.26 (A real inner product for Cn over R)

In this example, we are dealing with n-tuples of complex numbers in Cn with the field of scalars being R. It can be easily checked that Cn over R is a vector space.

Let z1,z2 be complex numbers:

z1z2―=(a1+ib1)(a2βˆ’ib2)=a1a2+b1b2+i(b1a2βˆ’a1b2).

Then

Re(z1z2―)=a1a2+b1b2.
  1. Re(zz―)=x2+y2 is positive definite; i.e., Re(zz―)=0⟺z=0+i0.

  2. Re(z1z2―)=Re(z2z1―) is symmetric.

  3. For any α∈R Re(Ξ±z1z2―)=Ξ±Re(z1z2―). Thus, it is linear in first argument.

Now, for any x,y∈Cn, define:

⟨x,y⟩=Re(βˆ‘i=1nxiyi―).

Following the argument above, it is a real inner product on Cn.

Interestingly, if u∈Cn is identified with v∈R2n by stacking the real and imaginary parts, then the real inner product defined above for Cn is nothing but the standard inner product for R2n.

While the presentation in rest of the section will be based on the general conjugate symmetric inner product, it will be easy to extrapolate the results for the special case of real inner products.

4.6.3. Inner Product SpaceΒΆ

Definition 4.90 (Inner product space / Pre-Hilbert space)

An F-vector space V equipped with an inner product ⟨,⟩:VΓ—Vβ†’F is known as an inner product space or a pre-Hilbert space.

4.6.4. OrthogonalityΒΆ

Orthogonality is the generalization of the notion of perpendicularity from elementary geometry.

Definition 4.91 (Orthogonal vectors)

Any two vectors u,v∈V are called orthogonal to each other if ⟨u,v⟩=0.

We write uβŠ₯v if u and v are orthogonal to each other.

Definition 4.92 (Set of orthogonal vectors)

A set of non-zero vectors {v1,…,vp} is called orthogonal or pairwise orthogonal if

⟨vi,vj⟩=0 if iβ‰ jβˆ€1≀i,j≀p.

Theorem 4.76 (Orthogonality implies independence)

A set of orthogonal vectors is linearly independent.

Proof. Let v1,…vn be a set of orthogonal vectors. Suppose there is a linear combination:

Ξ±1v1+β‹―+Ξ±nvn=0.

Taking inner product on both sides with vj, we get:

⟨α1v1+β‹―+Ξ±nvn,vj⟩=⟨0,vj⟩⟺0+β‹―+Ξ±j⟨vj,vj⟩+β‹―+0=0⟺αj⟨vj,vj⟩=0⟺αj=0.

Thus, the only zero linear combination is the trivial combination. Thus, the vectors are linearly independent.

4.6.5. Norm Induced by Inner ProductΒΆ

Definition 4.93 (Norm induced by inner product)

Every inner product βŸ¨β‹…,β‹…βŸ©:VΓ—Vβ†’F on a vector space V induces a norm β€–β‹…β€–:Vβ†’R given by:

β€–vβ€–=⟨v,vβŸ©βˆ€v∈V.

We shall justify that this function satisfies all the properties of a norm later. But before that, let us examine some implications of this definition which are useful in their own right.

Note that it is easy to see that ‖⋅‖ is positive definite; i.e., ‖0‖=0 and ‖v‖>0 if v≠0.

Also, it is positively homogeneous, since:

β€–Ξ±vβ€–=⟨αv,Ξ±v⟩=Ξ±Ξ±β€•βŸ¨v,v⟩=|Ξ±|⟨v,v⟩=|Ξ±|β€–vβ€–.

Theorem 4.77 (Pythagoras theorem)

If uβŠ₯v then

β€–u+vβ€–2=β€–uβ€–2+β€–vβ€–2.

Proof. Expanding:

β€–u+vβ€–2=⟨u+v,u+v⟩=⟨u,u⟩+⟨u,v⟩+⟨v,u⟩+⟨v,v⟩=β€–uβ€–2+β€–vβ€–2

where we used the fact that: ⟨u,v⟩=⟨v,u⟩=0 since uβŠ₯v.

Theorem 4.78 (Cauchy Schwartz inequality)

For any u,v∈V:

|⟨u,v⟩|≀‖uβ€–β€–vβ€–.

The equality holds if and only if u and v are linearly dependent.

Proof. If either u=0 or v=0 then the equality holds. So, suppose that neither of them are zero vectors. In particular v≠0 means ‖v‖>0.

Define

w=⟨u,vβŸ©β€–vβ€–2v.

Then,

⟨w,uβˆ’w⟩=⟨⟨u,vβŸ©β€–vβ€–2v,uβˆ’βŸ¨u,vβŸ©β€–vβ€–2v⟩=⟨u,vβŸ©β€–vβ€–2⟨v,uβˆ’βŸ¨u,vβŸ©β€–vβ€–2v⟩=⟨u,vβŸ©β€–vβ€–2(⟨v,uβŸ©βˆ’βŸ¨v,v⟩)=⟨u,vβŸ©β€–vβ€–2(⟨v,uβŸ©βˆ’βŸ¨u,vβŸ©β€•β€–vβ€–2⟨v,v⟩)=⟨u,vβŸ©β€–vβ€–2(⟨v,uβŸ©βˆ’βŸ¨v,uβŸ©β€–vβ€–2β€–vβ€–2)=0.

Thus, wβŠ₯uβˆ’w. Therefore, by Pythagorean theorem,

β€–uβ€–2=β€–uβˆ’w+wβ€–2=β€–uβˆ’wβ€–2+β€–wβ€–2β‰₯β€–wβ€–2=β€–βŸ¨u,vβŸ©β€–vβ€–2vβ€–2=|⟨u,vβŸ©β€–vβ€–2|2β€–vβ€–2=|⟨u,v⟩|2β€–vβ€–2.

Multiplying on both sides by β€–vβ€–2, we obtain:

β€–uβ€–2β€–vβ€–2β‰₯|⟨u,v⟩|2.

Taking square roots on both sides,

|⟨u,v⟩|≀‖uβ€–β€–vβ€–.

In the derivation above, the equality holds if and only if

0=uβˆ’w=uβˆ’βŸ¨u,vβŸ©β€–vβ€–2v

which means that u and v are linearly dependent.

Conversely, if u and v are linearly dependent, then u=αv for some α∈F, and

w=⟨αv,vβŸ©β€–vβ€–2v=α⟨v,vβŸ©β€–vβ€–2v=Ξ±v=u

giving us uβˆ’w=0. Hence, the equality holds.

Theorem 4.79 (Inner product induced norm justification)

The function β€–β‹…β€–:Vβ†’R induced by the inner product βŸ¨β‹…,β‹…βŸ©:VΓ—Vβ†’F as defined in Definition 4.93 is indeed a norm.

Proof. We need to verify that β€–β‹…β€– so defined is indeed a norm. We have already shown that it is positive definite and positive homogeneous. We now show the triangle inequality. We will take help of the Cauchy Schwartz inequality shown above:

β€–u+vβ€–2=⟨u+v,u+v⟩=⟨u,u⟩+⟨u,v⟩+⟨v,u⟩+⟨v,v⟩=β€–uβ€–2+⟨u,v⟩+⟨u,vβŸ©β€•+β€–vβ€–2=β€–uβ€–2+2Re⟨u,v⟩+β€–vβ€–2≀‖uβ€–2+2|⟨u,v⟩|+β€–vβ€–2≀‖uβ€–2+2β€–uβ€–β€–vβ€–+β€–vβ€–2=(β€–uβ€–+β€–vβ€–)2.

Taking square root on both sides, we obtain:

β€–u+v‖≀‖uβ€–+β€–vβ€–.

Thus, β€–β‹…β€– is indeed a norm.

We recap the sequence of results to emphasize the logical flow:

  1. We started with just the definition of β€–β‹…β€– in Definition 4.93.

  2. We proved positive definiteness from the definition itself.

  3. We proved positive homogeneity also from the definition itself.

  4. We proved Pythagoras theorem utilizing previously established results for inner products.

  5. We proved Cauchy Schwartz inequality using positive definiteness, positive homogeneity and Pythagoras theorem.

  6. We proved triangle inequality using Cauchy Schwartz inequality.

Theorem 4.80 (Inner product space to metric space)

Every inner product space is a normed space. Hence it is also a metric space.

Proof. An inner product induces a norm which makes the vector space a normed space. A norm induces a metric which makes the vector space a metric space.

4.6.6. Hilbert SpacesΒΆ

Definition 4.94 (Hilbert space)

An inner product space V that is complete with respect to the metric induced by the norm induced by its inner product is called a Hilbert space.

In other words, V is a Hilbert space if every Cauchy sequence of V converges in V.

4.6.7. OrthonormalityΒΆ

Definition 4.95 (Set of orthonormal vectors)

A set of non-zero vectors {e1,…,ep} is called orthonormal if

⟨ei,ej⟩=0 if iβ‰ jβˆ€1≀i,j≀p⟨ei,ei⟩=1βˆ€1≀i≀p;

i.e., ⟨ei,ej⟩=δ(i,j).

In other words, the vectors are unit norm (β€–eiβ€–=1) and are pairwise orthogonal (eiβŠ₯ej) whenever iβ‰ j).

Since orthonormal vectors are orthogonal, hence they are linearly independent.

Definition 4.96 (Orthonormal basis)

A set of orthonormal vectors form an orthonormal basis for their span.

Theorem 4.81 (Expansion of a vector in an orthonormal basis)

Let {e1,…,en} be an orthonormal basis for V. Then, any v∈V can be written as:

v=⟨v,e1⟩e1+β‹―+⟨v,en⟩en.

Proof. Since {e1,…,en} forms a basis for V, hence every every v∈V can be written as:

v=Ξ±1e1+β‹―+Ξ±nen

where Ξ±1,…,Ξ±n∈F.

Taking inner product with ej on both sides, we get:

⟨v,ej⟩=Ξ±1⟨e1,ej⟩+β‹―+Ξ±n⟨en,ej⟩.

Since ⟨ei,ej⟩=δ(i,j), hence the above reduces to:

⟨v,ej⟩=αj.

Theorem 4.82 (Norm of a vector in an orthonormal basis)

Let {e1,…,en} be an orthonormal basis for V. For any v∈V, let its expansion in the orthonormal basis be:

v=Ξ±1e1+β‹―+Ξ±nen.

Then,

β€–vβ€–2=|Ξ±1|2+β‹―+|Ξ±n|2=βˆ‘i=1n|Ξ±i|2.

Proof. Expanding the expression for norm squared:

β€–vβ€–2=⟨v,v⟩=⟨α1e1+β‹―+Ξ±nen,Ξ±1e1+β‹―+Ξ±nen⟩=βˆ‘i=1nβˆ‘j=1n⟨αiei,Ξ±jej⟩=βˆ‘i=1n|Ξ±i|2.

Here are some interesting questions:

  • Can a basis in an inner product space be converted into an orthonormal basis?

  • Does a finite dimensional inner product space have an orthonormal basis?

  • Does every finite dimensional subspace of an inner product space have an orthonormal basis?

The answer to these questions is yes. We provide a constructive answer by the Gram-Schmidt algorithm described in the next section.

4.6.8. The Gram-Schmidt AlgorithmΒΆ

The Gram-Schmidt algorithm (described below) can construct an orthonormal basis from an arbitrary basis for the span of the basis.

Algorithm 4.1 (The Gram-Schmidt algorithm)

Inputs v1,v2,…,vn, a set of linearly independent vectors

Outputs e1,e2,…,en, a set of orthonormal vectors

  1. w1=v1.

  2. e1=w1β€–w1β€–.

  3. For j=2,…,n:

    1. wj=vjβˆ’βˆ‘i=1jβˆ’1⟨vj,ei⟩ei.

    2. ej=wjβ€–wjβ€–.

Theorem 4.83 (Justification for Gram-Schmidt algorithm)

Let v1,v2,…,vn be linearly independent. The Gram-Schmidt algorithm described above generates a set of orthonormal vectors.

Moreover, for each j=1,…,n, the set e1,…,ej is an orthonormal basis for the subspace: span{v1,…,vj}.

Proof. We prove this by mathematical induction. Consider the base case for j=1.

  1. w1=v1.

  2. e1=w1β€–w1β€–=v1β€–v1β€–.

  3. Thus, β€–e1β€–=1.

  4. span{e1}=span{v1} because e1 is a nonzero scalar multiple of v1.

Now, assume that the set e1,…,ejβˆ’1 is an orthonormal basis for span{v1,…,vjβˆ’1}.

  1. Thus, span{e1,…,ejβˆ’1}=span{v1,…,vjβˆ’1}.

  2. Since vj is linearly independent from v1,…,vjβˆ’1, hence vjβˆ‰span{v1,…,vjβˆ’1}.

  3. Thus, vjβˆ‰span{e1,…,ejβˆ’1}.

  4. Hence, wj=vjβˆ’βˆ‘i=1jβˆ’1⟨vj,ei⟩eiβ‰ 0. If it was 0, then vj would be linearly dependent on e1,…,ejβˆ’1.

  5. Thus, β€–wjβ€–>0.

  6. Thus, ej=wjβ€–wjβ€– is well-defined.

  7. Also, β€–ejβ€–=1 by construction, thus, ej is unit-norm.

  8. Note that wj is orthogonal to e1,…,ejβˆ’1. For any 1≀k<j, we have:

    ⟨wj,ek⟩=⟨vjβˆ’βˆ‘i=1jβˆ’1⟨vj,ei⟩ei,ek⟩=⟨vjekβŸ©βˆ’βˆ‘i=1jβˆ’1⟨vj,ei⟩⟨ei,ek⟩=⟨vjekβŸ©βˆ’βŸ¨vj,ek⟩⟨ek,ek⟩=⟨vjekβŸ©βˆ’βŸ¨vj,ek⟩=0.

    since e1,…,ejβˆ’1 are orthonormal.

  9. Thus, for any 1≀k<j:

    ⟨ej,ek⟩=⟨wjβ€–wjβ€–,ek⟩=⟨wj,ekβŸ©β€–wjβ€–=0.
  10. Thus, ej is orthogonal to e1,…,ejβˆ’1.

  11. Since, all of them are unit norm, hence, e1,…,ejβˆ’1,ej are indeed orthonormal.

We also need to show that span{e1,…,ej}=span{v1,…,vj}.

  1. Note that wj∈span{vj,e1,…,ejβˆ’1}=span{v1,…,vj} since span{e1,…,ejβˆ’1}=span{v1,…,vjβˆ’1} by inductive hypothesis.

  2. Thus, ej∈span{v1,…,vj} since ej is just scaled wj.

  3. Thus, span{e1,…,ej}βŠ†span{v1,…,vj}.

  4. For the converse, by definition vj=wj+βˆ‘i=1jβˆ’1⟨vj,ei⟩ei.

  5. Hence, vj∈span{wj,e1,…,ejβˆ’1}=span{e1,…,ej}.

  6. Thus, span{v1,…,vj}βŠ†span{e1,…,ej}.

  7. Thus, span{e1,…,ej}=span{v1,…,vj} must be true.

Theorem 4.84 (Existence of orthonormal basis)

Every finite dimensional inner product space has an orthonormal basis.

Proof. This is a simple application of the Gram-Schmidt algorithm.

  1. Every finite dimensional vector space has a finite basis.

  2. Every finite basis can be turned into an orthonormal basis by the Gram-Schmidt algorithm.

  3. Thus, we have an orthonormal basis.

Corollary 4.13

Every finite dimensional subspace of an inner product space has an orthonormal basis.

4.6.9. Orthogonal ComplementsΒΆ

Definition 4.97 (Orthogonal complement)

Let S be a subset of an inner product space V. The orthogonal complement of S is the set of all vectors in V that are orthogonal to every element of S. It is denoted by SβŠ₯.

SβŠ₯β‰œ{v∈V|vβŠ₯sβˆ€s∈S}.

Definition 4.98 (Orthogonal complement of a vector)

Let a∈V. The orthogonal complement of a is the set of all vectors in V that are orthogonal to a. It is denoted by aβŠ₯.

aβŠ₯β‰œ{v∈V|vβŠ₯a}.

Observation 4.2

aβŠ₯ is just a notational convenience.

aβŠ₯={a}βŠ₯=(span{a})βŠ₯.

Theorem 4.85 (Orthogonal complement is a linear subspace)

If V is an inner product space and SβŠ†V, then SβŠ₯ is a subspace.

Proof. To verify that SβŠ₯ is a subspace, we need to check the following.

  1. It contains the zero vector.

  2. It is closed under vector addition.

  3. It is closed under scalar multiplication.

We proceed as follows:

  1. ⟨0,s⟩=0 holds for any s∈S. Thus, 0∈SβŠ₯.

  2. Let u,v∈SβŠ₯. Then,

    1. ⟨u,s⟩=0 and ⟨v,s⟩=0 for every s∈S.

    2. Thus, ⟨u+v,s⟩=⟨u,s⟩+⟨v,s⟩=0+0=0 for every s∈S.

    3. Thus, u+v∈SβŠ₯.

  3. Similarly, if v∈SβŠ₯, then ⟨αv,s⟩=α⟨v,s⟩=0 for every s∈S.

Thus, SβŠ₯ is a subspace of V.

Observation 4.3

The orthogonal complement of the inner product space V is its trivial subspace containing just the zero vector.

VβŠ₯={0}.

Theorem 4.86 (Orthogonal complement and basis)

If S is a subspace of V, then to show that some vector u∈SβŠ₯, it is sufficient to show that u is orthogonal to all the vectors in some basis of S.

Specifically, if S is a finite dimensional subspace of V and B={v1,…,vm} is a basis for S, then

SβŠ₯={x|xβŠ₯vi,i=1,…,m}.

Proof. Let B be a basis for S (finite or infinite).

Then, for any s∈S:

s=βˆ‘i=1pΞ±pep

where αp∈F and ep∈B.

Now, if u is orthogonal to every vector in B, then

⟨s,u⟩=βˆ‘i=1pΞ±p⟨ep,u⟩=0.

Thus, uβŠ₯s. Since s was arbitrarily chosen from S, hence u∈SβŠ₯.

Now, assume S to be finite dimensional and B={v1,…,vm} to be a basis of S. Let

T={x|xβŠ₯vi,i=1,…,m}.

We first show that SβŠ₯βŠ†T.

  1. Let v∈SβŠ₯.

  2. Then, vβŠ₯s for every s∈S.

  3. In particular, vβŠ₯vi for i=1,…,m since BβŠ‚S.

  4. Thus, v∈T.

  5. Thus, SβŠ₯βŠ†T.

We next show that TβŠ†SβŠ₯.

  1. Let x∈T.

  2. Then, xβŠ₯vi for i=1,…,m.

  3. But then, for any s∈S

    ⟨s,x⟩=βŸ¨βˆ‘i=1mtivi,x⟩=βˆ‘i=1mti⟨vi,x⟩0

    since s=βˆ‘i=1mtivi is a linear combination of B.

  4. Thus, xβŠ₯s for every s∈S.

  5. Thus, x∈SβŠ₯.

  6. Thus, TβŠ†SβŠ₯.

Combining:

SβŠ₯=T={x|xβŠ₯vi,i=1,…,m}.

Theorem 4.87 (Orthogonal decomposition)

Let V be an inner product space and S be a finite dimensional subspace of V. Then, every v∈V can be written uniquely in the form:

v=vβˆ₯+vβŠ₯

where vβˆ₯∈S and vβŠ₯∈SβŠ₯.

Proof. Let e1,…,ep be an orthonormal basis for S.

Define:

(4.5)ΒΆvβˆ₯β‰œβˆ‘i=1p⟨v,ei⟩ei.

And

vβŠ₯=vβˆ’vβˆ₯.

By construction, vβˆ₯∈span{e1,…,ep}=S.

Now, for every 0≀i≀p:

⟨vβŠ₯,ei⟩=⟨vβˆ’vβˆ₯,ei⟩=⟨v,eiβŸ©βˆ’βŸ¨vβˆ₯,ei⟩=⟨v,eiβŸ©βˆ’βŸ¨v,ei⟩=0.

Thus, vβŠ₯∈SβŠ₯.

We have shown that the existence of the decomposition of an vector v in components which belong to S and SβŠ₯. Next, we need to show that the decomposition is unique.

For contradiction, assume there was another decomposition:

v=uβˆ₯+uβŠ₯

such that uβˆ₯∈S and uβŠ₯∈SβŠ₯.

Then,

vβˆ₯+vβŠ₯=v=uβˆ₯+uβŠ₯

gives us:

w=vβˆ₯βˆ’uβˆ₯=uβŠ₯βˆ’vβŠ₯.

Thus, w∈S as well as w∈SβŠ₯. But then, wβŠ₯w giving us:

⟨w,w⟩=0=⟨vβˆ₯βˆ’uβˆ₯,vβˆ₯βˆ’uβˆ₯⟩=β€–vβˆ₯βˆ’uβˆ₯β€–2.

This is possible only if vβˆ₯βˆ’uβˆ₯=0, thus, vβˆ₯=uβˆ₯. Consequently, uβŠ₯=vβŠ₯ too.

Thus,

v=vβˆ₯+vβŠ₯

is a unique decomposition.

Corollary 4.14 (Intersection between a subspace and its complement)

If S is a finite dimensional subspace of an inner product space V, then

S∩SβŠ₯={0}.

In other words, the only vector common between S and its orthogonal complement is the zero vector.

Theorem 4.88 (Vector space as direct sum)

If S is a finite dimensional subspace of an inner product space V, then

V=SβŠ•SβŠ₯.

In other words, V is a direct sum of S and its orthogonal complement.

Proof. From Corollary 4.14, the intersection between S and SβŠ₯ is the zero vector. Thus, by Definition 4.47, the direct sum between the two spaces SβŠ•SβŠ₯ is well defined.

By Theorem 4.87, every vector v∈V can be uniquely decomposed as

v=vβˆ₯+vβŠ₯

where vβˆ₯∈S and vβŠ₯∈SβŠ₯.

Thus, VβŠ†SβŠ•SβŠ₯.

However, since both S and SβŠ₯ are subspaces of V, hence

V=SβŠ•SβŠ₯.

Theorem 4.89 (Dimension of vector space as direct sum)

Let V be a finite dimensional inner product space. If S is a subspace of V, then

dimV=dimS+dimSβŠ₯.

Proof. Since V is finite dimensional, hence both S and SβŠ₯ are finite dimensional subspaces of V.

By Theorem 4.88

V=SβŠ•SβŠ₯.

Then, due to Theorem 4.22

dimV=dimS+dimSβŠ₯.

Theorem 4.90 (Orthogonal complement of orthogonal complement)

Let V be a finite dimensional inner product space. Let S be a subspace of V and let SβŠ₯ be its orthogonal complement. Then

(SβŠ₯)βŠ₯=S.

In other words, in a finite dimensional space, the orthogonal complement of orthogonal complement is the original subspace itself.

Note that this result is valid only for finite dimensional spaces since in that case both S and SβŠ₯ are finite dimensional.

Proof. Since V is finite dimensional, hence both S and SβŠ₯ are finite dimensional.

(SβŠ₯)βŠ₯={v∈V|uβŠ₯vβˆ€u∈SβŠ₯}.

We shall first show that SβŠ†(SβŠ₯)βŠ₯.

  1. Let s∈S.

  2. Then, by definition, sβŠ₯uβˆ€u∈SβŠ₯.

  3. Thus, s∈(SβŠ₯)βŠ₯.

  4. Thus, SβŠ†(SβŠ₯)βŠ₯.

We now show that (SβŠ₯)βŠ₯βŠ†S.

  1. Let u∈(SβŠ₯)βŠ₯.

  2. By Theorem 4.88, V=SβŠ•SβŠ₯ since S is a finite dimensional subspace of V.

  3. Thus, u=v+w such that v∈S and w∈SβŠ₯.

  4. Since uβˆ’v=w, hence uβˆ’v∈SβŠ₯.

  5. We have already shown above that SβŠ†(SβŠ₯)βŠ₯. Hence v∈(SβŠ₯)βŠ₯.

  6. Thus, uβˆ’v=w∈(SβŠ₯)βŠ₯ since both u and v belong to (SβŠ₯)βŠ₯.

  7. Thus, uβˆ’v∈(SβŠ₯)βŠ₯∩SβŠ₯ as w∈SβŠ₯ by orthogonal decomposition above.

  8. But, by Corollary 4.14 SβŠ₯∩(SβŠ₯)βŠ₯={0} since (SβŠ₯)βŠ₯ is the orthogonal complement of SβŠ₯ and SβŠ₯ is finite dimensional.

  9. Thus, uβˆ’v=0.

  10. Thus, u=v.

  11. Thus, u∈S.

  12. Since u was an arbitrary element of (SβŠ₯)βŠ₯, hence (SβŠ₯)βŠ₯βŠ†S.

Combining the two:

(SβŠ₯)βŠ₯=S.

Theorem 4.91 (n-1 dimensional subspaces)

Let V be a finite dimensional inner product space with dimV=n. Let S be an nβˆ’1 dimensional subspace of V. Then, there exists a nonzero vector b∈V such that

S={x∈V|xβŠ₯b}.

In other words, the nβˆ’1 dimensional subspaces are the sets of the form {x|xβŠ₯b} where bβ‰ 0.

Proof. Let S be nβˆ’1 dimensional. Then, from Theorem 4.89

dimV=dimS+dimSβŠ₯.

This gives us dimSβŠ₯=nβˆ’(nβˆ’1)=1.

Since SβŠ₯ is one dimensional, we can choose a non-zero vector b∈SβŠ₯ as its basis. Since V is finite dimensional, hence

S=(SβŠ₯)βŠ₯.

Thus, S consists of vectors which are orthogonal to a basis of SβŠ₯. Thus,

S={x∈V|xβŠ₯b}.

4.6.10. Orthogonal ProjectionΒΆ

Recall that a projection operator P:V→V is an operator which satisfies P2=P.

The range of P is given by

R(P)={v∈V|v=Px for some x∈V}.

The null space of P is given by

N(P)={v∈V|Pv=0}.

Definition 4.99 (Orthogonal projection operator)

A projection operator P:V→V over an inner product space V is called orthogonal projection operator if its range R(P) and the null space N(P) as defined above are orthogonal to each other; i.e.

rβŠ₯nβˆ€r∈R(P),βˆ€n∈N(P).

Theorem 4.92 (Orthogonal projection operator for a subspace)

Let S be a finite dimensional subspace of V. Let {e1,…,ep} be an orthonormal basis of S. Let the operator PS:Vβ†’V be defined as:

PSvβ‰œvβˆ₯

where

v=vβˆ₯+vβŠ₯

is the unique orthogonal decomposition of v w.r.t. the subspace S as defined in Theorem 4.87. Then,

  1. PSv=βˆ‘i=1p⟨v,ei⟩ei.

  2. For any v∈V, vβˆ’PSvβŠ₯S.

  3. PS is a linear map.

  4. PS is the identity map when restricted to S; i.e., PSs=sβˆ€s∈S.

  5. R(PS)=S.

  6. N(PS)=SβŠ₯.

  7. PS2=PS.

  8. For any v∈V, β€–PSv‖≀‖vβ€–.

  9. For any v∈V and s∈S:

    β€–vβˆ’PSv‖≀‖vβˆ’sβ€–

    with equality if and only if s=PSv.

PS is indeed an orthogonal projection onto S.

Proof. For the sake of brevity, we abbreviate P=PS.

Following (4.5), indeed:

P=βˆ‘i=1p⟨v,ei⟩ei.

For any v∈V (due to Theorem 4.87):

vβˆ’Pv=vβˆ’vβˆ₯=vβŠ₯.

Since vβŠ₯∈SβŠ₯ hence vβˆ’PvβŠ₯S.

[Linear map]

  1. Let u,v∈V.

  2. Let u=uβˆ₯+uβŠ₯ and v=vβˆ₯+vβŠ₯.

  3. Consider u+v=(uβˆ₯+vβˆ₯)+(uβŠ₯+vβŠ₯).

  4. Then, uβˆ₯+vβˆ₯∈S and uβŠ₯+vβŠ₯∈SβŠ₯.

  5. Since, the orthogonal decomposition is unique, hence P(u+v)=uβˆ₯+vβˆ₯=Pu+Pv.

  6. Similarly, for α∈F, Ξ±u=Ξ±uβˆ₯+Ξ±uβŠ₯.

  7. With Ξ±uβˆ₯∈S and Ξ±uβŠ₯∈SβŠ₯, P(Ξ±u)=Ξ±uβˆ₯=Ξ±Pu.

Thus, P is a linear map.

For any s∈S, we can write it as s=s+0. With s∈S and 0∈SβŠ₯, we have: Ps=s.

[Range]

  1. Since P maps v to a component in S, hence R(P)βŠ†S.

  2. Since for every s∈S, there is v∈S such that Pv=s (specifically v=s), hence SβŠ†R(P).

  3. Combining R(P)=S.

[Null space]

  1. Let v∈N(P). Write v=vβˆ₯+vβŠ₯.

  2. Then, Pv=vβˆ₯=0 as v is in the null space of P.

  3. Hence, v=vβŠ₯∈SβŠ₯.

  4. Thus, N(P)βŠ†SβŠ₯.

  5. Now, let v∈SβŠ₯.

  6. We can write v as v=0+v where 0∈S and v∈SβŠ₯.

  7. Thus, Pv=0.

  8. Thus, SβŠ₯βŠ†N(P).

  9. Combining, SβŠ₯=N(P).

[P2=P]

  1. For any v∈V, we have, Pv=vβˆ₯.

  2. Since vβˆ₯∈S, hence Pvβˆ₯=vβˆ₯.

  3. Thus, P2v=Pvβˆ₯=vβˆ₯=Pv.

  4. Since v was arbitrary, hence, P2=P.

[β€–Pv‖≀‖vβ€–]

  1. We have v=vβˆ₯+vβŠ₯=Pv+vβŠ₯.

  2. By Pythagoras theorem: β€–vβ€–2=β€–Pvβ€–2+β€–vβŠ₯β€–2.

  3. Thus, β€–vβ€–2β‰₯β€–Pvβ€–2.

  4. Taking square root on both sides: β€–Pv‖≀‖vβ€–.

[β€–vβˆ’Pv‖≀‖vβˆ’sβ€–]

  1. Let v∈V and s∈S.

  2. Note that Pv∈S hence Pvβˆ’s∈S.

  3. By definition vβˆ’Pv∈SβŠ₯.

  4. Thus, vβˆ’PvβŠ₯Pvβˆ’s.

  5. We have: vβˆ’s=(vβˆ’Pv)+(Pvβˆ’s).

  6. Applying Pythagoras theorem:

    β€–vβˆ’sβ€–2=β€–vβˆ’Pvβ€–2+β€–Pvβˆ’sβ€–2β‰₯β€–vβˆ’Pvβ€–2.
  7. Taking square root on both sides:

    β€–vβˆ’Pv‖≀‖vβˆ’sβ€–.
  8. Equality holds if and only if β€–Pvβˆ’sβ€–2=0 if and only if Pv=s.

In order to show that P is an orthogonal projection, we need to show that:

  1. P is a projection operator.

  2. rβŠ₯nβˆ€r∈R(P),βˆ€n∈N(P).

We have shown that:

  1. P2=P. Hence P is a projection operator.

  2. R(P)=S and N(P)=SβŠ₯.

  3. By definition, for any r∈S and n∈SβŠ₯, rβŠ₯n.

  4. Thus, P is an orthogonal projection operator.

Theorem 4.93 (Orthogonal projectors are adjoint)

A projection operator is orthogonal if and only if it is self adjoint.

Example 4.27 (Orthogonal projection on a line)

Consider a unit norm vector u∈RN.
Thus uTu=1.

Consider

Pu=uuT.

Now

Pu2=(uuT)(uuT)=u(uTu)uT=uuT=P.

Thus P is a projection operator.

Now,

PuT=(uuT)T=uuT=Pu.

Thus Pu is self-adjoint. Hence, Pu is an orthogonal projection operator.

Also,

Puu=(uuT)u=u(uTu)=u.

Thus Pu leaves u intact; i.e., Projection of u on to u is u itself.

Let v∈uβŠ₯ i.e. ⟨u,v⟩=0.

Then,

Puv=(uuT)v=u(uTv)=u⟨u,v⟩=0.

Thus Pu annihilates all vectors orthogonal to u.

Any vector x∈RN can be broken down into two components

x=xβˆ₯+xβŠ₯

such that ⟨u,xβŠ₯⟩=0 and xβˆ₯ is collinear with u.

Then,

Pux=uuTxβˆ₯+uuTxβŠ₯=xβˆ₯.

Thus Pu retains the projection of x on u given by xβˆ₯.

Example 4.28 (Projections over the column space of a matrix)

Let A∈RMΓ—N with N≀M be a matrix given by

A=[a1a2…aN]

where ai∈RM are its columns which are linearly independent.

The column space of A is given by

C(A)={Axβˆ€x∈RN}βŠ†RM.

It can be shown that ATA is invertible.

Consider the operator

PA=A(ATA)βˆ’1AT.

Now,

PA2=A(ATA)βˆ’1ATA(ATA)βˆ’1AT=A(ATA)βˆ’1AT=PA.

Thus PA is a projection operator.

PAT=(A(ATA)βˆ’1AT)T=A((ATA)βˆ’1)TAT=A(ATA)βˆ’1AT=PA.

Thus PA is self-adjoint.

Hence PA is an orthogonal projection operator on the column space of A.

4.6.11. Parallelogram IdentityΒΆ

Theorem 4.94 (Parallelogram identity)

2β€–xβ€–22+2β€–yβ€–22=β€–x+yβ€–22+β€–xβˆ’yβ€–22.βˆ€x,y∈V.

Proof. Expanding:

β€–x+yβ€–22=⟨x+y,x+y⟩=⟨x,x⟩+⟨y,y⟩+⟨x,y⟩+⟨y,x⟩.

Also:

β€–xβˆ’yβ€–22=⟨xβˆ’y,xβˆ’y⟩=⟨x,x⟩+⟨y,yβŸ©βˆ’βŸ¨x,yβŸ©βˆ’βŸ¨y,x⟩.

Thus,

β€–x+yβ€–22+β€–xβˆ’yβ€–22=2(⟨x,x⟩+⟨y,y⟩)=2β€–xβ€–22+2β€–yβ€–22.

When inner product is a real number following identity is quite useful.

Theorem 4.95 (Parallelogram identity for real inner product)

⟨x,y⟩=14(β€–x+yβ€–22βˆ’β€–xβˆ’yβ€–22).βˆ€x,y∈V.

Proof. Expanding:

β€–x+yβ€–22=⟨x+y,x+y⟩=⟨x,x⟩+⟨y,y⟩+⟨x,y⟩+⟨y,x⟩.

Also,

β€–xβˆ’yβ€–22=⟨xβˆ’y,xβˆ’y⟩=⟨x,x⟩+⟨y,yβŸ©βˆ’βŸ¨x,yβŸ©βˆ’βŸ¨y,x⟩.

Thus,

β€–x+yβ€–22βˆ’β€–xβˆ’yβ€–22=2(⟨x,y⟩+⟨y,x⟩)=4⟨x,y⟩

since for real inner products

⟨x,y⟩=⟨y,x⟩.

4.6.12. Polarization identityΒΆ

When inner product is a complex number, polarization identity is quite useful.

Theorem 4.96 (Polarization identity for complex inner product)

⟨x,y⟩=14(β€–x+yβ€–22βˆ’β€–xβˆ’yβ€–22+iβ€–x+iyβ€–22βˆ’iβ€–xβˆ’iyβ€–22)βˆ€x,y∈V.

Proof. Expanding

β€–x+yβ€–22=⟨x+y,x+y⟩=⟨x,x⟩+⟨y,y⟩+⟨x,y⟩+⟨y,x⟩.

Also,

β€–xβˆ’yβ€–22=⟨xβˆ’y,xβˆ’y⟩=⟨x,x⟩+⟨y,yβŸ©βˆ’βŸ¨x,yβŸ©βˆ’βŸ¨y,x⟩.

And,

β€–x+iyβ€–22=⟨x+iy,x+iy⟩=⟨x,x⟩+⟨iy,iy⟩+⟨x,iy⟩+⟨iy,x⟩.

And,

β€–xβˆ’iyβ€–22=⟨xβˆ’iy,xβˆ’iy⟩=⟨x,x⟩+⟨iy,iyβŸ©βˆ’βŸ¨x,iyβŸ©βˆ’βŸ¨iy,x⟩.

Thus,

β€–x+yβ€–22βˆ’β€–xβˆ’yβ€–22+iβ€–x+iyβ€–22βˆ’iβ€–xβˆ’iyβ€–22=2⟨x,y⟩+2⟨y,x⟩+2i⟨x,iy⟩+2i⟨ix,y⟩=2⟨x,y⟩+2⟨y,x⟩+2⟨x,yβŸ©βˆ’2⟨y,x⟩=4⟨x,y⟩.