5.2. Differentiation in Banach Spaces

We introduce the concept of differentiation in Banach spaces. Recall that Banach spaces are normed linear spaces that are complete.

5.2.1. Gateaux Differential

Definition 5.14 (Directional derivative)

Let X and Y be Banach spaces. Let f:XY be a function with S=domf. The directional derivative of f at xintS in the direction hX where h0, denoted by f(x;h) is given by

f(x;h)limt0+f(x+th)f(x)t

whenever the limit exists. This is also known as the Gateaux differential. By convention, f(x;0X)=0Y. This is consistent with the definition above.

  • There is no single directional derivative at a point x.

  • The directional derivative depends on the direction h.

  • In one dimension, there are two directional derivatives at each x.

  • In two or more dimensions, there are infinitely many directional derivatives.

  • The directional derivative is a one dimensional calculation along the direction h.

  • It is usually easy to compute the directional derivative even when the space X is infinite dimensional.

Definition 5.15 (Gateaux differentiability)

Let X and Y be Banach spaces. Let f:XY be a function with S=domf. Let US be an open set. We say that f is Gateaux differentiable at xU if the Gateaux differential f(x;h) exists for every direction hX.

Accordingly, we can define a bounded operator Tx:XY given by

Tx(h)limt0+f(x+th)f(x)thX.

The operator T is called the Gateaux derivative of f at x.

Example 5.17 (Gateaux differential of exponential function)

Let f(x)=ex. Then,

f(x;h)=limt0+ex+thext=exlimt0+eth1t=exlimt0+tht=hex.

We note that the Gateaux derivative depends linearly on h.

Theorem 5.7 (Gateaux differential nonnegative homogeneity)

The Gateaux differential of a function f:XY is nonnegative homogeneous in the sense that

f(x;αh)=αf(x;h)

for every αR+ and every hX.

However, the Gateaux differential may not be additive. Thus, the Gateaux differential may fail to be linear.

Example 5.18 (Gateaux differential of absolute value function)

Let f(x)=|x|. Then, the Gateaux differentials are given by

f(x;h)={hx|x|x0;|h|x=0.

We note that the Gateaux differential of f exists everywhere. However the Gateaux differential depends on h in a nonlinear way at x=0. At x0, the Gateaux differential depends linearly on h.

Example 5.19 (Gateaux differential of square function)

Let f(x)=x2. Then, the Gateaux differential is given by

f(x;h)=limt0+f(x+th)f(x)t=limt0+x2+t2h2+2xthx2t=2xh.

We note that the Gateaux differential is linear w.r.t. h.

Example 5.20 (Gateaux differential of linear functional)

Let f(x)=aTx where aRn is a given fixed vector.

f(x;h)=limt0+aTx+taThaTxt=aTh.

We note that the Gateaux differential is linear w.r.t. h.

Example 5.21 (Gateaux differential of simple quadratic)

Let f(x)=xTAx where ASn is a given symmetric matrix.

f(x;h)=limt0+(x+th)TA(x+th)xTAxt=limt0+t2hTAh+2thTAxt=2hTAx=2xTAh.

We note that the Gateaux differential is linear w.r.t. h.

In particular, if f(x)=xTx, then f(x;h)=2hTx=2xTh.

Theorem 5.8 (Gateaux differential of a constant function)

The Gateaux differential of a constant function is zero.

Theorem 5.9 (Gateaux differential sum rule)

Gateaux differential distributes over sum.

Let f,g:XY both have Gateaux derivatives at x in the direction h. Then,

(f+g)(x;h)=f(x;h)+g(x;h).

Also,

(fg)(x;h)=f(x;h)g(x;h).

Theorem 5.10 (Gateaux differential product rule)

Let f,g:XY both be Gateaux differentiable at xintdomfdomg. Let h be their (pointwise) product function given by

h(x)=f(x)g(x)

with domh=domfdomg. Then,

h(x;h)==(fg)(x;h)=f(x;h)g(x)+g(x;h)f(x).

Theorem 5.11 (Gateaux differential chain rule)

Let f:XY and g:YZ be functions. Let h:XZ be the composition of f and g given by h=gf. Let Udomh be an open set. Let xU. Assume that f is Gateaux differentiable at x and g is Gateaux differentiable at f(x). Then,

h(x;h)=g(f(x);f(x;h))hX.

We recall the little-o notation. We say that a quantity q is o(t) if

limt0+qt=0.

For vector valued functions, a quantity q is o(t) if

limt0+qt=0.

or

limt0+qt=0.

Proof. If f is Gateaux differentiable at x, then

f(x;h)=limt0+f(x+th)f(x)thX.

In terms of little-o notation,

f(x+th)=f(x)+tf(x;h)+o(t).

Similarly, if g is Gateaux differentiable at y, then

g(y+su)=g(y)+sg(y;u)+o(s).

Now,

h(x;h)=(gf)(x;h)=limt0+g(f(x+th))g(f(x))t=limt0+g(f(x)+tf(x;h)+o(t))g(f(x))t=limt0+g(f(x)+t(f(x;h)+t1o(t)))g(f(x))t=limt0+g(f(x))+tg(f(x);f(x;h)+t1o(t))+o(t)g(f(x))t=limt0+tg(f(x);f(x;h)+t1o(t))+o(t)t=limt0+[g(f(x);f(x;h))+t1o(t))+t1o(t)]=g(f(x);f(x;h)).

Example 5.22 (Chain rule for square of inner product)

Consider the function h(x)=(xTx)2.

  1. Define g(t)=t2

  2. Define f(x)=xTx.

  3. Then h=gf.

  4. We have f(x;h)=2hTx.

  5. We have g(y;u)=2yu.

  6. Thus,

    g(f(x);f(x;h))=2f(x)f(x;h)=2(xTx)(2hTx)=4(hTx)(xTx).

We can compute the same thing using the product rule.

  1. We note that h(x)=f(x)f(x).

  2. Applying the product rule:

    h(x;h)=f(x;h)f(x)+f(x;h)f(x)=2f(x;h)f(x)=2(2hTx)(xTx)=4(hTx)(xTx).

5.2.2. Fréchet Derivative

Definition 5.16 (Fréchet differentiability)

Let X and Y be Banach spaces. Let f:XY be a function with S=domf. Let US be an open set. We say that f is Fréchet differentiable at xU if there is a bounded and linear operator Tx:XY given by

Tx(h)=limt0+f(x+th)f(x)thX.

The operator Tx is called the Fréchet derivative of f at x.

We note that Tx depends on x.

Remark 5.1 (Fréchet differentiability alternate forms)

By definition, if f is Fréchet differentiable at x, then it is Gateaux differentiable at x. Since Tx is linear, we can write it as

Tx(h)=Ah

emphasizing the fact that the essential part of Tx doesn’t depend on h. A may still depend on x.

Using the little-o notation, we can write

f(x+th)=f(x)+tTx(h)+o(t)=f(x)+tAh+o(t).

If we set th=y, then t0 if and only if y0. In particular, yX=thX=o(t). Now,

f(x+y)=f(x)+Ay+o(t)f(x+y)f(x)Ay=o(t)=o(yX)limyX0f(x+y)f(x)AyYyX=0limy0f(x+y)f(x)AyYyX=0limy0f(x+y)f(x)Tx(y)YyX=0.

Therefore f:XY is Fréchet differentiable at xU if and only if

limy0f(x+y)f(x)Tx(y)yX=0

for every yX.

It is worthwhile to compare this definition to the definition of differentiability of f:RnRm in Definition 5.1. If we put z=x+y, we can rewrite the condition as

limzxf(z)f(x)Tx(zx)YzxX=0.

Thus, Tx plays the same role as the Jacobian matrix Df(x) in (5.1).

Theorem 5.12 (Existence of Fréchet derivative)

The Fréchet derivative of a function f exists at a point x=a if and only if all Gateaux differentials of f at x are continuous functions of x at x=a.

Theorem 5.13 (Uniqueness of Fréchet derivative)

If the Fréchet derivative of a function f exists at a point x=a then it is unique.

5.2.3. Gradient

Definition 5.17 (Gradient)

Let V be a Hilbert space. Let f:VR is a real valued function. Let S=domf and US be an open set. Assume that f is Fréchet differentiable at xU. Then, the Fréchet derivative Tx:VR is a bounded linear functional.

The gradient of a real valued function is denoted by f(x) and f(x)V satisfying

h,f(x)=Tx(h).