Hahn–Banach theorem

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

The Hahn–Banach theorem is a central tool in functional analysis. It allows the extension of bounded linear functionals defined on a subspace of some vector space to the whole space, and it also shows that there are "enough" continuous linear functionals defined on every normed vector space to make the study of the dual space "interesting". Another version of the Hahn–Banach theorem is known as the Hahn–Banach separation theorem or the hyperplane separation theorem, and has numerous uses in convex geometry.

History[edit]

The theorem is named for the mathematicians Hans Hahn and Stefan Banach, who proved it independently in the late 1920s. The special case of the theorem for the space of continuous functions on an interval was proved earlier (in 1912) by Eduard Helly,[1] and a more general extension theorem, the M. Riesz extension theorem, from which the Hahn–Banach theorem can be derived, was proved in 1923 by Marcel Riesz.[2]

The first Hahn–Banach theorem was proved by Eduard Helly in 1921 who showed that certain linear functionals defined on a subspace of a certain type of normed space () had an extension of the same norm. Helly did this through the technique of first proving that a one-dimensional extension exists (where the linear functional has its domain extended by one dimension) and then using induction. In 1927, Hahn defined general Banach spaces and used Helly's technique to prove a norm preserving version of Hahn–Banach theorem for Banach spaces (where a bounded linear functional on a subspace has a bounded linear extension of the same norm to the whole space). In 1929, Banach, who was unaware of Hahn's result, generalized it by replacing the norm-preserving version with the dominated extension version that uses sublinear functions. Whereas Helly's proof used mathematical induction, Hahn and Banach both used transfinite induction.[3]

The Hahn–Banach theorem arose from attempts to solve infinite systems of linear equations. This is needed to solve problems such as the moment problem, whereby given all the potential moments of a function one must determine if a function having these moments exists, and, if so, find it in terms of those moments. Another such problem is the Fourier cosine series problem, whereby given all the potential Fourier cosine coefficients one must determine if a function having those coefficients exists, and, again, find it if so.

Riesz and Helly solved the problem for certain classes of spaces (such as Lp([0, 1]) and C([a, b])) where they discovered that the existence of a solution was equivalent to the existence and continuity of certain linear functionals. In effect, they needed to solve the following problem:[3]

(The vector problem) Given a collection of bounded linear functionals on a normed space X and a collection of scalars , determine if there is an xX such that fi(x) = ci for all iI.

To solve this, if X is reflexive then it suffices to solve the following dual problem:[3]

(The functional problem) Given a collection of vectors in a normed space X and a collection of scalars , determine if there is a bounded linear functional f on X such that f(xi) = ci for all iI.

Riesz went on to define Lp([0, 1]) (1 < p < ∞) in 1910 and the lp spaces in 1913. While investigating these spaces he proved a special case of the Hahn–Banach theorem. Helly also proved a special case of the Hahn–Banach theorem in 1912. In 1910, Riesz solved the functional problem for some specific spaces and in 1912, Helly solved it for a more general class of spaces. It wasn't until 1932 that Banach, in one of the first important applications of the Hahn–Banach theorem, solved the general functional problem. The following theorem states the general functional problem and characterizes its solution.[3]

Theorem[3] (The functional problem) — Let X be a real or complex normed space, I a non-empty set, (ci)iI a family of scalars, and (xi)iI a family of vectors in X.

There exists a continuous linear functional f on X such that f(xi) = ci for all iI if and only if there exists a K > 0 such that for any choice of scalars (si)iI where all but finitely many si are 0, we necessarily have

One can use the above theorem to deduce the Hahn–Banach theorem.[3] If X is reflexive, then this theorem solves the vector problem.

Hahn–Banach theorem[edit]

Theorem (Hahn-Banach)[3][4] — Set 𝕂 to be either or and let X be a 𝕂-vector space. If f : M → 𝕂 is a 𝕂-linear functional on a 𝕂-linear subspace M and p : X → ℝ a nonnegative, sublinear function such that

|f(m)| ≤ p(m)     for all mM.

then there exists a 𝕂-linear F : X → 𝕂 such that

F(m) = f(m)     for all mM,
|F(x)| ≤ p(x)     for all xX.

The extension F is in general not uniquely specified by f and the proof gives no explicit method as to how to find F.

It is possible to relax slightly the subadditivity condition on p, requiring only that for all x, yX and all scalars a and b satisfying |a| + |b| ≤ 1,

p(ax + by) ≤ |a| p(x) + |b| p(y) .[5]

It is further possible to relax the positive homogeneity and the subadditivity conditions on p, requiring only that p is convex.[6]

The Mizar project has completely formalized and automatically checked the proof of the Hahn–Banach theorem in the HAHNBAN file.[7]

Proof[edit]

In the complex case, the -linearity assumptions demand that M=N+Ni for some real vector space N. Moreover, for every vector xN, f(ix) = if(x). Thus the real part of a linear functional already determines the behavior of the linear functional as a whole, and proving the real case will suffice.[3]

First, we note Helly's initial result:[3] if M has codimension 1, then Hahn-Banach is easy.

Lemma[3] (One-dimensional dominated extension theorem) — Let X be a real vector space, p : X → ℝ a sublinear function, f : M → ℝ a linear functional on a proper vector subspace MX such that fp on M (i.e. f(m) ≤ p(m) for all mM), and xX an vector not in M. There exists a linear extension F : M ⊕ ℝx → ℝ of f to M ⊕ ℝx = span { M, x} such that Fp on M ⊕ ℝx.

Proof —

To prove this lemma, let m, nM. By the linearity properties of our functions,

-p(−xn) − f(n) ≤ p(m + x) − f(m).

In particular, let

a = [ −p(−xn) − f(n)], and b = [p(m + x) − f(m)]

Then we conclude "the decisive inequality"[3] that for any ab. So let c ∈ [a, b] and define F(m + rx)=f(m) + rc; then

F(m + rx)≤p(m) + rcp(m+rx)

The reverse inequality is similar.

Now apply Zorn's lemma: the possible extensions of f are partially ordered by extension of each other, so there is a maximal extension F. By the codimension-1 result, if F is not defined on all of X, then it can be further extended. Thus F must be defined everywhere, as claimed.

In locally convex spaces[edit]

In the above form, the functional to be extended must already be bounded by a sublinear function. In some applications, this might close to begging the question. However, in locally convex spaces, any continuous functional is already bounded by the norm, which is sublinear. One thus has

Continuous extensions on locally convex spaces[3] — Let X be locally convex topological vector space over 𝕂 (either ℝ or ℂ), M a vector subspace of X, and f a continuous linear functional on M. There f has a continuous linear extension to all of X. If the topology on X arises from a norm, then the norm of f is preserved by this extension.

In category-theoretic terms, the field 𝕂 is an injective object in the category of locally convex vector spaces.

Relation to axiom of choice[edit]

The above proof uses Zorn's lemma, which is equivalent to the axiom of choice. It is now known (see below) that the ultrafilter lemma (or equivalently, the Boolean prime ideal theorem), which is slightly weaker than the axiom of choice, is actually strong enough.

The Hahn–Banach theorem is equivalent to the following:[8]

(∗): On every Boolean algebra B there exists a "probability charge", that is: a nonconstant finitely additive map from B into [0, 1].

(The Boolean prime ideal theorem is equivalent to the statement that there are always nonconstant probability charges which take only the values 0 and 1.)

In Zermelo–Fraenkel set theory, one can show that the Hahn–Banach theorem is enough to derive the existence of a non-Lebesgue measurable set.[9] Moreover, the Hahn–Banach theorem implies the Banach–Tarski paradox.[10]

For separable Banach spaces, D. K. Brown and S. G. Simpson proved that the Hahn–Banach theorem follows from WKL0, a weak subsystem of second-order arithmetic that takes a form of Kőnig's lemma restricted to binary trees as an axiom. In fact, they prove that under a weak set of assumptions, the two are equivalent, an example of reverse mathematics.[11][12]

"Geometric Hahn–Banach" (the Hahn–Banach separation theorems)[edit]

The key element of the Hahn–Banach theorem is fundamentally a result about the separation of two convex sets: {−p(−xn) − f(n): nM }, and {p(m + x) − f(m): mM }. This sort of argument appears widely in convex geometry,[13] optimization theory, and economics. Lemmas to this end derived from the original Hahn–Banach theorem are known as the Hahn–Banach separation theorems.[14][15]

Theorem[14] — Let X be a real locally convex topological vector space and let A and B be non-empty convex subsets. If Int A ≠ ∅ and B ∩ Int A = ∅ then there exists a continuous linear functional f on X such that sup f(A) ≤ inf f(B) and f(a) < int f(B) for all a ∈ Int A (such an f is necessarily non-zero).

Often one assumes that the convex sets have additional structure; i.e. that they are open or compact. In that case, the conclusion can be substantially strengthened:

Theorem[3][16] — Let X be a real topological vector space and choose A, B convex non-empty disjoint subsets of X.

  • If A is open then A and B are separated by a (closed) hyperplane. Explicitly, this means that there exists a continuous linear map f : X → 𝕂 and s ∈ ℝ such that f(a) < sf(b) for all aA, bB. If both A and B are open then the right-hand side may be taken strict as well.
  • If X is locally convex, A is compact, and B closed, then A and B are strictly separated: there exists a continuous linear map f : X → 𝕂 and s, t ∈ ℝ such that f(a) < t < s < f(b) for all aA, bB.

If X is complex, then the same claims hold, but for the real part of f.

One important corollary is known as the Geometric Hahn–Banach theorem or Mazur's theorem.

Theorem (Mazur)[17] — Let M be a vector subspace of the topological vector space X. Suppose K is a non-empty convex open subset of X with KM = ∅. Then there is a closed hyperplane (codimension-1 vector subspace) NX that contains M, but remains disjoint from K.

To see that Mazur's theorem follows from the Hahn-Banach separation theorems, note that M is convex and apply the first bullet. Mazur's theorem clarifies that vector subspaces (even ones that are not closed) can be characterized by linear functionals.

Corollary[18] (Separation of a subspace and an open convex set) — Let X be a locally convex vector space, M a vector subspace, and U a non-empty open convex subset disjoint from M. Then there exists a continuous linear functional f on X such that f(m) = 0 for all mM and Re f > 0 on U

Supporting hyperplanes[edit]

Since points are trivially convex, geometric Hahn-Banach implies that functionals can detect the boundary of a set. In particular, let X be a real topological vector space and AX be convex with Int A ≠ ∅. If then there is a functional that is vanishing at a0, but supported on the interior of A.[14]

Call a normed space X smooth if at each point x in its unit ball there exists a unique closed hyperplane to the unit ball at x. Köthe showed in 1983 that a normed space is smooth at a point x if and only if the norm is Gateaux differentiable at that point.[3]

Balanced or disked neighborhoods[edit]

Let U be a convex balanced neighborhood of 0 in a locally convex topological vector space X and suppose xX is not an element of U. Then there exists a continuous linear functional f on X such that

sup |f(U)| ≤ |f(x)|.[3]

Applications[edit]

The Hahn–Banach theorem is the first sign of an important philosophy in functional analysis: to understand a space, one should understand its continuous functionals.

For example, linear subspaces are characterized by functionals: if X is a normed vector space with linear subspace M (not necessarily closed) and if z is an element of X not in the closure of M, then there exists a continuous linear map f : X → 𝕂 with f(x) = 0 for all x in M, f(z) = 1, and ||f|| = dist(z, M)−1. (To see this, note that dist(·, M) is a sublinear function.) Moreover, if z is an element of X, then there exists a continuous linear map f : X → 𝕂 such that f(z) = ||z|| and ||f|| ≤ 1. This implies that the natural injection J from a normed space X into its double dual V′′ is isometric.

That last result also suggests that the Hahn–Banach theorem can often be used to locate a "nicer" topology in which to work. For example, many results in functional analysis assume that a space is Hausdorff or locally convex. However, suppose X is a topological vector space, not necessarily Hausdorff or locally convex, but with a nonempty, proper, convex, open set M. Then geometric Hahn-Banach implies that there is a hyperplane separating M from any other point. In particular, there must exist a nonzero functional on X — that is, the continuous dual space X* is non-trivial.[3][19] Considering X with the weak topology induced by X*, then X becomes locally convex; by the second bullet of geometric Hahn-Banach, the weak topology on this new space separates points. Thus X with this weak topology becomes Hausdorff. This sometimes allows some results from locally convex topological vector spaces to be applied to non-Hausdorff and non-locally convex spaces.

Partial differential equations[edit]

The Hahn–Banach theorem is often useful when one wishes to apply the method of a priori estimates. Suppose that we wish to solve the linear differential equation Pu = f for u, with f given in some Banach space X. If we have control on the size of u in terms of and we can think of u as a bounded linear functional on some suitable space of test functions g, then we can view f as a linear functional by adjunction: . At first, this functional is only defined on the image of P, but using the Hahn–Banach theorem, we can try to extend it to the entire codomain X. The resulting functional is often defined to be a weak solution to the equation.

Characterizing reflexive Banach spaces[edit]

Theorem[20] — A real Banach space is reflexive if and only if every pair of non-empty disjoint closed convex subsets, one of which is bounded, can be strictly separated by a hyperplane.

Example from Fredholm theory[edit]

To illustrate an actual application of the Hahn–Banach theorem, we will now prove a result that follows almost entirely from the Hahn–Banach theorem.

Proposition — Suppose X is a Hausdorff locally convex TVS over the field 𝕂 and Y is a vector subspace of X that is TVS-isomorphic to 𝕂I for some set I. Then Y is a closed and complemented vector subspace of X.

Proof —

Since 𝕂I is a complete TVS so is Y, and since any complete subset of a Hausdorff TVS is closed, Y is a closed subset of X. Let f = (fi)iI : Y → 𝕂I be a TVS isomorphism, so that each fi : Y → 𝕂 is a continuous surjective linear functional. By the Hahn–Banach theorem, we may extend each fi to a continuous linear functional Fi : X → 𝕂 on X. Let F := (Fi)iI : X → 𝕂I so F is a continuous linear surjection such that its restriction to Y is F|Y = (Fi|Y)iI = (fi)iI = f. It follows that if we define P := f−1F : XY then the restriction to Y of this continuous linear map P|Y : YY is the identity map 𝟙Y on Y, for P|Y = f−1F|Y = f−1f = 𝟙Y. So in particular, P is a continuous linear projection onto Y (i.e. PP = P). Thus Y is complemented in X and X = Y ⊕ ker P in the category of TVSs. ∎

One may use the above result to show that every closed vector subspace of is complemented and either finite dimensional or else TVS-isomorphic to .

Generalizations[edit]

General template

There are now many other versions of the Hahn–Banach theorem. The general template for the various versions of the Hahn–Banach theorem presented in this article is as follows:

X is a vector space, p is a sublinear function on X (possibly a seminorm), M is a vector subspace of X (possibly closed), and f is a linear functional on M satisfying |f| ≤ p on M (and possibly some other conditions). One then concludes that there exists a linear extension F of f to X such that |F| ≤ p on X (possibly with additional properties).

For seminorms[edit]

Hahn–Banach for seminorms[21][22] — If M is a vector subspace of X, p is a seminorm on M, and q is a seminorm on X such that pq|M, then there exists a seminorm P on X such that P|M = p and Pq.

A proof runs as follows:

Lemma[3] — Let M be a vector subspace of a real or complex vector space X, let D be an absorbing disk in X, and let f be a linear functional on M such that |f| ≤ 1 on MD. Then there exists a linear functional F on X extending f such that |F| ≤ 1 on D.

let S be the convex hull of { mM : p(x) ≤ 1 } ∪ { xX : q(x) ≤ 1}. Note that S is an absorbing disk in X, and call its Minkowski functional q. Then p = P on M and Pq on X.

Geometric separation[edit]

Hahn–Banach sandwich theorem[3] — Let S be any subset of a real vector space X, let p be a sublinear function on X, and let f : S → ℝ be any map. If there exist positive numbers a and b such that for all x, yS,

then there exists a linear functional F on X such that Fp on X and fF on S.

Maximal linear extension[edit]

Theorem[3] (Andenaes, 1970) — Let M be a vector subspace of a real vector space X, p be a sublinear function on X, f be a linear functional on M such that fp on M, and let S be any subset of X. Then there exists a linear functional F on X that extends f, satisfies F ≤ p on X, and is (pointwise) maximal in the following sense: if G is a linear functional on X extending f and satisfying Gp on X, then GF implies that G = F on S.

Vector valued Hahn–Banach[edit]

Theorem[3] — Let X and Y be vector spaces over the same field, M be a vector subspace of X, and f : MY be a linear map. Then there exists a linear map F : XY that extends f.

For nonlinear functions[edit]

The following theorem of Mazur–Orlicz (1953) is equivalent to the Hahn–Banach theorem.

Mazur–Orlicz theorem[23] — Let T be any set, r : T → ℝ be any real-valued map, X be a real or complex vector space, v : TX be any map, and p be a sublinear function on X. Then the following are equivalent:

  1. there exists a real-valued linear functional F on X such that Fp on X and rFv on T;
  2. for any positive integer n, any sequence s1, ..., sn of non-negative real numbers, and any sequence t1, ..., tn of elements of T, .

The following theorem characterizes when any scalar function on X (not necessarily linear) has a continuous linear extension to all of X.

Theorem (The extension principle[24]) — Let f a scalar-valued function on a subset S of a topological vector space X. Then there exists a continuous linear functional F on X extending f if and only if there exists a continuous seminorm p on X such that

for all positive integers n and all finite sequences (ai)n
i=1
of scalars and elements (si)n
i=1
of S.

Converse[edit]

Let X be a topological vector space. A vector subspace M of X has the extension property if any continuous linear functional on M can be extended to a continuous linear functional on X, and we say that X has the Hahn–Banach extension property (HBEP) if every vector subspace of X has the extension property.[25]

The Hahn–Banach theorem guarantees that every Hausdorff locally convex space has the HBEP. For complete metrizable topological vector spaces there is a converse, due to Kalton: every complete metrizable TVS with the Hahn–Banach extension property is locally convex.[25] On the other hand, a vector space X of uncountable dimension, endowed with the finest vector topology, then this is a topological vector spaces with the Hahn-Banach extension property that is neither locally convex nor metrizable.[25]

A vector subspace M of a TVS X has the separation property if for every element of X such that xM, there exists a continuous linear functional f on X such that f(x) ≠ 0 and f(m) = 0 for all mM. Clearly, the continuous dual space of a TVS X separates points on X if and only if { 0 } has the separation property. In 1992, Kakol proved that any infinite dimensional vector space X, there exist TVS-topologies on X that do not have the HBEP despite having enough continuous linear functionals for the continuous dual space to separate points on X. However, if X is a TVS then every vector subspace of X has the extension property if and only if every vector subspace of X has the separation property.[25]

See also[edit]

References[edit]

  1. ^ O'Connor, John J.; Robertson, Edmund F., "Hahn–Banach theorem", MacTutor History of Mathematics archive, University of St Andrews.
  2. ^ See M. Riesz extension theorem. According to Gȧrding, L. (1970). "Marcel Riesz in memoriam". Acta Math. 124 (1): I–XI. doi:10.1007/bf02394565. MR 0256837., the argument was known to Riesz already in 1918.
  3. ^ a b c d e f g h i j k l m n o p q r s t Narici & Beckenstein 2011, pp. 177-220.
  4. ^ Rudin 1991, Th. 3.2
  5. ^ Reed & Simon 1980.
  6. ^ Schechter 1996.
  7. ^ HAHNBAN file
  8. ^ Schechter, Eric. Handbook of Analysis and its Foundations. p. 620.
  9. ^ Foreman, M.; Wehrung, F. (1991). "The Hahn–Banach theorem implies the existence of a non-Lebesgue measurable set" (PDF). Fundamenta Mathematicae. 138: 13–19. doi:10.4064/fm-138-1-13-19.
  10. ^ Pawlikowski, Janusz (1991). "The Hahn–Banach theorem implies the Banach–Tarski paradox". Fundamenta Mathematicae. 138: 21–22. doi:10.4064/fm-138-1-21-22.
  11. ^ Brown, D. K.; Simpson, S. G. (1986). "Which set existence axioms are needed to prove the separable Hahn–Banach theorem?". Annals of Pure and Applied Logic. 31: 123–144. doi:10.1016/0168-0072(86)90066-7. Source of citation.
  12. ^ Simpson, Stephen G. (2009), Subsystems of second order arithmetic, Perspectives in Logic (2nd ed.), Cambridge University Press, ISBN 978-0-521-88439-6, MR2517689
  13. ^ Harvey, R.; Lawson, H. B. (1983). "An intrinsic characterisation of Kähler manifolds". Invent. Math. 74 (2): 169–198. doi:10.1007/BF01394312. S2CID 124399104.
  14. ^ a b c Zălinescu, C. (2002). Convex analysis in general vector spaces. River Edge, NJ: World Scientific Publishing Co., Inc. pp. 5–7. ISBN 981-238-067-1. MR 1921556.
  15. ^ Gabriel Nagy, Real Analysis lecture notes
  16. ^ Brezis, Haim (2011). Functional Analysis, Sobolev Spaces, and Partial Differential Equations. New York: Springer. pp. 6–7.
  17. ^ Trèves 2006, p. 184.
  18. ^ Narici & Beckenstein 2011, pp. 195.
  19. ^ Schaefer & Wolff 1999, p. 47.
  20. ^ Narici & Beckenstein 2011, p. 212.
  21. ^ Wilansky 2013, pp. 18-21.
  22. ^ Narici & Beckenstein 2011, pp. 150.
  23. ^ Narici & Beckenstein 2011, pp. 177–220.
  24. ^ Edwards 1995, pp. 124-125.
  25. ^ a b c d Narici & Beckenstein 2011, pp. 225-273.

Bibliography[edit]