Lecture notes, etc., for
Math 55b: Honors Real and Complex Analysis
(Spring [2016-]2017)
If you find a mistake, omission, etc., please
let me know
by e-mail.
The orange balls
mark our current location in the course,
and the current problem set.
New this term: CAs’
TeX notes of class lectures (thanks, A&W!).
Office hours will be in the Lowell House Dining Hall as in Math 55a,
or by appointment.
The first office hours are Tuesday, 31 January, from 8:00 to 10:00 PM.
Afterwards (starting 7 February) it will be Tuesdays 8:00 to 9:30
till further notice.
CA office hours will be in Leverett Dining Hall,
Monday 8-10 PM (concurrent with Math Night) and Wednesday 8-10 PM.
Section times and rooms:
Andy, Friday 3-4 in SC 104;
Wyatt, Wednesday 3-4 in SC 411.
[SC = Science Center]
Our first topic is the topology of metric spaces,
a fundamental tool of modern mathematics
that we shall use mainly as a key ingredient in our rigorous development
of differential and integral calculus
over R and C.
To supplement the treatment in Rudin's textbook,
I wrote up 20-odd pages of notes in six sections;
copies will be distributed in class, and you also may view them
and print out copies in advance from the PDF files linked below.
[Some of the explanations, as of notations such as
f (·) and the triangle inequality
in C, will not be necessary; they were needed
when this material was the initial topic of Math 55a,
and it doesn't feel worth the effort to delete them now that it's been
moved to 55b. Likewise for the comment about the Euclidean distance
at the top of page 2 of the initial handout on
“basic definitions and examples”.]
Metric Topology I
Basic definitions and examples:
the metric spaces Rn
and other product spaces; isometries; boundedness and function spaces
The “sup metric” on XS
is sometimes also called the “uniform metric” because
d(f, g)≤r
is equivalent to a bound
d(f(s), g(s))≤r
for all s in S that is “uniform”
in the sense that it's independent of the choice of s.
Likewise for the sup metric on the space of bounded functions
from S to an arbitrary metric space X
(see the next paragraph).
If S is an infinite set and
X is an unbounded metric space then
we can't use our definition of XS
as a metric space because
supS dX(f(s), g(s))
might be infinite. But the bounded functions from S
to X do constitute a metric space under the same
definition of dXS. A function
is said to be “bounded” if its image is a bounded set.
You should check that
dXS(f, g)
is in fact finite for bounded f and g.
Now that metric topology is in 55b, not 55a, the following
observation can be made: if X is R
or C, the bounded functions in
XS constitute a vector space,
and the sup metric comes from a norm on that vector space:
d(f, g) =
||f−g||
where the norm ||·|| is defined by
||f || = sups |f(s)|.
Likewise for the bounded functions from S to any normed
vector space. Such spaces will figure in our development of real
analysis (and in your further study of analysis beyond Math 55).
The “Proposition” on page 3 of the first topology handout
can be extended as follows:
iv) For every point p of X
there exists a real number M such that
d(p, q) < M
for all q of E.
In other words, for every p in X
there exists an open ball about p that contains E.
Do you see why this is equivalent to (i), (ii), and (iii)?
Metric Topology II
Open and closed sets, and related notions
Metric Topology III
Introduction to functions and continuity
Metric Topology IV
Sequences and convergence, etc.
Metric Topology V
Compactness and sequential compactness
Metric Topology VI
Cauchy sequences and related notions
(completeness, completions, and a third formulation of compactness)
Here is a more direct proof of the theorem that
a continuous map
f : X → Y
between metric spaces is uniformly continuous if X
is compact.
Assume not. Then there exists
ε > 0
such that for all δ > 0
there are some points
p, q in X
such that
d(p, q) < δ
but
d(f (p), f (q)) ≥ ε.
For each n = 1, 2, 3, …,
choose pn, qn
that satisfy those inequalities for δ = 1/n.
Since X is assumed (sequentially) compact, we can extract
a subsequence {pni}
of {pn}
that converges to some p in X.
But then {qni}
converges to the same p. Hence both
f (pni)
and f (qni)
converge to f (p),
which contradicts the fact that
d(f (pni), f (qni)) ≥ ε
for each i.
Our next topic is differential calculus of vector-valued
functions of one real variable, building on Chapter 5 of Rudin.
You may have already seen “little oh” and “big Oh” notations.
For functions f, g on the same space,
“f = O(g)” means that
g is a nonnegative real-valued function,
f takes values in a normed vector space,
and there exists a real constant M such that
|f(x)|≤Mg(x)
for all x. The notation
“f = o(g)”
is used in connection with a limit; for instance,
“f(x) = o(g(x))
as x approaches x0” indicates that
f, g are vector- and real-valued
functions as above on some neighborhood of x0,
and that for each ε>0 there is a neighborhood
of x0 such that
|f(x)|≤εg(x)
for all x in the neighborhood. Thus
f '(x0) = a
means the same as
“f (x) = f (x0) + a(x−x0) + o(|x−x0|)
as x approaches x0”,
with no need to exclude the case x = x0.
Rudin in effect uses this approach when proving the Chain Rule (5.5).
Apropos the Chain Rule: as far as I can see we don’t need continuity
of f at any point except x
(though that hypothesis will usually hold in any application).
All that’s needed is that x has some relative neighborhood
N in [a,b] such that
f (N) is contained
in I. Also, it is necessary that f
map [a,b] to R,
but g can take values in any normed vector space.
The derivative of f /g
can be obtained from the product rule,
together with the derivative of 1/g
— which in turn can be obtained from the Chain Rule together
with the the derivative of the single function 1/x.
[Also, if you forget the quotient-rule formula, you can also reconstruct
it from the product rule by differentiating both sides of
f = g · (f /g)
and solving for (f /g)';
but this is not a proof unless you have some other argument to show that
the derivative exists in the first place.]
Once we do multivariate differential calculus,
we'll see that the derivatives of
f +g, f −g,
fg, f /g
could also be obtained in much the same way
that we showed the continuity of those functions,
by combining the multivariate Chain Rule
with the derivatives of the specific functions
x+y, x−y,
xy, x/y
of two variables x,y.
As Rudin notes at the end of this chapter, differentiation can also
be defined for vector-valued functions of one real variable. As Rudin
does not note, the vector space can even be infinite-dimensional,
provided that it is normed; and the basic algebraic properties of the
derivative listed in Thm. 5.3 (p.104) can be adapted to this generality,
e.g., the formula
(fg)' = f 'g + fg'
still holds if f, g
take values in normed vector spaces U, V
and multiplication is interpreted as a continuous bilinear map from
U × V
to some other normed vector space W.
Rolle’s Theorem is the special case
f (b) = f (a)
of Rudin’s Theorem 5.10; as you can see it is in effect the key step
in his proof of Theorem 5.9, and thus of 5.10 as well.
We omit 5.12 (continuity of derivatives) and 5.13 (L’Hôpital’s Rule).
In 5.12, see p.94 for Rudin’s notion of “simple discontinuity”
(or “discontinuity of the first kind”) vs.
“discontinuity of the second kind”, but please don’t
use those terms in your problem sets or other mathematical writing,
since they’re not widely known.
In Rudin’s proof of L’Hôpital’s Rule (5.13),
why can he assume that g(x)
does not vanish for any x in (a,b),
and that the denominator
g(x)−g(y)
in equation (18) is never zero?
NB The norm does not have to come from an inner product structure.
Often this does not matter because we work in finite dimensional
vector spaces, where all norms are equivalent, and changing to
an equivalent norm does not affect the definition of the derivative.
The one exception to this is Thm. 5.19 (p.113) where one needs the
norm exactly rather than up to a constant factor. This theorem still
holds for a general norm but requires an additional argument.
The key ingredient of the proof is this: given a nonzero vector
z in a vector space V, we want a continuous
functional w on V such that
||w|| = 1 and w(z) = |z|.
If V is an inner product space (finite-dimensional or not),
the inner product with z / |z|
provides such a functional w.
But this approach does not work in general. The existence of
such w is usually proved as a corollary of the Hahn-Banach
theorem. When V is finite dimensional, w can
be constructed by induction on the dimension of V.
To deal with the general case one must also invoke the Axiom of Choice
in its usual guise of Zorn’s Lemma.
We next start on univariate integral calculus,
largely following Rudin, chapter 6.
The following gives some motivation for the definitions there.
(And yes, it’s the same
Riemann (1826–1866) who gave number theorists
like me the Riemann zeta function and the Riemann Hypothesis.)
The Riemann-sum approach to integration goes back to the
“method of exhaustion” of classical Greek geometry,
in which the area of a plane figure (or the volume of a region in space)
is bounded below and above by finding subsets and supersets
that are finite unions of disjoint rectangles (or boxes).
The lower and upper Riemann sums adapt this idea
to the integrals of functions which may be negative as well as positive
(recall that one of the weaknesses of geometric Greek mathematics is that
the ancient Greeks had no concept of negative quantities
— nor, for that matter, of zero).
You may have encountered the quaint technical term “quadrature”,
used in some contexts as a synonym for “integration”.
This too is an echo of the geometrical origins of integration.
“Quadrature” literally means “squaring”, meaning not
“multiplying by itself” but “constructing a square of
the same size as”; this in turn is equivalent to
“finding the area of”,
as in the phrase “squaring the circle”.
For instance, Greek geometry contains a theorem equivalent
to the integration of x2dx,
a result called the “quadrature of the parabola”.
The proof is tantamount to the evaluation of lower and upper Riemann sums
for the integral of x2dx.
An alternative explanation of the upper and lower Riemann sums,
and of “partitions” and “refinements”
(Definitions 6.1 and 6.3 in Rudin),
is that they arise by repeated application of the following two
axioms describing the integral (see for instance
L.Gillman’s expository paper
in the American Mathematical Monthly (Vol.100 #1, 16–25)):
- For any a,b,c (with a < b < c),
the integral of a function from a to c
is the sum of its integrals
from a to b and from b to c;
- If a function f on [a,b] takes values in
[m,M] then its integral from a to b
is in
[m(b−a), M(b−a)]
(again assuming a < b).
The latter axiom is a consequence of the following two: the integral of
a constant function from a to b is that constant
times the length b−a
of the interval [a,b];
and if f ≤ g on some interval
then the integral of f over that interval
does not exceed the integral of g.
Note that again all these axioms arise naturally from an interpretation
of the integral as a “signed area”.
The (Riemann-)Stieltjes integral,
with dα in place of dx,
is then obtained by replacing each
Δx = b−a by
Δα = α(b)−α(a).
In Theorem 6.12, property (a) says the integrable functions form
a vector space, and the integral is a linear transformation;
property (d) says it’s a bounded transformation relative to the
sup norm, with operator norm at most
Δα = α(b)−α(a)
(indeed it’s not hard to show that the operator norm equals
Δα = α(b)−α(a));
and (b) and (c) are the axioms noted above. Property (e) almost
says the integral is linear as a function of α —
do you see why “almost”?
Recall the “integration by parts” identity: fg
is an integral of f dg + g df.
The Stieltjes integral
is a way of making sense of this identity even when f
and/or g is not continuously differentiable. To be sure,
some hypotheses on f and g must still
be made for the Stieltjes integral of f dg to make sense.
Rudin specifies one suitable system of such hypotheses in Theorem 6.22.
Here’s a version of Riemann-Stieltjes integrals
that works cleanly for integrating bounded functions from
[a,b] to any complete normed vector space.
[corrected 27.ii.17 to fix a minor typo:
“m→∞”,
not “m→0”]
Riemann-Stieltjes integration by parts: Suppose both
f and g are increasing functions on
[a,b]. For any partition
a = x0 < … < xn = b
of the interval, write
f(b)g(b)
− f(a)g(a)
as the telescoping sum of
f(xi)g(xi) −
f(xi−1)g(xi−1)
from i=1 to n. Now rewrite the i-th summand as
f (xi)
(g(xi)−g(xi−1)) +
g(xi−1)
(f (xi
)−f (xi−1)).
[Naturally it is no accident that this identity resembles the
one used in the familiar proof of the formula for the derivative of
fg !] Summing this over i yields the upper
Riemann-Stieltjes sum for the integral of f dg plus
the lower R.-S. sum for the integral of g df. Therefore:
if one of these integrals exists, so does the other, and their sum is
f (b)g(b)
− f (a)g(a).
[Cf. Rudin, page 141, Exercise 17.]
Most of Chapter 7 of Rudin we’ve covered already
in the topology lectures and problem sets. For more counterexamples
along the lines of the first section of that chapter, see
Counterexamples in Analysis by B.R.Gelbaum and J.M.H.Olsted
— there are two copies in the Science Center library (QA300.G4).
Concerning Thm. 7.16, be warned that it can easily fail for
“improper integrals” on infinite intervals.
It is often very useful
to bring a limit or an infinite sum within an integral sign,
but this procedure requires justification beyond Thm. 7.16.
We’ll cover most of the new parts of Chapter 7:
- One counterexample, 7.4;
- Weierstrass M, 7.10, extended to
vector-valued functions;
- uniform convergence and ∫ (7.16,
again in vector-valued setting, with the target space V
normed and complete);
- uniform convergence and d/dx
(7.17, also taking values in a complete normed vector
space V, which for us is either finite-dimensional or
an inner-product space since we’re not proving Hahn-Banach); and
- another counterexample, 7.18.
We’ll then outline the
Stone-Weierstrass theorem,
which is the one major result of Chapter 7 we haven’t seen yet.
We then proceed to power series and the exponential
and logarithmic functions in Chapter 8.
We omit most of the discussion of Fourier series (185–192),
an important topic (which used to be the concluding topic of Math 55b),
but one that alas cannot be accommodated given the mandates of
the curricular review. We’ll encounter a significant special case
in the guise of Laurent expansions of an analytic function on a disc.
See these notes (part 1,
part 2) from 2002-3 on
Hilbert space for a fundamental context for Fourier
series and much else (notably much of quantum mechanics),
which is also what we’ll use to give one proof of the
Müntz-Szász theorem
on uniform approximation on [0,1] by linear combinations of arbitrary powers.
[Yes, if I were to rewrite these notes now I would not have to define
separability, because we already did that in the course of developing
the general notion of compactness.]
We also postpone discussion of Euler’s Beta and Gamma integrals
(also in Chapter 8) so that we can use multivariate integration to
give a more direct proof of the formula relating them.
The result concerning the convergence of alternating series
is stated and proved on pages 70-71 of Rudin (Theorem 3.42).
The original Weierstrass approximation theorem (7.26 in Rudin)
can be reduced to the uniform approximation of the single function
|x| on [−1,1].
From this function we can construct
an arbirtrary piecewise linear continuous function, and such
piecewise linear functions uniformly approximate any continuous function
on a closed interval. To get at |x|,
we’ll rewrite it as
[1−(1−x2)]1/2,
and use the power series for (1−X)1/2.
This power series (and more generally the power series for
(1−X)A)
is the first part of Exercise 22 for Chapter 8, on p.201;
we outline another approach in
PS6, Problem 12
(under the assumption of the standard formula for differentiating
xr with respect to x,
which as we note there is not too hard for r rational).
We need (1−x)1/2
to be approximated by its power series uniformly on the
closed interval [−1,1] (or at least [0,1]);
but fortunately this too follows from the proof
of Abel’s theorem (8.2, pages 174-5).
Actually this is a subtler result than we need, since
the Xn coefficient
of the power series for (1−X)1/2 is negative
for every n>0. If a power series in X
has radius of convergence 1 and all but finitely many of its nonzero
coefficients have the same sign, then it is easily shown that the sum of
the coefficients converges if and only if f (X)
has a finite limit as X approachess 1, in which case the sum
equals that limit and the power series converges uniformly on
[0,1].
That’s all we need because clearly (1−X)1/2
extends to a continuous function on [0,1].
(For an alternative approach to uniformly approximating
|x|, see exercise 23 on p.169.)
Rudin’s notion of an “algebra” of functions is almost
a special case of what we called an
“algebra over F” in 55a
(with F = R
or C as usual), except that Rudin
does not require his algebras to have a unit (else he wouldn’t
have to impose the “vanish on no point” condition).
The notion can be usefully abstracted to a
“normed algebra over F”,
which is an algebra together with a vector space norm
||·|| satisfying
||xy|| ≤ ||x|| ||y||
for all x and y in the algebra.
Among other things this leads to the Stone-Čech theorem.
In the first theorem of Chapter 8,
Rudin obtains the termwise differentiability of a power series at any
|x| < R
by applying Theorem 7.17. That’s nice, but we’ll want to use
the same result in other contexts, notably over C,
where the mean value theorem does not apply. So we instead give
an argument that works in any complete field with an absolute value
— this includes R, C,
and other examples such as the field
Qp
of p-adic numbers. If the sum of
cn xn
converges for some nonzero x with
|x| = R, then any x
satisfying |x| < R has
a neighborhood that is still contained in
{y : |y| < R}.
So if f (x) is the sum of
that series, then for y ≠ x
in that neighborhood we may form the usual quotient
(f (x)−f (y))
/ (x−y) and expand it termwise, then let
y→x and recover the expected
power series for f '(x)
using the Weierstrass M test (Theorem 7.10).
An alternative derivation of formula (26) on p.179:
differentiate the power series (25) termwise (now that we know
it works also over C) to show
E(z) = dE(z)/dz;
then for any fixed w the difference
E(w+z)
− E(w) E(z)
is an analytic function of z that vanishes
at z = 0 and is thus zero everywhere.
In algebraic terms, identities (26) and (27) say that E gives
group homomorphisms from
(R, +) to R*
and from (C, +) to C*.
Theorem 8.6 includes the assertion that in the real case
this map has image the positive reals, and trivial kernel;
so there is a well-defined inverse function from the multiplicative
group of positive reals back to (R, +);
and that’s the logarithm function. In the complex case,
we shall soon see that the image is all of C*,
but the kernel is no longer trivial (in fact, ker(exp) consists of
the integer multiples of 2πi), which means that more care
will be needed if we want to define and use logarithms of
complex numbers.
Small error in Rudin: the argument on p.180 that
“Since E is strictly increasing and differentiable on
[the real numbers], it has an inverse function L which is
also strictly increasing and differentiable …” is not quite correct:
consider the strictly increasing and differentiable function taking
x to x3. What's the
correct statement? (Hint: the Chain Rule tells you what the
derivative of the inverse function must be.)
In any case, we have deliberately omitted the univariate
Inverse Function Theorem in anticipation of the multivariate setting where
the Inverse and Implicit Function Theorems are equivalent. However,
if there is a differentiable inverse function then
we known its derivative from the Chain Rule.
So if L'(y) exists then it equals
1/y; this together with L(1)=0
gives us the integral formula
L(y)
= ∫1y dx/x
(via the Fundamental Theorem of Calculus), and then we can define
L(y) by this formula,
and differentiate to prove that it is in fact
the inverse function of E for y>0.
The same approach identifies tan−1(y) with
∫0y
dx/(x2+1)
once we have constructed the sine and cosine functions
(Rudin’s “S ” and “C ”)
and checked that the derivative of their ratio tan(x)
is tan2(x) + 1.
This yields the power-series expansion
tan−1(y) =
y
− y3/3
+ y5/5
− y7/7
+ − · · ·
for |y|<1 (be sure you understand how to derive this from the
formula for the derivative of tan−1 !), and thus also
π/4 = tan−1(1) = 1 − 1/3 + 1/5 − 1/7 + 1/9 − 1/11
+ − · · ·
(why?).
Notes:
- You can also check that 1/(x2+1) is
(1/(x−i)
− 1/(x+i))
/ (2i),
and that the corresponding linear combination of
log(x±i)
seems to agree with tan−1(x)
— though I don’t think we are quite in position yet
to make rigorous sense of this route to
∫ dx/(x2+1).]
- The power series for sin, cos, and tan−1,
and the alternating series for π/4, long predate 19th-century calculus.
They are often named for Leibniz (1646–1716) or
James Gregory (1638–1675),
but were already known centuries earlier
— together with some computational applications, including
the evaluation of π as 4 cos−1(0) —
to the mathematicians of the
Kerala school,
and “are believed to have been discovered by Madhava of Sangamagrama
(c. 1350 – c. 1425)” according to
the “Madhava series” page
on Wikipedia.
Similarly we get
∫0y
dx/(1-x2)½
= sin−1 y
for |y| ≤ 1; note that this is the
“principal value” of
sin−1 y
(i.e., the choice in [−π/2, π/2]), and that for
y = ±1 the integral is “improper”
and must be interpreted as a limit
limy→1−
or limy→(−1)+ .
Likewise
∫
dx/(x2±1)½
leads to inverse functions of the
hyperbolic trigonometric functions
sinh(x)
= (ex − e−x) / 2
and cosh(x)
= (ex + e−x) / 2.
These basic indefinite integrals, together with elementary changes of
variable and integrations by parts, suffice to obtain any indefinite
integral one is likely to encounter in a first-year calculus class.
As far as I can tell the final inequality “≤ 2” in
Rudin’s (50) can just as easily be “≤ 1”,
because if we have found a choice of y that makes
C(y) negative then
C must already vanish somewhere between 0 and y.
For that matter, we can find such y
directly from the power series for C(y): we calculate
cos(2) < 1 − 22/2! + 24/4!
= −1/3 < 0
(the omitted terms − 26/6! + 28/8!
etc. pair up to a negative sum);
this yields an explicit upper bound of 4 on π.
Likewise if x2 ≤ 2 then
cos(x) > 0, so
π2 < 8.
It is “well known” that in fact π2
is less than but rather close to 10;
this one-page note explains this fact
if you believe that ζ(2) = π2/6
([Euler 1734]
— a famous theorem of which we shall give at least one proof before
the semester’s end). For much better estimates, integrate
(x−x2)4
dx/(x2+1) from 0 to 1
and note that
½ ≤ 1/(x2+1) ≤ 1.
☺
[Published by D. P. Dalzell in 1944, as I learned from the
replies to this MathOverflow question, where you can also find
further information about this nifty proof and some related mathematics.]
We can now prove our claims about the image and kernel of the exponential
homomorphism exp: (C, +) → C*.
We have seen in effect that the restriction of this map to the imaginary axis
{iy : y ∈ R}
has image the unit circle and kernel 2πiZ.
Write an arbitrary complex number z as
x+iy. Then
exp z = exeiy, so
|exp z| =
|ex| |eiy|
= ex.
Thus every nonzero complex w is in exp(C): write
w = |w| (w/|w|);
and then the first factor is a positive real number, so of the form
ex for some real x,
and the second factor is
eiy for various real y —
so we have written w = exp(x+iy).
Conversely, |ez| = 1 iff
x = 0, that is, iff z is
on the imaginary axis so ker(exp) is contained in the imaginary axis,
and we already know that eiy = 1
iff y is a multiple of 2π, QED.
We have already given the proof of the Fundamental Theorem of Algebra
(Rudin 8.8) as an application of the fact that a continuous
real-valued function on a compact set attains its minimum
(though at that time we didn’t yet officially prove that for
k = 1, 2, 3, … the k-th
power map on the unit circle is surjective). Section 8.9 is an
introduction to Fourier series;
while we must alas skip most of Fourier analysis in Math 55b,
we can still use our work so far to easily prove the following:
If f : R/2πZ
→ C
is a continuous function whose Fourier coefficients
an :=
(2π)−1∫
R/2πZ
exp(−inx) f (x) dx
satisfy
∑n|an| < ∞,
then f equals its Fourier series.
Proof : the difference is a continuous function
all of whose Fourier coefficients vanish;
applying Stone-Weierstrass to the real and imaginary parts,
we see that this difference can be uniformly approximated by
“trigonometric polynomials”
(finite linear combinations of cos(nx) and sin(nx)), etc.
[This special case of Stone-Weierstrass is also a theorem of
Fejér, who obtained an explicit sequence of trigonometric
polynomials converging to the function; see the first few pages of
Körner’s Fourier Analysis. NB it’s not
true that the partial sums of the Fourier series of every
continuous function converge to it pointwise, let alone uniformly!]
A nice example is
f (x)
= Bk(2πx)
for each k≥2, where
Bk is the kth Bernoulli polynomial
(for x in [0,1], and extended to R by periodicity);
this yields the values of ζ(k) for
k = 2, 4, 6, …, and much else in addition.
Parseval’s identity
for such functions follows as well:
(2π)−1∫
R/2πZ
|f (x)|2 dx
= ∑n∈Z
|an|2.
We next begin multivariate differential calculus,
starting at the middle of Rudin Chapter 9 (since the first part
of that chapter is for us a review of linear algebra —
but you might want to read through the material on norms of linear maps
and related topics in pages 208–9).
Again, Rudin works with functions from open subsets of
Rn
to Rm,
but most of the discussion works equally well with the target space
Rm
replaced by an arbitrary normed vector space V.
If we want to allow arbitrary
normed vector spaces for the domain of f,
we’ll usually have to require that the derivative f '
be a continuous linear map, or equivalently that its norm
||f '|| =
sup|v|≤1|f '(v)|
be finite.
As in the univariate case,
proving the Mean Value Theorem in the multivariate context
(Theorem 9.19) requires either that V have an inner-product
norm, or the use of the Hahn-Banach theorem
to construct suitable functionals on V. Once this is done,
the key Theorem 9.21 can also be proved for functions to V,
and without first doing the case m=1. To do this,
first prove the result in the special case when each
Dj f (x)
vanishes; then reduce to this case by subtracting from f
the linear map from Rn
to V indicated by the partial derivatives
Dj f (x).
The Inverse function theorem (9.24) is a special case
of the Implicit function theorem (9.28), and its
proof amounts to specializing the proof of the implicit function
theorem. But Rudin proves the Implicit theorem as a special
case of the Inverse theorem, so we have to do Inverse first.
(NB for these two theorems we will assume
that our target space is finite-dimensional;
how far can you generalize to infinite-dimensional spaces?)
Note that Rudin’s statement of the contraction principle
(Theorem 9.23 on p.220)
is missing the crucial hypothesis that X be nonempty!
The end of the proof of 9.24 could be simplified if Rudin allowed
himself the full use of the hypothesis that f
is continuously differentiable on E,
not just at a: differentiability of the
inverse function g at
b = f(a)
is easy given Rudin’s construction of g;
differentiability at any other point
f(x) follows,
since x might as well be a,
and then the derivative is continuous because
g and f ' are.
The proof of the second part of the implicit function theorem,
which asserts that the implicit function g not only
exists but is also continuously differentiable with derivative
at b given by formula (58) (p.225), can be done
more easily using the chain rule, since g has been
constructed as the composition of the following three functions:
first, send y to
(0, y); then,
apply the inverse function
F−1;
finally, project the resulting vector
(x,y)
to x. The first and last of these three functions
are linear, so certainly C1; and the continuous
differentiability of F−1 comes from
the inverse function theorem.
Here’s an approach to Dij=Dji
that works for a C2 function to an arbitrary
normed space. As in Rudin (see p.235) we reduce to the case of
a function of two variables, and define u and Δ.
Assume first that D21 f vanishes
at (a,b). Then use the Fundamental Theorem of Calculus
to write Δ(f,Q) as as the integral of
u'(t) dt on
[a, a+h],
and then write u'(t) as an integral of
D21
f (t,s) ds
on [b,b+k]. Conclude that
u'(t) = o(k)
and thus that
Δ(f,Q) / hk approaches zero.
Now apply this to the function
f − xyD21
f (x,y)
to see that in general Δ(f,Q) / hk approaches
D21 f (x,y).
Do the same in reverse order to conclude that
D21
f (x,y)=D12
f (x,y).
Can you prove
D12(f )
= D21(f )
for a function f to an arbitrary inner product space
under the hypotheses of Theorem 9.41?
We omit the “rank theorem” (whose lesser importance
is noted by Rudin himself), as well as the section on determinants
(which we treated at much greater length in Math 55a).
An important application of iterated partial derivatives is the
Taylor expansion of an m-times differentiable
function of several variables; see Exercise 30 (Rudin, 243-244).
As promised at the start of Math 55a and/or Math 55b, this also
applies to maxima and minima of real-valued functions f
of several variables, as follows. If f
is differentiable at a local maximum or minimum
then its derivative there vanishes, as was the case
for a function of one variable. Again we say that
a zero of the derivative is a “critical point”
of f. Suppose now that f is
C2 near a critical point.
The second derivative can be regarded as a quadratic form.
It must be positive semidefinite at a local minimum,
and negative semidefinite at a local maximum. Conversely,
if it is strictly positive (negative) definite at a critical point
then that point is a strict local minimum (resp. maximum) of f.
Compare with Rudin’s exercise 31 on page 244 (which however assumes that
f is C3 —
which I don’t is needed, though it makes some of the estimates
easier to obtain.
Next topic, and last one from Rudin, is
multivariate integral calculus (Chapter 10).
Most of the chapter is concerned with setting up a higher-dimensional
generalization of the Fundamental Theorem of Calculus that comprises
the divergence, Stokes, and Green theorems and much else besides.
With varying degrees of regret we’ll omit this material, as well as
the Lebesgue theory of Chapter 11. We will, however, get
some sense of multivariate calculus by giving a definition of
integrals over Rn
and proving the formula for change of variables (Theorem 10.9).
this will already hint why in general an integral over an
n-dimensional space is often best viewed
as an integral not of a function but a “differential
n-form”. For instance, in two dimensions
an integral of
“f (x, y)
dx dy”
can be thought of as
“f (x, y)
dx ∧ dy”,
and then we recover the formula involving the Jacobian from
the rules of exterior algebra. You’ll have to read the
rest of this chapter of Rudin, and/or take a course on
differential geometry or “calculus on manifolds”,
to see these ideas developed more fully.
After deriving (at least part of) the change of variables formula,
we can return to the section of Chapter 8 concerning
Euler’s Beta and Gamma integrals and give a more natural
treatment of the formula relating them (Theorem 8.20).
The rather obscure integration by parts in Rudin, p.194 is not necessary.
A straightforward choice of “parts” yields
x B(x, y+1) =
y B(x+1, y) ;
This may seem to go in a useless direction,
but the elementary observation that
B(x, y) =
B(x, y+1) + B(x+1, y)
recovers the recursion (97).
In addition to the trigonometric definite integrals noted by Rudin
(formula 98), Beta functions also turn up in the evaluation of
the definite integral of
ua du /
(1+ub)c
over (0,∞): let
t = ub / (1+ub).
What is the value of that integral?
Can you obtain in particular the formula
π / (b sin(a π/b))
for the special case c=1?
The limit formula for Γ(x) readily yields
the product formula:
Γ(x) =
x−1 e−Cx
Prod(exp(x/k) / (1+x/k),
k=1,2,3,...)
where C=0.57721566490... is Euler’s constant
(a.k.a. the Euler-Mascheroni constant),
which is the limit as N→∞ of
1 + (1/2) + (1/3) + ... + (1/N) − log(N).
This lets us easily show that Γ is infinitely differentiable
(in fact analytic) and to obtain nice formulas for the derivatives
of log(Γ(x)); for instance,
Γ '(1) = −C, and more generally
the logarithmic derivative of Γ(x) at
x = N+1 is
1 + (1/2) + (1/3) + ... + (1/N) − C.
Let
I(w) = ∫−∞∞
exp(−x2+wx) dx
for any complex w. The integral is “improper”
but converges absolutely for all w. We know
I(0) = π½.
If w is real then
I(w) = exp(w2/4) I(0)
by “completing the square”. We showed in effect that
the same formulas holds for purely imaginary w:
if w = it then the real part of the integral
is
I(w) = ∫−∞∞
exp(−x2) cos(tx) dx,
which we evaluated as
exp(−t2/4) I(0),
and the imaginary part is
I(w) = ∫−∞∞
exp(−x2) sin(tx) dx,
which vanishes by antisymmetry. Combining these two tools we can show
I(w) = exp(w2/4) I(0)
for all complex w. We shall give a better explanation
for this when we can think of I as an analytic function of
a complex variable.
Context for Bohr-Mollerup etc.:
Some generalities about
convexity.
A subset E of a real vector space V
is said to be convex if E contains the line segment
{(1−t)x + ty :
x, y ∈ E,
0 ≤ t ≤ 1} joining any two points
x, y ∈ E.
Examples are V, ∅, a vector subspace, or a closed or open
ball with respect to any norm on V (why?);
also the intersection of any convex sets
E and E'
(and indeed the intersection of any family of convex sets),
the sum
{x + x' :
x ∈ E,
x' ∈ E'}
of any convex sets E and E'
(so in particular the translate of a convex set by a fixed vector),
and the image or preimage of a convex set under any linear transformation
(so in particular a convex subset of a vector subspace).
By induction, a convex set is closed under “convex combinations”:
if x1, …, xn
∈ E
then E also contains
∑i tixi
for all real ti such that
∑i ti = 1
and each ti ≥ 0
(whence also each ti ≤ 1).
A subset E of V is “midpoint-convex”
if it contains the midpoint (x + y) / 2
of the line segment joining any two points
x, y ∈ E.
That’s the special case t=½
of (1−t)x + ty.
By induction E then contains
(1−t)x + ty
for all “dyadic rationals” (rational numbers with
power-of-2 denominator) t in [0,1].
This does not imply that E is convex (for instance,
the rational numbers constitute a subset of R that is
midpoint-convex but not convex); however, in a normed vector space,
an open or closed midpoint-convex subset is automatically convex
(basically because the dyadic rationals are dense in R).
If E is convex, a function
g : E → R
is said to be “convex” if its
“epigraph”
{(v, y)
∈ V × R :
y ≥ g(v)}
is a convex subset of V ⊕ R.
[One can also use the “strict epigraph” with the condition
y ≥ g(v) replaced by
y > g(v).]
Equivalently,
g((1−t)x + ty)
≤ (1−t) g(x)
+ t g(y)
for all
x, y ∈ E
and t ∈ [0,1].
Examples are constants, linear functions, norms (check this!),
quadratic forms iff they’re positive-semidefinite (why?),
the pointwise maximum of any two convex functions (or even the pointwise
supremum of any nonempty family of convex functions, provided it is
everywhere positive — because this corresponds to intersecting
the epigraphs), and any positive linear combination of
convex functions. By induction, the criterion
g((1−t)x + ty)
≤ (1−t) g(x)
+ t g(y)
generalizes to convex linear combinations:
g(∑i
tixi) ≤
∑i ti
g(xi)
with ti as above
(i.e. ∑i ti = 1
and each ti ≥ 0); in other words,
a weighted average of function values is ≥ the value at the function
at the corresponding weighted average. This is
Jensen’s inequality.
We say g is “midpoint-convex” if
g((1−t)x + ty)
≤ (1−t) g(x)
+ t g(y)
holds only for t=½ (i.e. for unweighted averages),
and thus by induction for all dyadic t in [0,1]
(and by a clever variation of the argument, even for all rational
t in [0,1], whether dyadic or not). If g is continuous
then it is convex iff it is midpoint convex.
For example, the exponential function on R, and the function
g(x) = −log(x) on
(0,∞), are readily seen to be midpoint-convex,
being continuous, these functions are thus both convex.
Applying Jensen then yields the
weighted AM-GM inequality.
Rudin uses a fact about convex functions on intervals in R
that is only presented as an exercise earlier in the book (p.100, #23).
Namely: let f be a convex function on some interval I,
and consider the slope
s(x, y) :=
(f (x)−f (y))
/ (x−y)
as a function on the set of (x, y) in
I × I with
x > y;
then s is an increasing function of both variables.
The proof is fortunately not hard. For instance, to prove that if
x > y' > y then
s(x, y' ) >
s(x, y),
write y' as px + qy
with p + q = 1,
and calculate that
s(x, y') >
s(x, y)
is equivalent to the usual convexity condition. The case
x > x' > y
works in exactly the same way.
If G takes only positive values then G is said to be
logarithmically convex if log G is convex
(equivalently, if G=exp(g) for some
convex function g); that is,
g((1−t)x + ty)
≤ g(x)1−t
g(y)t
for all x, y, t as above
(or t=½ for “logarithmically midpoint-convex”).
This is a strictly stronger condition than convexity (why?).
It is satisfied by any function of the form
G(x) = ∫t
A(t)xB(t)
for any nonnegative functions A, B
for which the integral converges.
(For t=½ this is an application of Cauchy-Schwarz;
to prove it directly for other t in [0,1], use
Hölder’s inequality.)
In particular, Γ is logarithmically convex on (0,∞),
as is B(x,y) as a function of two variables
with x,y > 0 (why?).
This lets Rudin prove the formula
B(x,y) =
Γ(x) Γ(y) / Γ(x+y)
via Bohr-Mollerup
without the usual double integral, and also obtain the product formula
Γ(x). Further consequences of this
product formula are the identity
[B(x, 1−x) = ]
Γ(x) Γ(1−x)
= π / sin(πx)
for x in (0,1) (via the product formula for the sine),
and also the duplication formula, and similar identities such as
the “triplication formula” expressing
Γ(x/3) Γ((x+1)/3) Γ((x+2)/3)
as a multiple of Γ(x).
“Partitions of unity”
can even be made smooth (a.k.a. C ∞).
This requires a smooth function φ : R → [0,1]
such that
φ(t) = 0 for t ≤ 0
and φ(t) = 1 for t ≥ 1.
Here is one construction. Recall that the function g defined by
g(t) = exp(−1/t)
for x > 0 and
g(t) = 0 for t ≤ 0
is smooth and already satisfies
g(t) = 0 for t ≤ 0,
as well as
0 < g(t) < 1 for
t > 0. So one construction of φ is
φ(t) = g(t) /
(g(t) + g(1−t)).
Alternatively, we can use
φ(t) =
c ∫−∞t
f (u) du,
where f is a smooth nonnegative function with support (0,1)
and the normalizing constant c is
1 / ∫01
f (u) du;
for example, take
f (u)
= φ(u) φ(1−u),
which is
exp(−1/(u−u2))
if 0 < u < 1 and zero otherwise.
Once we’ve obtained
Green’s thoerem
∫∂B ω
= ∫B dω
for a C1 1-form on a neighborhood
E of a rectangle B, it follows by change of variable
for the image of B in any invertibly C1
image of E. This, together with patching together such
images and taking limits, gives us gives us Green’s theorem
for all B that we shall need. (You might think of
the “oriented boundary” ∂ as a group homomorphism from
combinations of oriented d-dimensional figures to
combinations of oriented (d-1)dimensional figures;
this makes
(γ, ω)
↦ ∫γ ω
a pairing. The identity
∫∂B ω
= ∫B dω
is the Fundamental Theorem of Calculus for d=1,
and Green’s theorem for d=2.
For the vast generalizations of this to Stokes’ theorem,
and of identities such as ∂2 = 0 and
d2 = 0, exact vs. closed
differentials, etc., if/when you take courses in differential geometry
and algebraic topology.
A very special, but still very important, case of Green’s theorem
is obtained by identifying R2 with C
in the usual way and considering (the real and imaginary parts of)
the contour integral
∫γ w(z) dz
where
w = u + iv
is differentiable as a function of a complex variable
z = x + iy
on a neighborhood of B, and
dz = dx + i dy.
Using the Cauchy-Riemann equations we find that if
ω = w(z) dz
then dω = 0, so
∫∂B
w(z) dz = 0.
It follows that w has an indefinite integral on any convex
open region (or its image under an invertibly 1:1 map).
The example of dz/z on C*
shows that this result can fail for other regions (again,
you can explore this more thoroughly in 100- and 200-level classes).
With a bit more work we obtain
Cauchy’s integral formula,
and thus the analyticity of w.
Math 55b concludes with an introduction to
complex analysis
(a.k.a. “functions of one complex variable”).
We'll start with contour integrals and the fundamental
theorems of Cauchy, roughly following the exposition in Ahlfors,
chapter III (p.82 ff.). We'll prove:
- Cauchy’s theorem for a rectangle: if f is
continuously differentiable on a neighborhood of the rectangle,
then the contour integral of
f (z) dz
on the boundary of the rectangle vanishes; curiously this can be proved
even without the assumption that f' is continuous,
using a neat subdivision trick that Ahlfors attributes to Goursat,
but I don’t know when one would ever need this refinement.
- The variant when f is defined on the complement
in the rectangle of finitely many points ζ, as long as
(z−ζ) f (z) → 0
on each such point.
- Same theorems for a circle.
- Likewise for the annulus between two concentric circles: the
integrals over the two circles are equal. (One proof: change
variables to a rectangle of height 2π using complex exponential.)
- Cauchy’s integral formula for a rectangle or circle:
under the same hypotheses,
if a is any interior point then
f (a)
is (1/2πi) times the contour integral of
f (z) dz
/ (z−a).
- Consequences:
- If f is differentiable in a circle of radius
R > r > 0 about a
then f (a) is the average of
f (a
+ reiθ)
over θ in [0, 2π]; corollary(!):
Fundamental Theorem of Algebra; also:
|f (·)| has no local maximum
unless it is constant
(“maximum principle” for analytic functions).
- [via power series expansion of
1 / (z−a)]:
f is analytic, and its power-series expansion about
any z0 converges in
|z−z0| < r
if that open circle is contained in the interior of our rectangular
or circular contour. This again yields the maximum principle above,
and also shows that
|f (·)| has no nonzero
local minimum unless it is constant.
- Same argument (or contour integration by parts) also yields
an integral formula for the derivative
f '(z0), which
in turn gives Liouville's theorem: a bounded analytic function on
all of C is constant. [An analytic function
on all of C is also known as an
entire function.]
- Using also the integral formula for higher derivatives of
an analytic function: The uniform limit f
of a sequence {fn}
of analytic functions is analytic,
and each term in a power series expanion of f
is the limit of the corresponding terms for the
fn. Thus the same is true
for a uniformly convergent sum of analytic functions.
This is very useful for constructing/defining analytic functions;
e.g. the Riemann zeta function ζ(s)
is defined for Re(s)>1 as
∑n n−s
(summed over n=1,2,3,…).
- (“analytic continuation”)
An analytic function f
on a neighborhood of z that vanishes on distinct points
zn → z
is identically zero. (Proof: by induction each coefficient in the
power series expansion of f about z
vanishes.) Hence if two analytic functions
f, g agree at each
zn then they are equal.
- Cauchy's integral formula also works under the weaker hypothesis
we’ve seen above, allowing a finite number of exceptional ζ;
the formula shows that in this case f extends to
an analytic function on all of the interior of our rectangular or
circular contour.
Thus such ζ is called a removable singularity.
- More generally, if f is analytic in a
punctured neighborhood of ζ, and
(z−ζ)n
f (z) → 0
for some positive integer n, then
(z−ζ)n−1 f (·)
has a removable singularity at ζ. Then
f is said to have a pole at ζ:
we can write f (z)
as an polynomial of degree <n in
1 / (z−ζ)
plus an analytic function on a neighborhood of ζ.
We say ζ is a simple, double, triple, etc. pole
if that polynomial in
1 / (z−ζ)
has degree 1, 2, 3, …, or a pole of order d
if the polynomial has degree d.
(A removable singularity then has d=0
but is rarely called a “pole of order zero”.)
If there's no such n then f
is said to have an essential singularity at ζ.
The standard example of such a function is
f (z) = exp(1/z)
with ζ = 0.
- If f is analytic on an open set E,
and has a set P of poles, then f is said to be
meromorphic on the union of E and P.
In the special case of an analytic function (i.e. with P
empty) we also say f is holomorphic.
As the holomorphic functions form a ring, the meromorphic functions
on a subset of C form a field; it turns out
to be the fraction field of the holomorphic functions: for any
meromorphic function f we can find
a holomorphic function h that vanishes at the poles
of f to the necessary multiplicities for
g=hf to be holomorphic,
and then f =g/h.
[NB: Analytic continuation also works for meromorphic functions;
indeed it's easy to see that a function continuous at z
cannot have a sequence of poles approaching z,
so by restricting to a small enough neighborhood we get
a holomorphic function.]
- Example: if f is analytic on the unit disc
and vanishes at the origin then f/z
is analytic too; this yields the
Schwarz Lemma.
A key concept in the theory and application of complex analysis
is the residue of a function — more properly, a differential
— on a punctured neighborhood of some complex number.
-
For an analytic function f on a punctured
neighborhood of ζ, the residue at ζ of the
differential
f (z) dz
is invariant under locally invertible analytic changes of variable;
i.e. is the same as the residue of
f (g(w))
g'(w) dw
at the preimage of ζ under g,
assuming g' does not vanish there.
The residue is defined as the integral of
(1/2πi) f (z) dz
on (say) a circle around ζ. If
f (z)
has a power series expansion
∑n cn
(z−ζ)n
in a punctured neighborhod of ζ
then the residue is the coefficient
c−1.
-
The integral of a differential on a closed contour (at least one of our
standard contours: circles, rectangles, and their images under
invertible conformal maps) is 2πi times the sum of
the residues at the poles or essential singularities the contour
encloses, assuming the differential is analytic except at those
finitely many points.
- This has numerous applications to the evaluation of (often
non-elementary) definite integrals of elementary functions.
Some paradigmatic examples:
-
∫ 0∞
dx / (x2 + 1) = π / 2
-
∫ 0∞
dx / (x2 + 1)2
= π / 4
-
∫ 0∞
cos(cx)
dx / (x2 + 1)
= (π / 2) e−|c|
for all real c
-
∫ 0∞
sin(cx)
dx / x
= π / 2
for all c > 0
[via the “principal value” of the integral of
exp(icx) dx / x
over (−∞,∞)]
-
For α in (0,1), the special value
B(α, 1−α) of the Beta function is given by
∫ 01
dx / (xα(x + 1))
= π / sin(απ)
- An important special case: the logarithmic differential
df / f of a meromorphic
function f has only simple poles, at the
zeros and poles of f , with residue n
at a zero of order n and −n
at a pole of order n. Hence the integral of
df / f =
(f '(z) / f (z))
dz
on a closed contour is 2πi times the difference between
the number of zeros and poles enclosed by the contour, counted with
multiplicity. This assumes that f is analytic on
the interior of the contour and has neither zero nor pole on
the contour itself. This formula is often called the
“argument principle”, because the integral can be
interpreted as i times the change of the “argument”
(= imaginary part of the logarithm) of f
around the contour.
- The Gamma function extends to a meromorphic function
on C, satisfying the same functional equation
Γ(z+1) = z Γ(z),
and holomorphic except for simple poles at
z = 0, −1, −2, −3, … .
Given that there exists a meromorphic function
Γ on C that agrees with the usual
Gamma function on the positive real axis, this extension is unique
(analytic continuation again), but it is not immediate that
such an extension is possible. We give two approaches:
- Use the Euler integral to define Γ(z)
for z = x + iy
of positive real part x; prove the functional equation
for such x (either by analytic continuation or by the
usual integration by parts); and then use the functional equation
to inductively extend Γ to
x > −1,
x > −2,
x > −3, etc.,
at each stage finding a simple pole at
0, −1, −2, etc.
- Use the formula
Γ(z) =
limn→∞
n! nz /
(z (z+1) (z+2) …
(z+n)) .
This sequence of analytic functions converges uniformly
on bounded subsets of
C −
{0, −1, −2, −3, …},
whence its limit is analytic.
Each of these can be used to prove the functional equation; the second
most easily yields the following additional properties:
- For each x > 0, the function
| Γ(x+iy) |
of a real variable y is maximized at
y = 0,
and is increasing for
y ≥ 0
and decreasing for
y ≤ 0;
- In particular, Γ(z) ≠ 0
for all complex z;
- The Stirling approximation holds for the complex Gamma function,
in the following sense:
given ε > 0,
we have Γ(z) asymptotic to
(2π)1/2 zz−(1/2)
e−z as
|z| →∞
as long as z has argument in
[ε−π, π−ε].
[NB We did not prove this last part in class, and yes, it does
reduce to the usual Stirling approximation to n! for
z = n+1
even though it looks different.]
- The Beta identity
B(x,y)
=
Γ(x) Γ(y)
/ Γ(x+y)
still holds for x and y of positive real part,
and can be used to define B(x,y)
for all complex x and y not among the
poles of Γ. In particular, the identity
B(α, 1−α)
= π / sin(απ)
holds for all non-integer complex numbers α.
We can use this, together with the formula
Γ(z) =
limn→∞
n! nz /
(z (z+1) (z+2) …
(z+n)) ,
to give a proof (admittedly more convoluted than necessary or usual)
of the product expansion of the sine, and thus (via logarithmic
differentiation) of the partial-fraction decomposition of
cot(x) and (differentiating once more)
sec2(x), etc. Evaluation at
special points like π/2 yields classical formulas: Wallis's
product, the values of ζ(k) for
k = 2, 4, 6, … (again), and others.
- (not covered in the final exam) Introduction to conformal mapping
and the Riemann mapping theorem;
Hilbert space basics; and 1.5 proofs of
Müntz’s generalization of the
Weierstrass approximation theorem.
Problem sets 1 and 2: Metric topology basics
due dates corrected 30.i.17 (Wednesdays, 1 and 8 Feb.)
Problem sets 3 and 4: Metric topology cont’d
Yes, in problem 10i all vector spaces are over
F=R or C.
(We already know from last term it's not true
over Q…)
Problem sets 5 and 6:
Topology finale; differential-calculus prelude
Problem set 7: Univariate integral calculus
Problem set 8:
Manipulating and estimating definite integrals to prove some classical
identities; first few problems in multivariate differential calculus
due date corrected: Wednesday April 5
(the 6th will be Thursday)
Problem set 9:
Multivariate differentiation cont’d;
a bit of multivariate integration
corrected April 7 (thanks to Peter Chen)
to say explicitly that Problem 3 continues Problem 2
Problem set 10:
Integration in Rk and C, etc.
Problem set 11:
Complex analysis cont’d: definite integrals and other uses of
residues; rational functions; variation on a theme of Jensen