Lecture notes for Math 229: Introduction to Analytic Number Theory (Spring 20[20-]21)

If you find a mistake, omission, etc., please let me know by e-mail.

The orange ball marks our current location in the course.

For an explanation of the background pattern, skip ahead to the end of the page.

January 26 and 28: plan.pdf and intro.pdf: administrivia and “philosophy”/examples;
elem.pdf: Elementary methods I: Variations on Euclid;
euler.pdf: Elementary methods II: The Euler product for $s \geq 1$ and consequences

[also: which if any of 8675309, 6060842, 6654321, and 7184981043 is prime, and how surprised might you be if one of them is prime, or half of a prime pair? Thanks to Jordan Ellenberg and David Farmer for noting the prime and prime pair respectively. (Turns out that I had already seen the twin-prime observation in the mouse-over text for xkcd comic #1047: Approximations.]

The CA for Math 229 is Alec Sun. If you’re on my mailing list then you already have his e-mail address.

faculty legislation requires all instructors to include a statement outlining their policies regarding collaboration on their syllabi” [but apparently does not require us to be all that careful about singular/plural grammar …] — as stated in plan.pdf: for homework, “as usual in our department, you are allowed — indeed encouraged — to collaborate on solving homework problems, but must write up your own solutions.” For the final project or presentation, work on your own even if another student has chosen the same topic. (As with theses etc. it is still OK to ask peers to read drafts of your paper, or see dry runs of your presentation, and make comments.) In all cases, acknowledge sources as usual, including peers in your homework group.

For many more examples of and references for elementary approaches to the distribution of primes and related topics, see Paul Pollack’s book Not Always Buried Deep: A Second Course in Elementary Number Theory.
About $4, 25, 168, \ldots$: OEIS Sequence A006880 gives $\pi(10^n)$ for $n \leq 28$. Computing the last few terms required both nontrivial algorithms and significant computational resources.
Here is XKCD’s haiku rendition of Euclid’s proof. (CW: b-word.) It’s not quite right — can you remove the last word and use those two syllables to fix the proof? — “But, hey, it’s a hallucination.
To make $\ll$ and $\gg$ in TeX, write \ll and \gg respectively. Please do not write << and >> (which will produce $<<$ and $>>$ )!

February 2 and 4 [and much of Feb.9]: dirichlet.pdf: Dirichlet characters and L-series; Dirichlet’s theorem under the hypothesis that L-series do not vanish at $s=1$
Homework = elem Exercise 6; euler Exercises 3 and 4; and dirichlet Exercises 1, 3, and 7. (In the past I also assigned some of elem:2,5, euler:2,7, and dirichlet:5,9,11. If you do look at dirichlet:11, note that you’ll need quadratic reciprocity (QR) for the Kronecker or Jacobi symbol — or classical QR plus Dirichlet’s theorem [just infinitude, not logarithmic density]…)

The introduction of powers of $i$ for $r=5$ (starting at the bottom of page 3) is similar to the trick of summing every fourth entry in the $n$th row of Pascal’s triangle by averaging $(1+1)^n$, $(1-1)^n$, $(1+i)^n$, and $(1-i)^n$. Generalizing this to summing every $k$th entry again leads naturally to roots of unity and Pontrjagin duality for finite groups.

What we call the “inverse [discrete] Fourier transform” in the middle of p.5 is sometimes called the plain “[discrete] Fourier transform”, possibly with the factor of $|G|^{-1}$ omitted or replaced by $|G|^{-1/2}$. (There is a similar diversity of terminology for Fourier series and Fourier transforms.) I don’t think that any choice is “correct”, or even so dominant in the literature as to make all others look “wrong”; I regard any such choice as acceptable as long as it is introduced explicitly and used consistently.

One (overly) fancy way to identify the dual of ${\bf Z}/m{\bf Z}$ with μ$_m$ is to extend the quotient map ${\bf Z} \to {\bf Z}/m{\bf Z}$ to a short exact sequence $0 \to {\bf Z} \stackrel{m}{\to} {\bf Z} \to {\bf Z}/m{\bf Z} \to 0$, with the second arrow being multiplication by $m$, and then dualize to identify the dual of ${\bf Z}/m{\bf Z}$ with the kernel of multiplication by $m$ on the dual of $\bf Z$; but the dual of $\bf Z$ is just the circle group, so the kernel in question is its $m$-torsion subgroup, which is indeed μ$_m$. But to justify that we would have to extend Pontrjagin duality beyond finite groups to infinite groups like $\bf Z$ (discrete but not compact) and the circle (compact but not discrete).

February $\not\!9$ 11: chebi.pdf: Cebysev’s method; introduction of Stirling’s approximation, and of the von Mangoldt function $\Lambda(n)$ and its sum $\psi(x)$

Keith Conrad (who unlike me can read Russian, not just sound it out) writes that the trick of factoring $2n \choose n$ “was Erdos’s simplification to the proof of bounds on $\psi(x)$ […] Chebyshev never used binomial coefficients.” It still seems plausible that Chebyshev started with $2n \choose n$ but didn’t mention it in his 1852 paper (in French) because it not only gives worse bounds than he obtained using another combination of values of $\log (c_i n)!$ but also does not quite prove Bertrand’s postulate directly (though 80 years later Erdös did give a proof starting from the factorization of $2n \choose n$; this Wikipedia page gives a proof along these lines).
here is a scan of Ralph B. Boas’ “spelling lesson”, from the College Math J. 15 #3 (June 1984), page 217.

February 11, 16, & 18:
psi.pdf: Complex analysis enters the picture via the contour integral formula for $\psi(x)$ and similar sums (a.k.a. Perron’s formula);
zeta1.pdf: The functional equation for the Riemann zeta function using Poisson summation on theta series; basic facts about Γ(s) as a function of a complex variable s

The contour-integral formula can be regarded as an instance of the formula for inverting the Laplace transform, in the Laplace transform’s guise as the Mellin transform. Indeed partial summation of the Dirichlet series for $-\zeta' / \zeta$ writes $-\frac{\zeta'}{\zeta}(s) / s$ as the Mellin transform of $\psi(x)$ (evaluated at $-s$ if we go by Wikipedia’s normalization), and the inversion formula for recovering $\psi(x)$ would be precisely our contour integral if we were allowed to integrate all the way from $c - i\infty$ to $c + i\infty$ without worrying about convergence and error estimates. The Laplace inversion formula is in turn closely related with the more familiar formula for inverting the Fourier transform.
Euler already guessed the functional equation in some sense: you’ve probably run across the “identity” 1 + 2 + 3 + 4 + … = −1/12 (that he may have derived from 1 − 2 + 3 − 4 + … = 1/4, though according to the 1 + 2 + 3 + 4 + … page it is not clear whether Euler ever stated the −1/12 version), and −1/12 agrees with the value of ζ(−1); he likewise “computed” alternating sums that correspond to (1−21+n) ζ(−n) for other small integers n≥0, finding numbers equivalent to ζ(−n) = −1/2, −1/12, 0, 1/120, 0, −1/252, 0, 1/240, 0, −1/132, 0, 691/32760, 0 for n = 0, 1, 2, …, 12 — and the appearance of 691 surely suggested (if he didn’t notice this earlier) a connection with the values of ζ(n+1) that Euler had already obtained. Euler then proved this connection; he even chose cos(πn/2) for the 4-periodic fudge factor, which turns out to be right for all complex n ! But Riemann was still the first to give a formula for ζ(s) for complex s and to prove the functional equation in this setting.

To spell out the argument suggested on p.2 that $\Gamma$ has no zeros: since $s \Gamma(s) = \Gamma(s+1)$ for all $s$, it is enough to show $\Gamma(s) \neq 0$ for ${\rm Re}(s) \gt 0;$ if there were such a zero, we could conclude from the formula for $B(s_1,s_2)$ that if $s_1 + s_2 = s$ (with both $s_1, s_2$ in the right half-plane) then either $\Gamma(s_1) = 0$ or $\Gamma(s_2) = 0,$ and that would make $\Gamma$ identically zero by analytic continuation — contradiction. (Come to think of it, we already get $\Gamma(s) \neq 0$ from Euler’s reflection formula $\left[B(s,1-s) = \right] \Gamma(s) \Gamma(1-s) = \pi / \sin (\pi s)$, whose right-hand side surely has no zeros. Alternatively, use Bohr-Mollerup to establish the product formula for $\Gamma(s)$ on the positive real axis, and then invoke analytic continuation.)

You can find David Wilkins’ transcription and English translation of Riemann’s fundamental 1859 paper here. A conjecture equivalent to what we now call the Riemann Hypothesis appears in the middle of page 4 of the translation (numbered 5 in the PDF file because the title appears on page 0); note that in the previous page Riemann set $s=(1/2)+ti.$

Riemann’s proof of $\xi(s) = \xi(1-s)$ (and generalizations such as those we shall develop for Dirichlet L-functions) illustrate the versatile technique of studying an infinite sum by converting it to a definite integral. A few more exotic examples from my own contributions to Mathoverflow and Math Stack Exchange include the evaluation of $\sum_{n=0}^\infty \frac{n! \, (2n)!}{(3n+2)!}$ (MSE 516263) and the asymptotic analysis of $\sum_{k=0}^n {n \choose k} (-1)^k \sqrt{k}$ as $n \to \infty$ (MSE 368212) and $\sum_{n=0}^\infty (-x)^n / n^n$ as $x \to \infty$ (MO 109160).

If you like “distributions”, you can express Poisson summation as “a row of deltas $\sum_{m \in {\bf Z}} \delta_m$ is its own Fourier transform”: the inner product of any function $\,f$ with the “row of deltas” is the sum of the values of $\,f$ over the integers; now use Parseval for Fourier transforms.

Our special case of the Gaussian works also for complex $u$, giving rise to the value at $iu$ of a modular form of weight $1/2$. The “sanity check” for the resulting transformation formulas turns out to be basically equivalent with Quadratic Reciprocity!

Even for real u there’s more to be said. The functional equation for $\theta$ says nothing about $\theta(1)$ [which remarkably turns out to be $\pi^{1/4} / \Gamma(3/4)$] but does yield the ratio $\theta'(1) / \theta(1) = 1/4,$ which in turn gives the approximation $e^\pi \approx 8 \pi - 2$ by ignoring all terms of size $\exp(-3\pi)$ and smaller. This, together with the Archimedean approximation $\pi \approx 22/7$, explains the first run of 9s in $e^\pi - \pi = 19.999099979\ldots$ (see XKCD 217).

Another neat example of Poisson summation: if $f(x) = 1 / (x^2+c^2)$ (some constant $c \gt 0$), then the Fourier transform of $\,f$ is $(\pi/c) \exp(-2\pi c |y|)$ (a standard exercise in contour integration, which can also be done by Fourier inversion since the Fourier transform of $\exp(-2\pi c|y|)$ is an elementary integral); hence $\sum_{n\in\bf Z} 1/(n^2+c^2)$ is readily evaluated as $(\pi/c) (e^{2\pi c} + 1) \, / \, (e^{2\pi c} - 1)$ using the formula for summing a convergent geometric series. Subtracting the $n=0$ term $f(0) = 1/c^2$ and letting $c \to 0$, we recover Euler’s formula $\zeta(2) = \pi^2/6$. [This is Exercise 11 in zeta1.] The $c^2, c^4, c^6$, etc. terms in the Laurent expansion of $(\pi/c) (e^{2\pi c} + 1) \, / \, (e^{2\pi c} - 1)$ about $c=0$ then yield the values of $\zeta(s)$ for $s=4, 6, 8,$ etc. as rational multiples of $\pi^s$.

Poisson summation works in the general setting of locally-compact abelian groups: if $\,f$ is “any” function on such a group $G$, and $H \subseteq G$ is a closed subgroup, then $\int_H f$ is (appropriately scaled) the integral of the Fourier transform $\hat f$ of $\,f$ over the annihilator of $H$. In standard Poisson, $G = \bf R$, $H = \bf Z$ (which is its own annihilator), and the “integrals” over $\bf Z$ are just sums. Tate’s thesis gives a proof of the functional equation of the zeta function of any number field $K$ by using $K$ as the subgroup of the adèles of $K$. We won’t go there in Math 229x, but will note the generalization from $(G,H) = ({\bf R}, {\bf Z})$ to $(G,H) = ({\bf R}^n, L)$ where $L$ is some lattice in ${\bf R}^n$; the annihilator of L is then its dual lattice (a.k.a. the “reciprocal lattice” in crystallography).

February 23 and 25: gamma.pdf and prod.pdf: review of the complex Gamma function, and of functions of finite order (Hadamard’s product formula and its logarithmic derivative)

March 2 and 4: zeta2.pdf: the Hadamard products for $\xi(s)$ and $\zeta(s)$, vertical distribution of the zeros of $\zeta(s)$ (and an interlude on the Lindelöf hypothesis, on which see also the “further remarks” on page 4 of the next handout); free.pdf: the nonvanishing of $\zeta(s)$ on the edge $\sigma=1$ of the critical strip, and the classical zero-free region.
Exercise 3 of zeta2 finally corrected 3/3/2021.

The notes use (but possibly could state more explicitly) the following facts about the real part, call it $r = x / (x^2+y^2),$ of the (multiplicative) inverse of a complex number $x + iy$ with $x \gt 0:$
• $r$ is always positive;
• $r$ is bounded away from zero if $y$ is bounded and $x$ is in a fixed interval such as $[1,2]$ with both endpoints positive; and
• for large $y$ (say $|y| \geq 1$), $r$ decays as $1/y^2$ as $|y| \to\infty$ (again assuming that $x$ is in say $[1,2]$).

This picture used to appear without explanation on the web page for John Derbyshire’s Prime Obsession. It is a plot of the Riemann zeta function on the boundary of the rectangle $[0.4,0.6]+[0,14.5]i$ in the complex plane. Since the contour winds around the origin once (and does not contain the point $s=1,$ which is the unique pole of $\zeta(s)$), the zeta function has a unique zero inside this rectangle. Since the complex zeros are known to be symmetric about the line ${\rm Re}(s) = 1/2,$ this zero must have real part exactly equal $1/2$, in accordance with the Riemann hypothesis.

It is known that this first “nontrivial zero” of $\zeta(s)$ occurs at $s = 1/2 + it$ for $t = 14.13472514\ldots.$ The pole at $s=1$ accounts for the wide swath in the third quadrant, which corresponds to s of imaginary part less than 1.

Here’s a similar picture for $L(s,\chi_4)$ on $[0.4,0.6]+[0,11]i.$ With no pole in the vicinity, this picture is not as visually interesting. We see the first two nontrivial zeros, with imaginary parts 6.0209489... and 10.2437703...

The coefficients $3, 4, 1$ of the inequality $3 + 4 \cos(\theta) + \cos(2\theta) \geq 0$ may seem “random”/mysterious, but they’re basically just the fourth row $1, 4, 6, 4, 1$ of Pascal’s triangle, as you can see by writing $3 + 4 \cos(\theta) + \cos(2\theta)$ as a linear combination of $\exp(in\theta)$ with $n = -2, -1, 0, 1, 2.$

The coefficient $\sqrt{2\pi}$ in Stirling’s approximation can also be obtained from Euler’s reflection formula $\Gamma(s) \Gamma(1-s) = B(s,1-s) = \pi / \sin(\pi s)$. [I was worried about $s$ and $1-s$ both being in $R_\epsilon$, but that’s easily arranged by fixing $\sigma \in \bf R$ and considering $s = \sigma + it$ with $|t| \to \infty$.]

March 9: pnt.pdf: Conclusion of the proof of the Prime Number Theorem with error bound; the Riemann Hypothesis, and some of its consequences and equivalent statements.
Here’s an expository paper by B. Conrey on the Riemann Hypothesis, which includes a number of further suggestive pictures involving the Riemann zeta function, its zeros, and the distribution of primes.

Here’s the Rubinstein-Sarnak paper “Chebyshev’s Bias” (Experimental Mathematics, 1994).

Here’s a bibliography of fast computations of $\pi(x)$.

March [9, ] 11 and 18: lsx.pdf: $L(s,\chi)$ as an entire function (where $\chi$ is a nontrivial primitive character mod $q$); Gauss sums, and the functional equation relating $L(s,\chi)$ with $L(1-s,\bar\chi)$

It is not actually wrong to state the Proposition on page 1 with “positive” rather than “nonnegative” coefficients $a_n$, because if any $a_n=0$ then we can simply remove the term $a_n e^{-\lambda_n s}$ from the sum (and if doing that for all such $n$ results in a finite or empty sum then it is automatically convergent on all of $\bf C$).

“Recall” that Exercises 8 and 9 in zeta1 introduced the even and odd cases of the functional equation for $L(s,\chi)$ by the examples of $\chi = \chi_8^{\phantom0}$ and $\chi = \chi_4^{\phantom0}$ respectively. Since in each case $\chi$ is real (i.e. $\chi = \bar\chi)$ we didn’t see at that point that in general the functional equation must relate $L(s,\chi)$ with $L(1-s,\bar\chi)$.

March 23: pnt_q.pdf: Product formula for $L(s,\chi),$ and ensuing partial-fraction decomposition of its logarithmic derivative; a (bad!) zero-free region for $L(s,\chi),$ and resulting estimates on $\psi(x,\chi)$ and thus on $\psi(x, a \bmod q)$ and $\pi(x, a \bmod q).$ The Extended Riemann Hypothesis and consequences.
Thanks to William Chang ‘19 for alerting me to a more elementary proof of the step that shows that for any $t \geq 2$ there are $O(\log|qt|)$ zeros with imaginary part in $[t-1,t+1]$ (see page 3 of zeta2 for the proof we gave in the case $q=1$): apply Jensen’s inequality (which we already used in prod, see (3)) to $L(s,\chi)$ on a disc such as $|s-(2+it)| \leq 3$. The central absolute value $|L(2+it,\chi)|$ is bounded away from zero (recall dirichlet Exercise 11), while the maximum is bounded by some power of $q|t|$ (this doesn’t even require the functional equation, only analytic continuation via Euler-Maclaurin). But if ${\rm Im}\,\rho \in [t-1,t+1]$ then $|\rho-(2+it)| \leq \sqrt 5 < 3$, so if there are $N$ such $\rho$ then $(3/\sqrt 5)^N \ll (q|t|)^{O(1)}$, whence $N \ll \log (q|t|)$ as claimed.

March 25 and 30: free_q.pdf; The classical region $1 - \sigma \ll 1 / \log(q|t|+2)$ free of zeros of $L(s,\chi)$ with at most one exception $\beta$; the resulting asymptotics for $\psi(x, a \bmod q)$ etc.; lower bounds on $1 - \beta$ and $L(1,\chi),$ culminating with Siegel’s theorem. [The fancy script L “ℒ ” is {\mathscr L}, with \mathscr defined in the mathrsfs package.]

If you were expecting the complex conjugate of $\chi(a)$ in formulas 7, 8 (top of page 5 of free_q), rather than $\chi(a)$ itself, remember that we just showed that if $\beta$ exists at all then $\chi$ is real.
It turns out that Exercise 4 is far from the strongest result of the form “small $L(1,\chi)$ implies close Siegel zero”. In fact if $L(1,\chi)$ is small enough (for a primitive real character $\chi \bmod q$) then one can show that there exists a Siegel zero $\beta$ with $1 - \beta \ll L(1,\chi)$; here we can show that $L(1,\chi) < c \, / \log^3 q$ is “small enough” using just elementary estimates (no need for positivity of the convolution $1 * \chi$, nor the Hadamard product formula), and Terry Tao improves this to $L(1,\chi) < c \, / \log q$ in Lemma 59 of these online lecture notes on “Complex-analytic multiplicative number theory”.

April 1: l1x.pdf: Closed formulas for $L(1, \chi)$ and their relationship with cyclotomic units, class numbers, and the distribution of quadratic residues.

In Exercise 2, the character $\chi$ must be assumed real, else both the Gauss-sum factor and the irrational character values usually (always?) put $q^{1/2} \pi^{-n} L(n,\chi)$ in some proper cyclotomic extension of $\bf Q$.
To prove the formula for $\sum_{n=1}^\infty \frac1n e^{2\pi i an/q}$: The Taylor series $\sum_{n=1}^\infty z^n / n = -\log(1-z)$ converges absolutely for $|z|<1$; if $|\zeta|=1$ and $\zeta \neq 1$ then $-\log(1-z)$ extends to a continuous function on the closed segment $\{t\zeta : 0 \leq t \leq 1\}$, and $\sum_{n=1}^\infty \zeta^n / n$ still converges conditionally. It follows from Abel’s theorem that $\sum_{n=1}^\infty \zeta^n / n = -\log(1-\zeta).$ Now take $\zeta = e^{2\pi i a / q}.$ (The Wikipedia page currently states the theorem for real power series $\sum_{k=0}^\infty a_k x^k,$ but the coefficients $a_k$ can just as well be complex.)
The formula for the real and imaginary parts of $\log (1 - e^{2\pi i a/q})$ can be seen geometrically in the isosceles triangle with vertices $0,1,e^{2\pi i a/q}$. The real part is $\log |1 - e^{2\pi i a/q}|$, and $|1 - e^{2\pi i a/q}|$ is the length of the side opposite $0$, which is $2 \sin a \pi/q$ (note that no absolute value sign is needed as long as $0 < a < q$, because then $0 < a \pi/q < \pi$). The imaginary part $(\pi/2) (\frac{2a}{q}-1)$ is the angle between $1 - e^{2\pi i a/q}$ and the horizontal, which can be computed starting from the vertex angle $2\pi a/q$ of the triangle (or $2\pi(1-\frac{a}{q})$ if $a > q/2$).
The formulas for $L(1,\chi)$ include the familiar Leibniz-Madhava formula $1 - \frac13 + \frac15 - \frac17 + - \cdots = \frac\pi 4$ (with $\chi = \chi_4$), and nice but less familiar examples such as $L(1,\chi_5) = \frac2{\sqrt{5}} \log \varphi$ where $\varphi = \frac12(1+\sqrt5)$ (the “golden ratio”) is the fundamental unit of ${\bf Q}(\sqrt 5)$ (with cyclotomic representations $2 \cos \pi/5 = (2\cos 2\pi/5)^{-1}$ — you might have seen $\sin 18^\circ = \frac14(\sqrt{5}-1)$ in connection with the regular pentagon or high-school contest math, and it also follows from the evaluation of the Gauss sum $\tau(\,\chi_5) = \sqrt{5}$).
Once you know the Fourier series for the “sawtooth wave” (imaginary part of $\log |1 - e^{2\pi i x}|$), you can recover the series for the “square wave” (top of page 3) by subtracting a sawtooth of double the frequency and half the amplitude. [In effect this also accounts for the factor $2 - \chi(2)$ in the formula for $S_\chi((q-1)/2)$.] Multiplicatively, this leads us to the identity $(1-z)^2 / (1-z^2) = (1-z)/(1+z)$ (where $z = e^{2\pi i x}$), which is pure imaginary for $|z|=1$, with sign opposite to the sign of ${\rm Im}(z)$; the fact that $(1-z)/(1+z) \in {\bf R}i$ for $|z| = 1$ corresponds to one of the earliest theorems of Greek geometry, “generally attributed to Thales of Miletus”.

April 6 and 8: sieve.pdf: The Selberg (a.k.a. quadratic) sieve and some applications
Concerning the note to Exercise 4: the $k=1$ case of Schinzel’s conjecture was stated explicitly by Bunyakovsky as early as 1857.
For more on sieve techniques and inequalities, especially since Selberg’s Lectures on Sieves (1969), see the more recent and very extensive treatment in Opera de Cribro by Friedlander and Iwaniec (Providence: American Math. Society, 2010).

April 13: weyl.pdf: Introduction to exponential sums; Weyl’s equidistribution theorem and the application to equidistribution $\bmod 1$ of arithmetic progressions with irrational step. [interlude on “little $o$” notation]

April 13, 20, and 22 [sic: Apr.15 was a Wellness Day]: more analytic estimates on exponential sums (kmv.pdf and vdc.pdf): Mean-square estimates via Beurling’s function (we won’t cover the Montgomery-Vaughan generalization of Hilbert’s inequality); the Kuzmin and van der Corput bounds

Functions $\phi : {\bf R} \to {\bf C}$ whose Fourier transform $\hat\phi$ are supported on some interval $[-\delta,\delta]$ (i.e. satisfy the condition $|r| > \delta \Rightarrow \hat\phi(r) = 0$) are often said to be “band-limited”, as in the title of [Logan 1977], because they are (limits of) linear combinations of exponentials whose frequencies are supported on the “band” $|r| \leq \delta$.
In dimension $\gt1$, good analogs of $\beta_\pm$ that minorize or majorize the characteristic function of a box and have Fourier supports in another box are still quite mysterious. Another application, to upper bounds on sphere-packing densities, requires only inequalities on both $\beta$ and $\hat\beta$; in that setting a new construction of an optimal test function (but still reminiscent of Beurling’s use of the factor $(\sin \pi z)^2$) was found very recently by Viazovska to solve the sphere-packing problem in dimension 8, and was adapted shortly thereafter to solve it in dimension 24.
Here’s Terry Tao’s take on the van der Corput inequalities, and on the Vinogradov estimates which improve on van der Corput for large $k$ (with an exponent of $c/k^2$ instead of $1/(2^k-2)$).

April 27 and 29: short.pdf: The Davenport-Erdös bound and the distribution of short character sums, with some applications
Corrected April 28: on page 3, the function $F$ of equation (2) has domain $\bf C$, not $\bf R$, and the integral in (6) ends with $dx\,dy$, not just $dx$. (The domain of $F$ must still be $\bf C$ since we allow $F(z) = z^r \, {\bar z}^{\,r'}$ even with $r \neq r'$; I also made $f$ take values in $\bf C$, though this change is cosmetic.)
Corrected April 29: on page 5, start of the proof of the Corollary to Lemma 1: There exists some $x_0$, not some $n_0$ [Alec Sun].
Corrected May 2: on page 8, equation (15) ends with $O(r p^{-1/2} N^{r/2})$, not $O(r p^{-1/2} N^r)$, because we normalized the $S_\chi(n_0, n_0+N)$ by dividing by $N^{1/2}$ [Franklyn Wang].

There are several Davenport-Erdös results, so one must specify which one is intended. According to this list at the Erdös Number Project, Davenport published 7 papers with Erdös, which only ties him for 41st among the 202 co-authors of more than one joint paper with Erdös. (The maximum is 62 !)
Re Proposition 1: the formula $\sum_{X \lt l \leq N} 1/l = \log (\log(N) / \log(X)) + o(1)$ in the proof follows from $\sum_{l \leq y} 1/l = \log(\log(y)) + M + o(1)$ where $M$ is the Meissel-Mertens_constant. This constant drops out from the formula for $\sum_{X \lt l \leq N} 1/l.$
The size of the smallest quadratic residue is also of interest in theoretical computer science: finding a “quadratic non-residue” of a given large prime $p$ is a standard example of an easily “RP” (randomized polynomial-time) problem not known to have a deterministic polynomial-time algorithm. [Since it takes only $\log p$ bits to specify $p$, “polynomial” here means polynomial in $\log p$. It is easy to check whether $(a/p) = -1$ given $a,p$: Euler’s criterion in terms of $a^{(p-1)/2} \bmod p$ is polynomial-time thanks to the repeated-squaring trick. So choosing $a$ at random works with probability $1/2$. Under ERH, trying the first $O(\log^2 p)$ residues is a polynomial-time algorithm, but it is not known unconditionally to succeed for all $p$.] This problem is equivalent to extracting square roots $\bmod p$ when they do exist (and thus solving arbitrary quadratic equations $\bmod p$) in polynomial time: if we could to that, we would start from $-1$ and repeatedly take square roots until we reach some $a \bmod p$ that does not have a square root; conversely, given $a \bmod p$ with $(a/p) = -1$ it is easy to extract a square root of any $c \bmod p$ with $(c/p) = +1$, by using Euler’s criterion $O(\log p)$ times to write a square root as $c^m a^{(n+1)/2}$ for some $m$, where $(p-1) = 2^e n$ (and $e=v_2(p-1)$ is the number of iterations). NB finding a square root of $-1$ (when $p \equiv +1 \bmod 4$) is polynomial time, thanks to Schoof’s algorithm; and indeed for each $k$ it is now known that a $2^k$-th root can be found in polynomial time if it exists (i.e., when $p \equiv 1 \bmod 2^{k+1}$). But the degree of this polynomial grows too rapidly with $k$ for this to give an algorithm that is polynomial time for all $p$.
For the characterization of the Gaussian distribution by its moments: suppose more generally that $\mu$ is a probability measure on ${\bf R}^n$. We say that $\mu$ has “exponential decay” if there exists $c \gt 0$ such that for all $R \gt 0$ the complement of the $R$-ball about the origin has measure $O(\exp(-cR))$. Then the Fourier transform of $\mu$ [which takes any vector $y$ to $\int_{{\bf R}^n} \exp(2\pi i \langle x,y \rangle) \, d\mu(x) = \int_{{\bf R}^n} e(\langle x,y \rangle) \, d\mu(x)$] extends to an analytic function on a neighborhood of ${\bf R}^n$ in ${\bf C}^n$. Hence if $\mu$ and $\mu'$ are two probability measures with exponential decay, all of whose moments agree, then the Fourier transform of $\mu - \mu'$ is an analytic function that vanishes together with all its derivatives at the origin, and is thus identically zero — whence $\mu - \mu' = 0$ so $\mu = \mu'.$

In fact it is sufficient for just $\mu$ to have exponential decay, because exponential decay can be detected from the moments. Assume for simplicity $n = 1.$ Then for even $r$ the $r$th moment is $\ll \int_{-\infty}^\infty x^r \exp(-c|x|) \, dx,$ and thus $\ll r! / c^r.$ Conversely, if this bound holds for all even $r$ then the complement of the $R$-ball about the origin has measure $\ll r! / (cR)^r$ for all even $r$; taking $r = cR + O(1)$ we find that this measure is $\ll R^{1/2} \exp(-cR),$ which yields the desired exponential decay with any coefficient less than $c.$ The same argument works in dimension $n$ (with $R^{n/2} \exp(-cR)$ in place of $R^{1/2} \exp(-cR)$).

In particular, Gaussian distributions on $\bf R$ and $\bf C$ are characterized by their moments, as claimed.

WARNING Without the condition of exponential decay, it is not true that every probability distribution with finite moments is characterized by those moments. Indeed there are real-valued smooth functions $f,$ not identically zero, such that $\int_{-\infty}^\infty x^r f(x) \, dx = 0$ for each $r = 0, 1, 2, \ldots$; we can thus write some multiple of $f$ as $f_+ - f_-$ with $f_\pm \geq 0$ and $\int_{-\infty}^\infty f_\pm (x) \, dx = 1,$ and then $f_+ \, dx$ and $f_- \, dx$ are two different distributions on $\bf R$ with the same moments.