Lecture notes for Math 229: Introduction to Analytic Number Theory (Fall 2015)

If you find a mistake, omission, etc., please let me know by e-mail.

The orange ball marks our current location in the course.

For an explanation of the background pattern, skip ahead to the end of the page.

September 2: plan.pdf and intro.pdf: administrivia and “philosophy”/examples
[also: which if any of 8675309, 6060842, 6654321, and 7184981043 is prime, and how surprised might you be if one of them is prime, or half of a prime pair? Thanks to Jordan Ellenberg and David Farmer for noting the prime and prime pair respectively. Added later: I forgot that I had already seen the twin-prime observation in the mouse-over text for xkcd comic #1047: Approximations.]
The CA for Math 229 is Yihang Zhu Adrian Zahariuc Tom Lovering. His e-mail address is what you’d guess given that there are six eight letters before the @.

faculty legislation requires all instructors to include a statement outlining their policies regarding collaboration on their syllabi” [but apparently does not require us to be all that careful about singular/plural grammar …] — as stated in plan.pdf: for homework, “as usual in our department, you are allowed — indeed encouraged — to collaborate on solving homework problems, but must write up your own solutions.” For the final project or presentation, work on your own even if another student has chosen the same topic. (As with theses etc. it is still OK to ask peers to read drafts of your paper, or see dry runs of your presentation, and make comments.) In all cases, acknowledge sources as usual, including peers in your homework group.

September 4: elem.pdf: Elementary methods I: Variations on Euclid
Homework = Exercises 2, 5, 6, due September 11 at 5PM.
For many more examples of and references for elementary approaches to the distribution of primes and related topics, see Paul Pollack’s book A Second Course in Elementary Number Theory.


September 9: euler.pdf: Elementary methods II: The Euler product for s>1 and consequences

To make ≪ and ≫ in TeX, write \ll and \gg respectively. Please do not write << and >> (which will produce < < and > >)!
Homework = Exercises 2, 3, 4, 7, due September 18 21 at 5PM.

September 11: chebi.pdf: Cebysev’s method; introduction of Stirling’s approximation, and of the von Mangoldt function Λ(n) and its sum ψ(x)

September 14 and 16: dirichlet.pdf: Dirichlet characters and L-series; Dirichlet’s theorem under the hypothesis that L-series do not vanish at s=1
Homework = Exercises 1, 3, 5, 7, 9, 11 (as it happens), due September 18 21 at 5PM (may postpone one till September 25).

The introduction of powers of i for q=5 (starting at the bottom of page 3) is similar to the trick of summing ever fourth entry in the nth row of Pascal’s triangle by averaging the nth powers of 1+1, 1−1, 1+i, and 1−i. Generalizing this to summing every kth entry again leads naturally to roots of unity and Pontrjagin duality for finite groups.

The (overly) fancy way I suggested for identifying the dual of Z/mZ with μm is to extend the quotient map ZZ/mZ to a short exact sequence 0 → ZZZ/mZ → 0, with the second arrow being multiplication by m, and then dualize to identify the dual of Z/mZ with the kernel of multiplication by m on the dual of Z; but the dual of Z is just the circle group, so the kernel in question is its m-torsion subgroup, which is indeed μm. But to justify that we would have to extend Pontrjagin duality beyond finite groups to infinite groups like Z (discrete but not compact) and the circle (compact but not discrete).

September 18: psi.pdf: Complex analysis enters the picture via the contour integral formula for ψ(x) and similar sums
The contour-integral formula can be regarded as an instance of the formula for inverting the Laplace transform, in the Laplace transform’s guise as the Mellin transform. Indeed partial summation of the Dirichlet series for −ζ'/ζ writes (−ζ'/ζ(s)) / s as the Mellin transform of ψ(x) (evaluated at −s if we go by Wikipedia’s normalization), and the inversion formula for recovering ψ(x) would be precisely our contour integral if we were allowed to integrate all the way from ci ∞ to c + i ∞ without worrying about convergencr and error estimates. The Laplace inversion formula is in turn closely related with the more familiar formula for inverting the Fourier transform.

September 21: zeta1.pdf: The functional equation for the Riemann zeta function using Poisson inversion on theta series; basic facts about Γ(s) as a function of a complex variable s
Corrected Oct.4 to fix a typo noted by Nick Jameson: Re(u) > 0 is the right half-plane, not the upper one [which is the domain of definition for the associated modular form θ(u/i)]; Exercise 3 Corrected Sep.26 to fix a typo noted by Mark Sellke [denominator (kk'(k+k'))² was (kk'(k+k')²)].

Euler already guessed the functional equation in some sense: you’ve probably run across the “identity” 1 + 2 + 3 + 4 + … = −1/12 (that he may have derived from 1 − 2 + 3 − 4 + … = 1/4, though according to the 1 + 2 + 3 + 4 + … page it is not clear whether Euler ever stated the −1/12 version), and −1/12 agrees with the value of ζ(−1); he likewise “computed” alternating sums that correspond to (1−21+n) ζ(−n) for other small integers n≥0, finding numbers equivalent to ζ(−n) = −1/2, −1/12, 0, 1/120, 0, −1/252, 0, 1/240, 0, −1/132, 0, 691/32760, 0 for n = 0, 1, 2, …, 12 — and the appearance of 691 surely suggested (if he didn’t notice this earlier) a connection with the values of ζ(n+1) that Euler had already obtained. Euler then proved this connection; he even chose cos(πn/2) for the 4-periodic fudge factor, which turns out to be right for all complex n ! But Riemann was still the first to give a formula for ζ(s) for complex s and to prove the functional equation in this setting.

You can find David Wilkins’ transcription and English translation of Riemann’s fundamental 1859 paper here. A conjecture equivalent to what we now call the Riemann Hypothesis appears in the middle of page 4 of the translation (numbered 5 in the PDF file because the title appears on page 0); note that in the previous page Riemann set s=(1/2)+ti.

September 23: More about Poisson summation
If you like “distributions”, you can express Poisson summation as “a row of deltas mZδm is its own Fourier transform” (inner product of any function f with the “row of deltas” is the sum of the values of f over the integers; now use Parseval for Fourier transforms).

Our special case of the Gaussian works also for complex u, giving rise to the value at iu of a modular form of weight ½. The “sanity check” for the resulting transformation formulas turns out to be basically equivalent with Quadratic Reciprocity!

Even for real u there’s more to be said. The functional equation for θ says nothing about θ(1) [which remarkably turns out to be π¼/Γ(¾)] but does yield the ratio θ'(1) = −θ(1)/4, which in turn gives the approximation exp π ≈ 8π − 2 by ignoring all terms of size exp(−3π) and smaller. This, together with the Archmiedean approximation π ≈ 22/7, explains the first run of 9s in exp(π) − π = 19.999099979….

Another neat example of Poisson summation: if f(x) = 1 / (x2 + c2) (some constant c>0), then the Fourier transform of  is (π/c) exp(-2πc|y|) (a standard exercise in contour integration, which can also be done by Fourier inversion since the Fourier transform of exp(-2πc|y|) is an elementary integral); hence the sum of 1 / (n2 + c2) over integers n is readily evaluated as (π/c) (e2πc+1) / (e2πc−1) using the formula for summing a convergent geometric series. Subtracting the n=0 term f(0) = 1/c2 and letting c approach zero, we recover Euler’s formula ζ(2) = π2/6. The c2, c4, c6, etc. terms in the Laurent expansion of (π/c) (e2πc+1) / (e2πc−1) about c=0 then yield the values of ζ(s) for s=4, 6, 8, etc. as rational multiples of πs.

Poisson summation works in the general setting of locally-compact abelian groups: if f is “any” function on such a group G, and H is a closed subgroup, then the integral of f over H is (appropriately scaled) the integral of the Fourier transform of f over the annihilator of H. In standard Poisson, G=R, H=Z (which is its own annihilator), and the “integrals” over Z are just sums. Tate’s thesis gives a proof of the functional equation of the zeta function of any number field K by using K as the subgroup of the adèles of K. We won’t go there in Math 229x, but will note the generalization from (G, H) = (R, Z) to (G, H) = (Rn, L) where L is some lattice in R; the annihilator of L is then its dual lattice (a.k.a. the “reciprocal lattice” in crystallography).

September 25: Functional equation for ζ(s) [actually ξ(s)] cont’d;
review of more about the Gamma function as a function of a complex variable (product formula, Stirling approximation).

Homework = zeta1 Exercises 1, 2, 8; gamma Exercise 5. (Due 5PM next Friday, October 2. In zeta1 #1, of course “analytic continuation” = with the simple pole at s=1.)

To spell out the argument suggestd in the previous handout (p.2) that Γ has no zeros: it is enough to show Γ(s) ≠ 0 for Re(s) > 0; if there were such a zero, we could conclude from the formula for B(s1, s2) that if s1 + s2 = s (with both s1, s2 in the right half-plane) then either Γ(s1) = 0 or Γ(s2) = 0, and that would make Γ identically zero by analytic continuation — contradiction. (Alternatively, use Bohr-Mollerup to establish the product formula for Γ(s) on the positive real axis, and then invoke analytic continuation.)

September 28: Functions of finite order (prod.pdf): Hadamard’s product formula and its logarithmic derivative (and a prelude/interlude on the Lindelöf hypothesis, on which see also the “further remarks” on page 4 of the next handout)

Apropos Exercise 5 for the Γ(s) handout: here’s a graph of
S(x) = xx2 + x4x8 + x16x32 + − …
for x in [0,0.9995]. (Apply the “magnifying glass” to the top right corner to see the first few oscillations.) The fact that S(0.995) = 0.50088… > 1/2, together with the functional equation
S(x) = xS(x2)
(from which S(x) = xx2 + S(x4) > S(x4)), suffice to refute the guess that S(x) approaches 1/2 as x approaches 1.

September 30: zeta2.pdf: The Hadamard products for ξ(s) and ζ(s); vertical distribution of the zeros of ζ(s).

The notes use (but possibly could state more explicitly) the following facts about the real part, call it r = x / (x2+y2), of the (multiplicative) inverse of a complex number x+iy with x>0: r is always positive; r is bounded away from zero if y is bounded and x is in a fixed interval such as [1,2] with both endpoints positive; and for large y (say |y| ≥ 1), r decays as 1 / y2 as |y| → ∞ (again assuming that x is in say [1,2]).
This picture appears without explanation on the web page for John Derbyshire’s Prime Obsession. It is a plot of the Riemann zeta function on the boundary of the rectangle [0.4,0.6]+[0,14.5]i in the complex plane. Since the contour winds around the origin once (and does not contain the point s=1, which is the unique pole of ζ(s)), the zeta function has a unique zero inside this rectangle. Since the complex zeros are known to be symmetric about the line Re(s)=1/2, this zero must have real part exactly equal 1/2, in accordance with the Riemann hypothesis.

It is known that this first “nontrivial zero” of ζ(s) occurs at s=1/2+it for t=14.13472514... The pole at s=1 accounts for the wide swath in the third quadrant, which corresponds to s of imaginary part less than 1.

Here’s a similar picture for L(s4) on [0.4,0.6]+[0,11]i. Without a pole in the neighborhood, this picture is less interesting visually. We see the first two nontrivial zeros, with imaginary parts 6.0209489... and 10.2437703...

October 2: free.pdf: The nonvanishing of ζ(s) on the edge σ=1 of the critical strip, and the classical zero-free region
Homework = prod Exercises 1, 7; zeta2 Exercises 1, 2; and free Exercise 1. All due 5PM next Friday, October 9.

The coefficients 3, 4, 1 of the inequality 3 + 4 cos(θ) + cos(2θ) ≥ 0 may seem “random”/mysterious, but they’re basically just the fourth row 1, 4, 6, 4, 1 of Pascal’s triangle, as you can see by writing 2 (3 + 4 cos(θ) + cos(2θ)) as a linear combination of exp(inθ) with n = −2, −1, 0, 1, 2.
October 5: pnt.pdf: Conclusion of the proof of the Prime Number Theorem with error bound; the Riemann Hypothesis, and some of its consequences and equivalent statements.
Homework = Exercise 1
Here’s an expository paper by B. Conrey on the Riemann Hypothesis, which includes a number of further suggestive pictures involving the Riemann zeta function, its zeros, and the distribution of primes.

Here’s the Rubinstein-Sarnak paper “Chebyshev’s Bias” (Experimental Mathematics, 1994).

Here’s a bibliography of fast computations of π(x).

October 7 and 9: lsx.pdf: L(s, χ) as an entire function (where χ is a nontrivial primitive character mod q); Gauss sums, and the functional equation relating L(s, χ) with L(1−s, \bar{χ})
[I tried to do the proper HTML thing to get \bar{χ}, but χ sets too high a bar…]
Homework = Exercises 1, 2, 3, 6, 11 (this and pnt Exercise 1 due Oct.16 at 5PM)

October 12: NO CLASS: UNIVERSITY HOLIDAY (Columbus Day)

October 14: lsx, cont’d; pnt_q.pdf: Product formula for L(s,χ), and ensuing partial-fraction decomposition of its logarithmic derivative; a (bad!) zero-free region for L(s,χ), and resulting estimates on ψ(x,χ) and thus on ψ(x, a mod q) and π(x, a mod q). The Extended Riemann Hypothesis and consequences.

October 16 and 19: free_q.pdf: The classical region 1−σ ≪ 1/log(q|t|+2) free of zeros of L(s,χ) with at most one exception β; the resulting asymptotics for ψ(x, a mod q) etc.; lower bounds on 1−β and L(1,χ), culminating with Siegel’s theorem. [The fancy script L is {\mathscr L}, with \mathscr defined in the mathrsfs package.]
If you were expecting the complex conjugate of χ(a) in formulas 7, 8 (top of page 5), rather than χ(a) itself, remember that we just showed that if β exists at all then χ is real.
Homework = Exercises 1, 2 of pnt_q, 1, 4 of free_q; due Oct.23 at 5PM

October 23: l1x.pdf: Closed formulas for L(1, χ) and their relationship with cyclotomic units, class numbers, and the distribution of quadratic residues.
Homework = Exercises 1, 2, 3, and the cases m=2,3 of Exercise 4; due Oct.30 at 5PM

[October 26: Outline of possible final projects]

October 28 and 30: sieve.pdf: The Selberg (a.k.a. quadratic) sieve and some applications
Concerning the note to Exercise 4: the k=1 case of Schinzel’s conjecture was stated explicitly by Bunyakovsky as early as 1857.
Homework = Exercises 1, 2 (except the last sentence), 3 due November 6.

November 2: weyl.pdf: Introduction to exponential sums; Weyl’s equidistribution theorem; interlude on “little o” notation; Kuzmin’s bound (from the start of kmv.pdf, the rest of which we’ll cover next week)

November 4: vdc.pdf: The van der Corput estimates and some applications
Homework = weyl Exercises 1, 3, 4, and vdc Exercises 2 and 3, due November 13.

November 9: kmv.pdf: estimates on the mean square of an exponential sum, culminating with the Montgomery-Vaughan inequality (which this year’s edition of Math 229 will state but not prove)

Apropos Beurling's function: here's MathWorld's take on B(x), including a graph on [−3,3]; this PDF version of the graph also shows the comparison with sgn(x).

In dimension >1, good analogs of β± that minorize or majorize the characteristic function of a box and have Fourier supports in another box (and likewise for balls) are still quite mysterious.

11/11: Beurling etc., cont’d; short.pdf: Start on the Davenport-Erdös bound and the distribution of short character sums, with some applications
Correction (19 November) Nicholas Triantafillou notes a typo in Exercise 4: the hypothesis should have T2T1 in Zδ−1, not Zδ as written.

For the characterization of the Gaussian distribution by its moments: suppose more generally that μ is a probability measure on Rn. We say that μ has “exponential decay” if there exists c>0 such that for all R>0 the complement of the R-ball about the origin has measure O(exp(−cR)). Then the Fourier transform of μ [which takes any vector y to Rn e(⟨x,y⟩) dμ(x)] extends to an analytic function on a neighborhood of Rn in Cn. Hence if μ and μ' are two probability measures with exponential decay, all of whose moments agree, then the Fourier transform of μ−μ' is an analytic function that vanishes together with all its derivatives at the origin, and is thus identically zero — whence μ−μ' = 0 so μ = μ'.

In fact it is sufficient for just μ to have exponential decay, because exponential decay can be detected from the moments. Assume for simplicity n=1. Then for even r the rth moment is ≪ ∫R xr exp(−c|x|), and thus r! / cr. Conversely, if this bound holds for all even r then the complement of the R-ball about the origin has measure r! / (cR)r for all even r; taking r = cR + O(1) we find that this measure is R½ exp(−cR), which yields the desired exponential decay with any coefficient less than c. The same argument works in dimension n (with the exponent of R being n/2 in that generality).

In particular, Gaussian distributions on R and C are characterized by their moments, as claimed.

WARNING Without the condition of exponential decay, it is not true that every probability distribution with finite moments is characterized by those moments. Indeed there are real-valued functions f, not identically zero, such that R xr f(x) dx = 0 for each r = 0, 1, 2, …; we can thus write some multiple of f as f+f with f± > 0 and R f±(x) dx = 1, and then f+ dx and f dx are two different distributions on R with the same moments.

November 13: Davenport-Erdös bound and applications, cont’d
Homework = kmv Exercises 1, 3, 4, and short Exercise 1, due November 20. (Note the correction above for Exercise 4.)

November 16 and 18: burgess.pdf: The Burgess bound on short exponential sums
Error near the bottom of page 3: restricting d to primes is not quite enough even when n0 = 0, because h/d = h'/d' is still possible with distinct primes d and d', so one must still make some additional argument. In any case the alternative approach on page 4 works for any n0.

Interlude on one reason that we care about small “nonquadratic residues”: given a large prime p, finding a single n with (n/p) = −1 is equivalent (modulo O(logCp) factors) with extracting arbitrary square roots mod p. Hence square roots mod p can be extracted in “random polynomial time” (so “RP”), but aren’t yet known to be doable in deterministic polynomial time. Under ERH for the Dirichlet L-series L(s,(·/p)), the minimal n is O(log2p), but unconditionally the best we have is O(pθ) for some θ > 0. Currently the smallest such θ is 1 / (4e½) + ε, and this uses the Burgess bounds via an argument given in short.pdf (Proposition 1).

Write p − 1 = 2eq with q odd. To get the essential equivalence between solving (n/p) = −1 and extracting square roots mod p: given the latter, extract square roots e−1 times starting from −1 to obtain n, which is a primitive root of unity of order 2e. Conversely, given n, we may assume n is a root of unity of order 2e by replacing it by nq (note that exponentiation mod p can be done in polynomial time by repeated squaring); then, to find D½ when it exists, write D = D1D2 with D1 and D2 roots of unity of order q and 2e respectively (not necessarily primitive; this uses the Euclidean algorithm, which is again polynomial time), and then D1 is the square of its (q+1)/2 power so we need only deal with D2. By raising suitable residues nνD2 to powers (p−1)/2, (p−1)/4, (p−1)/8, …, (p−1)/2e = q (again this is polynomial-time), we can identify D2 with a power of n, say nα; the exponent α is even by assumption, and then nα/2 is a square root of D2 so we’re finally done.

Here’s a PDF plot of the Burgess exponents (1 − 1/r) θ + (r+1)/4r2 for r = 2, 3, 4, 5, 6, 8, 12, 24, and also the enveloping parabola −1/16 + 3θ/2 − θ2 = θ − (θ − 1/4)2 = 1/2 − (θ − 3/4)2 (highlighted on θ [1/4, 3/4]) and the exponents θ and 1/2 corresponding respectively to the trivial and Pólya-Vinogradov bounds (and to Burgess for r = 1 and r → ∞). [For the PostScript source, change the suffix .pdf to .ps in the URL.]

Here’s another recent take on the Burgess bounds (by Liangyi Zhao, as part of these notes from a seminar on various classical and modern exponential sum estimates. There are some minor errors here (e.g. the final expression in the third display on page 3 cannot be right since it is identically zero), but this writeup does recite the proof of the key estimate (3.5) on complete character sums starting from Weil’s theorem (RH for curves over finite fields).

November 20: kloos.pdf: An application of Weil’s bound on Kloosterman sums.
Correction (29 November) Hannah Larson notes two typos in Exercise 5: In the displayed formula for Sn, the coefficient of Nn(p) should be p2 / (p−1), not p2 (I was thinking of the number of solutions in projective space with no coordinate zero); and in the formula for S4, the first term should be 2p3 (as needed for the Catalan-number rule), not p3.

Here are the plots of x y == 256 mod 691 and x y == 691 mod 5077, illustrating the asymptotically uniform distributions studied in this chapter (and also illustrating a bit of mathematical PostScript trickery, as you can see from the source files for the plots for p=691 and p=5077).
November 23: many_pts.pdf: How many points can a curve of genus g have over the finite field of q elements? The zeta function of a curve over a finite field; the Weil and Drinfeld-Vladut bounds, and related matters.
Final homework = burgess Exercise 3; kloos Exercises 1, 2, 5; and many_pts Exercise 1 (omitting the last sentence) and 2. All due Friday, December 4 at 5PM. (Note the correction above for kloos Exercise 5.)
many_pts_0.pdf: Some explanation (following and somewhat extending the lecture) for the possibly mysterious argument on the bottom of p.3 of the many_pts handout

Here are some tables of curves of given genus over finite fields with many rational points.

Let Mk(g) be the maximal number of rational points of a genus-g curve over the finite field k. The many_pts handout cites [Serre 1982-1984] for the theorem that for each k the limsup over g of Mk(g)/g is positive. The result that for each k the liminf of Mk(g)/g is also positive is contained in the paper:

N.D.Elkies, E.W.Howe, A.Kresch, B.Poonen, J.L.Wetherell, and M.E.Zieve: Curves of every genus with many points, II: Asymptotically good families, Duke Math. J. 122 #2 (2004), 399-422 (math.NT/0208060 on the arXiv).

(This is actually easier than the limsup proof, though it assumes the limsup result as almost a black box — “almost” because we need a somewhat stronger result that the limsup can be attained by a sequence of curves Cn whose genera are spaced closely enough that there exists some finite R with g(Cn+1) / g(Cn) < R for each n.)

November 30 and December 2: disc.pdf: Stark’s analytic lower bound on the absolute value of the discriminant of a number field (assuming GRH).
Somewhat expanded, with several typos corrected, December 1 and 2.

Here are some tables of number fields, compiled by Henri Cohen.

The LMFDB (“L-functions, Modular Forms, and related objects DataBase”) has much more extensive tables of tables of number fields which contain the PARI database and are complete in some directions; if you find a number field that “should” be contained in the LMFDB (e.g. because the database contains a field of the same degree, signature (r1, r2), and ramified primes, but with |D| larger than yours), send it in for the next revision.

According to Wikipedia’s article on number-field discriminants, the lowest known root-discriminants of an infinite family of number fields are still the records of about 92.4 for totally imaginary fields (r1 = 0), and 1059 for totally real ones (r2 = 0), both by Martinet (Invent. Math. 1978); but a generation later Hajir and Maire lowered the first bound to under 83.9 (Compositio Math. 128 (2001) #1, 35-53), and wrote that “it has long been conjectured” that Q(√−1365) has an infinite class field tower, which would yield an upper bound of 5460 < 73.9 (see the end of this paper, published at about the same time in the Proceedings of the 2000 European Congress of Mathematics).

December 4: muff.pdf: Illustration of the “parity problem” for Selberg’s sieve, using the function field case for which the Möbius-function sums greatly simplify thanks to the zeta function having no zeros. (“muff” = “μ and function field”.)


So what’s with the whorls in the background pattern? They’re a visual illustration of an exponential sum, that is, sum(exp i f(n), n = 1…N). Even simple functions f can give rise to interesting behavior and/or important open problems as we vary N. What function f produced the background for this page? See here for more information.