Course website for Math 223: Introduction to the theory and computational practice of the arithmetic of elliptic curves

If you find a mistake, omission, etc., please let me know by e-mail.

The orange ball marks our current location in the course, and the current problem set.



Introductory lecture, September 4

[...] I started with the example of the Diophantine equation $X^2 + Y^2 = 1$ (projectively: $x^2 + y^2 = z^2$) arising from the Pythagorean theorem. [NB this is not an elliptic curve, though ironically its graph is an ellipse if we use different units for the $x$ and $y$ axes…] Most of what I said is contained in the first few pages of Taussky’s article “Sums of Squares” (American Math. Monthly 77 #8, Oct.1970, pages 805–830; posted to the course Canvas website). This article may also be the earliest observation that the familiar parametrization $(x:y:z) = (m^2-n^2 : 2mn : m^2+n^2)$ can be obtained by applying Hilbert’s Theorem 90 to the quadratic extension ${\bf Q}(i)/{\bf Q}$. My take on it can also be found in the first few pages of my article “The ABC’s of Number Theory” (the title plays on the article’s main topic, the abc conjecture). In that article I used the slope of the line from $(X,Y)$ to $(1,0)$, not $(-1,0)$ as I did in class; the two slopes are related by $m \leftrightarrow -1/m.$ Here’s the corresponding picture for $(-1,0)$, showing the lines of slope $1/3, 1/2, \infty$ connecting $(-1,0)$ to $(4/5,3/5)$, $(3/5,4/5)$, and $(-1,0)$ itself, and the half-angle relation for $(4/5,3/5)$. [...]

Much of the first two chapters of the Silverman text is a review of the basics of algebraic geometry in affine and projective space (Chapter I) and algebraic curves (Chapter II), giving the necessary definitions and theorem statements but relegating most proofs to references. Acccordingly we shall go through this material much more quickly than each of the other chapters and chapter sections that we shall cover.

Chapter I: Algebraic Varieties (September 9)

Algebraic geometry is often introduced over an algebraically closed field. We cannot limit ourselves to algebraically closed ground fields when our topic is elliptic curves over fields such as Q and ${\bf F}_q$ that are not algebraically closed. On the other hand, even the basic dictionary between prime ideals $I$ and varieties $V_I$ can fail if the ground field is not algebraically closed. Over ${\bf F}_q$ there are even prime ideals $I \neq \{0\}$ such that $V_I({\bf F}_q)$ is all of ${\bf F}_q^n$, and likewise in projective space. This cannot happen over an infinite field but we can still have the opposite problem: a prime ideal $I$ for which $V_I(K) = \emptyset$. For example, consider the ideal $(x^2 + y^2 + 1)$ over $\bf Q$, or even $\bf R$. (For projective varieties the ideal $(x^2+y^2+z^2)$ works the same way.)

[See also Example 2.5 on page 8 (continued as Example 3.8 on page 14) and Exercise 1.10 on page 16. Example 2.5 proves that $X^2 + Y^2 = 3Z^2$ has no nonzero rational solutions over ${\bf Q}_3$ (and thus a fortiori over ${\bf Q}$); you can check that there are no nonzero rational solutions over ${\bf Q}_2$ either.]

Thankfully Galois theory lets us eat our cake and have it too. To do algebraic geometry over a field $K$, we work over an algebraic closure $\bar K$, but use the action of ${\rm Gal}(\bar K/K)$ to keep track of points defined over $K$. [This is why Silverman requires $K$ to be perfect; fortunately most of the fields that concern us are perfect: number fields ($\bf Q$ and its finite extensions) and $p$-adic fields (${\bf Q}_p$ and its finite extensions, including ${\bf R} = {\bf Q}_\infty$ and $\bf C$) are of characteristic zero and thus automatically perfect, and finite fields ${\bf F}_q$ are are permuted by $x \mapsto x^p.$ The first important exception is ${\bf F}_q(t)$ and its finite extensions, but we probably won’t encounter elliptic curves over such fields in this class.] It is also true that the $K$-rational functions on an affine or projective variety $V$ the $\bar K$-rational functions on $V$ that are fixed by ${\rm Gal}(\bar K/K)$, but this is subtler than one might expect; see the last Exercise for this chapter (p.16, Ex. 1.12). We shall postpone this until we need to develop the relevant Galois cohomology to study “twists”.

Note that, following the textbook, we write an algebraic closure, not the algebraic closure. (Silverman was a student of John Tate, who insisted on this point.) For general $K$, the construction of an algebraic closure $\bar K / K$ and its Galois group ${\rm Gal}(\bar K/K)$ requires the Axiom of Choice; any two algebraic closures are isomorphic, but in general the isomorphism is not canonical. One case when we do have a canonical choice of $\bar K$ is when $K$ comes to us already embedded in some algebraically closed field; for example we can speak of the algebraic closure of $\bf Q$ in C.

Example 1.3.1 on page 3 (parametrization of the hyperbola) is basically the same as the parametrization of the circle that we gave in the first class; see Example 3.5 on page 13, where $\phi$ is tantamount to the slope of the line from $(x,y)$ to $(-1,0)$ (using $(x,y)$ for the affine coordinates with $(x:y:1) = (X:Y:Z)$ [see below], so $V$ is $x^2 + y^2 = 1$). We must assume ${\rm char}(K) \neq 2$ because in characteristic 2 the generator $X^2-Y^2-1$ factors as $(X-Y-1)^2$.

Example 1.3.2 (page 3): note that [291] is by both Wiles and Taylor; the proof is usually credited to “Wiles and Taylor–Wiles” or an equivalent formulation, recognizing that [291] was necessary to complete the proof of [311]. [This proof supersedes most of the long history of proofs for particular exponents $n$ or classes of exponents, but the proof in [291,311] still does not apply to some small $n$, notably including $n=4$ which is the one case that we know Fermat proved: if $X^4 + Y^4 = 1$ then $(1/X)^6 - (1/X)^2 = (1-X^4)/X^6 = Z^4 / X^6$ is a square, but Fermat proved that $x^3 - x$ ($x \in {\bf Q}$) cannot be a nonzero rational square, so either $X = 0$ or $X^2 = 1$.]

Example 1.3.3 (page 3) is an elliptic curve. As is often the case the curve is simple enough (in a way we shall make precise later this term) to be in the LMFDB = L-Functions and Modular Forms DataBase. To find it, you can enter the equation Y^2 = X^3 + 17 into the “Find” window at the bottom of www.lmfdb.org/EllipticCurve/Q; this will take you to the LMFDB page for this curve, where you will find a lot of information about it. Most of the information may be mysterious at this point; as the term progresses we’ll understand much more of the meaning and context of various entries, and also how some of them could be computed. (For a preview you might click on some of the gray-underlined terms like “Minimal Weierstrass equation” and “Mordell–Weil group”.) But you can already see $(-2,3)$ and $(5234,378661)$ in the list of integral points near the top of the page. [NB the Find window will accept several formats, but if you type in an equation it has to be equivalent to $y^2 + a xy + c y = P(x)$ for some rational $a,c$ and monic cubic $P$; if you want to look up $6y^2 = x^3 - x$ you can first scale $x,y$ to $6X,6Y$ and divide through by $6^3$ to get Y^2 = X^3 - X/36 which takes you to curve 576c3, which is the curve $y^2 = x^3 - 36x$. Can you see how this is isomorphic with $6y^2 = x^3 - x$? Hint: the largest integral points $(294,\pm 5040)$ correspond to the ones we found with $6 \cdot 140^2 = 49^3 - 49$.]

[Why does the LMFDB contain elliptic curves, and for that matter number fields and various other data, that are not part of the acronym? Well, number fields have L-functions (e.g. $\bf Q$ gives the Riemann zeta function), as do elliptic curves by the modularity theorem. See also the “map of the LMFDB universe”, and try clicking on some of the nodes and arrow labels in the diagram.]

We shall sometimes write the projective coordinates of a point $[x_0, x_1, \ldots, x_n] \in {\bf P}^n$ as $(x_0 : x_1 : \cdots : x_n)$, using colons rather than commas to suggest a ratio.

On page 6, you should check that the “minimal field of definition” of $P \in {\bf P}^n(\bar K)$ does not depend on the choice of $i$ (the index of a nonzero coordinate) when there is more than one choice.

Page 8, Remark 2.4: Here “relatively prime” means that there is no integer $c > 1$ that divides each of $x_0, \ldots, x_n;$ once $n > 1$ the $x_i$ might not be “relatively prime in pairs” — there are even examples like $(6:10:15)$ where no pair is relatively prime. Note too that the existence of a representative $(x_0, \ldots, x_n)$ with relatively prime integer coordinates uses unique factorization in $\bf Z$. When we work over a general number field $K$ with a nontrivial class group, we can bring the coordinates to lowest form only when the ideal they generate, call it $I$, is principal; in general the ideal class of $I$ determines how close we can come to a representative with relatively prime coordinates. Fortunately the ideal class group, even if nontrivial, is always finite, which will let us put the coordinates in “almost lowest terms”.

Page 13, Definition between 3.3 and 3.4: if $\phi: V_1 \to V_2$ and $\psi: V_2 \to V_1$ are rational maps (but not necessarily morphisms) such that $\psi \circ \phi = {\rm id}_{V_1}$ and $\phi \circ \psi = {\rm id}_{V_2}$ then $\phi$ and $\psi$ are “birational isomorphisms” and $V_1,V_2$ are “birational” or “birationally isomorphic”. For smooth projective curves, all birational isomorphisms are isomorphisms, thanks to the bijection between such curves and function fields. Beyond that things can get complicated. Already for $V_1 = V_2 = {\bf P}^2$ there is a birational isomorphism $(X:Y:Z) \mapsto (YZ:ZX:XY)$ that is its own inverse [think of $(YZ:ZX:XY)$ as $(1/X : 1/Y : 1/Z)$] but not an isomorphism [not defined on $(1:0:0)$, $(0:1:0)$, and $(0:0:1)$].

Chapter II: Algebraic Curves (September 11)

If $K = {\bf C}$ then an algebraic curve over $C$ is also a compact Riemann surface. Conversely, every compact Riemann surface is an algebraic curve over $\bf C$; this is the Riemann existence theorem and requires some nontrivial analytic tools. Many of the constructions and results in this chapter were first obtained for Riemann surfaces and then generalized (sometimes using different tools) to algebraic curves over arbitrary fields.

At the start of II.1 (page 17), note that “(I.2.3) and (I.2.8)” mean Chapter I, Examples 2.3 and 2.8; before realizing this I was fruitlessly searching for Examples 1.2.3 and 1.2.8 in that Chapter (which does have Examples 1.3.1, 1.3.2, and 1.3.3 on page 3…). Likewise for other references to formulas, examples, etc. in previous or later chapters. In the present case, Example I.2.3 is the circle $X^2 + Y^2 = Z^2$, and I.2.8 is the projective curve with affine equation $Y^2 = X^3 + 17.$

Remark 1.1.1 (page 18): I shall not assign Exercise 2.16 as homework, and for once Silverman gives no other reference here. A uniformizer at $P$ can be constructed as the quotient $l/l'$ of linear forms where $l'$ does not vanish at $P$ while $l$ vanishes but only to order 1 (which is possible when $P$ is a smooth point, and we assume that all points on $C$ are smooth).

Still on page 18, the “order“ ${\rm ord}_P(f)$ (Definition following Remark 1.1.1) is also called the valuation $v_P(f)$.

Page 20, Theorem 2.3: for Riemann surfaces, this also follows analytically from the fact that the image of a nonconstant map is open; since $C_1$ is also compact, its image $\phi(C_1)$ is thus a nonempty compact open subset of $C_2$, and thus all of $C_2$ because $C_2$ is connected.

First line of page 21: you might know “subfield of finite index” as “subfield of finite degree” (either term means that $[K(C) : {\mathbb K}] < \infty$).

Page 22, Example 2.5.1: some authors require that a “hyperelliptic curve” have genus at least 2, meaning $d > 4$. I prefer Silverman’s definition, which allows $C$ of genus 1 and even 0. (Note, though, that the degree $d$ must be positive; if $f(x)$ is a nonzero constant then $y^2 = f(x)$ is not a curve but two lines meeting at infinity.) [...]

Page 24, Corollary 2.7: note that $\# \phi^{-1}(Q)$ is taken without multiplicity.

Page 27: ${\rm Div}^0_C$ is the kernel of the homomorphism $\deg: {\rm Div}_C \to {\bf Z};$ over an algebraically closed field, this map is surjective, and extends to an exact sequence $0 \to {\rm Div}^0(C) \to {\rm Div}(C) \to {\bf Z} \to 0.$ For a curve $C$ over a field $K$ that may not be algebraically closed, ${\rm Div}_K(C)$ consists of $\bf Z$-linear combinations of Galois orbits of $C(\bar K)$; in other contexts those are called “closed points” of $C$.

Page 28 (just before Proposition 3.1): “But see exerciswe 2.13 for a case in which this is true“ — we shall also soon see an example where it is false.

Page 28, Proposition 3.1: on a Riemann surface this can also be proved by triangulating $C$ and integrating $df/f$ on each triangle, or (again by contour integration) showing that the number of preimages of $x \in {\bf P}^1({\bf C})$ counted with multiplicity is locally constant.

The group ${\rm Pic}^0(C)$ introduced at the top of page 29 is called the Jacobian of $C$, usually denoted ${\rm Jac}(C)$ or $J(C)$. The Jacobian is itself a projective variety of dimension $g$, the genus of $C$. This is a hard theorem except for $g=0,$ when ${\rm Jac}(C)$ is a point, and $g=1,$ when $J(C)$ is $C$ itself as soon as $C$ has a rational point — which is to say that $C$ is an elliptic curve, the topic of the ensuing chapters of Silverman and of our course. Over $\bf C$ the Jacobian can be seen identified with ${\bf C}^g / L$ where $L$ is the period lattice of $C$, but even in this case it takes work to show that such ${\bf C}^g / L$ is an algebraic variety. (We shall do this for $g=1;$ for $g=2$ see Cassels and Flynn’s Prolegomena to a Middlebrow Arithmetic of Curves of Genus 2 [sic] (1996).) Jacobians, and more generally abelian varieties, of dimension $\geq 2$ are beyond the scope of our course.

Page 33, first Definition of II.5 (The Riemann-Roch theorem): Yes, that’s a partial order: if $D_1 \geq D_2$ and $D_2 \geq D_3$ then $D_1 \geq D_3;$ also if $D_1 \geq D_2$ then $D + D_1 \geq D + D_2$ for any divisor $D$.

Page 35, Example 5.6: the fact that $\ell(K_C) = 0$ when $C = {\bf P}^2$ also follows from the fact that $K_C$ has degree $-2 < 0$ (see the computation of ${\rm div}(dt)$ in Example 4.5 on page 32–33).

Page 37, Theorem 5.9: The sum over $P \in C_1$ is really finite, even though $C(\bar K)$ is infinite (perhaps uncountably infinite), because $e_\phi(P) = 1$ for all but finitely many $P$.. Theorem 5.9 is usually called the Riemann–Hurwitz theorem (occasionally the name of Zeuthen is added); I don’t know the history, but I guess that the original proof(s) was only for Riemann surfaces, and in any case not over a field which may have positive characteristic. For a Riemann surface there is also a topological proof: triangulate $C_2$, making sure that each of the finitely many branch points are vertices of the triangulation, and compare Euler characteristics. Going from $C_2$ to $C_1$ multiplies the face, edge, and vertex counts by $\deg\phi,$ except that the vertex count must then be decreased by $\sum_{P \in C_1(\bar K)} \bigl( e_\phi(P) - 1 \bigr);$ thus $$ \chi(C_1) = \deg\phi \cdot \chi(C_2) - \!\! \sum_{P \in C_1(\bar K)} \bigl( e_\phi(P) - 1 \bigr). $$ Rieman–Hurwitz then follows from $\chi(C) = 2 - 2g(C)$.

By the way, “genera”, which appears twice on page 37 and apparently nowhere else in the book, is the Latin plural (third declension, I see) of “genus”; this is a rare kind of irregular plural in English, but there is at least one further example: “opera” is also the plural of “opus“ (as in Euler’s Opera Omnia = complete works), which is ultimately the source of the more familiar sense of “opera”.

(Riemann-Roch and) start of Chapter III (September 16)

The Riemann-Roch theorem for algebraic curves (originally proved only for compact Riemann surfaces, i.e. curves over $\bf C$) states $$ \ell(D) - \ell(K-D) = \deg(D) - g + 1 $$ where $g$ is the genus of the curve, $K$ is a canonical divisor, and $D$ is any divisor. Some examples / sanity checks / consequences:

This last observation includes Lemma III.3.3 (page 61), but Riemann-Roch is over kill here: on a curve $C$ of any positive genus, if $(P) \sim (Q)$ rthen $P = Q$, else $K(C)^*$ would contain a function $f$ with divisor $(P) - (Q),$ and thus of degree $1$, identifying $C$ with the curve ${\bf P}^1$ of genus zero.
The formulas for $E, a_k, b_k, c_k, \Delta, j$ at the start of III.1 (page 42) are from Tate’s “Formulaire“ in Modular Forms of One Variable IV (LNM 476, 1972). They let us test in any characteristic whether $E$ is nonsingular and, if so, to compute $j(E)$. It’s not the most welcoming introduction to elliptic curves, so we shall start with III.3; but we do have to explain these formulas before too long.

To begin with, the omission of $a_5$, and the apparent disorder of $a_1,a_3,a_2,a_4,a_6$, are for good reason: they make the formula $$ E: y^2 + a_1 x y + a_3 y = x^3 + a_2 x^2 + a_4 x + a_6 $$ homogeneous of weight 6 when $x,y$ are taken to have weight 2,3 (which are the orders of the poles of $x,y$ at the point at infinity). We get an isomorphic curve by replacing $(x,y)$ by $(x/\lambda^2, y/\lambda^3)$ and multiplying through by $\lambda^6$; this multiplies each $a_k$ by $\lambda^k$. There is no $a_5$ because there is no weight-$1$ monomial in $x,y$. Each $b_k$ and $c_k$ is then homogeneous of weight $k$, while the discriminant $\Delta$ has weight 12, and $j$ has weight zero (as it must if $j$ is to be an invariant of the curve).

Now $b_2,b_4,b_6$ arise naturally when we complete the square (in characteristic $\neq 2$) to eliminate $a_1,a_3$, as we see on page 42: translating $y$ by $(a_1 x + a_3)/2$ gives $$ y^2 = x^3 + (b_2/4) x^2 + (b_4/2) x + (b_6/4). $$ Then in characteristic $\neq 3$ we can translate $x$ by $b_2/12$ to remove the $x^2$ term, getting $$ y^2 = x^3 + \Bigl( \frac{b_4}{2} - \frac{b_2^2}{48} \Bigr) x + \Bigl( \frac{b_6}{4} - \frac{b_2 b_4}{24} + \frac{b_2^3}{864} \Bigr) = x^3 - (c_4/48) x - (c_6/864). $$ which is why $c_4$ and $c_6$ are defined as they are. Now ${\rm disc}(x^3 + px + q) = -(4p^3 + 27q^2),$ so $$ {\rm disc}(x^3 + (b_2/4) x^2 + (b_4/2) x + (b_6/4)) = {\rm disc}(x^3 - (c_4/48) x - (c_6/864)) = 2^{-10} 3^{-3} (c_4^3 - c_6^2) $$ so that, in characteristic other than $2$ or $3$, our curve $E$ is nonsingular iff $c_4^3 \neq c_6^2$. Thus $\Delta$ had better be some multiple of $c_4^3 - c_6^2$. It turns out that the right multiple is $(c_4^3 - c_6^2) / 1728 = 2^{-6} 3^{-3} (c_4^3 - c_6^2)$. [That is the origin of the factor of $1728 = 12^3$ in the formula $j = 1728 c_4^3 / (c_4^3 - c_6^2)$.] In characteristic 3 we can avoid division by zero using the general formula $$ {\rm disc}(x^3 + A x^2 + B x + C) = -4A^3 C + A^2 B^2 + 18 ABC - (4 B^3 + 27 C^2) $$ for the discriminant of a cubic (without assuming $A=0$). Tate introduced $b_8$ to get a formula that works also in characteristic 2. Of course we must still verify that the resulting $\Delta$ vanishes iff $E$ is singular, even in characteristic 2 or 3. Silverman does this later in the chapter.

The formula for $\omega$ is not so mysterious once the denominators are recognized as the partial derivatives of $\pm f(x,y)$ where $f(x,y) = y^2 + a_1 x y + a_3 y - (x^3 + a_2 x^2 + a_4 x + a_6)$ is the defining equation of $E$; see pages 43–44. That is a common way to produce differentials on a plane curve.

Page 47, start of (b): for “${\rm char}(K) \geq 5$” read “${\rm char}(K) \neq 2,3$” (because $0 < 5$ but characteristic zero is OK).

Proposition III.1.6 (pages 48–49): when a cubic curve $E$ has a double point $P$, the slope of the line connecting $(x,y) \in E$ to $P$ is a rational function of degree 1 on the curve; this is analogous to our parametrization of the unit circle by the slope $(x+1)/y$ (and is also a common Quals problem). [While we’re at it: if a cubic curve $C$ has (at least) two singularities, say $P$ and $Q$, then the line $L$ joining $P$ and $Q$ intersects $C$ with multiplicity at least 4 — which implies that $C$ is reducible because $L$ is a component.]

The group law; interlude on elliptic curves over C (and R) (September 18)

There are several computer packages that implement the group law (Algorithm 2.3, starting on page 53) and many other operations on an elliptic curve. In PARI/GP, the function ellinit creates an elliptic curve with given $a_1,a_2,a_3,a_4,a_6;$ for example E = ellinit([0,-1,1,-10,-20]) creates the curve $y^2 + y = x^3 - x^2 - 10x - 20$ (LMFDB label 11.a2). For starters this accesses the other Formulaire quantities, e.g. [E.b8, E.disc, E.c4, E.j] returns [-21, -161051, 496, -122023936/161051] (you can check that the numerator of $j$ is indeed $496^3$.) Yes, E.a1 and the like works too. Points on the curve are then ordered pairs [x,y], except that the origin $O$ is just [0]. The function ellisoncurve checks whether a given point is in fact on the curve, returning 1 or 0 for true or false respectively. For example, ellisoncurve(E,[5,5]) returns 1 — I chose [5,5] because that is the torsion generator listed on the LMFDB page. The group law operations are then elladd, ellsub, ellpow; for example the negative of a point P is ellsub(E,[0],P), doubling P is achieved by either ellpow(E,P,2) or elladd(E,P,P), and ellpow(E,[5,5],5) returns [0], as it should be because the torsion group is ${\bf Z} / 5 {\bf Z}$.

Why do we care about the group law on a singular curve (Proposition 2.5 on page 56)? For one thing, because one way we understand curves $E$ over number fields such as $\bf Q$ is by reducing (an equation for) $E$ modulo different primes $p$, and thus can produce singular curves even when $E$ is nonsingular; in fact $E \bmod p$ is singular iff $p|\Delta,$ because that’s the condition for the reduced curve to have discriminant 0. In that case, we say $E$ has “multiplicative reduction“ at $p$ if the reduced curve $E \bmod p$ has a node, and “additive reduction“ when $E \bmod p$ has a cusp, because that is the structure of $E_{ns} \bmod p$ according to Proposition 2.5.

In the additive case, we can also understand the structure of the group law as follows (if you like playing with Vandermonde determinants and the like). Change coordinates as in III.3.1b to put the cusp at $(0,0)$ with a horizontal tangent, so the curve has the Weierstrass equation $y^2 = x^3$, which is to say $(x,y) = (t^2,t^3)$. Distinct points $(t_i^2,t_i^3) \in E_{ns}$ ($i=1,2,3$) are collinear iff the determinant of the three vectors $(1,t_i^2,t_i^3)$ vanishes; but that determinant is $$ \pm (t_1-t_2) (t_1-t_3) (t_2-t_3) (t_1 t_2 + t_1 t_3 + t_2 t_3). $$ So the collinearity condition is $t_1 t_2 + t_1 t_3 + t_2 t_3 = 0$; dividing by $t_1 t_2 t_3$) yields the equivalent condition $t_1^{-1} + t_2^{-1} + t_3^{-1} = 0$. So, mapping $(t^2, t^3)$ to $t^{-1}$ gives an isomorphism between $E_{ns}$ and the additive group.

We shall next take a detour through Chapter VI.2 to apply this theory to Riemann surfaces $E = {\bf C} / \Lambda$, doing just enough (through Theorem VI.2.2, pages 162–163) to identify the elliptic-curve group structure of $E$ with the additive group structure of ${\bf C} / \Lambda$. This will let us quickly deduce the structure of the torsion group $E[n]$ and some other properties of isogenies involving elliptic curves ${\bf C} / \Lambda$ over $\bf C$, so we have some intuition for the general case (III.4). Note that we do not yet prove that every elliptic curve over $\bf C$ can be realized as ${\bf C} / \Lambda$ for some lattice $\Lambda \subset \bf C$; we shall do this later when covering Chapter VI more systematically.

For a lattice $\Lambda$ in $\bf C$ let $E$ be the elliptic curve $({\bf C}/\Lambda, 0)$. (Recall that by Riemann existence we know that ${\bf C}/\Lambda$, being a compact Riemann surface, is algebraic; to show it has genus $1$ we can observe that the differential $dz$ is holomorphic and nowhere vanish.) If $P,Q \in E$ then the construction from III.3 gives a point, call it $P \oplus Q$, such that $(P) + (Q) \sim (P \oplus Q) + (0)$. We claim that this is the same as the $P+Q$ computed using the additive structure on $\bf C$ and its subgroup $\Lambda$. Indeed $(P) + (Q) \sim (P \oplus Q) + (0)$ means that there is a rational function $\,f$ on ${\bf C} / \Lambda$ (that is, a meromorphic function on $\bf C$ invariant under translation by $\Lambda$) with divisor $(\,f) = (P) + (Q) - ((P \oplus Q) + (0))$. Theorem VI.2.2(c) then identifies $P \oplus Q$ with $P+Q \bmod \Lambda.$

It follows that for integer $n \neq 0$ the $n$-torsion group $E[n]$ is $n^{-1}\!\Lambda \, / \, \Lambda \cong ({\bf Z} / n {\bf Z})^2$. In particular $\#E[n] = n^2$, which is also the degree of the multiplication-by-$n$ map $E \to E, \, z \mapsto n.$. More generally, if $E' = {\bf C} / \Lambda'$ for some (possibly) other lattice $\Lambda'$ then multiplication by $a \in {\bf C}$ gives a well-defined map $E \to E'$ iff $a\Lambda \subseteq \Lambda'$, in which case it gives a group homomorphism (and thus an isogeny) whose kernel is $\{z \in {\bf C} / \Lambda : az \in \Lambda'\}$. If $a \neq 0$ then we can write this kernel as $(a^{-1} \Lambda') / \Lambda$ and deduce that it has size $[a^{-1} \Lambda' : \Lambda] = [\Lambda' : a\Lambda] = |a\Lambda| / |\Lambda'|$ where we use $|\cdot|$ for the covolume of a lattice. Since $|a\Lambda| = |a|^2 |\Lambda|$ this gives $\left|\ker a\right| = (|\Lambda| / |\Lambda'|)|a|^2$. Now the pull-back of $dz$ under multiplication by $a$ is nonzero iff $a \neq 0$, so we deduce $\deg a = (|\Lambda| / |\Lambda'|)|a|^2$ (which we then see holds also for $a=0$), and this is indeed a positive-definite quadratic form in $a$. Note that the special case $\Lambda=\Lambda', \, n \in {\bf Z}$ agrees with $|E[n]| = n^2$. See Exercise 3.8 on page 106.

We did not cover in class Theorem III.3.6 (page 64–5): the addition map $E \times E \to E$ is a morphism of algebraic varieties. This is an important technical tool, used several times in the sequel. (See for instance page 77 in III.5; in the comments on III.5 below, we discuss below what “$E \times E$ ” means as an algebraic variety.) The explicit addition formulas of Algorithm III.2.3 do not quite suffice, because we must check that the various special cases (namely $O+P, P+O, P+P, P+(-P)$) patch together properly. Silverman uses the additive structure of $E$ to make the task somewhat less daunting. Once this is done, one might also ask how few charts are needed; this was answered surprisingly recently, by W. Bosma and H. W. Lenstra in 1995 (Complete Systems of Two Addition Laws for Elliptic Curves, J. Number Theory 53, 229–240 (1995)): only two charts are needed. The formulas are still quite complicated, as you can see on the last few pages of the paper.


Chapter III.4: Isogenies (September 23)

A recurring theme in modern mathematics is that we study a class of mathematical structures via appropriate maps between them: continuous maps between topological spaces, homomorphisms between groups, morphisms or rational maps between algebraic varieties, etc. (When the allowed maps include each object’s identity map, and the composition of two allowed maps is again allowed, the structures and maps become the “objects” and “morphisms” of a “category”.) For elliptic curves $E$ over a given field $K$, we have two choices: do we require only morphisms between pointed curves — that is, morphisms $\phi: E_1 \to E_2$ satisfying $\phi(O_{E_1}) = O_{E_2}$ — or group homomorphisms, that is, $\phi: E_1 \to E_2$ satisfying the identity $\phi(P+Q) = \phi(P) + \phi(Q)$? [Recall that for curves a rational map is the same as a morphism, so we do not have to make that choice.] Happily the two choices are the same: a morphism taking $O_{E_1}$ to $O_{E_2}$ automatically satisfies $\phi(P+Q) = \phi(P) + \phi(Q)$ ! Silverman introduces isogenies with the “pointed morphisms” definition (page 66), and later proves that any such morphism is automatically a group homomorphism (Theorem 4.8 on page 71).

[In these notes I will not distinguish between \phi=$\phi$ and \varphi=$\varphi$.]

Silverman proves this using the construction of the group law via the Jacobian. The proof seems to omit a key step: why is there a well-defined “push-forward” map $\phi_* : {\rm Pic}^0 E_1 \to {\rm Pic}^0 E_2$? Certainly there’s a map $\phi_* : {\rm Div}^0 E_1 \to {\rm Div}^0 E_2$, but why do principal divisors $(f) = (f)_0 - (f)_\infty$ ($f \in K(E_1)^*$) map to principal divisors? The trick is to prove more generally that $(f)_t = (f)_{t'}$ for any $t,t' \in {\bf P}^1(K)$ — which is really equivalent because we could change $f$ to $f-t$ for any $t \in K$ — and then check that the map $t \mapsto (f)_t$ is a rational map from ${\bf P}^1$ to ${\rm Pic}^d(E)$ where $d = \deg f$. [For a general curve $C$ we don’t know that this makes sense, because we haven’t shown that ${\rm Pic}^d(E)$ is an algebraic variety; but in genus 1 we have ${\rm Pic}^d(E) \cong E$.] But by Riemann–Hurwitz a rational map from ${\bf P}^1$ to a curve of positive genus must be constant. Thus the map $\phi_* : {\rm Div}^0 E_1 \to {\rm Div}^0 E_2$ indeed descends to $\phi_* : {\rm Pic}^0 E_1 \to {\rm Pic}^0 E_2$ as desired.

In the context of complex tori ${\bf C}/\Lambda$ there is also a complex-analytic proof, see page 172 (proof of Theorem VI.4.1(a)), which shows that any holomorphic map $\phi: {\bf C}/\Lambda_1 \to {\bf C}/\Lambda_2, \, 0 \to 0$ is multiplication by some $\alpha \in {\bf C}$, necessarily satisfying $\alpha \Lambda_1 \subseteq \Lambda_2$; this of course implies $\alpha$ is a group homomorphism, and further shows that ${\rm Hom}({\bf C}/\Lambda_1,{\bf C}/\Lambda_2)$ is a discrete subgroup of $\bf C$, and thus of rank at most 2. Once we see that every elliptic curve over $\bf C$ is isomorphic with some ${\bf C}/\Lambda$ it will follow that ${\rm Hom}(E_1,E_2)$ has rank at most 2 for any elliptic curves $E_1,E_2$ in characteristic zero. In positive characteristic there are rare cases where ${\rm Hom}(E_1,E_2)$ has rank 4.

Don’t rush through the Examples of isogenies (4.1, 4.5, 4.6, 4.7)! They are basic building blocks of the theory.

Example 4.4: Silverman constructs a map ${\bf Z}[i] \to {\rm End}(E)$ for $E : y^2 = x^3 - x$ over a field not of characteristic 2 that contains square roots of $-1$, and asserts (but does not yet prove) that it is an isomorphism in characteristic 0. This is not hard to see over $\bf C$ if we realize $E$ as ${\bf C} / \Lambda$ with $\Lambda$ of the form $\omega {\bf Z}[i]$ for some $\omega \in {\bf C}^*$. In that case the image of $m+in$ in ${\rm End}(E)$ has degree $m^2 + n^2$; we shall soon see that this is true in general, and thus that the map ${\bf Z}[i] \to {\rm End}(E)$ is at least an injection. (It turns out to be an isomorphism iff ${\rm char}(K) \not\equiv 3 \bmod 4$.)

Example 4.5: This was mentioned (without the explicit formulas for $\phi$ and its dual isogeny) in the introductory lectures. The $E/\Phi$ construction (Proposition 4.12) explains why a degree-2 isogeny $\phi: E_1 \to E_2$ implies the existence of a degree-2 isogeny $E_2 \to E_1$: setting $\Phi_1 = \ker\phi,$ we see that $\#\Phi_1 = 2$ and $\Phi_1 \subset E[2],$ so $\phi(E[2])$ is a 2-element subgroup $\Phi_2 \subset E_2$ so the multiplication-by-2 map $E \to E$ factors as $E_1 \to E_1/\Phi_1 \cong E_2 \to E_2 / \Phi_2 \cong E_1 / E_1[2] \cong E_1.$ (As is often the case this is easier to visualize for elliptic curves $E = {\bf C}/\Lambda$ over C, with $E[2]$ identified with $\frac12\Lambda / \Lambda$.)

By the way, if elliptic curves $E_1,E_2$ are related by isogenies of degree $d$ whose kernels are cyclic groups of order $d$ then $E_1,E_2$ are said to be d-isogenous. When $d$ is squarefree the kernel condition holds automatically. So for example the curves $E_1,E_2$ of Example 4.5 are “2-isogenous”.

Chapter III.5: The invariant differential (September 25)

On page 77 (in the proof of Theorem 5.2) Silverman says “$([x_1,y_1,1], [x_2,y_2,1])$ give coordinates for $E \times E$ sitting inside ${\bf P}^2 \times {\bf P}^2$.” That is fine as far as it goes, but we haven’t explained how to do algebraic geometry in a product of two projective spaces. For affine spaces, this would be easy, because ${\bf A}^n \times {\bf A}^{n'}$ is just an affine space of dimension $n+n'$: if ${\bf A}^n$ has coordinates $(x_1,\ldots,x_n)$ and ${\bf A}^{n'}$ has coordinates $(x'_1,\ldots,x'_{n'})$ then ${\bf A}^n \times {\bf A}^{n'}$ has coordinates $(x_1,\ldots,x_n, x'_1,\ldots,x'_{n'})$. But this does not work for the product of projective spaces: if ${\bf P}^n$ has coordinates $[x_0,\ldots,x_n]$ and ${\bf P}^{n'}$ has coordinates $[x'_0,\ldots,x'_{n'}]$, we can’t just declare that ${\bf P}^n \times {\bf P}^{n'}$ has coordinates $[x_0,\ldots,x_n, x'_0,\ldots,x'_{n'}]$ which is not even well-defined (remember that “$[x_0,\ldots,x_n]$” is an equivalence class), and has one coordinate too many. Instead we map ${\bf P}^n \times {\bf P}^{n'}$ into ${\bf P}^{nn' + n + n'}$ by the Segre embedding, using for projective coordinates the $(n+1)(n'+1)$ products $X_{i,i'} := x_i x'_{i'}$ with $0 \leq i \leq n, \, 0 \leq i' \leq n'$, which is well-defined: multiplying each $x_i$ by some nonzero $\lambda$, and each $x'_i$ by some nonzero $\lambda'$, multiplies all the $x_i x'_{i'}$ by the same nonzero factor $\lambda\lambda'$. [The Segre embedding does appear in the text, but not until Chapter VIII, and there only in the Exercises (see 8.8c on page 263).] You should also check that the only way all the $x_i x'_{i'}$ can vanish is if either every $x_i = 0$ or every $x'_{i'} = 0$, in which case we did not actually have a point in ${\bf P}^n$ or ${\bf P}^{n'}$ to begin with; so we actually get a morphism, not just a rational map.

The image of the Segre embedding is the homogeneous ideal, call it $I_S$, generated by quadratics of the form $X_{i,i'} X_{j,j'} - X_{i,j'} X_{j,i'}$. [In the special case $n=n'=1$ there is just one generator, and indeed the image is a smooth quadric in ${\bf P}^3$; this connects to some classical and more recent mathematics: the two families of lines $P \times {\bf P}^1$ and ${\bf P}^1 \times P'$ give the two rulings of the quadric surface, and the action of ${\rm PGL}_2 \times {\rm PGL}_2$ on ${\bf P}^1 \times {\bf P}^1$ gives the isomorphism of ${\rm PGL}_2^2$ with ${\rm SO}_4 / \{\pm1\}$.] If we then map a product such as $E_1 \times E_2$ to the Segre embedding of ${\bf P}^2 \times {\bf P}^2$ in ${\bf P}^8$, we get the ideal of the image by starting with $I_S$ and adding further generators of degree 3, obtained by multiplying the defining equation of $E_1$ by all cubic monomials in the $x'_{i'}$ and multiplying the defining equation of $E_2$ by all cubic monomials in the $x_{i}$; there are many choices of writing each of these generators as cubics in the coordinates $X_{i,i'}$ of ${\bf P}^8$, but they are all equivalent modulo $I_S$. More generally, if we have a variety in ${\bf P}^n \times {\bf P}^{n'}$ defined by some bihomogeneous polynomials $P_k$ of bidegrees $(d,d')$, its image under the Segre embedding has an ideal generated by $I_S$ and the products of each $P_k$ by all the monomials that yield a homogeneous polynomial of degree $\max(d,d')$ in the $X_{i,i'}$. Like Silverman I relegate further details to standard references on algebraic geometry.

Corollary 5.5 singles out the special case of $1 - \phi$ because $E({\bf F}_q) = \ker (1-\phi)$. [This is an example of the general tactic of accessing $K$-rational objects as Galois-invariant subobjects of the corresponding objects over $\bar K$. When $K$ is finite, ${\rm Gal}(\bar K / K)$ is topologically generated by the Frobenius map $\phi$, so $P \in E({\bf F}_q)$ iff $P = \phi P$, and then the group structure lets us write this condition as $0 = P - \phi P = (1-\phi)P$ which is to say $P \in \ker (1 - \phi)$.] Since $1-\phi$ is separable, its kernel’s size is equal to its degree; we show in the next section that the degree is a positive-definite quadratic form on ${\rm End}(E)$ (and indeed on ${\rm Hom}(E_1,E_2)$ for any elliptic curves $E_1,E_2$ over the same field), which will soon give us the Hasse–Weil bound $\left| \#E({\bf F}_q) - (q+1) \right| \leq 2 q^{1/2}$.

Typo at the very end of this section (page 80, proof of Corollary 5.6c): ${\rm End}(E)$ injects (as a ring) into $\bar K$, not ${\bar K}^*$ — indeed the zero endomorphism maps to $0 \notin {\bar K}^*$. [Added later: already corrected in Silverman’s errata document, see below.]

Chapter III.6: The dual isogeny; Chapter III.10: The automorphism group (September 30)

Silverman’s definition of a positive definite form (page 85) on an abelian group $A$ is not quite the usual one. For one thing, we usually require that $A$ be free, though this doesn’t matter much because in general $A_{\rm tors}$ is in the kernel of any bilinear pairing $A \times A \to {\bf R}$, so the pairing descends to the free quotient $A / A_{\rm tors}$. (For now we need not worry about this anyway because ${\rm Hom}(E_1,E_2)$ is torsion-free, but we shall later construct pairings on groups such as $E({\bf Q}),$ which can have nontrivial torsion.) More substantively, the usual definition requires that the induced quadratic form on the real vector space $A \otimes_{\bf Z} {\bf R}$ be positive-definite. That certainly implies properties (iii,iv), but the reverse implication need not hold: let $c \in {\bf R}$ be any irrational number, and consider $A = {\bf Z}^2$ and $d(m,n) = (m-cn)^2$. Silverman’s definition does imply that $d$ is at least positive-semidefinite on $A \otimes_{\bf Z} {\bf R}$. Also, in our present setting the form takes only integral values, and at least if $A$ is finitely generated then a semidefinite quadratic form on $A$ that takes only integral values is automatically definite — though this takes some work to prove (and the fact that ${\rm End}(E_1,E_2)$ is always finitely generated takes even more work — it is proved in the next section).

Another characterization of quadratic forms $d$ (not necessarily definite) on an abelian group is the parallelogram identity $d(\phi + \psi) + d(\phi - \psi) = 2 (d(\phi) + d(\psi))$. For the degree function on ${\rm Hom}(E_1,E_2)$ this can be proved by a direct (albeit mysterious) calculation, which we shall later generalize to the canonical height on $E(K)$ for “global fields” $K$ such as $\bf Q$. In this context ${\rm Hom}(E_1,E_2)$ is a subgroup of $E_2(K(E_1))$, and is also the quotient of $E_2(K(E_1))$ by its subgroup of constant maps. A more conceptual approach, which generalizes to the canonical height on $E(K)$ when $K$ is a function field, uses intersection theory on the surface $E_1 \times E_2$; but that is rather more algebraic geometry than I want to assume or develop in Math 223. If you’ve already learned intersection theory then this could be a topic for your final project.

With about 20 minutes at the end of Monday’s class, I also went over the short §III.10, including an outline of the case of characteristics 2, 3 which Silverman relegates to Appendix A (see Proposition A.1.2 on pages 410ff.). In general ${\rm Aut}(E) = ({\rm End}(E))^*$. In characteristic 3, a curve with $j = 0 = 1728$ can be written as $y^2 = x^3 - x$ over $\bar K$; this has the 4-cycle generated by $(x,y) \mapsto (-x,iy)$ where $i^2 = -1$ (same as for $j=1728$ curves in characteristic 0), and also the 3-cycle $(x,y) \mapsto (x+1,y)$; you can check that these generate a group of order 12. In characteristic 2 we can use the model $y^2 + y = x^3$ and find $8$ automorphisms of the form $(x,y) \mapsto (x+\tau y, y + \tau^2 x + \eta)$ where $\tau^4 = \tau$ (so $\tau \in {\bf F}_4$) and $\eta^2 + \eta = \tau^3$. These constitute an 8-element quaternion group, say $\{\pm 1, \pm i, \pm j, \pm k\}$. We still have $(x,y) \mapsto (\rho x, y)$ and $(x,y) \mapsto (\rho x, y+1)$ where $\rho$ is a cube root of 1; together with $\{\pm 1, \pm i, \pm j, \pm k\}$ these generate the full 24-element automorphism group. In ${\rm End}(E)$ the extra $24-8 = 16$ are the elements $\pm \frac12 \pm \frac12 i \pm \frac12 j \pm \frac12 k$ of the Hurwitz quaternions; the unit group ${\rm Aut}(E) = ({\rm End}(E))^*$ is sometimes called the binary tetrahedral group.

Chapter III.7: The Tate module (October 2)

Remark 7.3 (page 88): If you haven’t seen the $T_\ell$ construction before, start by thinking about the Tate module $T_\ell({\bar K}^*)$ (a.k.a. the Tate group of the multiplicative group ${\bf G}_m$), and for that matter about $T_\ell({\bf R} / {\bf Z}) = T_\ell({\bf Q} / {\bf Z}) = T_\ell({\bf Z}[1/\ell] / {\bf Z}) \cong {\bf Z}_\ell$ which is the same ${\bf Z}_\ell$-module but with a different action of ${\rm Gal}({\bar{\bf Q}} / {\bf Q})$. [In particular, $T_\ell({\bf R} / {\bf Z}) \cong {\bf Z}_\ell$ canonically but for general $K$ there is no canonical isomorphism $T_\ell({\bar K}^*) \cong {\bf Z}_\ell$.] The Tate module $T_\ell({\bar K}^*)$ with its Galois action is important to us not just as simpler example of the construction but also because we shall soon connect it with $T_\ell(E)$ via the Weil pairing (see III.8). The action of ${\rm Gal}(\bar K / K)$ on $T_\ell({\bar K}^*)$ is also the beginning of a major thread of modern number theory, continuing via the Galois representation on $T_\ell(E)$ (see Remark 7.2 on the same page) to the Langlands cnojectures.

Theorem 7.4 (page 89, proved on 89–91): this is strictly stronger than injectivity of ${\rm Hom}(E_1,E_2) \to {\rm Hom}(T_\ell(E_1),T_\ell(E_2))$ because when ${\rm Hom}(E_1,E_2)$ has rank at least 2 there are elements of ${\rm Hom}(E_1,E_2) \otimes {\bf Z}_\ell$ that are not ${\bf Z}_\ell$-multiples of any element of ${\rm Hom}(E_1,E_2)$.

In the proof, Silverman’s “$M^{\rm div}$” is often called the “saturation” of $M$ in ${\rm Hom}(E_1,E_2)$; it can also be defined as $(M \otimes {\bf Q}) \cap {\rm Hom}(E_1,E_2)$. The proof (page 90) again uses the positive-definiteness of the degree map on ${\rm Hom}(E_1,E_2)$. For the record, here is the proof I outlined in Monday’s class. Let $\phi_1,\ldots,\phi_r \in {\rm Hom}(E_1,E_2)$ be $\bf Z$-linearly independent, and let $G = (c_{ij})_{i,j=1}^r$ be the symmetric Gram matrix with $2 \deg(\sum_i a_i \phi_i) = \sum_{i=1}^r \sum_{j=1}^r c_{ij} a_i a_j := Q(a_1,\ldots,a_r)$ for all $a_i \in {\bf Z}$. We claim that $G$ is positive-definite. First we show that $G$ is positive-semidefinite. If not then we can choose $\vec a \in {\bf R}^r$ such that $Q(\vec a) < 0,$ say $Q(\vec a) = -D$ for some positive $D$. Then for large $N$ we have $Q(\vec Na) = -DN^2$. Replacing each coordinate $Na_i$ by the nearest integer $b_i$ changes the value of $Q$ by only $O(N)$, giving an isogeny $\sum_i b_i \phi_i$ of degree $-DN^2 + O(N) < 0$ which is impossible. Now that we know $G$ is positive-semidefinite, if it is not definite then $G$ in singular; since its entries are integers, $\ker G$ has a nonzero vector $v$ with rational entries; some multiple $nv$ is then a nonzero $\bf Z$-linear combination of $\phi_1,\ldots,\phi_r$ of degree zero — which again is a contradiction. □

We did not explicitly cover Corollary 4.11 (pages 73–74); be sure you review this result, which is used in the proof of Theorem 7.4 (towards the end of page 90) and will likely appear again later in the course.

Corollary 7.5 (page 91) predates Tate. One $T_\ell$-free proof is to start with the case $E_1 = E_2 = E$ of ${\rm End}(E)$. Call this endomorphism ring $A$, and observe that $A \otimes {\bf Q}$ is a division algebra with a positive-definite quadratic norm. Thus the same is true of $A \otimes {\bf R}$, and the only real division algebras with positive-definite quadratic norms are $\bf R$, $\bf C$, and the non-commutative algebra $\bf H$ of Hamilton quaternions. Hence $A$ has rank 1, 2, or 4. Now if $E$ and $E'$ are isogenous then ${\rm Hom}(E,E') \otimes {\bf Q}$ is an $A$-vector space of dimension 1, so ${\rm Hom}(E,E')$ too has rank 1, 2, or 4; and of course if $E,E'$ are not isogenous then ${\rm Hom}(E,E') = 0.$

Chapter III.8: The Weil pairing; Chapter III.9: The endomorphism ring (October 7 and 9)

Near the top of page 93: “this determinant pairing on $E[m]$ is not Galois invariant” — except, naturally, in case $m=2$ (when $\det(P,Q) \neq 0$, and thus $e_2(P,Q) = -1$, if and only if $P,Q$ are distinct nonzero $2$-torsion points).

Typo at the start of the next paragraph: We can simultaneouly achieve basis independence, not “basis independent”.

A function $f$ on $E$ with divisor $m(T) - m(O)$ is sometimes called a “Weil function” (though it seems this term might be less well known than I had imagined). An example is $x - x(T)$ if $m = 2$ and $T \neq O$; if $m=3$ and $T \neq O$ then $T$ is an inflection point (see Exercise 3.9 on pages 106–107) and a Weil function is a linear combination of $1,x,y$ that defines the tangent line to $E$ at $T$. Such functions, which are locally $m$-th powers but not globally, will appear when we study descent; already in Fermat’s work on $y^2 = x^3 - x$ and $y^2 = x^3 + 4x$ he reduces to the case that $x$ is a square, and we now recognize $x$ as a Weil function for the $2$-torsion point $(x,y) = (0,0)$.

In the definition of the Weil pairing (middle of page 93), another way to show that $X \mapsto g(X+S)\,/\,g(X)$ is constant is to consider its degree, call it $d$, as a rational function on $E$ : the $m$-th power $X \mapsto (g(X+S)\,/\,g(X))^m$ is the constant function 1, which has degree zero; but that degree is $md$, so $d=0$ and $g(X+S) \, / \, g(X)$ is a constant function of $X$. Simpler yet: while $g$ need not be invariant under translation by $S$, its divisor ${\rm div}(g)$ is invariant, so $g(X+S) \, / \, g(X)$ has neither zeros nor poles.

Warning: Some sources define the Weil pairing $e_m(S,T)$ to be what Silverman would call $e_m(T,S) = e_m(S,T)^{-1}$, which has the same properties. For many purposes (including Corollary 8.1.1, Propositions 8.2 and 8.3, and Remark 8.4 = Exercise 3.15), this does not matter as long as one is consistent (or $m=2$…). But if you use some published result for which it matters whether $e_m(S,T)$ means our definition or its inverse, be sure to check which $e_m(S,T)$ is meant — and if you write some mathematical text for which the distinction matters, be sure to give a statement or reference specifying which convention you are using.

Speaking of Remark 8.4 and Exercise 3.15: this isn’t quite “more generally” — while $E[m]$ is the kernel of an isogeny, namely multiplcation by $m$, this isogeny has degree $m^2$, not $m$. We can of course regard $e_m$ as taking values in $\mu_{m^2}$, but then we must remember that this map is not surjective.

We shall not attempt to prove the last few results in Chapter III.9 (pages 102–103). The Brauer group is not really necessary to classify quaternion algebras over $\bf Q$: the bijection between such algebras and finite even subsets of $\{\infty, 2, 3, 5, 7, 11, \ldots \}$ can be viewed as a version or application of quadratic reciprocity together with the Hasse principle for conics over $\bf Q$ (and indeed ${\rm Br}_2$ and higher Brauer groups of number fields can be viewed as vast generalizations of that picture over $\bf Q$). Quaternion algebras are also closely related to quadratic Hilbert symbols. There are several sources for quaternion algebras, including Vignéras’ Arithmetique des algèbres de quaternions (in French; LNM 800, 1980) and Voight’s recent Quaternion Algebras (GTM 288, 2021), both available online from Harvard’s library.

NO CLASS OCTOBER 14: UNIVERSITY HOLIDAY

Chapter VI: Elliptic curves over $\bf C$ (October 16, 21)

This chapter is much shorter than Chapter III, so naturally we’ll take much less time here, all the more so since we already covered some of VI.2 for the background to Chapter III. You may also have seen some of this material in a course or book on complex analysis.

Note that the motivating example of $\int dx / \sqrt{1-x^2}$ (top of page 157) differes in another way from $\int \omega$ where $\omega$ is the invariant differential on an elliptic curve: not only does the curve $y^2 = 1 - x^2$ have genus zero (as we saw a month-plus ago), but also $dx / \sqrt{1-x^2}$, unlike $\omega$, has simple poles at the two points at infinity $(x:y:1) = (1:\pm i:0)$. Thus $\int dx / \sqrt{1-x^2}$ must have logarithmic singularities there — and indeed $\sin^{-1} z$ can be written as $\frac1i \log(\sqrt{1-z^2} + iz)$. [To be sure this distinction is related to the distinction between genus 0 and genus 1: a curve of genus zero has no nonzero holomorphic differentials.] Using our rational parametrization $(x,\sqrt{1-x^2}) = (2t/(1+t^2), (1-t^2)/(1+t^2))$ of the unit circle, we find that $dx/\sqrt{1-x^2} = -2\,dt/(1+t^2)$ so the integral is $C - 2 \tan^{-1} t$, consistent with the interpretation of $t$ as $\tan \theta/2$; using the partial fraction decomposition $2/(1+t^2) = 1 / (1+it) + 1 / (1-it)$ gives the logarithmic form $C + i \log\bigl((1+it)/(1-it)\bigr)$ of the integral. The Jacobi elliptic functions sn, cn, etc. develop further the analogy between such trigonometric functions and elliptic functions; we shall say nothing of that approach other than noting that it exists but is no longer the usual route to the theory of elliptic (a.k.a. doubly periodic) functions.

VI.1: we do not really need $E$ to be in Legendre form. As I learned it, we just need to choose a labeling of the three roots of the cubic (a.k.a.\ the three nonzero 2-torsion points), writing $E$ as $y^2 = (x-e_1) (x-e_2) (x-e_3)$ for any pairwise distinct $e_1, e_2, e_3 \in \bf C$. [To get from this to Legendre form, apply to $x$ an affine-linear transformation that puts $e_1,e_2$ at $0,1$ respectively, and then scale $y$ to get a Legendre model, as Silverman does at the end of III.1 (page 50).]

p.160–161: When we get to VI.5.1 (page 173) Silverman does not actually recite a proof but cites several references. I didn’t yet check the proof in [46, §2.9] cited on page 161 to find out how the $\bf R$-linear independence of $\omega_1, \omega_2$ is derived from Stokes’s theorem. One can prove it as follows: if $\omega_1,\omega_2$ were $\bf R$-linearly dependent then some nonzero multiple of $\int\omega$ would have a well-defined real part, giving a function $Z : E \to \bf R$. Being a continuous function on a compact space, $Z$ would then have a global maximum, which is impossible because even a local maximum would require $\omega$ to have a zero. (We can likewise use the maximum principle instead of Liouville’s theorem in the proof of Proposition 2.1 on page 161.)

This argument generalizes to show that the $2g$ periods of every Riemann surface $X$ of genus $g\geq1$ are linearly independent over $\bf R$. If not, there would be some nonzero holomorphic differential $\omega$ whose real periods would all vanish, so there would be a well-defined real function $Z = {\rm Re}(\int \omega)$ on $X$. This is not possible, even though $\omega$ does have zeros once $g>1$, because $Z$ is a nonconstant harmonic function and thus has no local maximum or minimum even where its gradient vanishes.
Starting on page 165 (definitions of $\wp$ and $G_{2k}$) we see several times the sum over nonzero $\omega \in \Lambda$. Such a sum is often abbreviated $\sum\!'_{\omega \in \Lambda}$ or even just $\sum\!'$. Likewise for the (rarer) product over nonzero $\omega \in \Lambda$, as seen at the bottom of page 167 (construction of the Weierstrass $\sigma$-function). I use the notation $\sum\!'_{\omega \in \Lambda}$ in the problem set. Note that, depending on details of your LaTeX installation, you may need to fiddle with \mathop and \nolimits to get $\sum\!'_{\omega \in \Lambda}$ to appear correctly in displayed equations, because LaTeX implements “'” as “^\prime”, and by default puts superscripts on top of displayed math operators such as $\sum$.

Apropos of the bottom of page 167: Silverman says “theta function” but then defines and studies the $\sigma$-function. These functions are closely related (even if the constructions are not); see the last problem of the third problem set. By the way, if the definition of the $\sigma$-function seems mysterious, it’s an example of Hadamard’s construction of an entire function with given zeros (and multiplicities) as a product over the zeros; the exponential factors make the product converge.

The Weierstrass $\zeta$-function (see Exercise 6.4 on pages 178–179) provides the missing link in the sequence $-\!\log \sigma, -\zeta, \wp, \wp', \wp'', \ldots$ of functions each of which (except the first) is the derivative of the previous one, and each of which starting with $\wp$ is doubly periodic. As we can use products of translates $\sigma(z-z_i)$ to construct doubly periodic functions with given zeros and poles (Proposition 3.4 on page 168), we can use sums $\sum_j r_j \zeta(z-z_j) \, dz$ of translates of $\zeta$ to construct differentials with given residues $r_j$ at $z_j$, which are doubly periodic iff $\sum_j r_j = 0$. We then construct differentials with arbitrary principal parts (again subject to the condition that the sum of the residues vanish) from linear combinations of such $\sum_j r_j \zeta(z-z_j)$ together with the translates of $\wp$ and its derivatives.

In general there is no closed form for $\zeta(z)$ in terms of $\wp(z)$, $\wp'(z)$, and the coefficients $g_4,g_6$ of the Weierstrass equation relating $\wp,\wp'$. However, if $\sum_j z_j = 0$ in $\bf C$ and no $z_j \in \Lambda$ then $\sum_j \zeta(z_j)$ does have an expression as a rational function of the $\wp(z_j)$ and $\wp'(z_j)$. [We need not include the Weierstrass-equation coefficients, because we can solve for them given any two points $(\wp(z_j),\wp'(z_j))$ with distinct $\wp$ values.] Since $\zeta$ is an odd function, the first interesting case of this is $\zeta(z_1) + \zeta(z_2) + \zeta(z_3)$ where $z_1 + z_2 + z_3 = 0$. The answer is $1/2$ times the slope of the line in the $(\wp,\wp')$ plane joining the three points points $(\wp(z_j),\wp'(z_j))$. Reference: formula 8.177#1 in Gradshteyn and Ryzhik’s Table of Integrals, Series, and Products, citing “SI 182(53)” where SI = Yu. Sikorskiy’s Elements of the Theory of Elliptic Functions with Applications to Mechanics (in Russian; ONTI, Moscow & Leningrad 1936).

About the endomorphism ring of an elliptic curve over $\bf C$: Once we have identified ${\rm Hom}({\bf C} / \Lambda_1, {\bf C} / \Lambda_2)$ with $\{\alpha \in {\bf C}: \alpha \Lambda_1 \subseteq \Lambda_2\}$ (Theorem 4.1, pages 171–172, see also Theorem 5.3 on page 175), we get a ring homomorphism from ${\rm End}({\bf C}/\Lambda)$ into the matrix ring $M_2({\bf Z})$ by choosing a $\bf Z$-basis $(\omega_1,\omega_2)$ for $\Lambda$. This is consistent with, and simpler than, our algebraic construction of the homomorphism to $M_2({\bf Z}_\ell)$ via the Tate module, which works for any field not of characteristic $\ell$, but is fundamentally transcendental. Now we recognize the quadratic $\alpha^2 - (a+d) \alpha + (ad-bc)$ appearing in the proof of Theorem 5.5 (see the third displayed formula on page 176) as the value at $\alpha$ of the characteristic polynomial of the image $({a \; b \atop c \; d})$ of $\alpha \in {\rm End}({\bf C}/\Lambda)$ under this homomorphism to $M_2({\bf Z})$. Indeed $\alpha$ is an eigenvalue of this $2 \times 2$ matrix corresponding to the eigenvector $(\omega_1,\omega_2)^\perp$ (or equivalently $(1,\tau)^\perp$). In general, for an abelian variety $A = {\bf C}^g / \Lambda$ of dimension $g$ we get a homomorphism ${\rm End}(A) \to M_{2g}({\bf Z}),$ and irrationalities analogous to $\alpha$ can have degree as large as $2g$. For elliptic curves, there is a difficult theorem that if $j$ is algebraic then $\tau$ is either quadratic (giving rise to a CM curve, as described in Theorem 5.3) or transcendental, so there is nothing special about elliptic curves with $\tau$ algebraic of degree $3$ or higher: those just give “random” elliptic curves that cannot be defined over a number field because their $j$-invariants are transcendental.

Note too that as it stands the criterion $[{\bf Q}(\tau) : {\bf Q}] = 2$ of Theorem 5.3 cannot be checked by any finite computation: a high-precision approximation to $\tau$ can strongly suggest that $\tau$ is quadratic or not, but cannot prove it. Fortunately there is an extensive theory of complex multiplication that makes it routine to recognize CM curves. Over $\bf Q$ it is known that there are only $13$ possible rings ${\rm End}_{\bar{\bf Q}}(E)$ other than $\bf Z$, corresponding to $13$ possible $j$-invariants (starting with $0, 12^3, -15^3, 20^3$ which you have seen already, and ending with $-5280^3$ and $-640320^3$); each is represented in the LMFDB: see the “Complex multiplication” menu in the main “Elliptic curves over Q” page.

Chapter V: Elliptic curves over finite fields (October 23, 28)

We shall postpone most of V.2 and V.3 until we've covered the more down-to-earth V.4 .

We have already seen the Hasse(–Weil) bound (Theorem V.1.1, p.138), and observed that it means the $q$ terms in Corollary 1.4 (p.139) behave roughly like $q$ independent random flips of a $\pm 1$ coin — but not exactly, because for eacn $m$ the sum of $N \gg 1$ coin flips has a positive probability of falling outside $[-m N^{1/2}, m N^{1/2}]$ (though this probability decreases rapidly with $m$), while the trace is always bounded by $2q^{1/2}$.

The trick of expanding and summing $f^{(p-1)/2}$ (Theorem V.4.1, pages 148ff.) is good to know in some other contexts, notably the Chevalley–Warning theorem. An elliptic curve isn’t quite covered by Chevalley–Warning, since it is given by a homogeneous equation in 3 variables of degree equal, rather than less than, 3; but since it is on the boundary, the same idea still gives valuable information. Note that (in odd characteristic $p$) if we write a Weierstrass equation projectively as $Y^2 Z = P(X,Z)$ for some homogeneous cubic $P$ then counting ${\bf F}_q$-points mod $p$ comes down to finding the $(XYZ)^{q-1}$ coefficient of $(Y^2 Z - P(X,Z))^{q-1}$, which is the $X^{q-1} Z^{(q-1)/2}$ coefficient of ${q-1 \choose (q-1)/2} P(X,Z)^{(q-1)/2}$; so we soon come down to the $X^{q-1}$ coefficient of the $((q-1)/2)$-th power of the affine version $P(X,1)$ of $P$, as in Theorem V.4.1. [Exercise: the binomial coefficient is $\pm 1$ and the sign exactly matches the $(-1)^{(q-1)/2}$ that we get from the $((q-1)/2)$-th power of the term $-P$ of $Y^2 Z - P(X,Z)$.]

A rather different application is the slick proof that any list of $2n-1$ integers has a sublist of $n$ integers that sum to $n$. (At least $2n-1$ are needed, because $(n-1)$ 0’s and $(n-1)$ 1’s constitute a list of $2n-2$ with no suitable sublist of $n$.) By induction we reduce to the case that $n$ is a prime $p$, and then expand the sum of the ($p-1$)-st powers of all the $p$-subsums. We get $0 \bmod p$, but ${2p-1 \choose p} \not\equiv 0 \bmod p$, so at least one of the subsums must vanish mod $p$ ! (In fact the number of multiple-of-$p$ subsums is $1 \bmod p$ because ${2p-1 \choose p} \equiv 1 \bmod p$.)
At the end of the proof of Theorem V.4.1 Silverman shows that $A_p = 0$ iff $A_q = 0$ (page 150). The proof shows that $A_q$ is a power of $A_p$; with a bit more work we find that if $q = p^e$ then the exponent is $\sum_{r=0}^{e-1} p^r = (q-1)/(p-1)$, that is, $A_q$ is the norm of $A_p$ in the extension ${\bf F}_q / {\bf F}_p$. This gives us a welcome “sanity check”, because $A_q$, being a trace mod $p$, must be in ${\bf F}_p$.

The polynomial $H_p(t) = \sum_{i=0}^m {m \choose i}^{\!2} t^i$ occurring in Theorem 4.1b (page 148, proved on page 150) is closely related to the Legendre polynomial $P_m(X) = 2^{-m} \sum_{i=0}^m {m \choose i}^{\!2} (X-1)^{m-i} (X+1)^i$, which is “well-known” to satisfy a second-order linear differential equation. Changing $m = (p-1)/2$ to $-1/2$ (which is congruent to $m \bmod p$), and replacing $\sum_{i=0}^m$ by $\sum_{i=0}^\infty$, yields a hypergeometric series that’s proportional to a period of the elliptic curve $y^2 = x (x-1) (x-\lambda)$ and, as a function of $\lambda$, satisfies such a differential equation too (cf. Remark 4.2 on page 151). At this point we, like Silverman, relegate further discussion to references.

The formula $\#E({\bf F}_{q^n}) = q^n + 1 - \alpha^n -\beta^n$ (Theorem V.2.3.1a, page 142) factors as $(\alpha^n - 1) (\beta^n - 1)$ because $\alpha\beta=q$. Since $\alpha,\beta$ are algebraic integers, it follows that $\#E({\bf F}_{q^n})$ is a factor of $\#E({\bf F}_{q^{mn}})$ — as it must be because $E({\bf F}_{q^n})$ is subgroup of $\#E({\bf F}_{q^{mn}})$. The formula for $\#E({\bf F}_{q^n})$ also yields an alternative proof of parts of Theorem V.3.1 on pages 144ff, namely the structure of $E[p^r]$, and also the equivalence of $E[p^r] = \{0\}$ with $p|a$. Start by choosing a valuation $v$ on ${\bf Q}(\alpha) = {\bf Q}(\beta)$ that extends the $p$-adic valuation on $\bf Q$. Since $v(\alpha\beta) = v(q) > 0$, at least one of $v(\alpha),v(\beta)$ is positive; assume without loss of generality $v(\beta) > 0$. Then $p|a$ iff $v(\alpha) > 0$. In this case $v(\alpha^n + \beta^n) > 0$ for each $n$, so $\#E({\bf F}_{q^n}) \equiv 1 \bmod p$ and in particular $\#E({\bf F}_{q^n})$ is not a multiple of $p$, so each $E({\bf F}_{q^n})$ has trivial $p$-torsion. Since every algebraic extension of ${\bf F}_q$ is contained in some ${\bf F}_{q^n}$, it follows that $E[p] = \{0\}$, whence $E[p^r] = \{0\}$ for all $r$. if $a \not\equiv 0 \bmod p$ then $v(\alpha) = 0$, and then for each $r$ there exists $n$ such that $\alpha^n \equiv 1 \bmod p^r$. Then $\#E({\bf F}_{q^n})$ is a multiple of $p^r$. But we already know (Corollary III.6.4c, page 86) that $E[p]$ is no larger than ${\bf Z} / p{\bf Z}$. Therefore the $p^r$-torsion subgroup of $E({\bf F}_{q^n})$ is a cyclic group of order $p^r$. Again it follows that the same is true of $E(\bar{{\bf F}}_q)$.

About the mysterious definition of the zeta function (page 140, already prefigured in Exercise 3.32d on page 112 which was in the second problem set): Recall that the Riemann zeta function $\zeta(s) = \sum_{n=1}^\infty n^{-s}$ has the Euler product $$ \zeta(s) = \prod_p (1 + p^{-s} + p^{-2s} + p^{-3s} + \cdots) = \prod_p \frac1{1-p^{-s}}. $$ We convert the sum to a product by taking logs: $$ \zeta(s) = \exp \Bigl[ \sum_p -\log (1-p^{-s}) \Bigr] = \exp \Bigl[ \sum_p \Bigl(\sum_{m=1}^\infty \frac{p^{-ms}}{n} \Bigr) \Bigr]. $$ More generally, the Dedekind zeta function $\zeta_K$ of a number field $K$ is the sum over ideal norms $n$ (with multiplicity) of $n^{-s}$, and has an Euler product over prime ideal norms $q$ (with multiplicity) that can be written as $\exp \bigl[\sum_q \bigl( \sum_m q^{-ms}/m \bigr) \bigr]$; equivalently we can write this as $\exp \bigl[\sum_q M_q \bigl( \sum_m q^{-ms}/n \bigr) \bigr]$ where $q$ ranges over all prime powers and $M_q$ is the number of prime ideals (a.k.a. “places”) of $K$ of norm $q$. Now in the analogy between number fields and function fields suggests that we construct the zeta function of a curve $C/{\bf F}_{q_0}$ (not necessarily of genus 1; indeed we won’t even use $\dim(C) = 1$) in the same way from its function field ${\bf F}_{q_0}(C)$. We get the same formula, but this time $q$ must be a power of $q_0$. The number, call it $M_n$, of places of norm $q_0^n$ is determined by $N_n = \sum_{m|n} m M_m$ where $N_n = \#C({\bf F}_{q_0^n})$. [The solution of the recurrence is $M_n = n^{-1} \sum_{m|n} \mu(n/m) N_n$ where $\mu$ is the Möbius function, but we do not need this formula.] It follows that $N_n/n = \sum_{m|n} (m/n) M_m$, so writing $n=md$ we calculate $$ \sum_{n=1}^\infty N_n \frac{T^n}{n} = \sum_{m=1}^\infty M_m \Bigl(\sum_{d=1}^\infty \frac{T^{md}}{d} \Bigr) $$ and the change of variable $T = q^{-s}$ (see Remark 2.5 on page 164) converts this to the exponent in our formula for the zeta function, which at last explains why we call $\exp \sum_{n=1}^\infty N_n T_n/n$ the zeta function of $C$.

In the proof of the crucial implication (iv) ⇒ (ii) of Theorem V.3.1 (characteriations of supersingular curves), we need to know that there are only finitely many supersingular j’s, not necessarily that they are all in ${\bf F}_{p^2}$. Silverman uses that approach because he just proved $j_E \in {\bf F}_{p^2}$ earlier on the same page (146). Alternatively we can use either part of Theorem 4.1 (page 148) which give polynomial equations that every supersingular $j$ or $\lambda$ must satisfy. (For this purpose we do not need the fact that those polynomials have distinct roots; if there were repeated roots it would only make the set of supersingular j’s smaller.)

As Silverman notes, Theorem V.3.1 comes from Deuring’s foundaiontal paper [60], which proves much more. Notably, any integer $a$ that is not a multiple of $p$ and satisfies the Hasse bound $|a| \leq 2 \sqrt q$ is the trace of at least one elliptic curve $E/{\bf F}_q$, and the number, call it $H_a$, of such $E$ (which is basically the class number of ${\bf Z}[\alpha]$ where $\alpha$ is either of the roots of $\lambda^2 - a\lambda + q$). Unfortunately we, like Silverman, cannot prove it at this point because the natural proof goes through the theory of complex muliplication (CM): one constructs a CM curve $E_0 / {\bf C}$ with ${\rm End}(E_0) \supseteq{\bf Z}[\alpha]$, shows that $j(E_0)$ is an algebraic integer with ${\bf Q}(j(E_0))$ having a prime with residue field ${\bf F}_q$, and reduces $j(E_0)$ modulo this prime. For example, if $q$ is the prime 223 and $a=26$ then $a^2 - 4q = -216 = -6^3$, so we can start with $E_0 = {\bf C} / {\bf Z} + {\bf Z}\sqrt{-6}$ and compute $j(E_0) = 2417472 + 1707264 \sqrt{2}$ which makes $j(E) = 15$ or $66 \bmod 223$, and indeed both of these are $j$-invariants of elliptic curves of trace 26 over ${\bf F}_{223}$. Years later Zagier observed that this implies that $\sum_a H_a$ equals the number of ordinary curves over ${\bf F}_q$, and it is somewhat easier to show that $H_a$ is an upper bound on the number of trace-$a$ curves (indeed that number is a priori either $H_a$ or zero), which gives an alternative proof assuming we have some other way to evaluate $\sum_a H_a$ — which is known and reasonably elementary but not easy. Neither approach is amenable to actually computing an ordinary curve $E$ of given trace over a large finite field, which (except in the special case that $Q(\alpha)$ has very small discriminant compared with $q$) remains an intractable computational problem, and an important one in the context of elliptic curve cryptography (for which see Chapter XI of Silverman’s text).


Chapter VII: Elliptic curves over local fields (November 4, 6)

We are one chapter away from proving the Mordell–Weil theorem for elliptic curves over number fields $F$ (Chapter VIII). In Chapter VI we studied elliptic curves $E$ over an archimedean completion of F, or (for a real place) an algebraic extension, necessarily of degree 2. In Chapter VII we likewise study curves over a non-archimedean completion and some of its algebraic extensions. The “analysis” is easier (VII.1 3) but the Galois theory (starting at §VII.4) is richer.

Notation on page 185: our main interest is in non-archimedean completions $K$ of number fields, which are finite algebraic extensions of the field ${\bf Q}_p$ of $p$-adic numbers, and have finite residue fields. The same theory applies to some infinite extensions of ${\bf Q}_p$, whose residue fields are finite as well. Note that Silverman uses the normalization $v(\pi) = 1$, so the valuation $v$ takes values in $\bf Z$ (or $+\infty$), not the normalization $v(p) = 1$ that we recently used in class. Finally, because Silverman requires both $K$ and its residue field $k$ to be perfect (last sentence before VII.1), some details will fail for local fields such as the field ${\bf F}_q((t))$ of formal Laurent series over the finite field ${\bf F}_q$, because $t$ has no $p$-th root.

Definition near the top of page 186: note that “the minimal discriminant” itself cannot be defined, because $\Delta$ is a minimal discriminant then so is $u^{12} \Delta$ for any $u \in R^*$. However, all minimal $\Delta$ have the same valuation, so “the valuation of the minimal discriminant“ still makes sense. For curves over $\bf Q$, the PARI/GP command elllocalred will find a transformation to a minimal model using Tate’s algorithm (and some other information, which we shall partly explain); since $\bf Z$ has unique factorization, one can find such a transformation that works modulo all $p$ at once, and ellglobalred will compute such a transformation that also puts the curve in the standard form you’ve seen in the LMFDB (with $a_1,a_3 \in \{0,1\}$ and $a_2 \in \{-1,0,1\}$). Use ellchangecurve to apply the change of coordinates encoded by the resulting $(u,r,s,t)$ and get the corresponding minimal model.

Page 187, start of the proof of Proposition VII.1.3a (states on page 186): this is the argument given at the beginning of this section (VII.1). Proof of part b: since either 2 or 3 must be invertible, knowing that both $4r^3$ and $3r^4$ are in $R$ implies that $r \in R$ too. (You might prefer to find $a,b \in \bf Z$ such that $4^4 a + 3^3 b = 1$ and then observe that $R \ni a(4r^3)^4 + b (3r^4)^3 = r^{12}$, whence $R \ni r$. Come to think of it, since we’ll use $r^n \in R \Rightarrow r \in R$ anyhow, it’s simpler yet to start by showing $R$ contains both $4r$ (or $2r$) and $3r$, and subtract.)

Last sentence of the first paragraph of VII.2: the point is that $v(u) = 0$ which means ũ $\neq 0$.

Page 189, first display: note that $(A:B:C)$ is in ${\bf P}^2$ (more canonically, the dual ${\bf P}^2$ of the projective plane where $(x:y:z)$ lives); cf. the paragraph spanning pages 187–188.

VII.3.1: Thankfully we do not actually need formal groups here, only the following observation: let $z$ be a local parameter of $E$ near the origin such that $v(z(P)) > 0$ iff $P \in E_1(K)$ (for example, $z = x/y$); then $z$ is approximately additive, so in particular $z(mP) = mz(P) + \epsilon$ where $v(\epsilon) > v(z(P))$. The more precise part (b) of Theorem VII.3.4 (p.193) does require either formal groups or division polynomials (the polynomials vanishing on $x(P)$ for nonzero $m$-torsion points $P$), so we shall not prove it here; it is not necessary for our main purpose.

Trivial textbook typo here, just before the statement of Theorem VII.3.4: “Cassels” is correct; therefore “Cassel’s” is wrong. Cf. the penultimate paragraph of Ralph P. Boas’ [sic] “Spelling Lesson” (College Math J. 15 #3 (June 1984), page 217), concerning Stokes’(s) Theorem.
P.S. I see that this is noted in Silverman’s list of errata, see bottom of page 21. Thanks to Leonardo Recanati-Kaplan for reminding me of that document (which also corrects the type on page 80 noted above, see page 6 of the errata, and two parts of Exercise V.5.16, see bottom of page 18).

You can check that the point counts reported in Example 3.3.2 (page 192) are consistent with what you know from Chapter V and the relevant problem in PS4 :-)

Example 3.3.3 on page 193: this $E$ (LMFDB 64a4) is 2-isogenous with the curve 64a3: $y^2 = x^3 - 4x$ which has torsion $({\bf Z} / 2 {\bf Z})^2$, giving an even simpler explanation of $4 | \#E({\bf F}_p)$ for all odd $p$ (remember that $\#E$ is an isogeny invariant). Likewise, for some small odd $\ell$ there are curves $E / \bf Q$ with trivial torsion that are $\ell$-isogenous with a curve $E'$ with an $\ell$-torsion point, implying that $\ell | \#E({\bf F}_p)$ for all primes $p$ of good reduction. This all amounts to a warning that Application 3.2 (page 192) gives an upper bound on the torsion group of en elliptic curve $E$ over a number field $K$, showing that $E(K)_{\rm tors}$ is finite and reducing its determination to a finite calculation, but occasionally the upper bound is not sharp.

Proposition VII.5.5 (page 197): Alternatively, we need only find some other elliptic curve $E'/K$ with $j(E') = j(E)$ and good reduction, and then cite the fact that any two elliptic curves with the same $j$-invariant are isomorphic over some finite extension of $K$. As usual, this is usually easy if $p \geq 5$ and $j \not\equiv 0$ or 1728 mod $\pi$, harder if $p = 2$ or 3 or $j \equiv 0$ or 1728 but not both, and hardest if $p=3$ or (especially) $p=2$ and $j \equiv 0 \equiv 1728$. Once we have Theorem 6.1 (page 200), we can also construct $K'$ as $K(E[m])$ (that is, $K$ extended by the coordinates of all $m$-torsion points of the curve) for any $m > 4$ with $v(m) = 0$, because $E_{\rm ns}$ has $m$-torsion at most ${\bf Z} / m {\bf Z}$, and if $\#(E/E_0) \geq m$ then the curve has multiplicative reduction so $j_E \notin R$.

As Silverman notes (top of page 200, before stating Theorem 6.1), Corollary 6.2 (finiteness of $[E(K) : E_0(K)]$ ) can be proved by a compactness argument in our setting of finite residue field $k$. Indeed since $k$ is finite, so is the ring of integers $R$, from which we soon show that ${\bf P}^n(K)$ is compact for each $n$. It then follows that $E(K)$ is a compact topological group (note that this requires checking also that the group operations are continuous); since $E_0(K)$ is an open subgroup (complement of the image of a point under the reduction map), it is automatically of finite index.

Here is William Stein’s scan of Tate’s original paper from Modular Forms in One Variable IV, Antwerp (Lecture Notes in Math. 476) which contains Tate’s algorithm for determining the Kodaira type (and thus $E/E_0$ etc.) of an elliptic curve over a local field, starting on page 47; Here is the PDF version of the scan, from the same source, where the algorithm begins on page 15 of 20. The display occupying the middle half or so of page 36 (4/20 in the PDF) may look familiar.

Some examples follow. Unlike Tate, we describe only some possibilities, so need not be systematic. In each case we assume the reduced curve is singular (else there is nothing to do), and we have already translated $x,y$ to put the singular point at $(x,y) = (0,0)$, so $a_3,a_4,a_6$ all have positive valuation, and $E_0$ consists of all points $(x,y)$ (including the origin) that do not satisfy $v(x) > 0$ and $v(y) > 0$. We use Tate’s notation $x = \pi^i x_i, y = \pi^i y_i$ and $a_i = \pi^m a_{i,m}$ etc., see the bottom of page 48 = 16/20.

Consider first our curves $y^2 = x^3 + a_6$ with $p \geq 5$ and $v(a_6) > 0$. The discriminant is $-2^4 3^3 a_6^2$, so $v(\Delta) = 2 v(a_6)$. Thus $y^2 = x^3 + a_6$ is a minimal model if and only if $v(a_6) < 6$: under that condition, $v(\Delta) < 12$, which we know implies minimality; conversely, if $v(a_6) \geq 6$ then the model $y_3^2 = x_3^2 + a_{6,6}$ stlll has integral coefficients, with discriminant $\pi^{-12} \Delta$ of norm $v(\Delta) - 12 < v(\Delta)$. The Kodaira type, and thus $[E : E_0]$, depends on $v(a_6)$ (five possiblities) and sometimes more refined information about $v(a_6)$.

Two multiplicative examples:
  1. The curve $y^2 = x^3 + x^2 + \pi$ has multiplicative reduction with $\Delta = -432\pi^2 - 64\pi$ (GP/PARI: ellinit([0,1,0,0,pi]).disc ), which has valuation 1 for $p$ odd; again $E=E_0$ because if $x,y$ both had positive valuation then $\pi = y^2 - (x^3 + x^2)$ would have valuation at least 2. This is a Type ${\rm I}_1$ curve. (In general ${\rm I_n}$ has $E/E_0 \cong {\bf Z} / n {\bf Z}$ which the components forming an $n$-cycle, but here this is just one curve intersecting itself at its node.)
  2. The curve $y^2 = x^3 + x^2 + \pi x$ has multiplicative reduction with $\Delta = -64\pi^3 + 16\pi^2$ (GP/PARI: ellinit([0,1,0,pi,0]).disc — or use our formula from Chapter III), which has valuation 2 for $p$ odd. Here $[E:E_0] = 2$, as promised by Theorem 6.1 (page 200). Indeed if $x,y$ have positive valuations then we divide by $\pi^2$ to obtain $y_1^2 = \pi x_1^3 + x_1^2 + x_1$. We claim that this gives a point $P = (x,y) \notin E_0$ if and only if the translate of $P$ by the 2-torsion point $(x,y)=(0,0)$ is in $E_0$. Indeed the $x$-coordinate of this translate is $\pi/x$. (I think we have seen already that on a curve $y^2 = x^3 + a_2 x^2 + a_4 x$ the sum of the 2-torsion point $(0,0)$ with any point $(x,y)$ is $(a_4/x, -a_4 y / x^2)$. ) Well, $v(x) + v(\pi/x) = v(\pi) = 1$, so exactly one of $v(x)$ and $v(\pi/x)$ is nonnegative. Projectivizing $y_1^2 = \pi x_1^3 + x_1^2 + x_1$ and reducing mod $\pi$ yields $Z (Y_1^2 - X_1^2 + X_1 Z)$ so we get the line at infinity and a conic interseting at two distinct points $(X:Y:Z) = (1 : \pm 1 : 0)$. If we instead started with $y^2 = x^3 + a_2 x^2 + \pi x$ for some $a_2 \in R$ with $v(a_2) \geq 1$ we would still get the same two components, but this time with additive reduction, and the reduced curve would be $Z (Y_1^2 + X_1 Z)$ which is the line at infinity and a conic tangent to it; this is type III, and is shown on page 46 (14/20) as two curves meeting at a double point that is represented by two circles.

Chapter VIII: Elliptic curves over number fields (November 11 ff.)

The actual title of Chapter VIII is “Elliptic Curves over Global Fields”, but at the bottom of the first page we already restrict attention to number fields. (Much but not all of the discussion carries over to elliptic curves over function fields; a few of the results, such as Lemma 1.1.1 and Proposition 1.2, do not require that $K$ be a global field at all.) We shall motivate this material by reviewing the classical approaches to $E({\bf Q})$ for the same curves $y^2 = x^3 - x$ and its quadratic twist $y^2 = x^3 - 36 x$ that were our introductory examples in the first lecture (the latter curve in the form $6y^2 = x^3 - x$ that is not quite a Weierstrass model but makes it clearly a quadratic twist of $y^2 = x^3 - x$). This is due to Fermat et al., and is the basis of Mordell’s original proof; Silverman does this kind of thing and much more in Chapter X. Both examples correspond to the choice $m=2$ (note that indeed each curve has all its 2-torsion defined over Q), but even knowing this it is far from obvious that they are special cases of the proof in Chapter VIII…

In each case we use the dual 2-isogenies $E \to E' \to E$ where $E'$ is $y^2 = x^3 + 4x$ or $y^2 = x^3 - 9x$ respectively, though (for now) it would not make much difference if we used their composition [2] (the multiplication-by-2 map), which is what we shall do to connect this approach with the proof in VIII.1. Warning: $E'$, unlike $E$, does not have all its 2-torsion defined defined over Q.

Using the general formulas (p.70 in III.4) for the 2-isogenies between $[0,a,0,b,0]$ and $[0,-2a,0,a^2-4b,0]$ we see that a point $P \in E(K)$ (other than $T:=(0,0)$ or $O$) that lifts to $E'(K)$ must have $x(P) \in (K^*)^2$. Conversely, if $x(P)$ is a non-zero square then we soon check that $P$ is the image of some rational point of $E'$ under the isogeny $E' \to E$. Geometrically, $x$ is a Weil function associated to the 2-torsion point $T$. Thus, adjoining a square root of $x$ to the function field $K(E)$ yields the function field of an unramified double cover of $E$. By Riemann–Hurwitz this double cover is again a curve of genus 1; since the preimages of $O$ are rational, it is an elliptic curve $E'$, and the covering map $E' \to E$ is a 2-isogeny.

This suggests, though it does not yet prove, that $P \mapsto x(P)$ yields a homomorphism $\delta: E(K) \to K^* / (K^*)^2$ whose kernel is the image of $E'(K)$ under the 2-isogeny $E' \to E$. To prove it, we need to define $\delta$ on $P=O$ and $P=T$, and then check that the resulting map is indeed a homomorphism, at which point we have already checked that its kernel is the image of $E'(K)$. Now $\delta(O)$ must be 1, but $\delta(T)$ is not immediate. On a curve $[0,a,0,b,0]$, translation by $T$ takes a point $P$ (other than $O$ and $T$) to a point with $x$-coordinate $a_4 / x(P)$; this suggests that $\delta(T)$ should be $a_4$ — more precisely, the coset of $a_4$ modulo squares. Now using the group law we know that if $P_1 + P_2 + P_3 = O$ for some points $P_i \neq O$ on the curve then the $x(P_i)$ are the roots, with multiplicity, of a cubic $x^3 + ax^2 + bx = (rx+s)^2$ (where $r,s$ are the coefficients of the line $y = rx+s$ through the three points); the product of these roots is $s^2$, which is indeed a square. This reduces the proof that $\delta$ is a homomorphism to a few special cases (where one or more of the $P_i$ is $T$ or $O$), which we leave as an exercise. It turns out that in general the Weil function associated to a rational $m$-torsion point of some curve $E/K$ likewise yields a homomorphism $E(K) \to K^* / (K^*)^m$.

So, we have an injection $\delta: E(K) / {\rm im}(E'(K)) \hookrightarrow K^* / (K^*)^2$. Applying the same argument to the dual isogeny $E \to E'$ yields an injection $\delta': E'(K) / {\rm im}(E(K)) \hookrightarrow K^* / (K^*)^2$. We shall show that in each case the image is finite, whence so are $E(K) / {\rm im}(E'(K))$ and $E'(K) / {\rm im}(E(K))$; it soon follows that $E(K) / 2 E(K)$ is finite too, which is the $m=2$ case of the weak Mordell–Weil theorem. For now we shall shows this for $K = \bf Q$ and (for starters) when $E$ is one of the curves $y^2 = x^3 - x$ or $y^2 = x^3 - 36x$; the argument will adapt more-or-less routinely (for us, if not for Mordell 100+ years ago!) to any $E/K$ with a $2$-torsion point, and thus to any $E/K$ by Lemma 1.1.1 (see commentary below).

[…]


Lemma 1.1.1 (page 208): Once we have proven the full (not weak) Mordell–Weil theorem, we shall no longer need this lemma: it is well-known that any subgroup of a finitely-generated abelian group is itself abelian, so if we prove MW for a number field $L$ containing $K$, we automatically prove MW for $K$ (whether the extension $L/K$ is Galois or not).

Proposition 1.2(b) (page 209): It’s a bit strange to write that a pairing is “bilinear” when one of the variables, namely $\sigma \in G_{\bar K / K}$, is in a noncommutative group. But once we show that $\kappa(P,\sigma\tau) = \kappa(P,\sigma) + \kappa(P,\tau)$ it follows that $\kappa(P,\sigma\tau) = \kappa(P,\tau\sigma)$ so the pairing descends from $G_{\bar K / K}$ to its abelianization $G^{ab}_{\bar K / K}$ (which is the Galois group of the maximal abelian extension of $K$).

Remark 1.2.1 (page 210): The classical (= original) Kummer pairing to $m$-th roots of unity is “exactly analogous” because both are a special case of the Kummer pairing to $G[m]$ for any abelian algebraic group over a field $K$ containing all of $G(\bar K)[m]$. That group is ${\bf G}_m$ in the classical case, and $E$ in the Definition preceding Proposition 1.2.

Remark 1.3 (page 211): As we know by now, “$i = 1, \ldots, 6$” here means “$i = 1, 2, 3, 4, 6$”.

Proposition 1.6 (page 213): More precisely, $L/K$ is the maximal abelian extension of $K$ with exponent dividing $m$ etc.: there might not be an extension of exponent exactly $m$ that is unramified outside $S$. An example is $K=\bf Q$ and $S=\{\infty\}$ – by Minkowski there are no unramified extensions. There are further examples where $S$ does include some finite places, e.g. $K = \bf Q$ and $S = \{\infty, 2\}$ for any odd $m$. (Also, the condition $m \geq 2$ is not really necessary because for $m=1$ the claim is immediate.)

Silverman notes (Remark 1.7 on page 212) that Proposition 1.6 also follows from Minkowski’s theorem, which applies to all extensions of bounded degree, not only abelian ones. Naturally this yields much worse upper bounds on $[L:K]$. In the direction of greater precision, class field theory yields extensive information about $L$, including its degree. For $K = \bf Q$ it is enough to start from the Kronecker–Weber Theorem, which says $L$ is contained in some cyclotomic extension of $\bf Q$ (that is how I got the example with $S = \{\infty, 2\}$ above). For general $K$, the close relation between $E(K)/mE(K)$ and the extension $L/K$ described in Proposition 1.6 means that more precise study of $m$-descent requires class field theory — which is beyond our scope, being a possible topic of another semester-long graduate course in number theory. Such courses are often offered here (and at our “peer institutions”), so you may yet have the chance to learn this in another faculty member’s M223ar (or in graduate school elsewhere).


Theorem 3.1 (page 218): in fact we shall see that $h$ can be taken to be a quadratic form, which makes the proof somewhat easier. For $K=\bf Q$, Silverman proves in VIII.4 that $P \mapsto \log H(P(x))$ satisfies the hypotheses of Theorem 3.1, where $H(m,n) = \max(|m|,|n|)$ if $m,n$ are relatively prime integers; the proof is entirely elementary (and explains the choice of hypotheses) but relies on the mysterious-seeming Sublemma 4.3 (pages 222–223). We shall take a route that (at least to me) feels better motivated though it will somewhat delay the completion of the proof of Mordell–Weil.

Here is a possibly simpler way to see Theorem VIII.3.1 (p.218ff., “the descent procedure”), which we did in class Wednesday. Any $P$ can be written as $m P_1 + Q_i$ for some $i \leq r$. Under the hypotheses of the theorem, there exist a finite bound $B$ and a ratio $\rho<1$ such that $h(P_1) \leq \rho h(P)$ provided $h(P) \gt B$. (As in Silverman’s proof, $B$ depends on the heights of the representatives $Q_1,\ldots,Q_r$ of the cosets in $E(K) / m E(K)$.) Starting from any $P$ we reach a point $P_n$ of height at most $B$ in a finite number of steps, namely, as soon as $\rho^n h(P) \leq B$ if not sooner. Now join the end of Silverman’s proof (the paragraph spanning pages 219–220).


Most of the general material on heights (VIII.5, starting on page 224) works just as well when $K$ is a function field; indeed it is (as usual) somewhat simpler, since there are no archimedean places. The finiteness results require that the ground field $k$ be finite (see below), and there is no analogue of the absolute height (defined for number fields on page 227) because there is no analogus of $\bf Q$ — $k(x)$ doesn’t work because there are many possible $[K : k(x)]$ depending on the choice of subfield isomorphic with $k(x)$ (if $K = k(C)$ for some curve $C/k$ then these choices correspond to nonconstant functions $x \in K$). In this case all the valuations, and thus all the heights, are powers of some constant $q > 1$; all choices of $q$ yield equivalent valuations and heights, but when $k$ is finite we choose $q = \#k$ which makes the formulas consistent with the standard choices for a number field.

The definition (p.226, just before Proposition 5.4) of $H_K$ is an infinite product, but as usual all but finitely many of the factors are 1. Indeed we may assume that none of the $x_i$ vanishes, else we can remove them and evaluate the height of the resulting point in a projective space of lower dimension; but then for each $i$ there are only finitely many valuations $v$ such that $v(x_i) \neq 1$. For every other $v$ the factor $\max\{|x_0|_v,\ldots,|x_N|_v\}^{n_v}$ simplifies to $\max\{1,\ldots,1\}^{n_v} = 1$.

Having defined the height of a point $[x_0,\ldots,x_r]$ in ${\bf P}^N(K)$, we can use the case $N=1$ to define the height of a field element $x \in K$ by $H_K(x) = H_K([x,1])$. In the function-field case, $H_P(x) = q^d$ where $d$ is the degree of $x$ as a rational function on $C$. This is the same as the degree of the rational function $[x,1]$ from $C$ to ${\bf P}^1$; in general a point in ${\bf P}^N(K)$ is the same as a rational function $C \to {\bf P}^N$, and its height is $q^d$ where $d$ is the degree of that function.

Remark 5.5 (top of page 227, finiteness of points on ${\bf P}^N({\bf Q})$ of bounded height), and its generalization where $\bf Q$ is replaced by any number field, is called Northcott’s theorem. Both Northcott’s theorem and Schanuel’s theorem (asymptotic count of points of height $\leq C$ on ${\bf P}^N(K)$, see Remark 5.12 at the top of page 234) extend to function fields $k(C)$ over finite fields $k$.

Theorem 5.6 (p.227ff.): we proved this in class Wednesday (Nov. 13) for $K = \bf Q$ and $N = 1$. The general case is similar, though the lower bound (which already for $N=1$ was the harder inequality) requires some more machinery because the morphism condition is not as simple a matter as relatively prime coordinates. We also noted in class the special case of degree 1, which in the general case corresponds to Corollary 5.8 on p.230 (degree 1 and $M=N$ — note that necessarily $M \geq N$ else there is no nonconstant morphism ${\bf P}^N \to {\bf P}^M$). Again this has the interpretation that a coordinate change on ${\bf P}^N$ changes $H_K$ by a factor that is bounded above and below (equivalently: changes $\log H_K$ by $+O(1)$ with the $O$-constant depending on the entries of the change-of-coordinates matrix). Note too that in this case the easier upper bound yields the usually-harder lower bound for free because linear changes of coordinate are invertible. (To be sure the lower bound is not all that hard for a linear morphism ${\bf P}^N \to {\bf P}^N$.)

Combining Theorem 5.6 with Schanuel we see that if $C/K$ is a rational curve in a projective space ${\bf P}^N$ then the number of points on $C(K)$ of height at most $B$ is asymptotically proportional to a positive power of $B$ (using the height inherited from ${\bf P}^N(K)$). On the other hand, for an elliptic curve in ${\bf P}^2$ the count is asymptotically proportional to $\log(B)^{r/2}$ where $r$ is the Mordell–Weil rank. (This will follow once we show that the canonical height is a quadratic form, and still holds for an elliptic curve in any projective space.) Thus rational points on an elliptic curve, however large its rank, are asymptotically much sparser than points on any rational curve.

In the proof of Theorem 5.9 (pages 230–231) Silverman could have retained the earlier $\epsilon(v)$ and used factors of $1+\epsilon_v$ …


Start of VIII.6 (page 234): We have already used “big $O$” notation in class at least once. In general $A = O(B)$ means there exists a constant $c$ such that $|A| \leq cB$ always. (Thus $B$ must be a nonnegative real, and $A$ can live in any space where $|B|$ makes sense.) For example, the number of $P \in {\bf P}^N({\bf Q})$ with $H(P) \leq T$ is $O(T^{N+1})$. If the constant $c$ depends on some parameter $p$ then we might emphasize this by writing $A = O_p(B)$. Equivalent notations are $A \ll B$ and $A \ll_p B$. We also distinguish “effective” estimates, where the implied constants can be (at least in principle) bounded explicitly, from from “ineffective” estimates where the proof cannot give a bound. For example, the Mordell–Weil theorem gives an effective upper bound $O_E(1)$ on the rank of the elliptic curve $E$ but only an ineffective upper bound $O_E(1)$ on the height of each point in a set of generators.

Proposition VIII.6.1 (page 235) is a typical application of Northcott’s theorem.

Theorem VIII.6.2 (page 235, proved in 235–237) is possible because (i) $h(x_1) + h(x_2) = h([1,x_1+x_2,x_1 x_2]) + O(1)$ (each nonarchimedean place’s contributions to the two sides are exactly equal, so we need only check it over $\bf R$ and $\bf C$), and (ii) the configuraton $\{\pm P, \pm Q\}$ has the same symmetries as $\{\pm P \pm Q\}$. Using a narrow Weierstrass equation (with $a_1 = a_2 = a_3 = 0$) is not necessary but simplifies the formulas; it also changes $x$ to some $u^2 x' + r$ (see page 44 back at the start of Chapter III), but since that is a rational function of degree 1 it changes the logarithmic height $h(P)$ only by $\pm O(1)$, and thus will not change the canonical height $\hat h(P)$.

As it stands our proof of the Mordell–Weil theorem hinges on the computaton that shows that the degree-2 map $g: {\bf P}^2 \to {\bf P}^2$ is a morphism provided $\Delta \neq 0$, which is the case for any (nonsingular) elliptic curve $E$. Here is a more conceptual completion of the proof of Theorem VIII.6.2 (the parallelogram identity modulo $O(1)$ for $h$). Consider $g^2 = g \circ g : {\bf P}^2 \to {\bf P}^2$. It takes the configuration $\{\pm P, \pm Q\}$ to $\{\pm P \pm Q\}$ to $\{\pm 2P, \pm 2Q\}$. This suggests the following argument. The easy (upper-bound) part of Theorem VIII.5.6 (p.227ff.) already gives $$ h(P+Q) + h(P-Q) \leq 2 (h(P) + h(Q)) + O(1),\quad(1) $$ just because $\deg g = 2$, whether it is a morphism or not. Replacing $(P,Q)$ by $(P+Q,P-Q)$ yields $$ h(2P) + h(2Q) \leq 2 (h(P+Q) + h(P-Q)) + O(1) \leq 4 (h(P) + h(Q)) + O(1).\quad(2) $$ But we already know that $h(2P) = 4 h(P) + O(1)$ and $h(2Q) = 4 h(Q) + O(1)$. This means that equality must hold throughout (2) to within $O(1)$. In particular, the ones-sided estimate (1) improves to the desired two-sided estimate $$ h(P+Q) + h(P-Q) = 2 (h(P) + h(Q)) + O(1),\ {\rm QED.} $$

Example 7.4 on page 255 is the first elliptic curve with a 7-torsion point; while $4A^3 + 27B^2 = 2^{15} 13$ for the narrow Weierstrass model $y^2 = x^3 - 43 x + 166$, the minimal model (LMFDB 26b2) has $\Delta = -2^7 13$ which is consistent with Exercise 2(iii) in the last problem set. What is this curve’s parameter $d$ in the parametrization given at the bottom of page 242 (Remark 7.8)?

As Silverman observes (Exercise 8.6 on page 262), the proof of part (d) of Theorem VIII.9.3 (page 250: $P$ is torsion iff $\hat{h}(P) = 0$) is analogous to the proof of Kronecker’s theorem that an algebraic number has height 1 iff it is a root of unity, which is to say a torsion point on the multiplicative group. In each case there is a yet-unsolved question of how small a non-torsion point’s height can get.

Lemma 9.5 (p.269, proved in 269–270) addresses a difficulty that we already encountered for the quadratic form $\deg$ on ${\rm Hom}(E,E')$, but here the form does not take integer values so we cannot overcome the difficulty using determinants. Silverman proves that $q$ is definite using Minkowski’s “geometry of numbers” result that you may have already encountered in the study of algebraic number theory. An alternative approach (which we already used for for ${\rm Hom}(E,E')$) is as follows. First, if $q(v)$ is actually negative for some $v \in V$ then it takes arbitrarily large negative values $q(Mv) = M^2 q(v)$, and then replacing $Mv$ by some nearby $v' \in L$ makes $q(v') = M^2 q(v) + O(M)$ which is negative for $M$ large enough. So (as with Silverman’s proof) the hard case is when $q$ is positive semidefinite but not definite, so descends to a definite form on some space $V_0$ of dimension $r_0 < r$, still with generators $P_1,\ldots,P_r$. Reindexing if necessary we may assume that the $P_i$ for $1 \leq i \leq r_0$ are a (real vector-space) basis for $V_0$, generating a lattice $L_0 \subset V_0$. For each $n \in \bf Z$ write $n P_r = Q_n + R_n$ with $Q_n \in L_0$ and $R_n$ mapping to a fundamental domain of $V_0 / L_0$. Then the $R_n \in L$ are pairwise distinct (their $P_n$ coefficients are all different) but their heights $q(R_n)$ are bounded (because the fundamental domain is bounded). This contradicts the Northcott hypothesis (ii).

About the minimal discriminant (§VIII.8, Nov.25): The LMFDB contains information about many elliptic curves over number fields $K$, which for each of those curves includes whether the curve has a global minimal minimal model over $K$, and such a model if one exists; if not, the LMFDB displays a model and the prime(s) where it is not minimal, as well as both the discriminant of the model and the minimal discriminant of the curve. Silverman’s Example 8.5 (p.246) of a curve with no global minimal model is $y^2 = x^3 + 125$ or equivalently $y^2 = x^3 - 1/8$ over $K = {\bf Q}(\sqrt{-10})$. These are equivalent because $125 / (-1/8) = -10^3$ is a sixth power in $K$. The LMFDB includes this curve in yet another equivalent form, namely $y^2 = x^3 - 8$, and duly reports that it is minimal at all primes except the prime $(2,a)$ where $a = \sqrt{-10}$ (this is the prime of $K$ above the ramified rational prime 2), and that no global minimal model exists.

BTW the LMFDB’s name 2.0.40.1 for this number field $K$ means it is the 1st number field of degree 2 with 0 real places and discriminant of absolute value 40. When the LMFDB contains two or more number fields of the same degree, signature, and discriminant, their ordering is unpredictable; but in degree 2 the signature and $|d_K|$ determine $K$ uniquely, so every quadratic number field’s name ends with “.1”.

We shall also say a bit more about elliptic curves over R, including an explanation for the viral “fruit math” Diophantine equation (🍎 / (🍌 + 🍍)) + (🍌 / (🍍 + 🍎)) + (🍍 / (🍎 + 🍌)) = 4 — which is to be solved in positive integers, so permutations of $(-1,4,11)$ don’t quite work, but some multiple of that rational point in the group law of the elliptic curve must land in the positive locus. Here it turns out the curve has rank $1$ and the first multiple that works is the 9th, which means the minimal height of the solution is about $9^2 = 81$ times the canonical height of $[-1,4,11]$. See this Quora answer by Alon Amit; also Math Overflow Question #227713 for a generalization to “other values of 4”. (I figure that A. Amit’s estimate “Roughly 99.999995% of the people don’t stand a chance at solving it” has at least one 9 too many; 0.000005% of the current world population of about 8 billion comes to 400 or so, and there ought to be at least 4000 people in the world who know enough of the relevant number theory to “stand a chance at” finding a positive integral solution (🍎, 🍌, 🍍). Hopefully Math 223 added a few to this count.)

More about the family of cubics $$ E_t : \frac{a}{b+c} + \frac{b}{c+a} + \frac{c}{a+b} = t: $$ We can compute that $E_t$ is smooth if and only if $t \notin \{-3, 3/2, -5/2, \infty\}$. We observed in class that $E_t$ has three rational points at permutations of $(1:-1:0)$ — so we can regard $E_t$ as an elliptic curve by choosing one of these points as the origin — and that cyclic permutations of $a,b,c$ act on $E_t$ by translations by a 3-torsion point. [Proof: let $G$ be the group of cyclic permutations, and $\chi: G \to {\bf R}^*$ the homomorphism defined by $g^* \omega = \chi(g) \omega$ for any holomorphic differential $\omega$ on $E_t$. Since ${\bf R}^*$ has trivial 3-torsion, $\chi$ must be trivial. Thus $g$ composed with translation by $-g(O)$ is an endomorphism whose image under the map ${\rm End}(E_t) \to \bf R$ (defined again by the action on $\omega$) is 1; in characteristic zero, that map is an injection, so $g$ composed with translation by $-g(O)$ is the identity, and $g$ is translation by $g(O)$, which is thus a 3-torsion point.] But Amit reports that $E_4$ has Weierstrass model $y^2 = x^3 + 109 x^2 + 224 x$, and thus also has a rational 2-torsion point. (Entering this equation into the LMFDB finds the curve 910a4, which indeed has torsion ${\bf Z} / 6 {\bf Z}$ and rank 1; thanks to the 2-torsion point we could verify rank 1 using the methods of Chapter X.) In fact $E_t$ always has a torsion point of order 2, and thus torsion at least ${\bf Z} / 6 {\bf Z}$. If we put $O$ at the fixed point $(1:-1:0)$ of the involution $\iota: (a,b,c) \leftrightarrow (b,a,c)$ of $E_t$ then $\iota$ is multiplication by $-1$ (because it is an involution that fixed the origin), and this involution has another rational fixed point $(1:1:-1)$ (where $a/(b+c) + b/(c+a) + c/(a+b)$ again fails to be a morphism, this time of type $\infty-\infty$). In fact it turns out that conversely every elliptic curve with a rational 6-torsion point over some field $K$ not of characteristic 2 or 3 is isomorphic to $E_t$ for some unique $t \in K$, again excluding $t \in \{-3, 3/2, -5/2\}$ for which $E_t$ is singular; this can be described by saying the projective line with coordinate $t$ is the modular curve ${\rm X}_1(6)$ parametrizing pairs $(E,P)$ where $E=E_t$ is an elliptic curve and $P$ is a 6-torsion point on $E$; then $-3, 3/2,-5/2, \infty$ are the cusps of this modular curve, where $E_t$ degenerates.

In general, if $E$ is any elliptic curve over $\bf R$ then $E({\bf R})$ is isomorphic (as a real Lie group) with either ${\bf R} / {\bf Z}$ or $({\bf R} / {\bf Z}) \times ({\bf Z} / 2{\bf Z})$. In the latter case, the period lattice associated to a real invariant differential is rectangular, with one real and one pure imaginary generator, say ${\bf Z}\omega \oplus {\bf Z}\varpi i$ for some positive reals $\omega,\varpi$; in this case the real locus lifts to ${\bf R} \oplus \frac12 {\bf Z}\varpi i$, and $j \geq 1728$. In the latter case, the period lattice contains ${\bf Z}\omega \oplus {\bf Z}\varpi i$ with index 2, and the non-identity coset is generated by $(\omega + \varpi i) / 2$; here $j \leq 1728$. (If $j=1728$ either case is possible, depending on the sign of $a_4$ in a Weierstrass model $y^2 = x^3 + a_4 x$; note that taking $\omega = \varpi$ yields a square lattice whether or not $(\omega + \varpi i) / 2$ is included. By the way “$\varpi$” is not a variant $\omega$ but a variant $\pi$, produced with \varpi .) Now for any $P \in {\bf R} / {\bf Z}$ the multiples of $P$ are topoloically dense in ${\bf R} / {\bf Z}$ unless $P$ is torsion. [There are various proofs of this; here is a simple one: let $d(\cdot,\cdot)$ be the distance function on ${\bf R} / {\bf Z}$, and define $\delta\leq0$ by $\delta = \inf_{n \gt 0} d(nP,0) = \inf_{m \neq n} d(mP,nP)$. If $\delta > 0$ then $P$ has at most (indeed exactly) $1/\delta < \infty$ distinct multiples, and is thus torsion. If $\delta = 0$ then $nP \neq 0$ for each $n \gt 0$, but for each $\epsilon$ there is a multiple $nP$ of $P$ with $d(nP,0) \leq \epsilon$, and then the multiples of $P$ come within $\epsilon$ of every point of ${\bf R} / {\bf Z}$.] In particular, for any $E/\bf Q$ the topological closure of $E({\bf Q})$ in $E({\bf R})$ is either finite (when the curve has rank zero), all of $E({\bf R})$, of the identity component of $E({\bf R})$. So, once we found a point $P \in E_t({\bf Q})$ that is not on the identityf component of $E({\bf R})$, we knew that some multiple of $P$ must be on the positive locus, though it might take a while to find it.


A bit about group and Galois cohomology ($H^0$ and $H^1$) and why we care (see Appendix B and parts of Chapter X; Dec.2)

For a $G$-module $A$ we denote by $A^G$ the fixed submodule $\{a \in A : \forall g \in G, a^g = a\}$. (Note that we have $G$ acting from the right, so $a^{gh} = (a^g)^h$ rather than the usual $h(g(a)) = hg(a)$; the entire theory can also be developed for left actions, which are equivalent for the usual reason that $g(a) := a^{g^{-1}}$ gives a left action — we just have to make sure we’re consistent.) We say a homomorphism $f: A \to B$ is a map of $G$-modules if $f(a^g) = f(a)^g$ for all $g\in G$. Clearly $f(A^G) \subseteq B^G$ in that case. This gives a functor $A \mapsto A^G$ on the category of $G$-modules. It is not in general “exact”: there can be short exact sequences $0 \to A \to B \to C \to 0$ of $G$-modules for which the sequence $0 = 0^G \to A^G \to B^G \to C^G \to 0^G = 0$ is not exact. (Simple example: $G = {\bf Z} / 2 {\bf Z}$ acts on $B = ({\bf Z} / 2 {\bf Z}) \times ({\bf Z} / 2 {\bf Z})$ by sending the involution of $G$ to the involution $(x,y) \mapsto (x,x+y)$ of $B$, and $A = B^G = \{(0,*)\}$. The action of $G$ on the quotient $C = B/A$ is trivial, so $C = C^G$, but the non-identity element of $C$ does not have a preimage in $B^G$. More generally, let $G = (k,+)$ for some field $k$, acting on $B = G \oplus G$ by $(x,y)^a = (x,y+ax)$, and again $A = B^G = \{(0,*)\}$.) However, our functor is readily seen to be left-exact: if $0 \to A \to B \to C$ is an exact sequence then so is the induced map $0 \to A^G \to B^G \to C^G$ (that is, if $A \to B$ is an injection then certainly so is $A^G \to B^G$, and then the kernel of $B^G \to C^G$ is just $(\ker(B\to C))^G$ which by assumption is the $G$-invariants in the image of $A$ — which is the image of $A^G$ because $A \to B$ is an injection). So, the machinery of homological algebra gives us a series of derived functors $A \mapsto H^n(G,A)$ with $H^0(G,A)$ being $A^G$ itself. We then automatically get a description of $H^n(G,A)$ as the quotient of a certain group of $n$-cocycles by $n$-coboundaries, and a long exact sequence $$ 0 \to A^G \to B^G \to C^G \to H^1(G,A) \to H^1(G,B) \to H^1(G,C) \to H^2 (G,A) \to \cdots $$ associated to a short exact sequence $0 \to A \to B \to C \to 0$ of $G$-modules. But that is not the whole story (and it is also ahistorical, because group and Galois cohomology predates derived functors and indeed was one of the motivating examples). The cohomology groups $H^1(G,A)$ and (to a less extent) $H^2(G,A)$ arise naturally in some contexts, and satisfy some special properties, that are not formally predictable from the general “abstract nonsense” (yes, that is a technical term!) of homological algebra. For us, usually $G$ arises as a Galois group acting naturally on $A$, in which case $H^n(G,A)$ is called Galois cohomology. (We require the action to be continuous if $G$ is infinite.) We can also make sense of $H^1(G,A)$ — but not $H^n(G,A)$ for $n \geq 2$ — when $A$ is a non-abelian group with an action of $G$, see the end of Appendix B.

Here is an appearance of $H^1(G,A)$ in pure group theory that includes the “twist“ picture of Chapter X (“General theory“ in X.2, “Homogeneous spaces” in X.3 [but note the erratum on p.28 of Silverman’s “Errata, Corrections, and Addenda”], and quadratic and higher twists in X.5). In general if $G$ acts on a group $A$ by automorphisms (so we have a homomorphism $G \to {\rm Aut}(A)$ — sorry, I revert to left actions here), there is a semidirect product $\Gamma$, which can be defined as the set $A \times G$ with the group law $(a,g) (a',ga') = (a g(a'), gg')$. Thus $\Gamma$ has subgroups $\{(a,1) : a \in A\}$ and $\{(1,g) : g \in G \}$ isomorphic with $A,G$ respectively, and every $(a,g) \in \Gamma$ is $(a,1)(1,g)$ while conjugation by $(1,g)$ takes any $(a,1)$ to $(1,g) (a,1) (1,g^{-1}) = (1,g)(a,g^{-1}) = (g(a),1)$. Now the same $\Gamma$ can be such a semidirect product in several inequivalent ways, corresponding to splittings $\sigma: G \to \Gamma$ of the short exact sequence $1 \to A \to \Gamma \to G \to 1$. But $\sigma$ is a splitting if and only if there is some function $s: G \to A$ such that $\sigma(g) = (s(g),g)$ for all $g \in G$ and $(s(gg'),gg') = (s(g),g) (s(g'),g')$ for all $g,g' \in G$; by the formula for the group law in a semidirect product, $g \mapsto (s(g),g)$ is a group homomorphism iff $s(gg') = s(g) \, g(s(g'))$ for all $g,g' \in A$. That is precisely the condition for $s$ to be a 1-cocycle (and works even if the group $A$ is not abelian, in the sense of Appendix B.3, pages 421,422). Two splittings $\sigma_1,\sigma_2$ are equivalent iff they are related by $G$-conjugation, that is, iff $\gamma \sigma_2(\cdot) = \sigma_1(\cdot) \gamma$ for some $\gamma \in G$. This unwinds to $\gamma s_2(g) = s_1(g) g(\gamma)$ for all $g \in G$, which is precisely the equivalence relation for $1$-cocycles (again, even in the noncommutative case). So, $H^1(G,A)$ classifies inequivalent realizations of $\Gamma$ as a semidirect product of $A$ with $G$, with the distinguished element of $H^1(G,A)$ (which is the equivalence class of the 1-cocycle $s(\cdot)=1$) corresponding to the semidirect product for the action we began with.

Now in X.2 the groups $A,G$ are Isom$(C)$ and $G = {\rm Gal}(\bar K) / K$, so $\Gamma$ acts on the function field $\bar K(C)$, and also on $\bar K(C')$ for any twist of $C$ — the function fields are the same (because $C \cong C'$ over $\bar K$), but realized differently as a semidirect product, with $K(C)$ fixed by another $\sigma(G)$. This accounts for Theorem 2.2 (pages 318,319) on the level of groups, and then we recover $C'$ using the Galois correspondence and the identification of curves with their function fields. For principal homogeneous spaces, we replace ${\rm Isom}(C)$ by $E(\bar K)$ acting on $E$ by translation. For twists, we replace ${\rm Isom}(C)$ by ${\rm Aut}(C) = {\rm Isom}(C,O)$. For $j \neq 0,1728$ the latter group is $\{\pm 1\}$, which is abelian with trivial $G$-action, so $H^1(G,A) = {\rm Hom}(G,A)$, which we identify with $K^* / (K^*)^2$ by Kummer theory (or Artin-Schreier theory in characteristic 2, see Exercise A.2 on page 414). If $j=0$ or $j=1728$ but we are not in characteristic 2 or 3 then $A$ is larger but still commutative and $H^1(G,A)$ is $K^* / (K^*)^4$ for $j=1728$ or $K^* / (K^*)^6$ for $j=0$, as we already saw by direct calculation. See Proposition 2.5 (p.420) for the identification of $H^1({\rm Gal}(\bar K/K),\mu_m)$ with $K^* / (K^*)^m$ when $m$ is not a multiple of the characteristic of K.

No, Hilbert did not actually prove $H^1({\rm Gal}(\bar K/K),\bar K^*) = 0$, let alone the more general result $H^1({\rm Gal}(L/K),\bar K^*) = 0$ for any Galois extension $L/K$ — Galois cohomology was still unknown. Hilbert’s original Satz 90 is equivalent to $H^1({\rm Gal}(L/K),\bar K^*) = 0$ in the special case that ${\rm Gal}(L/K)$ is cyclic; according to that Wikipedia entry, that is actually due to Kummer in 1855, while the case of arbitrary finite Galois extensions is due to Emmy Noether (1933, still before the introduction of group cohomology). The justification, such as it is, for calling the more general results “Hilbert’s Theorem 90” is that once one develops enough of the machinery of group cohomology one can formally reduce the general result to the special case of a cyclic extension, which is the result in Hilbert’s text though it was not among the many mathematical discoveries of Hilbert himself.

We haven’t yet connected the cyclic case of $H^1({\rm Gal}(L/K),\bar K^*) = 0$ with a statement that Hilbert (or Kummer) might recognize. Suppose $L/K$ is cyclic of degree $n$ with $G = {\rm Gal}(L/K)$ generated by $g$. Then a 1-cocycle is a map $\xi: G \to L^*$ such that $\xi(ab) = b(\xi(a)) \, \xi(b)$ for all $a,b \in G$. (Here left and right actions are equivalent because $G$ is abelian.) Taking $a=b=1$ gives $\xi(1) = \xi(1) \, \xi(1)$ so $\xi(1) = 1$. Then if $x = \xi(g)$ we get $\xi(g^{i+1}) = g(\xi(g^i)) \, \xi(g) = x g(\xi(g^i))$ and thus inductively $\xi(g^i) = x g(x) g^2(x) \cdots g^{i-1}(x)$. Taking $i=n$ yields $1 = \xi(1) = \xi(g^n) = \prod_{j=0}^{n-1} g^j(x) = N_{L/K}(1)$. Kummer proved that this is the case iff $x = a/g(a)$ for some $a \in L^*$. (The “if“ direction is almost immediate but “only if” needed an idea.) Then $$ \xi(g^i) = \prod_{j=0}^{i-1} g^j(x) = \prod_{j=0}^{i-1} g^j(a/g(a)) = \prod_{j=0}^{i-1} g^j(a) / g^{j+1}(a) = a / g^i(a) $$ (the last product telescopes), which is exactly the statement that $\xi$ is a coboundary (of the 0-chain $a$).


Final lecture (Dec.4): overview of the Selmer and Tate–Šafarevič groups (X.4, 331–341), the L-function of an elliptic curve, and the conjecture of Birch and Swinnerton-Dyer (C.16, 449–453)

We now have (almost) all the ingredients needed to define the $L$-function and the Selmer and Tate–Šafarevič groups of an elliptic curve over a number field, and to state the conjecture of Birch and Swinnerton-Dyer, often called the BSD conjecture. (The notation Ш for the Tate–Šafarevič group is a capital Cyrillic letter Sha, which is the first letter (and first syllable) of Šafarevič’s surname.)

The BSD conjecture for elliptic curves over $\bf Q$ is stated on page 452 (Conjecture 16.5); see also the list of ingredients at the bottom of the previous page (451), including the footnote. For an elliptic curve over an arbitrary number field $K$, the statement is almost the same, but $\Omega$ must be replaced by a product of real and complex periods over all archimedean places of $K$.

The quotient $R(E/{\bf Q}) / \#(E({\bf Q}_{tors}))^2$ appearing in the BSD conjecture is inversely proportional to the square of the constant term in the asymptotic formula $c h^{r/2}$ for the number of rational points of height at most $h$. There is a similar interpretation (just without squaring) for the factor $R/w$ in Dirichlet’s formula $2^{r_1} (2\pi)^{r_2} R h / w \sqrt{|D_K|}$ for the residue at $s=1$ of $\zeta_K(s)$, which appears in the asymptotic count of units of $K$ of bounded size. Likewise the order of the Tate–Šafarevič group plays the role of the class number $h$ in Dirichlet’s formula.

Two more items to add to Silverman’s list of Evidence for the BSD conjecture (pages 452 and 453): First, if $K'/K$ is a quadratic extension, then BSD for $E/K'$ is equivalent to BSD for both $E/K$ and $E'/K$ where $E'$ is the quadratic twist of $E$ associated to the quadratic extension. (This is not too hard to show for the rank prediction, starting from a factorization $L(E/K',s) = L(E/K,s) \, L(E'/K,s)$; and it should be known for the full BSD statement, including some fiddly power-of-2 factors, though I cannot give a reference.) Likewise for cyclic quartic extensions if $j_E = 1728$ and $K$ contains fourth roots of unity, and for cyclic cubic and sextic extensions if $j_E = 0$ and $K$ contains cube roots of unity. Second, the BSD conjecture has an analogue for function fields that is known to be equivalent to a conjecture (Artin–Tate) on the L-functions of algebraic surfaces over finite fields that has been proved in many cases.