Jan.27 Intro: basic definitions and questions.

Illustrated by Fano plane Π_2 of 7 "points" and 7 "lines"
(diagram again; get properties from class, including):

-- Each line has 3 points
-- Any 2 points on a unique line

That makes it a 2-(7,3,1) design.  In general:

Def. 1.1 (p.1) A t-design with parameters (v,k,λ), a.k.a. a
t-(v,k,λ) design, is a pair \D = (X,\B) where

@ X is a set of v "points"
@ \B is a collection of k-element "blocks", each a subset of X
@ any t points contained in exactly λ blocks.

E.g. Pi_2 is also a 1-(7,3,3) design, and Petersen is 1-(10,2,3);
Trivial example: any collection of b blocks is a 0-(v,k,b) design.
Is Pi_2 a 3-design?

Even more trivial: If v<k or k<t then λ=0 (and if v<t then λ
is undefined); we don't care about such examples, nor indeed of the
example where λ=0 because \B is empty.

We do care very much about the special case

λ=1 (as with Π_2);

that's a

Steiner system (a.k.a. S(t,k,v)).

[further examples: the S(2,3,81) of the game "Set";
after Prop. 1.4 and its Corollaries: the S(3,4,8) obtained
by adding a point to Π_2 etc. -- more on this later in the term]

Aim: describe all designs, up to equivalence.

But can be done, with interesting results, for many particular cases,
some of which will appear shortly.  E.g. 2-(7,3,1) is unique.

Constructing t-designs gets harder as t increases, with the exception of:

Only slightly less trivial: the set of all Bin(v,k) subsets is
a trivial design: S_v acts transitively on t-element subsets,
so each is contained in the same number (namely Bin(v-t,k-t)).
That's a complete block design; we're interested in incomplete ones.
The t-design condition says that the set of blocks is "balanced" among
all elements, pairs, ..., t-subsets.  This is sometimes called a

BIBD: balanced incomplete block design.

(At least when t≥2 -- t=1 is still too easy to satisfy without
additional conditions, e.g. k=2 just gives regular graphs;
more on that soon.)

Jan.29 Duality and the incidence matrix of a design; Fisher's theorem

Reminders: first HW due next Monday;
informal lecture notes on course webpage

[pep talk about duality in projective planes and elsewhere]

Def. 1.10 (p.4) Incidence matrix of design (or even t-structure):
columns ↔ points
rows ↔ blocks
entries = 1 if point in block, 0 else.

Example (the RED BUOY description of Π_2):

B D E O R U Y
BUD  1 1 0 0 0 1 0
BYE  1 0 1 0 0 0 1
DOE  0 1 1 1 0 0 0
DRY  0 1 0 0 1 0 1
ORB  1 0 0 1 1 0 0
RUE  0 0 1 0 1 1 0
YOU  0 0 0 1 0 1 1

warning: depends on order of X and \B; also, some authors use a
definition with rows and columns switched, resulting in transpose of
our matrix.

1.11 (p.4), generalized slightly:

If it's a t-(v,k,λ) design then:

@ each block has k points ⇔ each row dots with all-1's column to k
⇔ M * (all-1's column) = k*(all-1's column)

@ if t≥1: each point is in r=λ_1 blocks ⇔ each column dots with
all-1's row to r ⇔ (all-1's row) * M = r*(all-1's row)

The fact that rows and columns are interchangeable so far suggests
defining the dual \D\T of a design \D = (X,\B) [see p.5, I'm changing
the text's order of exposition] so that the incidence matrix of \D\T
is the transpose of the incidence matrix of \D (as the notation suggests).
Morally speaking \D\T is (\B,X).  Since X does not consist of subsets of \B,
we interpret this as

\D\T = (X\T, \B\T) = (\B, {beta_x : x \in X}) where

\beta_x := {B \in \B: x \in B}

NB in general this might be only a "structure", not a design.
A structure \D is a design if the adjacency matrix has no
repeated rows, and \D\T is a design if there are no repeated columns.
So, for instance, consider

1100
1100
0011
0011

(\D and \D\T are 1-(4,2,2) structures, both isomorphic with the
regular graph-with-multiplicity @=@ @=@), and

110000
001100
000011

(\D is the 1-(6,2,1) design @-@ @-@ @-@, and \D\T is a 1-(3,1,2)
structure; that's a degenerate example because it's twice the
complete design -- can you find a nondegenerate example?)
In any case the dual of a 1-structure is a 1-structure, and
-- as ought to be the case with a notion of duality --

(\D\T)\T is canonically isomorphic with \D.

OK: what if we actually have a 2-design, such as \Pi_2?  Check that
for \Pi_2 the dual \D\T is also a 2-design, indeed with the same parameters.
This isn't always true, but it does suggest something that is.  Let's see:

@ if also t=2: each pair of points in λ blocks ⇔ any two
*different* columns have dot product λ.  The dot product of a
column with itself is again r, so ...

1.11 iii) M\T M = (r-λ) I + λ J  (v*v matrices)

where J = all-1's matrix. (Likewise if t≥2, replacing λ by λ_2.)
Note that this excludes repeated columns unless r = λ, which is
(not surprisingly) impossible except in trivial cases as we soon see.

The first two properties in (1.11) can also be expressed in terms of J:

1.11 ii)  M J = k J (v*v matrices)
1.11 i)   J M = r J (b*b matrices)

That might seem like just bookkeeping, but (iii) lets us actually
do nontrivial things using linear algebra (and then (i) and (ii)
will help too).  We need:

Lemma 1.12 (p.4): for the identity and all-1 matrices I,J of order n
we have

det(xI+yJ) = (x+ny) x^(n-1).

In particular, xI+yJ has rank n iff x and x+ny are nonzero.

[NB it's a priori a homogeneous polynomial of degree n; also
det(I+yM) is a polynomial in y of degree rank(M), and since J has rank 1
it had to be linear in y.]

Pf.: xI+yJ symmetric ⇔ spectral theorem applies (orthonormal
eigenvectors, det = product of eigenvalues).  All-1's has eigenvalue x+ny
and its orth. complement has dimension n-1 and eigenvalue x, etc.

[Yes, Lemma 1.12 can also be obtained directly by row operations;
but we'll soon make similar applications of the spectral theorem
in settings where a more direct approach is not available.]

In our case x+ny is certainly nonzero: it's r+(v-1)λ and both terms
are positive.  What about x=r-λ?  Well we saw that

(1.9)  r (k-1) = (v-1) λ

so,

(r-λ) (k-1) = (v-1) λ - (k-1) λ = (v-k) λ

so as long as k>1 (remember we must have k≥t for nontriviality)
and k<v (it's not just the complete design (X,{X})) we have r-λ > 0.
Hence:

Theorem 1.14 (p.5) [Fisher's inequality] In a nonempty 2-design with k<v
we have b≥v.

Proof: We have written a nondegenerate v*v matrix as a product of two
matrices each of rank at most b.  QED

What if equality holds?  First of all M is a square matrix, and (1.8) bk=vr
means that b=v iff k=r, so (1.11) means M and J commute.  But then M
commutes with M\T M since that's a linear combination of I and J, so
M M\T M = M\T M M, and since M is invertible we conclude M M\T = M\T M,
which is already known (1.11 again) to be (r-λ) I + λ J.
But this means that every two *blocks* have λ points in common!
That in turn means that \D\T is also a 2-design.  To summarize:

Theorem 1.15 (p.5) In a 2-design with k<v, TFAE:
(a) b=v
(b) r=k
(c) any two blocks have λ points in common
(d) any two blocks have a constant number of points in common.
[(e) \D\T is a 2-design.]

Pf.  a ⇔ b  follows from bk=rv (b/v=r/k).  We've just shown that
these imply c, and c ⇒ d ⇔ e  are obvious.  Finally for  e ⇒ a,
if \D\T is a 2-design then it satisfies the hypotheses of Fisher's
inequality (notably the condition k<v becomes r<b, which is equivalent
because bk=rv), so we may apply Fisher to both \D and \D\T to get
b ≥ v ≥ b,  whence b=v.  QED

A 2-design satisfying these conditions is said to be square
(because the incidence matrix is square; again the book warns that
the literature also contains other terminology for this).  You might
find problem 2 on the homework easier now.

It's harder as t increases because

Cor. 1.6 (p.3): a t-(v,k,λ) design is automatically also an
s-(v,k,λ_s) design for each s \leq t, with λ_s given by:

Prop. 1.4: If S is a subset of X with s = |S| ∈ [0,t] then
the number of blocks containing S is

λ_s := (Bin(v-s,t-s) / Bin(k-s,t-s)) λ.

Proof: double count pairs (B,T) where B is a block and
S \subset T \subset X with |T|=t.

Cor. 1.7: If a t-(v,k,λ) design exists then

Bin(k-s,t-s)  |  Bin(v-s,t-s) λ

for each s=0,1,...,t-1.

Example:

b := λ_0 = # blocks and
r := λ_1 = # blocks containing a given point [if t>0...]

satisfy

b k = v r.

Likewise if t=2 then we have

r = (Bin(v-1,1) / Bin(k-1,1)) λ = (v-1)/(k-1) λ

so  (v-1)λ = r(k-1)  [p4, (1.9)].

Even for λ=1, infinitely many examples known with t=2,3
but only finitely many for 4,5 (we'll see all the "classical" ones)
and none known with 6 or more!  [Update: though thanks to Keevash,
as of 2014 we know they exist.]

What does "up to equivalence" mean?

p.3 isomorphism from (X,B) to (X',B'): bijection f: X → X'
that takes B to B'.

Automorphism of (X,B) = isomorphism (X,B) → (X,B);
these form a group.

If G is t-transitive, or even transitive on t-subsets, then
(X,B) is automatically a t-design.  Many t-designs explained
this way, including Π_2 (and Petersen).

Finally, as promised: why don't we work with "designs with multiplicity",
a.k.a. "t-structures" (see p.1)?  Even the bridges-of-Königsberg graph
has multiplicity...  But again it's too easy to construct t-structures:

Prop. 1.2 (p.2): If k ∈ (t,v-t) [that is, if t < k < v-t]
then there is a t-(v,k,λ) structure for some λ,
in which not every k-set is incident with a block.

Proof: more variables than Q-linear equations; clear denominators and
add the smallest multiple of the all-1's vector.

[another way to say this: the variables are the multiplicities m_B of the
Bin(v,k) k-subsets, plus λ itself; the equations are the Bin(v,t)
conditions that each t-subset be contained in λ blocks counted
with multiplicity; so there's a >1 dimensional space of solutions,
which includes the vector x_1 where all the m_B equal 1.  Let x_2
be an independent vector; multiply by a common denominator to get
all the coordinates of x_2 to be integers; and subtract m*x_1, where
m is the minimal coordinate m_B of x_2, to get a nonzero solution where
all the m_B are nonnegative and at least one vanishes.]

[The vector space of solutions with λ=0 may still be of interest
for the construction of irreducible representations of the symmetric group
of permutations of v objects, but that's a topic for a different kind of
Math 155r.]

Feb.1 More about square designs: BRC theorem; Fisher re-proved;
duality and polarity.

Recall: Fisher's inequality (1.14) says a 2-(v,k,λ) design with
k<v has b \geq v; proved using

1.11 iii) M\T M = (r-λ) I + λ J  (v*v matrices)

and the fact that the RHS is invertible.  We showed that equality holds
if the dual design \D\T is also a 2-design, and noted that in that case
M is a square matrix.  A further necessary condition (part of the
project of deciding which parameters v,k,λ are realized by a t-design):

Theorem 1.21 (Bruck-Ryser-Chowla; p.7)  Suppose there exists a
square 2-(v,k,λ) design.  Then:

i) v even ⇒ k-λ is a square
[NB *not* "k is square"; that's a typo for "n is square"
where CvL define n=k-λ]

ii) v odd ⇒ there exist nonzero integers x,y,z such that
z^2 = (k-λ) x^2 + (-1)^((v-1)/2) λ y^2.

As you can imagine from the statement, (i) is much easier.
We'll show only that case; you might base your final project on
one or more of the known proofs of (ii), about which we'll make some
remarks to give a bit of context to that strange-looking condition
(but you still needn't memorize it for an exam).

Proof of (i):  Recall that  det(xI+yJ) = (x+yn) x^(n-1) [Lemma 1.12].
If xI+yJ = M\T M for a square matrix M then

det(xI+yJ)  =  det M\T  det M  =  det M  det M  = (det M)^2

and since M is an integer matrix det(M) is an integer.  So, a
necessary condition is that det(xI+yJ) be a square.  Now in our case
n = v,  and  x + yn = r-λ + vλ = r + (v-1) λ, and we already
saw (1.9) that (v-1) λ = r(k-1) so it's rk which for a square design
is k^2 (recall 1.15b).  So  det(M) = k (k-λ)^((v-1)/2), which is
an integer iff v is odd or k-λ is a square.  So if v is even then
k-λ must be a square, as claimed.

As for (ii): the rows are vectors in Z^v whose Gram matrix of
inner products is the invertible matrix  (r-λ) I + λ J.
So a necessary condition is that such vectors exist in Q^v,
which is to say that the quadratic form with Gram matrix
(r-λ) I + λ J  be equivalent to the standard form with
Gram matrix I.  A necessary condition is square determinant,
condition (ii).  The names BRC are not in alphabetical order
because BR proved it in 1949 for the special case λ=1 of
square Steiner systems (= projective planes of order q=k-1),
and Chowla generalized in 1950 to arbitrary λ.
That condition can in turn be checked "locally" modulo powers of
primes dividing k-λ and λ; for instance if λ=1, so
(v,k) = (q^2+q+1, q+1), we find that (-1)^((v-1)/2) = (-1)^((q^2+q)/2)
is 1 if q is 0 or 3 mod 4, and -1 if q is 1 or 2 mod 4; in the former
case the condition is automatically satisfied (let x=0, y=z) and
in the latter we ask that q = (y/z)^2 + (x/z)^2, which by Fermat
is equivalent to q = sum of two integer squares and can be checked from
the prime factorization of q.  This condition is necessary but
not sufficient (q=10, apparently still the only known case where BRC
allows parameters that have been proved impossible).

We can re-prove Fisher without linear algebra as a special case of:

Thm. 1.20 (p.6): let B be a block of a 2-(v,k,λ) design.
Then there are at least  k(r-1)^2 / ((k-1)(λ-1) + (r-1))
blocks intersecting B nontrivially [other than B itself];
equality ⇔ every block other than B meets B in a constant
number of points, in which case that constant is 1 + (k-1)(λ-1)/(r-1).

[Note that for a square design, k=r simplifies that constant to λ
as expected.  To recover Fisher, show that the lower bound of Thm. 1.20
is at least b-1 using (1.8) bk=vr and (1.9) r(k-1)=(v-1)λ, see top
of page 7.]

Happily it's not the specific formula that we'll use (we won't cover
chapter 7) -- you certainly need not memorize that result for the exam!
-- but you should know the technique (called the "variance trick" by the
textbook's authors): obtain the first few "moments" sum(1), sum(i), sum(i^2)
(usually by some kind of double counting) to get at the mean and variance,
which must be positive (in other words, apply Cauchy-Schwarz); since we have
only a 2-design we can get only the zeroth, first, and second moment, but
in other contexts one may be able to calculate an exploit higher moments too.

Proof of Thm. 1.20: For a block B' other than B, let i(B') be the size of
the intersection of B with B'.  Let d ≤ b-1 be the number of B' for which
i(B') > 0, so the intersection is nonempty.  We'll find the sum of
1, i, i^2 over those B' to get the mean and variance.   (The book lets
n_i be the number of solutions of i(B)=i, and finds the sum over i of
n_i, in_i, and (i^2-i)n_i; that's the same as summing 1, i, i^2-i,
and thus yields the sum of 1, i, i^2 respectively).

@ The sum of 1 is of course d.
@ The sum of i is the number of pairs (B',x) with x contained in both B and B';
there are k choices of x, each giving r-1 choices of B', so the sum is k(r-1).
@ Likewise the sum of i^2-i is the number of triples (B',x,y) where x,y are
distinct elements of both B and B', so their number is (k^2-k)(λ-1).

Thus the sum of i^2 is (k^2-k)(λ-1) + k(r-1) = k ((k-1)(λ-1) + (r-1)).
Now Cauchy-Schwarz says

sum(1)sum(i^2) ≥ sum(i)^2,  with equality iff all i are equal.

Therefore  d = sum(1) ≥ sum(i)^2 / sum(i^2),  which is exactly what
we claimed; in the case of equality, the constant value of i is
sum(i^2) / sum(i) = k ((k-1)(λ-1) + (r-1)) / (k (r-1)),
which is exactly 1 + (k-1)(λ-1)/(r-1) as claimed.  QED

Duality and polarity of a square design (1.18, 1.19 on page 6):
If \D is a square design that is isomorphic with the dual design \D\T,
then we can call \D self-dual, and any such isomorphism is
called a duality of \D; explicitly, it consists of bijections
\sigma: X → \B,  \tau: \B → X such that x is in B iff
\tau(B) is in \sigma(x).  [NB the book uses exponential notation
B^\tau, x^\sigma, which switches the order of composition of bijections;
also there's a typo in the first display of p.6: each \B (the collection
of blocks) should be plain B (a single block of \D).]  In terms of
the incidence matrix M,  \sigma and \tau switch rows and columns;
alternatively, they are a permutation of the rows and columns
that takes M to M\T.  Composing two dualities (\sigma,\tau) and
(\sigma',\tau') yields an automorphism (\tau' \sigma, \sigma' \tau)
of \D = (X,\B); also, a duality can be composed from either side with
an automorphism of \D to yield another duality.  So the dualities of a
self-dual design \D extend Aut(\D) to a bigger group containing Aut(\D)
with index 2.

A duality that's an involution in this group, i.e. for which
\tau\sigma is the identity permutation of X (and thus \sigma\tau
is the identity permutation of B), is called a polarity.
In terms of M, there's a polarity iff we can permute the rows
(and columns), to get a symmetric matrix -- this makes the n-th block
the image under \sigma of the n-th point.  For example, Π_2 is polar
(HW2 hint!):

1 1 0 1 0 0 0
1 0 1 0 0 0 1
0 1 0 0 0 1 1
1 0 0 0 1 1 0
0 0 0 1 1 0 1
0 0 1 1 0 1 0
0 1 1 0 1 0 0

Feb.3 & 5 Important examples of designs I:
projective planes and higher-dimensional projective spaces

Usually when we introduce a new kind of mathematical structure
we give a selection of important examples, and transformations that
construct new examples from known ones.  But for t-designs,
so far we've given several necessary conditions on the parameters
but precious few examples with t>1 -- only Π_2, a couple of
similar designs relegated to the first problem set, and the
3-(8,4,1) design constructed from Π_2 -- and no
transformations except for the facts that a t-design is also an
s-design for each s<t, and the dual of a square 2-design is a
square 2-design, which have yet to give us any new examples.
Today we remedy this by introducing the first of several batches of
important examples of 2-designs: algebraic projective spaces, including
notably projective planes; and square designs coming from Hadamard matrices.

While there's (still?) only one example is known of parameters for
a square 2-design that pass the BRC condition (1.21, p.7) but
do not arise for any design, there's also (still!?) no value of
λ known to arise infinitely often -- except for λ=1.
A square 2-design with λ=1 (i.e. a square 2-design that's also
a Steiner 2-system) is called a (combinatorial) projective plane,
and its blocks are called lines, because (as with the prototypical
example of Π_2) any two of its points determine a unique line,
and any two lines meet in a unique point.  In a square 2-(t,k,1) design
let k=n+1, so t=n^2+n+1, either using r(k-1)=(v-1)λ or directly
because a point determines n+1 lines each of which contains n further
points and no two of these n(n+1) points coincide [that's actually
a special case of our argument for r(k-1)=(v-1)λ].  We then say that
the projective plane has order n; e.g. Π_2 has order 2.  (Not order
n+1 as might be expected; we already saw that n=r-λ is an important
parameter in Fisher's inequality (where it was the coefficient of I in
M\T M = (r-λ I) + λ J) and in the BRC theorem, and this is
even more decisively the case for projective planes.

n=1 is possible but yields the complete 2-(3,2,1) design.  For
n=2, n=3, n=4, and n=5 we shall show that there is a unique
projective plane of order n and determine its automorphism group.
These will all be special cases of the following construction, and
applicable whenever n is a prime power, producing a projective plane with
more than n^8 automorphisms.  Warning: for some such n it is known
that there are non-isomorphic projective planes of order n.
(It seems that the first example is for n=9, when there are four
such planes [Lam, Kolesova, and Theil 1988, computer-assisted]; see
http://www.uwyo.edu/moorhouse/pub/planes .)  It is a famous open
question whether there is a finite projective plane whose order is
not a prime power; the first open case is n=12 (BRC disposes of 6
but not 10).

Construction of Π_2 that exhibits all its automorphisms.
We claimed that there are 168 automorphisms, forming a group
isomorphic with GL_3(Z/2Z).  [Recall the formula for the size of
GL_n(k) for a finite field k; here that gives (8-1)(8-2)(8-4)
which indeed is 7*6*4 = 168.]  So, let k be the 2-element field Z/2Z,
and let V be a 3-dimensional space over k.  It has 2^3 = 8 vectors,
including zero.  So, the points X of Π_2 will be the 7 nonzero vectors,
and the lines will be triples {x,y,z} of vectors with x+y+z = 0.
Check by explicit identification that this is equivalent with our
initial picture of Π_2.  Because the definition uses only
the vector space structure, any automorphism of V yields the same
structure, and the group of these automorphisms is GL_3(Z/2Z).

Proof of uniqueness of the 2-(7,3,1) design: as is often the case
with highly symmetric combinatorial structures S, we can identify
any other structure S' with the same parameters with S even if we
specify some parts p, p' of S and S' respectively and require tha
the identification take p' to p.  Taking S'=S, this then shows that
the automorphism group Aut(S) "acts transitively on the p-s",
and lets us count the automorphisms.  In particular, if p pins down
S completely then the number of p-s equals the number of automorphisms
(which then "acts simply transitively" on the p-s).  If we already
have a group G acting on S, we can then check whether G=Aut(S) by
comparing cardinalities.

Here we argue as follows.  Let D be any 2-(7,3,1) design.  We'll show
D is isomorphic to Π_2.  Fix a point P of D, an ordering (l1,l2,l3)
of the three lines through P, and one of the other lines l' of D.
This can be done in 7*3!*4 = 168 ways.  Map them arbitrarily to
corresponding elements of Π_2.  This tells us the images of all
seven points: there's P itself; the points P1,P2,P3 where l' meets
l1,l2,l3 respectively; and the points Q1,Q2,Q3 of l1,l2,l3 which are
neither p nor on l'.  We've also found four of the lines, and readily
check that the others are {P1,Q2,Q3}, {Q1,P2,Q3}, {Q1,Q2,P3}.
This gives an isomorphism of D with Π_2 (and a construction of Π_2
if we don't know one already) and shows that |Aut(Π_2)| = 168.
Moreover we already know that Π_2 has automorphisms by GL_3(Z/2Z)
[the group of invertible 3*3 matrices mod 2 -- GL stands for
"General Linear"].  Thus these are all the automorphisms, and we're done.

On to projective planes of arbitrary prime-power order.  We define the
(algebraic) projective plane over any field k as follows.  [See pages
7-8 of the textbook. You may well have seen this already, as well as
projective spaces of other dimensions, at least over the real and
complex fields; projective spaces are ubiquitous in complex analysis,
algebraic topology, Lie groups, and of course algebraic geometry.
Before the end of the week we'll define projective space of dimension d
for all d.]

The plane is called \P^2(k) or k\P^2 (the latter notation is more
common when k is the field of real or complex numbers; please
don't use the textbook's PG(2,q)).  Fix a 3-dimensional vector space
V over k.  The "points" of \P^2(k) are the 1-dimensional subspaces of V
("lines through the origin", a.k.a. "nonzero vectors modulo scaling"),
and the "lines" are 2-dimensional subspaces of ("planes through the origin"),
each consisting of the "points" that are its 1-dimensional subspaces.
The design properties are then consequences of linear algebra in V:
any two distinct 1-dim. spaces are contained in a unique 2-dim. space
(their span), and any two distinct 2-dim. spaces W,W' intersect in
a 1-dim. space because

dim(W) + dim(W') = dim(W+W') + dim(W\cap W')

and here the LHS is 2+2=4 and the first term on the RHS is 3
(since W is distinct from W', the vector space sum W+W' is all of V).

Now suppose k is finite (see handout for basics about finite fields),
of order n=q.  Then we get a finite design, whose number v of points is
#(V-{0}) / #(1-dim. space - {0}) = (q^3-1) / (q-1) = q^2+q+1, while
Likewise k = (q^2-1)/(q-1) = q+1.  (We did this already when q=2
but didn't see the "modulo scaling" part because there q-1 was 1
so there was no nontrivial scaling!)  The fact that b=q^2+q+1=v
can then be obtained in several ways, but the nicest is to identify
\P^2(k) with its combinatorial dual via (yes!) the dual vector space.
Recall that for every finite-dimensional vector space V over a field k
we define a dual vector space

V* = Hom(V,k),

a k-vector space of dimension equal to dim(V), and thus isomorphic
(non-canonically) with V; and that the annihilator of any subspace
W of V is the subspace W\perp of V*, defined by

W\perp = {v* in V* : v*(w) = 0 for all w in W},

with

dim(W) + dim(W\perp) = dim(V)

(so the dimension of W\perp is the "codimension" of W and vice versa).
Duality reverses inclusions of subspaces:

W \subset W' ⇔ W\perp \superset W'\perp .

Now if V is 3-dimensional then so is V*; the 1- and 2-dimensional
subspaces of V correspond respectively [sic!] to 2- and 1-dimensional
subspaces of V*, with inclusions reversed.  So the projective plane
obtained from V* is the combinatorial dual of the one obtained from V.

V                                    V*

points p: 1-dim. subspaces   ↔   2-dim. subspaces (lines) p\perp
lines l: 2-dim. subspaces   ↔   1-dim. subspaces (points) l\perp
l contains p            ⇔   l\perp contained in p\perp

In particular the number of 2-dimensional subspaces of V equals the
number of 1-dimensional subspaces of V*, which is (q^3-1)/(q-1)=q^2+q+1

[Feb.5 interlude: visualizing the projective plane as an extension of
the familiar ("affine") plane, by intersecting lines through (0,0,0)
with the plane {(1,x1,x2) | x1,x2 in F}, when F = real numbers.
(What familiar topological surface is the complement of a small disc
in the real projective plane?) Generalization to higher n gets the
n-dimensional projective plane as the disjoint union of an n-dimensional
affine plane with an (n-1)-dimensional projective one "at infinity".]

We can get higher-dimensional projective spaces P^n(k) starting from
V of dimension n+1.  (n=0 and n=1 works too, but P^0 is not very
interesting -- just a point -- and P^1 isn't big enough to make
W of V yields an m-dimensional projective subspace P(W) = (W-{0})/k*
of the projective space P^n(k) = P(V).  For each m=2,3,...,n-1
the m-dimensional subspaces are the blocks of a 2-design whose
points are the points (0-dim. subspace) of V.  This design is
Steiner iff m=2 (two points determine a unique line), and square iff
m=n-1: the hyperplanes are again points of a dual projective space.

V                                    V*

points p: 1-dim. subspaces   ↔  (n-1) dim. proj. subspaces p\perp
lines l: 2-dim. subspaces   ↔  (n-2) dim. proj. subspaces l\per.
...
hyperplanes h: (n-1)-dim. subsp. ↔    points h\perp
P(W) contains P(W')       ↔   P(W\perp) contained in P(W'\perp)

The nice way to explain why these are all 2-designs is that all
pairs of distinct points in projective space are equivalent:
for any two pairs (p,p') and (q,q') of 1-dimensional subspaces of V,
there's an invertible linear map  T : V → V  such that A(p)=q and
A(p')=q'.  Since the projective space is defined using just the
linear structure, T yields an automorphism of each of our designs
(that is, the design obtained for each value of m=2,3,...,n-1),
and shows that there are as many blocks containing p and p'
as there are containing q and q'.  We say that the automorphism group
acts "2-transitively" on the points; more on this later.  For any d,m
we can also construct a design with points and blocks being d- and
m-dimensional projective subspaces, but for 1<d<m<n that's only a
1-design, not a 2-design, and indeed the automorphism group is
1-transitive but not 2-transitive; e.g. you can't map lines l,l' to
lines m,m' if l and l' intersect while m and m' are skew.  [Warning:
t-transitivity of Aut(D) on points is a sufficient condition for
a design D to be a t-design, but it is not necessary, though
it might look that way from the nice examples we'll show.]

Coordinates on projective space P^n(k):  (x_0,x_1,...,x_n)
[NB we start the index at 0, so n+1 coordinates], not all zero,
but regarded as the same as (x'_0,x'_1,...,x'_n) if proportional,
i.e. if there exists c in k* such that  x'_i = c x_i  for each i
(notation:  k* := {c in k: c \neq 0} = multiplicative group of k).
To emphasize that the coordinates are in effect a ratio, we use
colons rather than commas to separate them: (x_0 : x_1 : ... : x_n).
These are known as projective (or homogeneous) coordinates.
For example, n+1 points

(x0_0 : x0_1 : ... : x0_n),
(x1_0 : x1_1 : ... : x1_n),
...,
(xn_0 : xn_1 : ... : xn_n)

are on a projective hyperplane (of dimension n-1) ⇔ the corresponding
vectors in k^(n+1) are on a vector hyperplane (of dimension n) ⇔
the order-(n+1) determinant of xi_j vanishes.  NB this is well-defined:
changing any xi to ci*xi with nonzero ci multiplies the determinant by
c0 c1 c2 ... cn, but preserves the condition that it be zero.  Likewise
the coordinates of an arbitrary vector in x* in V* are (x0*,x1*,...,xn*),
acting on V via the usual inner product:

(x0*,x1*,...,xn*) (x0,x1,...,xn) = x0* x0 + x1* x1 + ... + xn* xn

and (x0* : x1* : ... : xn*) are homogeneous coordinates on the dual
projective space, and the hyperplane x* contains the point x iff x* x = 0
(again this makes sense even though "x* x" needn't be a well-defined
element of k).  One more example: a hypersurface of degree d in P^n
is the locus of P(x0,x1,...,xn) = 0  where P is a (usually irreducible
nonzero) homogeneous polynomial of degree d; this condition
(though not usually the value of P) is again invariant under scaling.
For example, a hypersurface of degree 1 is a hyperplane.

Automorphisms of P^2(k), and more generally any of the n-1
2-designs from P^n(k) (n>1): GL(V) = GL_{n+1}(k) acts, but
scalar matrices act trivially; indeed it's not hard to see that
these are the only matrices that act trivially -- that is, these
are the kernel of the map GL(V) → Aut(D).  (Indeed it's a normal
subgroup, isomorphic with k*.)  So we have an action of GL_{n+1}(k)/k*,
known as the projective general linear group PGL_{n+1}(k).  If
#k = q^f with f>1 we also get an action of the field automorphisms
(we'll later give more details about how these automorphisms interact
with PGL_{n+1}(k)).  That turns out to be the full automorphism group.
We won't prove it (except for projective planes of order 2, 3, 4, 5),
because it depends on results that the book cites (1.23 and 1.24
on page 8) but does not prove or even fully state.  Basically one
"coordinatizes" D, reconstructing the structure of k and coordinates
(x0:...:xn) from the combinatorics of D.  That could make a good
final project if you're interested.

Feb.5 and 8 Important examples of designs, II:
Hadamard matrices, particularly Sylvester and Paley

These can be viewed as another generalization of Π_2, from a different
viewpoint.  Take its intersection matrix and replace each 0 with a -1
(i.e. 2M-J).  All rows (or columns) now have norm [self-inner-product] 7,
and any two have inner product -1.  So if we extend with a row and
columns of +1's (and abbreviate +1 and -1 by just + and -):

+ + + + + + + +
+ + + - + - - -
+ + - + - - - +
+ - + - - - + +
+ + - - - + + -
+ - - - + + - +
+ - - + + - + -
+ - + + - + - -

we get an 8x8 matrix H with

(i)  all entries +1 or -1,
(ii)  any two columns' inner products equal 0.
(iii)  any two rows' inner products equal 0,

An nxn matrix with these properties is said to be a Hadamard matrix
or H-matrix for short (see 1.27 on p.9) [though Sylvester already
gave some examples 25 or so years earlier].  We already know (ii) means
H\T H = nI, so (as with M, but easier because no J's) readily deduce that
either (ii) or (iii) may be omitted because they are equivalent given (i).
Since the rows (or columns) of H are orthogonal, and all with the
same norm n, we see that H/sqrt(n) is an orthogonal matrix.
It follows that (as with M) we can find det(H) up to sign: it's
a square root of det(nI) = n^n.  That was the origin of Hadamard's
interest in such matrices:

If A is a real nxn matrix with entries of absolute value at most 1
then |det(A)| ≤ n^(n/2), with equality iff A is an H-matrix.

Pf. This follows from:

Thm. The absolute value of the determinant of any real matrix
is at most the product of the lengths of its row vectors,
with equality iff those vectors are pairwise orthogonal.

This might feel geometrically intuitive, but it requires proof, especially
as we shouldn't trust our geometric intuition of really high dimensions.

Pf. Let  v_1, v_2, ..., v_n  be the rows of the matrix.  Assume they are
linearly independent, else the determinant vanishes and the inequality
is vacuous.  Define u_1, u_2, ..., u_n as follows (cf. Gram-Schmidt):
u_1=v_1, and for each k the vector u_k is the projection of v_k to the
orthogonal span of the complement of {u_j | j=1,2,...,k-1}.  Then:

(a) u_k = v_k + linear combination of u_j with  j<k

(b) |u_k| ≤ |v_k| with equality iff u_k = v_k
(because u_k is an orthogonal projection of v_k)

(c) |det(u_1,u_2, ..., u_n)| = |u_1| |u_2| ... |u_n|
(by the A\T A trick)

The desired inequality now follows because (a) implies that
det(v_1,v_2, ..., v_n) = det(u_1,u_2, ..., u_n)  (row operations).

This also proves Theorem 1.26 .

Remark: Hadamard's inequality is true for complex matrices too,
with much the same proof; but complex Hadamard matrices are
much easier to construct, e.g. let the (i,j) entry be z^(i-j) where
z is an n-th root of unity (that's basically discrete Fourier series).
Real Hadamard matrices are more subtle.  A basic observation (p.9):

If H is a Hadamard matrix, so is any matrix H' obtained from it by
-- multiplying some subset of rows or columns by -1
-- permuting rows or columns
H' is said to be equivalent to H.  (NB we don't define H\T to be
equivalent to H [unless it can also be obtained from H by row and column
operations as above], even though H\T is also an H-matrix.)

So any H has an equivalent matrix (even without permuting) whose
first row and first column are the all-1's vector (a.k.a. J).

Lemma (unnumbered, p.9-10) if an H-matrix exists then either n=1, n=2, or 4|n.

Proof of lemma: assume n≥2.  Go to equivalent H-matrix with
top row J.  In rows 2,3 let a,b,c,d be the number of columns
of type ++, +-, -+, --.  Then:

a + b + c + d = n    (there are n columns)
a + b = c + d = n/2  (2nd row orthogonal to 1st)
a + c = b + d = n/2  (3nd row orthogonal to 1st)
a + d = b + c = n/2  (2nd row orthogonal to 3rd)

Equivalently, a = b = c = d = n/4, QED.

Now our construction of an 8x8 H-matrix from Π_2 worked because
it was a square 2-design with (k,v,λ) = (4m-1, 2m-1, m-1) with m=2.
Indeed, the above analysis shows:

Prop. 1.28 (p.10) there is a Hadamard design of order n=4m iff there is
a square 2-design with (k,v,λ) = (4m-1, 2m-1, m-1).

Pf.  ⇐ extend 2M-J with a row and column of +1's as before.
⇒ go to an equivalent H-matrix with first row and column of +1's,
delete them, and use a=b=c=d=m.

Definition: a square 2-(4m-1,2m-1,m-1) design is called a

Warning (Remark 1.29) isomorphic designs yield equivalent H,
but not (always) conversely [because of the extra choice of
which row and column to remove; we'll see that what H *does*
uniquely determine is a 3-design].

Example (1.31) [Sylvester 1867] The points and hyperplanes of
of order 2^(r+1).  Equivalently, if d=r+1 and V is a d-dimensional
vector space over F_2, the rows and columns of H are labeled by
vectors of V (this time including zero!), and the i,j entry is
1 or -1 according as the inner product of the i-th and j-th vector
is 0 or 1.  [We're exploiting the fact that q=2 so there's no
k* ambiguity in identifying a point or hyperplane with a nonzero
vector in V or V*.  Oh, sorry about the inevitable double use of
"*" for both "multiplicative group of k" and "dual vector space of V".
In more formal writing the latter * is a superscript, V*,
which you can make in TeX with "V^*" as usual.  The multiplicative group
should really be k× (TeX: k^\times) but often the * is used there too.
While I'm at it: "k" for German Körper = field.]

See also Problem 4 on the current problem set (= CvL p.25 Ex.2).

The next important example is presented in the text but not proved.
We'll supply the proof.

Example (1.30, still on p.10) [Paley 1933]  If  n = 4m = q+1
for some prime power q, and k is a finite field of order q,
then the set S of nonzero squares in k has size (q-1)/2 = 2m-1
and is a "perfect difference set" (in a sense naturally
extending what you saw in the first problem set): any nonzero
c in k can be written as s-s' for m-1 choices of s,s' in S.
Hence the collection of translates {S+x | x in k} of S
is the set of blocks of a Hadamard design.

The implication is proved as before: given distinct a,b in k,
the number of x such that S+x contains both a and b is
the number of s,s' in S s.t. s+x=a and s'+x=b.  Necessarily
s-s'=a-b, and given such s,s' we solve uniquely for x=a-s=b-s'.

To prove that S is a perfect difference set, recall the
following properties of squares in finite fields
(generalizing familiar results of Legendre etc.):

Proposition: Let k be a finite field of odd order q=2r+1.  Then:
(i) Every nonzero square s=x^2 in k is the square of exactly two
elements of k, namely x and -x.
(ii) The r-th power of any s in k* is either 1 or -1.
s is a square iff s^r = 1.

Proof:
(i) Certainly f s=x^2 then s=(-x)^2 also, and -x is distinct from x
because x is nonzero (this is where we use that q is odd); and
if y is any square root then x^2=y^2, so (x-y)(x+y)=0, which
in a field implies x-y or x+y vanishes so y is one of the known
square roots x and -x.
(ii) (s^r)^2 = s^(q-1) = 1 because s^q=s and s is nonzero.
So s^r is a root of X^2-1 = (X-1)(X+1), so equal to 1 or -1.
If s = x^2 then s^r = x^2r = x^(q-1) = 1 by the same argument.
This gives r solutions of the degree-r polynomial equation x^r=1,
so the squares in k are the only solutions of that equation.

Corollaries:

(i) r of the 2r=q-1 nonzero elements of k are squares of elements of k.
[Proof: the squaring map k* → k* is 2:1 onto its image.]

(ii) -1 is a square iff r is even.  [Proof: that's the condition on r
for (-1)^r to equal 1; note that, since q is odd, 1 and -1 are distinct.]

(iii) If -1 is not a square then for any nonzero c either c or -c
is a square but not both.

Remark: if q is even then every element of k has a unique square root
in k.  Proof: the map k → k taking x to x^2 is injective because
x^2-y^2 = (x-y)^2  so  x^2=y^2 iff x=y;  since k is finite it follows
that the map is surjective.

Now to prove that the Paley construction actually gives a Hadamard design.
In our setting q=4m-1 so r=2m-1 is odd and -1 is not a square.
Corollary (i) says |S|=2m-1 so our blocks are of the right size.
To prove it's a 2-design, we've seen already that we must show that
if a,b are distinct elements of k we must count solutions s,s' in S of
s-s'=a-b.  Let c be the nonzero field element a-b.  By part (i) of
the Proposition, the count is 1/4 the number of solutions in k* of
x^2-y^2=c.  Now there are q-1 solutions in k, because we have
x^2-y^2=(x-y)(x+y) and there are q-1 factorizations, each of which
yields unique (x,y) because 2 is invertible in k.  Of those we must
throw out solutions with x=0 or y=0.  The number of x=0 solutions is
2 or 0 according as -c is or is not a square; for y=0 the count is
2 or 0 according as +c is or is not a square.  But by (ii) exactly
one of these two happens.  Hence the number of solutions of x^2-y^2=c
in (k*)^2 is (q-1)-2 = q-3 = 4(m-1) and we're done.  QED

So for n other than 1 and 2 we have Hadamard matrices of order n if

@ n is a power of 2, or
@ n = q+1  for some prime power q congruent to -1 mod 4.

We shall see (Problem 4 = CvL p.25 again) that we also have
Hadamard matrices of order n = n_1 n_2 for any known n_1, n_2
for which we've constructed a Hadamard matrix (including 2).

This gives n=4m for m=1,2,4,8,16,... and powers of 2 times
1,2,3,5,6,7,8,11,12,15,17,18,...  so of m≤18 we're missing
only 9 and 13.  That's misleading: asymptotically 0% of m that
are not multiples of 4 arise, because either m is a power of 2
(very rare) or 4m-1 or 2m-1 must be a prime power and that happens
with probability asymptotic to 1/log(m).  [But as noted above,
other constructions are known, and Hadamard designs of order 4m
have been found for all m ≤ 166 (= 664/4).]

For  n = 4, 8, 32, 128, ..., 2^57885161, 2^74207281, ...(?),
we get two constructions (even without taking Kronecker products)
since n is both 2^e and q+1.  The resulting Hadamard matrices are
the same for n=4 and 8 (somewhat remarkably -- we'll return to this)
but not for higher n [CvL Ex.15 on p.27], nor is Paley_128 isomorphic
with  H_4 x Paley_32.  So Hadamard matrices of order n need not be unique
when they exist.  They are unique at least for n=1,2,4,8,12, and this
will be important to us when we study their automorphism groups.

Feb. 10 new designs from old:

NB again we choose a different (but still logical) order of
presentation from the textbook's.

Complementary design (p.13, eqs. 1.37-1.38 and Prop. 1.39):
if \D = (X,\B) is a t-(v,k,λ) design, let

\B@ = { X \ B : B in \B }

be the collection of complements of blocks, and define \D@ = (X,\B@),
the complementary design of \D.  (Of course \D@@ = \D: the
complement of the complement is the original design).  We claim
that this too is a t-design:

Prop. 1.39: \D@ is a t-(v,v-k,λ@) design where

1.37  λ@ = sum( (-1)^s Binomial(t,s) λ_s,  s = 0,1,...,t).

(Recall that λ_s is the number of blocks containing any s points of X,
and was computed on page 3, see (1.5)).

Proof: Certainly \D@ consists of blocks of size v-k.  We must show:
for any set T of t elements of X, the number of blocks of the original
design \B that are *disjoint* from T is the same, namely λ@.
This is done by inclusion-exclusion (see Appendix to chapter 1,
Thm. 1.56 and Cor. 1.57 on pages 24 and 25).  For any subset T' of T
let a(T') be the number of blocks B containing T'; inc-exc says that
that the number of blocks disjoint from T is the sum of (-1)^#(T') a(T')
over all 2^t subsets T'.  But a(T') is just λ_s where where s = #(T'),
and each s occurs Binomial(t,s) times, so we're done.

e.g. if t=2 then λ@ = λ - 2r + b  (1.38, see Prob.2 which you
just handed in); in any case the incidence matrix of \B@ is just
M@ = J - M  where M is the incidence matrix of \B and J is the
all-1's matrix of the appropriate size.  Of course \B@ is square
iff \B is.

Are there interesting self-complementary designs?  Interesting or not,
we must have v=2k (so in particular v had better be even).
There are various boring examples, and no square examples
exept the trivial 2-(2,1,0) design (since a square design has
(v-1)λ = k^2-k, and if v=2k then 2k-1 is a factor of k^2-k,
which doesn't happen for k>1).  But we get some self-complementary
3-designs by extending certain square 2-designs -- including indeed
our prototypical example Π_2.

Let \E be the following design: there are 8 points, X together with
a new point called 0; the blocks each consist of 4 points, and are
line complements in Π_2 together with sets of the form "line u {0}".

Claim: This is 3-(8,4,1) design (a.k.a. Steiner (3,4,8) system),
and is self-complementary by construction.

Proof: Self-complementarity is clear.  For the design property:
[NB we did this already in class, so didn't re-do it today;
but nor was it in the notes, so I recite the argument here.]
Let T be any 3-element set of X.  If T contains 0 then any block
containing T must be "line u {0}", and conversely such a line exists
and is unique, namely the line determined by the other two elements of T.
If T does not contain 0, we use our result from the 1st problem set:
either T contains a line or is disjoint from some line, but not both.
For starters this means T cannot be contained in both a line and
a line complement.  If T is in a line then it *is* that line, so
we get our block as  T u {0},  and it is clearly unique.  Otherwise
T is contained in a line complement, which is again unique because the
union of any two lines contains 3+3-1=5 points (a simple application of
inclusion-exclusion), so cannot be disjoint from the 3-element set T.  QED

Alternative description: X is a 3-dimensional vector space over k=Z/2Z;
the blocks are the 14 four-element subsets that sum to zero, or
equivalently the 14 affine planes (planes translated by a vector).
Either description generalizes to give a Steiner (3,4,2^d) system
for every d>2, but it's not self-complementary once d>3.

Right now we're more interested in self-complementarity than the
Steiner property, so instead we generalize as follows.  For any
starting from the Hadamard design of points and hyperplanes in P^(d-1) k,
obtaining a  3-(2^d, 2^(d-1), 2^(d-2) - 1)  design whose points are the
vectors in k^d and whose blocks are the 2^(d+1)-2 affine hyperplanes.

Automorphisms include (and will turn out to equal) the
affine linear group generated by GL_d(k) and translations by V,

In terms of the associated Hadamard-Sylvester matrix: if the first row
is (+1, +1, +1, ..., +1) then each of the n-1 other rows has n/2 +1's
and n/2 -1's, giving a complementary pair of blocks, for a total of 2(n-1).
We claim (p.12) that these form a  3-(n,n/2,(n/4)-1) design --
and this works for any Hadamard matrix of order n>4.  [Naturally
such designs are called "Hadamard 3-designs".]  This is a special case
of CvL Ex.5.  Another proof: choose any three columns; we want to show
that there are  (n/4) - 1   rows other than the first where these columns
give either +++ or ---.  Since the first row has +++, it is equivalent to
show that there are n/4 rows, including the first, with either +++ or ---.
Choose some other column, and multiply each row by 1 or -1 to make
that column all +1's (this does not change our design).  We've seen already
that in each pair of columns each of ++, +-, -+, and -- arises n/4 times.
This gives 12 linear conditions on 8 counts; they're not independent
but the only freedom is to add some constant c to each count with an
even number of minus sign, and subtract c from all the other four counts,
and this does not affect the sum of +++ and --- counts, etc.

DERIVED DESIGNS

Building up a 2-design to form a 3-design is quite special.
The other direction is easier -- we already saw that a t-design
is already a (t-1)-design, and more interestingly we can reverse our
above construction as follows (Def. 1.32, p.11): if \D = (X,\B)
is a t-(v,k,λ) design then for any point p in X we get a t-1 design
\D_p, the derived design with parameters (v-1, k-1, λ)  whose
point set is  X \ {p}  and whose blocks are  B \ {p}  for every
block B in \B that contains p.  Note that in general \D_p depends on p
(the main exception is when Aut(\D) acts transitively on X).

If a given t-design \D is derived from a (t+1)-design \E then
\E is said to be an extension of \D, necessarily with parameters
(v+1,k+1,λ).  For instance, a Hadamard 3-design is an extension of
each of the Hadamard 2-designs obtained from the same matrix.  Again,
extendability -- going from a t-design to a (t+1)-design -- is unusual;
applying to \E the identity bk=vr of (p.4, 1.8) already imposes a
nontrivial necessary condition (Prop. 1.33, p.11): k+1 must divide b(v+1).
[NB as noted already, there's a typo on p.11: the reference preceding
the statement of this proposition should refer to (1.8), not (1.7).
To prove it, note that \E's "r" is \D's "b".]

For example: Prop. 1.34- [p.11, Hughes 1961]: if a projective plane
of order n is extendable then n is 2, 4, or 10.

[n=2 we saw already.  n=10 is impossible -- and it turns out that it is
easier, though still not trivial, to prove there's no 3-(112,12,1)
design, but we won't even do that.  n=4 is possible, but we don't
know this yet, though the Proposition is written as if we do.]

Proof: n+2 must divide (n^2+n+1)(n^2+n+2).  But then n+2 divides the
value of the RHS at n = -2, which is 3*4 = 12, and we're done since we
don't allow n=1.

This Diophantine technique will recur several times in the course.

More generally:

Thm. (Cameron 1973, 1.35 on page 11) if \D is an extendable square
2-(v,k,λ) design then:

(b) v = (λ + 2) (λ^2 + 4λ + 2)  and  k = λ^2 + 3 λ + 1,
or
(c) (v,k,λ) = (495,39,3).

Example: in (b), λ=1 gives (v,k,λ) = (21,5,1): a projective plane
of order 4.  We'll see that all such planes are isomorphic, and that
they are extendable (which follows from the uniqueness result plus the
existence of a Steiner (3,6,22) system which you've proved in the last
problem set).  The next case is impossible (Bagchi's result cited on p.13);
beyond that I don't know.

[proof postponed]

Pf.: let \E be the purported extension, and redefine v,k to be those of \E
(so we'll have to decrement by 1 at the end).  If some \E_p is square
then every \E_p is square (same parameters).  So any two blocks are
either disjoint or meet in λ+1 points (if they both contain some p,
consider their residues in \E_p).  Conversely if \E satisfies this
condition then each \E_p is square by characterization 1.15d of
square designs (any two blocks have the same-sized intersection).
Also, since now (v-1,k-1,λ) are the parameters of a square design
the identity (1.9) becomes

(*)  (k-1)(k-2) = (v-2)λ.

Claim: fix a block, and consider \E^0 := (X^0, \B^0) where
X^0 = complement of B and \B^0 = blocks disjoint from B.
Then \B^0 is a 2-design.  Proof: fix p,q in X^0.  For each r
in B there are λ blocks B' containing p,q,r, and each r appears
in λ+1 of them.  So double-count pairs (B',r): for each of k
r's there are λ choices of B'; for each B' there are λ+1
choices of r, so the number of choices for B' is  k λ / (λ+1).
Hence the number of blocks containing p,q but disjoint from B is
constant, as claimed.  Indeed it is  λ_2  -  k λ / (λ+1),
and λ_2 = k-1  (apply (*) to the identity λ_2 = ((v-2)/(k-2)) λ
which holds in any 3-design); so the count is now
k-1 - k λ/(λ+1) = (k-λ-1) / (λ+1)  [I wish the book
spelled this out...].

We now apply Fisher's inequality to the 2-design \E^0, which has
parameters (v-k, k, (k-λ-1)/(λ+1)) and thus has

(v-k) (v-k-1) (k-λ-1) / (k (k-1) λ+1).

We must exclude the degenerate case v-k = k.  In this case
\D is a square design whose blocks have size (|X|-1)/2,
and is thus a Hadamard design; that's case (a).

Otherwise, some manipulation gives  k \geq (λ+1) (λ+2).
k = (λ+1) (λ+2)  gives case (b) after decrementing to
recover the parameters of \D (and then \E^0 is square too,
so we can hope to undertake further analysis; e.g. for λ=1
that's a square (16,6,2) design).

On the other hand, consider the integrality condition on the number
b = λ_0 of blocks of \E.  That number is v(v-1)(v-2) λ / k(k-1)(k-2),
which (*) simplifies to v(v-1)/k.  Using (*) to express this in terms of
k and λ rather than k and v we get

b = ((k-1)(k-2)+λ) ((k-1)(k-2)+2λ) / (kλ^2)

so k divides the numerator and thus divides 2(λ+1)(λ+2).
A factor of 2N that's at least as large as N is either N or 2N.
We've dealt with the former case already.  In the latter, we get
λ^2 b = 8λ^6 + ... + 9,  so λ | 9, and λ=9 yields
fractional b.  So λ=1 or 3.  The former case gives k=12, v=112
and thus an extension of a projective plane of order 10, which was
already known to be impossible in 1973.  The latter gives k=2*4*5=40,
v=496 and thus case (c).  QED

Feb. 12 -- intro to arcs and ovals; intersection triangles

You're already had to look up "oval" on page 17 to construct a
(3,6,22) Steiner system from the square Paley 2-(11,5,2) design.
An arc in a 2-design is a set S of points no three of which are
contained in a block.  (The book uses "n-arc" for an arc with
n points, and limits the definition to square designs.)
Equivalently, the intersection of S with any block B has size
0, 1, or 2.  For geometrically obvious reasons (at least in the
context of projective planes where the blocks are called "lines"),
B is said to be "tangent" to S if the intersection has size 1,
and "secant" if it has size 2; a block disjoint from S is said to be
"passant" to S.

Why do we care?  It turns out to be a useful notion in several contexts
when \D is a square design; in particular it will let us classify and
analyze projective planes of small order.  For starters (see p.18),
suppose we ask whether a projective plane \D = (X,\B) is extendable:
is there a 3-design \E such that \D is \E_p?  If so, we already know
the points of \E -- namely X u {p} -- and some of the blocks, namely
B u {p} for any B in \B.  Since B is square, any two blocks meet in
λ=1 point.  For each x in X, \E_x has the same parameters as \E_p,
so is also square with the same value of λ; hence any two of its blocks
meet in a unique point too.  Thus any two blocks of \E are either disjoint
or meet in 2 points.  (That's a special case of the argument at the
beginning of the proof of Cameron's Theorem 1.35 .)  So, any block of \E
that does not contain p is an arc.  Indeed we'll see it's an arc of
maximal size, which we'll call a "hyperoval".

How big can an arc be?

Prop. 1.46: A point of an n-arc in a square 2-(v,k,λ) design
lies on (n-1)λ secants and k-(n-1)λ tangents.  In particular
n ≤ 1 + (k/λ) [because the count of tangents is nonnegative].

Proof: let S be an arc and p be one of its points.  There are
n-1 other points of S, hence n-1 pairs {p,q} in S, each lying in
λ blocks B; each such B is a tangent, which determines q uniquely,
so there are indeed (n-1)λ such B's.  In a square design each point
is in a total of k blocks, and if such a block is not one of the
(n-1)λ secants then it's a tangent.  QED

An oval is an arc for which the tangent count is either 1 or 0;
that is, for which n is either  1 + (k-1)/λ  (an oval of Type I)
or  1 + kλ  (Type II) respectively.  You must know the distinction
but need not memorize which is which.  You should, however, remember that
ovals that have no tangents (and thus intersect each block in either
0 or 2 points) are called hyperovals.  Example: a line complement
in Π_2 is a hyperoval; removing any of its 4 points yields a Type I oval.
In the Paley 2-(11,5,2) design an oval has Type I and  1 + (5-1)/2 = 3
points; there are 11*10*3/3! = 55 ovals.  Type I is geometrically
familiar -- and we'll see (though not fully prove in class) that this
familiarity extends to algebraic projective planes P^2(k).

Observe that

Type I exists ⇒ k is 1 mod λ
Type II exists ⇒ k is 0 mod λ
Both exist ⇒ λ = 1 (so projective plane).

As usual, none of these is ⇔ !  For example:

Prop. 1.47 (p.18) If a square 2-(v,k,λ) design has a hyperoval,
and (v,k,λ) ≠ (3,2,1), then k ≡ λ mod 2.

Pf.: Fix a point p outside S -- such must exist, else S=X so k=2
and we have the trivial 2-(3,2,1) design (which indeed is an exception).
The number of secants containing p is  n λ / 2  (why?), which is
(λ + k) / 2,  so this must be an integer, QED.

Example: if a projective plane of order q has a hyperoval then 2|q.

In fact we can say more.  If there's a hyperoval in a projective plane
then removing any point yields an oval.  Conversely, an oval yields
a hyperoval iff all its tangents meet in a point.

Prop. 1.48 (p.18): If  k ≡ λ mod 2  and there's a Type I oval S
then any point on the design lies on either one or all tangents.

Cor.: in a projective plane with q even, all the tangents meet
in a unique point p (proof: intersect any two of them), called
its "center" (the book says "nucleus", which feels rarer to me,
or "knot", which I don't think I've seen elsewhere).  So we recover
a hyperoval  S u {p}.

Proof: We have  k(k-1) = (v-1) λ  (because it's a square 2-design)
and  n = 1 + (k-1)/λ.  Claim: k, λ, n are odd.  Proof: if λ
were even then k-1 would be odd, contradicting  λ | k - 1.
Since λ is odd, k-1 is even, so k is odd and so is n = 1 + (even)/(odd).
It follows that the number of tangents through each point is odd
(so in particular not zero).

Apply "variance trick" again, computing the first and second moments of
the numbers of tangents going through the points of X; this time it will
turn out that the variance must be as large as possible.  Let N_i be the
number of points that lie on exactly i tangents.  [NB the book calls it n_i,
which would lead to the monstrosity n_n.]  Then the N_i sum to v = #X;
i N_i  sums to  (# tangents) * (# points per tangent) = n k; and
i (i-1) N_i  sums to (# pairs of tangents) * (# points per pair) = (n^2-n) λ.
Rearranging yields  sum_i (i-1) (i-n) N_i = 0.  Since N_i = 0 for i=0 and i<n
it follows that the only nonzero N_i are N_1 and N_n, QED.

Exercise: use these counts to verify that N_n = λ, so one might say that
in general there are λ "nuclei".

Examples: in P^2, "conics" = zeros of "nonsingular quadratic forms" =
zeros of homogeneous quadratic polynomials P(x,y,z) over k that are
irreducible even over the algebraic closure.  [That's a necessary proviso:
consider x^2 + xy + y^2  mod 2,  or  x^2 + y^2  mod 3.]  Note/recall that
the set of zeros of a homogeneous P is well-defined even though its value
at a point of P^2 in general is not.  A conic is clearly an arc because
restricting to a line yields a homogeneous quadratic polynomial in two
variables that is not identically zero (else P had a linear factor).
In fact it is known -- though we will not prove this (yet another possible
paper topic) -- that:

(i) all conics are equivalent under PGL_3(k);
(ii) all conics are Type I ovals (that is, they have q+1 points);
and remarkably, one can prove in general that
(iii) if q is odd then every oval is a conic (Segre [1954]).

Given (i), we can prove (ii) by checking it for one conic, and it is
convenient to use the conic xz-y^2; indeed the general point is
(r^2:rs:s^2) for some (r:s) in P^1(k) (check that this makes sense;
algebraic geometers recognize this as the first nontrivial example of
a "rational normal curve", or the image of the "degree-2 Veronese map"
P^1 → P^2).  So a conic is identified with of the q+1 points of P^1.

When q is even: (iii) is true for q=2 and q=4 (we'll say more about
center, and remove some other point to get a non-conic oval; and for q>8
(except q=32) there are even hyperovals not of the form "conic + center".

Finally I owe you some words about "intersection triangles" (p.21).
That's a convenient way to do a bunch of inclusion-exclusion calculations
at once without an explicit sum of binomial coefficients etc.
The general setup is: we have a subset Y of X such that, for each
s up to l=#Y, the number of blocks containing any s-element subset S of Y
depends only on s; and we want to count the blocks whose intersection
with Y is exactly S.  The hypotheses always hold if \D is a t-design
with l ≤ t, and in some other cases, e.g. if Y is an arc, or notably
a block of a Steiner system, plus one other case explored in problems
6 and 7.  The method is as follows, and illustrates the paradox that
sometimes a problem is made easier by generalizing it (even though
one might expect that proving a more general result should be
at least as hard).  Here we ask: given nonnegative i and j with
j ≤ i ≤ l, an i-point subset I of Y, and a j-point subset J of I,
how many blocks intersect I in J?  (The problem we started with is
the special case I=Y.)  This depends only on i and j; the count ν_{i,j}
[ν = Greek letter nu] can be computed recursively by starting from
the known values ν_{i,i} (e.g. ν_{i,i} = λ_i  for i ≤ s) and then using

ν_{i,j} = ν_{i-1,j} - ν_{i,j+1}

to fill each row of a triangle from right to left.  See the example
on page 21 (Table 1.1); yes, the Gross responsible for Theorem 1.55 (1974)
on page 22 (which gives necessary conditions for the bottom row of such
a triangle in the Steiner case to contain a zero) is our recently retired
Benedict Gross.

Feb. 17 -- affine and inversive planes

There are two topics remaining in chapter 1 that we'll cover
(and several that we won't).  One is the construction of the
star examples of design theory: the Steiner systems whose
automorphism groups are the sporadic Mathieu groups (or, for
the (3,6,22) system, a group containing M_{22} with index 2).
This is described on pages 22-24, but requires various notions
that are postponed to later chapters (e.g. chapter 6 for the
uniqueness of Pi_4).  We'll get to them later in the course.
The other topic is our concern today (pages 13-15).

AFFINE PLANES

Recall that P^2(k) is obtained from the affine plane k^2
by adjoining a line at infinity.  It's called an "affine plane"
because there's no choice of origin -- imagine an unmarked blackboard
or flat sheet of paper extended out to infinity for the familiar case
of k=R, the real numbers.  Of course this means that we can recover k^2
by starting from P^2(k) and removing a line and all its points.
This is still a Steiner system: any two points in the affine plane
determine a unique line.  This follows just from the axioms of a
projective plane, so we don't need any algebra on k to check it.
If  \D = (X, \B)  is a combinatorial projective plane of order q,
and B is any of its lines, then the residual \D^B := {X \ B, \B \ {B}}
is a finite affine plane.  (More precisely, the collection of blocks
of \D^B isn't quite \B \ {B}, because a block other than B needn't be
a subset of X \ B; the blocks are  B' \ B  for blocks B' of \B other
than B itself.)  This \D^B is a 2-(q^2, q, 1) design with q^2+q blocks,
called an affine plane of order q.  If the projective plane was algebraic
then we recover the affine plane k^2, an "algebraic affine plane"
(which the book will call AG(2,q), a notation I'm deprecating).

We can try this for any design \D and block B, but for \D^B to be
itself a design we must have each B'\B of the same size, and if that's
to be true for all B then \D must be a square design.  Conversely,
the residual of a square 2-(v,k,λ) design is a 2-(v-k, k-λ, λ) design.

Warnings:

(i) Different choices of B may yield non-isomorphic residuals \D^B
with the same parameters.

(ii) \D^B could have repeated blocks  But then two blocks of \D meet in
at least k-λ points, whence k ≤ 2λ.  But since k(k-1)=vλ, this means
v \leq 2(k-1)  [the book has the slightly weaker bound 2k-1, which is
good enough], and this cannot hold for both a design and its complement.
So either \D or its complement \D@ (whichever has the smaller blocks) has
a residual design without repeated blocks.

Writing  (v-k, k-λ, λ) =: (v^B, k^B, λ^B)  we solve for
λ = λ^B,  k = k^B + λ^B,  v = v^B + k^B + λ^B.
Then the condition  k(k-1) = (v-1)λ  on the parameters of
the square design \D becomes

k^B (k^B + λ^B - 1) = v^B λ^B

(p.13; check this manipulation, and that it works for the parameters
(v^B, k^B, λ^B) = (q^2, q, 1) of a finite affine plane of order q).

A 2-design satisfying these constraints is said to be quasi-residual.
Not every such design need be the residual of an actual square 2-design,
but nicely enough the case of affine planes (with λ=1) always works!
See p.14, Prop. 1.40.  The proof: in such a design, any two lines
meet in either 0 or 1 points.  If the design is \D^B then two lines are
disjoint iff they were obtained from lines that meet at a point of B.
To reconstruct \D, we need an equivalence relation || on affine lines L
such that L||L' iff L||L' came from projective lines through the same
point of B.  So we've seen L||L' iff L=L' or L,L' are disjoint.
We say that L,L' are parallel in this case.  Reflexivity and
symmetry are clear, but why is this transitive?

Lemma: for every line L and point p of a finite affine plane
there's a unique line L' containing p and parallel to L.
(That is, "Playfair's axiom" holds for finite projective planes.)

Pf.: if p in L then L'=L works, and no other line does because then
two distinct parallel lines would meet.  (We use n>1 here; if n=1
there's only one choice for L' anyhow...)  If p not in L, there are
q+1 lines through p, of which q meet L (one for each point of L),
so a unique line not meeting it.

Corollary: || is an equivalence relation.

Pf.: The only nontrivial case is three distinct lines L,L',L" with
L||L'||L".  If L and L" were not parallel they would meet at some point
p which has two parallels to L', namely L and L", contradiction.

Thus: Prop. 1.40 -- a quasi-residual Steiner 2-design (=affine plane)
is residual; thus there's a finite projective plane of order q
iff there is a finite affine plane of the same order.

INVERSIVE PLANES

Remarkably k^2 always extends to a Steiner 3-design; this gives us
our second infinite examples of Steiner 3-designs, and again with a
3-transitive automorphism group.  The necessary condition (1.33)
[p.11: k+1 must divide b(v+1)] is always satisfied by a combinatorial
affine plane [q+1 | (q^2+q)(q^2+1)] but we already know that such
integrality conditions are rarely sufficient.

Again the construction is nicely motivated by a classical geometrical
picture over the real numbers.  The (2-)sphere S can be considered a
3-design whose blocks are the circles (not just "great circles").
Stereographic projection from the north pole P identifies this with
the classical inversive plane: the blocks become Euclidean circles
as well as lines ("circles through infinity" -- NB this is a different
compactification of R^2, with just one point at infinity rather than
a whole line).  It's called the "inversive plane" because inversions
are automorphisms, taking any block (circle of the sphere) to another
block.  You've seen that S is also the "Riemann sphere" C u {infinity},
which we now know is also \P^1(C), the "projective line over C".
The circles are just the images of its subset \P^1(R) under PGL_2(C).
This works just as well over a finite field k = F_q, since there is
a quadratic extension k' = F_{q^2} to take the place of C: the images
of \P^1(k) under PGL_2(k) are the blocks, and the blocks through the
point at infinite (0:1) are just the affine k-lines on k' which may be
identified with k^2 by an arbitrary choices of k-basis for k'.

The same construction works for any d>1 to give a 3-(q^d+1,q+1,1)
design whose points are the points of the projective line over
a field of q^d elements and whose derived design is affine d-dimensional
space over a field of q elements.  Your first problem on the next
problem set is to prove that this actually works.

The book notes another description via "ovoids" in P^3(k).  This
then you might do your final project on that construction, and maybe
also the Suzuki-Tits ovoids for q=8,32,128,... noted on page 16.

Mild warning: the 3-(q^d+1,q+1,1) designs coming from P^1(F_{q^d})
are sometimes called "spherical geometries".  Indeed over R
there are inversive spaces of all dimensions d, each of which
gives an infinite Steiner 3-design -- namely, the d-sphere with
its circles -- and the derived 2-design is R^d with its lines.
But once d>2 there's no directly analogous construction in the
finite case: R has no algebraic extension of degree d, while
quadratic forms in d+2>4 variables over a finite field don't work
because they have isotropic lines whereas the Euclidean sphere
contains no Euclidean lines.

Feb. 22 -- intro to strongly regular graphs (chapter 2)

Def. 2.1 (p.29): graph G = (V,E) where V is a finite set of "vertices"
(singular = "vertex") and E is a collection of two-element
subsets of V ("edges").  Equivalently, a finite set with a
relation that is symmetric and never reflexive.  So we're
dealing only with undirected graphs -- edge {v,v'} is the same as
{v',v} -- without loops {v,v'} or multiple edges.  We say v is
adjacent to v', or a neighbor of v', in the graph iff
{v,v'} is an edge; the latter term suggests neighborhood of v =
set of all vertices v' adjacent to v, denoted G(v).  Also we can say
v is "incident" with an edge e iff v is one of its two vertices.
[Isomorphism and automorphism of graphs as expected: isomorphism
G=(V,E) → G'=(V',E') is a bijection V → V' that induces a bijection
E → E'; if G=G' this is an automorphism, and automorphisms form a group.]

So a graph is at least a 0-(n,2,#E) design, and never a 2-(n,2,λ)
design except in the trivial cases of the null graph (E is empty, λ=0)
or the complete graph (E consists of all 2-element subsets, λ=1).
Graphs that are 1-(n,2,d) designs are called regular of degree d.
(In general the  degree  of a vertex v of G is the number of edges
containing it; the text calls this the "valency" (p.30), which makes
some sense if you think about organic chemistry and forget double bonds.
So a graph is regular iff all vertices have the same degree.)  But
we already know that we don't expect to do much with 1-designs.
How to impose more structure than t=1 (too flabby, except for degree
1 or 2 and their complements) but less than t=2 (only trivial examples)?

As suggested by the Petersen graph:

Def. 2.4 (p.32) a graph G that is neither null nor complete is
strongly regular iff the number of common neighbors of any two
vertices v,v' depends only on whether v and v' are equal, adjacent or

In the first case that number is just deg(v) so the graph must be
regular.  So we have two parameters: n=#V, k=degree.  The other two
cases give us two more parameters.  We say the graph is regular with
parameters (n,k,λ,μ) if:

n vertices;
regular of degree k;
any two adjacent vertices have λ common neighbors;
any two non-adjacent vertices have μ common neighbors.

[NB: i) excluding null or complete graphs relieves us from the
conundrum of an ill-defined λ or μ respectively.
ii) The book uses Γ, and you're welcome to do so too, but
G is easier to type or TeX (though no easier to handwrite).
iii) I use American spellings but you won't be penalized for
using the text's British spellings such as "neighbours" ;-) ]

Examples:

The complement of G@ of a graph G = (V,E) is (V,E@) where
E@ is the set of all two-element subsets of V not in E.
(Same as the complement of a t-design.)  G is regular of degree k
iff G@ is regular of degree #V-k-1.  G is strongly regular with
parameters (n,k,λ,μ) iff G@ is strongly regular with parameters
(n, n-k-1, λ@, μ@) with

λ@ = n-2 + μ - 2k,
μ@ =  n + λ - 2k

on the present problem set.  As the text notes, since λ@ and μ@ are
nonnegative we must have n ≥ 2k-λ, n ≥ 2k - μ + 2.)

(2.8)
i) The "triangular graph" T(m) has (m^2-m)/2 vertices, indexed by
2-element subsets of an m-element set S; two are adjacent iff
they have an element in common.  (So the complement has two pairs
adjacent if disjoint.)  Parameters: ((m^2-m)/2, 2m-4, m-2, 4) [why?].
We must take m≥4, else the graph is complete (and also null if m<3...).
T(4) is the octahedron, complement of 3.K_2:  12--34  13--24  14--23.
T(5) is the complement of the Petersen graph, as you can check from
its labeling in the introductory handout.
These examples may be misleading: for m large enough T(m) has smaller
degree than its complement (how large is large enough?).  See the
second picture on p.35 (Fig. 2.3) for T(9), and note the convention
(p.34) for drawing this graph so the picture doesn't become an ugly clutter.
Automorphism group contains S_m, and usually equal to it, though with a
couple of exceptions -- trivial for m=2, a bit more interesting for m=4.
We'll use this to study the structure of S_m later in the course.

ii) "square lattice graph" L_2(m): vertices = S x S with #S = m;
adjacent iff common coordinate; parameters (m^2, 2m-2, m-2, 2).
Why doesn't S x T work if #S does not equal #T?  Must have m>1;
L_2(2) is a 4-cycle and the complement of 2.K_2 (the only cycles
that are strongly regular with our definitions are the 4- and 5-cycle);
for L_2(3) see Figure 2.2 (p.34); this graph is its own complement
(Problem 5ii = page 45, exercise 4ii).  Automorphism group?
It contains (and we'll soon easily show it equals) the wreath product
of S_m with S_2, with 2*m!^2 elements.

iii) (2.9, p.34-35) for r>1 and m>1 the disjoint union r.K_m of
r complete graphs of order m is strongly regular with parameters
(rm, m-1, m-2, 0) -- the only strongly regular graphs with μ = 0
(problem 2).  See page 34-35 for some further quaint terminology
involving some of these graphs and their complements.

iv) (2.10, p.35: what happens to Paley's construction of Hadamard
matrices when q is 1 mod 4?)  If q is a prime power congruent to 1 mod 4:
Paley graph P(q) with vertices V=k=F_q and x,y adjacent iff x-y is
a nonzero square (recall -1 is a square here so this relation is symmetric).
Strongly regular with parameters (q, (q-1)/2, (q-5)/4, (q-1)/4).  Proof is
problem 5 of homework 4 (= CvL problem 6 on page 45).  First few examples:
P(5) = pentagon; P(9) = L_2(3).

[in class we did this in the order iii, i, ii, and postponed iv = Paley
till Wednesday.]

As with 2-designs, we aim to describe, at least in some cases,
which parameters (n,k,λ,μ) are feasible, and to describe
an easy algebraic condition:

Proposition 2.6 (p.33): If a strongly regular graph has parameters
(n,k,λ,μ) then

k (k-λ-1) = (n-k-1) μ.

Proof: double count, of course.  Fix x, count edges {y,z} where
y is a neighbor of x but z isn't (and does not equal x either).
There are k choices for y; for each of them there are k neighbors,
of which one is x and λ are the common neighbors of x and y,
so there are k-λ-1 choices for z.  On the other hand there are
n-k-1 choices for z, and for each of them there are μ choices for y, QED.

You should verify that the parameters of G@ satisfy this necessary condition
iff those of G do, and that the parameters of the graphs we've seen so far
(triangular, square lattice, r.K_m, and Paley) satisfy it as well.

Next time we'll introduce an adjacency matrix of a graph, and use it
to obtain a further necessary condition on n,k,λ,μ that is
considerably subtler and quite powerful -- e.g. it's the source of the
result noted in the introductory lecture that if there is a Moore graph
of degree d then d is in {2, 3, 7, 57}.

Feb. 24 -- the adjacency matrix and the integrality condition (p.36-38)

Let G=(V,E) be a graph with n vertices.  Choose a labeling v_1,...,v_n
of the vertices.  A basic construction for much of graph theory:
The adjacency matrix A=A(G) is the matrix whose (i,j) entry a_{i,j}
is the number of edges from v_i to v_j.  As with the incidence matrix
of a design, this will turn out to be much more than mere bookkeeping.
Strictly speaking A depends not just on G but on the labeling, but
different labelings yield similar matrices  P^{-1} A P  for some
permutation matrix P (more on this below).  Also:

The entries a_{i,j} are nonnegative integers (this might change for
weighted graphs, but we're not going in that direction).

No loops: the diagonal entries a_{i,i} are all zero.  (In particular
the trace of A is zero.)

Edges are undirected:  A is symmetric.  So we can use results special to
real symmetric matrices, notably the decomposition of R^n as an
orthogonal direct sum of eigenspaces.

No multiple edges: each a_{ij} is either 0 or 1.

Since we use {1,2,...,n} to index both vertices and rows/columns,
it's useful to think of the (i,j) entry a_{i,j} as being really the
(x,y) entry where  x = v_i,  y = v_j.  This makes it canonical:
A is really a linear operator (self-transformation) on a vector space R^V
with an orthogonal basis e_x (x in V), and it takes e_x to the sum of e_y
over y in the neighborhood of x -- morally speaking it takes x to G(y).
For example, the degree of x is the inner product of Ae_x with the
all-1's vector \j, which (since A is symmetric) is also the inner product
of e_x in A\j.  A graph is regular of degree k iff A\j=k\j, i.e. \j is a
k-eigenvector.  The adjacency matrix of the complementary graph G@ is
J-I-A (where J is the all-1's square matrix of order n, as before).

So what does A^2 do?  It takes e_x to the sum of e_y, and then each
e_y to a sum of its e_z's; so the (x,z) entry is the number of paths
(x,y,z) of length 2 (note that x=z is possible).  Likewise the (x,x')
entry of A^3 is the number of length-3 paths (x,y,z,x') (again with
repeats and trackbacks allowed), etc.  For that matter the (x,y) entry
of A^1 is the number of length-1 paths, i.e. the number of edges {x,y};
and the (x,y) entry of A^0 is 1 if x=y and 0 else, so it does behave
like the number of "length-0 paths"...  For an undirected graph, the
length-m paths from x to x' are in bijection with the length-m paths from
x' to x (just retrace the path backwards), so A^m is symmetric, as
it should be since A is.

Now we know from linear algebra that we can analyze A^k via the
"spectrum" (i.e. eigenvalues) and eigenspaces of A.  This is one reason
why the adjacency matrix and its spectrum is such an important graph invariant.
[Note that the spectrum really is an invariant: reindexing the vertices
changes A to a similar matrix but does not change the spectrum; equivalently,
it's just a different indexing of the coordinates for the linear operator
on R^V.]  A further example: the diagonal entries of A^m are the number of
closed paths (x_1,x_2,...,x_m) of length m going through a given vertex x_1;
thus the sum is the number of closed paths with no restriction on the x_m
(and with a choice of initial point x_1 and direction, e.g. each pentagon
counts 10 times).  But the sum of the diagonal entries of a matrix is its
trace, and the trace of A^m is the sum of the m-th powers of its eigenvalues
counted with multiplicity.

There's even a remarkably fruitful analogy with the spectrum of the
Laplacian in analysis (the motivation is that the Laplacian can be
visualized as taking x to a(x)-x, where a(x) is the average of an
infinitesimal neighborhood of x); this connection has been productive
in both directions, with ideas and results in graph theory suggesting
analogues in analysis -- some proved, some open -- and vice versa.

===

Back to the adjacency of a graph and its spectrum.
It's particularly nice when A is a strongly regular graph.  We already saw
that A is regular of degree k iff A\j=k\j; as happened in Chapter 1, this
is tantamount to AJ=kJ, and since here A and J are symmetrical we deduce

(p.37, 2.14)  A J = J A = k J.

If G is strongly regular then we also get to describe A^2: its (x,z) entry
is the number of paths (x,y,z), which is just the number of common neighbors
y of x and z.  And we already know what that is:

k if x=z;
λ if x is adjacent to z; and
μ if x is disjoint from z.

This means

(p.37, 2.13)  A^2 = k I + λ A + μ (J-I-A).

Conversely a graph is strongly regular iff it satisfies (2.13) and (2.14)
(and is neither complete nor null).  Applying (2.13) to \j yields

k^2 = k + λ k + μ (n-k-1),  i.e.  k (k-λ-1) = μ (n-k-1)

as seen already (2.6).  Since \j is an eigenvector and A is symmetric,
A acts on the orthogonal complement of \j (recall the usual argument:
if ⟨u,\j⟩ = 0 then ⟨Au,\j⟩ = ⟨u,A\j⟩ = 0 because A\j is a multiple of \j).
This orthogonal complement is sometimes denoted R^V_0: it's the space of
vectors in R^V whose coordinates sum to zero.

What could be the eigenvalues of this operator on R^V_0?
Since the letter λ is already in use, let ρ be an eigenvalue, and u
a (nonzero) eigenvector, so  Au = ρu.  Since R^V_0 is the kernel of J
we have Ju=0, so (2.13) yields

ρ^2 u = (k + λρ - μ(ρ+1)) u

and since u is nonzero this is equivalent to the quadratic equation

ρ^2 - (λ-μ) ρ + (μ-k) = 0.

let r and s be the roots.  Note that they are always real -- as they
must be for a self-adjoint transformation -- because  r s = μ - k ≤ 0.
Moreover they're distinct, else λ=μ=0 and we have a null graph.
So we may, and always do without loss of generality, order them so r>s:

r = (λ - μ + \sqrt(D)) / 2,
s = (λ - μ - \sqrt(D)) / 2.

where  D = (λ-μ)^2 + 4 (k-μ).

Let f and g be the multiplicities.  We compute them by solving
linear equations: since R^V_0 is the direct sum of the r- and s-eigenspaces,
we compare dimensions to get  f + g = n - 1;  and since A has trace zero,
fr + gs + k = 0  (remember that A also has the 1-dimensional k-eigenspace
spanned by j).  These equations are independent because r and s are
distinct, and we find an elementary if somewhat offputting formula
(bottom of page 37):

f, g = (1/2) ( n - 1 ± ((n-1)(μ-λ) - 2k) / sqrt(D) )

with D as above.

Theorem (p.37, 2.16 = integrality condition)  f and g are
nonnegative integers.

[Check that this is the case in each of our arsenal of examples so far.
How are the eigenvalues r,s and multiplicities f,g of the complementary
graph related with those of G?]

It almost follows that D is a square; that is already a strong
condition, and already implies that f and g are rational -- which is
why Theorem 2.16 is sometimes called the rationality condition.
Why "almost"?  Because it could be that the numerator vanishes,
i.e. that (n-1)(μ-λ) = 2k !  This can indeed happen, and
already is the case for the pentagon with (n,k,λ,μ)=(5,2,0,1)
(and indeed for all Paley graphs, of which the pentagon is the first
example).  In that case we say G is of Type I.  We then have

n = 1 + 2k/(μ-λ)

but also (since G is not complete)

n > 1 + k

so the denominator μ - λ is positive but less than 2.
Hence  λ = μ - 1  and  n = 1+2k; the condition
k (k-λ-1) = μ (n-k-1) then yields  k = 2μ  so

Type I ⇔ (4μ + 1,  2μ,  μ - 1,  μ)

(and D = n = 4μ+1,  f=g=2μ), as with Paley.  Similar to BRC (Bruck-Ryser Chowla,
Thm. 1.21 on p.7), we then have (but don't prove) a further condition,
here due to Van Lint and Seidel 1966, that forces n=4μ+1 to be a sum of
two squares -- so for n<45 we get 5, 9, 13, 17, 25, 29, 37, 41, all
prime powers so there are Paley graphs, while 21 and 33 are excluded;
but 45 is allowed, and -- unlike the situation with finite projective planes
-- actually arises (Mathon 1975).

If not Type I then D is a square; such graphs are said to be of Type II.
These are not mutually exclusive: already L_2(3) = P(9) has both
properties (in general Type I is also Type II iff n is a square).

Feb. 26 -- Moore graphs; two more conditions: "absolute" and Krein (pages 41-42)

Moore graphs: Suppose G is a regular graph of degree k on n vertices.  Then:

G has diameter at most 2 ⇒ n ≤ k^2 + 1;
G has girth at least 5 ⇒ n ≥ k^2 + 1;
Equality in either implies both conditions hold.

In the case of equality we say G is a Moore graph
of girth 5.  (In the last problem of PS3 we gave a similar
analysis of minimal graphs of degree d and girth 6.
For similar results for other values of the girth, see e.g.
Bollobás's book Extremal Graph Theory.)  Equivalently,
a Moore graph of degree 5 is a strongly regular graph with
parameters (k^2+1, k, 0, 1).

Recall: the adjacency matrix A of a strongly regular graph with
parameters (n,k,λ,μ) satisfies A\j = k\j and

(p.37, 2.13)  A^2 = k I + λ A + μ (J-I-A).

Thus any eigenvalue ρ of an eigenvector orthogonal to \j is

ρ^2 u = (k + λ ρ - μ(ρ+1)) u

whence the eigenvalues are the distinct real numbers

r = (λ - μ + \sqrt(D)) / 2,
s = (λ - μ - \sqrt(D)) / 2.

where  D = (λ-μ)^2 + 4 (k-μ),

with  multiplicities

f, g = (1/2) ( n - 1 ± ((n-1)(μ-λ) - 2k) / sqrt(D) )

with D as above.  Hence

Theorem (p.37, 2.16 = integrality condition)  f and g are
nonnegative integers.

Example (see exercise 10, page 46): a Moore graph (of girth 5)
is strongly regular with parameters (k^2+1, k, 0, 1).  So Type I
iff k=2 (seen already), and otherwise D = (0-1)^2 + 4(k-1) = 4k-3 is
a square, necessarily odd so 4k-3=(2m+1)^2 and k=m^2+m+1 for some m>1.
m=1 and m=2 yield k=3 (Petersen) and k=7 (Hoffman-Singleton) respectively.
Any m is allowed by rationality, but integrality limits us to m=1,2,7
so k=3, 7, or 57.  Here the numerator  n - 1 + ((n-1)(μ-λ) - 2k) is
k^2 - 2k = (m^2+m+1)(m^2+m-1), while sqrt(D) = 2m+1, so we need

2m + 1 | (m^2+m+1)(m^2+m-1).

We've already seen the technique: write the RHS as a multiple of the LHS plus
a remainder, which is a constant of which 2m+1 must be a factor.  Trouble is
the remainder here is -15/16 (the value at the root m=-1/2).  OK, but if we
multiply the RHS by 16 then it's still a multiple, and then

16(m^2+m+1)(m^2+m-1) = (8m^3+12m^2+2m-1) (2m+1) - 15

so  15 | 2m+1,  whence 2m+1 is 1, 3, 5, or 15, so m = [0,] 1, 2, or 7
as claimed.  We can check that in each case the division by 16, and
later by 2, introduces no denominators -- indeed the division by 16
had to work because 2m+1 is odd -- so the integrality criterion allows
any of 2, 3, 7, 57 and as noted already the first three are known to
arise, each uniquely up to automorphisms, and the last is a famous
open problem.

======================================================================

Two more conditions.  The first arises from the observation that
we can project the n standard unit vectors to the r- and s-eigenspaces,
obtaining a configuration of vectors of the same length with only
two different inner products (other than the squared length), depending
on whether the corresponding vertices are adjacent or disjoint.
(The text outlines a difference construction.)  Indeed if we write each
e_x as (1/n) \j + u_x + v_x  with u,v in the r- and s-eigenspace we have

1 = ⟨e_x,e_x⟩  =  1/n  +   ⟨u_x,u_x⟩ +  ⟨v_x,v_x⟩
0 = ⟨e_x,Ae_x⟩ =  k/n  +  r⟨u_x,u_x⟩ + s⟨v_x,v_x⟩

which determines ⟨u_x,u_x⟩ and ⟨v_x,v_x⟩ likewise for distinct x and y

0    = ⟨e_x,e_y⟩  =  1/n  +   ⟨u_x,u_y⟩ +  ⟨v_x,v_y⟩
a_{x,y} = ⟨e_x,Ae_y⟩ =  k/n  +  r⟨u_x,u_y⟩ + s⟨v_x,v_y⟩

(where the entry a_{x,y} of A is 1 or 0 according as x,y are adjacent or
disjoint), and this determines ⟨u_x,u_y⟩ and ⟨v_x,v_y⟩ in terms of a_{x,y}.

Theorem (2.23+, p.14) [Delsarte-Goethals-Seidel 1977]: if there is
a set S of unit vectors in some inner product space R^f for which
⟨v,v'⟩ (v,v' distinct) takes on only two distinct values then #S ≤ f(f+3)/2.

Proof: Let b,c the two values of ⟨v,v'⟩; necessarily b,c < 1.
[The text uses Greek letters β, γ for our b,c.]
Let f_v be the quadratic polynomial functions on R^f:

f_v(x) = (⟨v,x⟩ - b) (⟨v,x⟩ - c) + b c (⟨x,x⟩ - 1)

on R^f.  Thanks to the last term they live in a space of dimension
f(f+3)/2 [the constant term vanishes].  But evaluation on v' in S
shows that they are linearly dependent.  Hence their number |S|
is at most f(f+3)/2 as claimed.

[Similarly, for m distinct inner products there's an upper bound on |S|
that grows as f^m/m!.  The best bound of this form requires some
finesse with adjustments analogous to "+ b c (⟨x,x⟩ - 1)"
but more involved and we won't say more about it here.

Yet another variant: suppose we have a set S of n pairs {v,-v} of
unit vectors in R^f, with only one pair {c,-c} of inner products
other than 1 and -1.  Then  f_v(x) = ⟨v,x⟩^2 - c^2 + c^2 (⟨x,x⟩ - 1)
is an even quadratic polynomial without constant term (and thus
homogeneous) that vanishes on all of S except {v,-v}, so the number of
such pairs is at most the dimension (f^2+f)/2 of the space of
homogeneous quadratic polynomials in f variables.  Here there are
more examples of configurations attaining the bound, starting with the
3 opposite pairs of vertices of a regular hexagon (f=2) or the 6 of
a regular icosahedron (with f=3).

Back to strongly regular graphs...]

Corollary ("absolute bound"): in a strongly regular graph n is at most
min(f(f+3)/2, g(g+3)/2).

Proof: apply 2.23 to the normalized vectors ⟨u_x⟩/|u_x| in R^f and
⟨v_x⟩/|v_x| in R^g.

Example: the pentagon gives an example of equality with f=2
(it really is a regular pentagon or pentagram in R^2 !).
The text gives another example where otherwise plausible
parameters are excluded.  In general equality is rare because
f and g are typically both near their average (n-1)/2.
I think only two further examples are known: (f,n)=(6,9)
[the Schläfli graph or its complement, related with the
exceptional E6 root system -- coming attraction] and (22,275)
[the McLaughlin graph or its complement, related with a sporadic
simple group in the Conway/Leech family; that's too exotic for us].

The Krein condition or Krein bound (Thm. 2.26, p.42) exploits:

(i) a formula for the projection  E_r: x → u_x  as a linear
combination of J, I, and A, and

(ii) a trick involving Kronecker products.

These are individually of interest but I have no special wisdom to
impart on the final result, except that it lets us exclude a few more
possible parameters as you're doing for Exercise 3 on page 45 (= prob.4).
For the record the bound says  (r+1) (k+r+2rs) ≤ (k+r) (s+1)^2
and likewise with r and s switched.

(i) since we have

Je_x = \j
Ie_x = (1/n) \j + u_x + v_x
Ae_x = (k/n) \j + r u_x + s v_x

and r,s are distinct, we can extract the map  E_r: x → u_x  as an
explicit linear combination of J, I, A, namely

[((k-s)/n) J + (sI-A)] / (s-r).

In particular the (v,v') entry again depends only on whether v,v' are
the same, adjacent, or disjoint.  But this matrix is positive semidefinite:
⟨v, E_r v⟩ ≥ 0 for all v (equality iff v in the span of \j and the s-eigenspace).
Now use:

Lemma 2.25 (p.42): if a matrix M=(m_{ij}) is positive semidefinite then
so is  M o M = (m_{ij}^2).

Proof: M o M is symmetric, and the self-adjoint linear transformation
associated to it is the restriction to a coordinate subspace of the
tensor square of the transformation M, QED.

Now write M o M as a linear combination of J, I, A and compute its
eigenvalues (it acts on each eigenspace of A by scalar multiplication),
obtaining the Krein bound from the condition that these eigenvalues
be nonnegative.

Hint / Word to the wise: the decomposition  x = (1/n) \j + u_x + v_x
and the associated inner-product computations have other uses; e.g. try
the sum of u_x or v_x over x in a (co-)clique.

Feb 29: Groups plan; uniqueness and automorphism group of
Π_2 (again) and Π_3

The second part of this course focuses on (usually finite) groups that
can be studied using some of the designs or strongly regular graphs
we've encountered and/or can shed light on them.  The groups we will
focus on are:

@ The symmetric and alternating groups S_n and A_n;

@ The linear and groups GL_n(k), SL_n(k) and projective linear groups
PGL_n(k), PSL_n(k) (usually k is finite); and

@ Mathieu's sporadic highly transitive subgroups M_n of S_n
(n=11, 12, 22, 23, 24).

In particular we'll prove the simplicity of A_n (n>4) and
PSL_n(k) (n>2, or n=2 and #k>3), explaining how simplicity fails
in small cases like A_4 and PSL_2(F_3), and also of M_n (which will
lead us to the corresponding Steiner systems); and use some of our
designs and graphs to explain some other remarkable behavior like
coincidences among some of these groups (listed below) and the
automorphism group of S_6 (which, uniquely among all S_n, is
larger than its group of inner automorphisms).  We'll temporarily
leave our textbook, relying on my text and TeX notes and some other
sources, except for using some of Chapter 6 (on Aut(S_6)).

The isomorphisms we'll address are as follows (listed according to
group size):

12  A_4 ≅ PSL_2(F_3)
'  24  S_4 ≅ PGL_2(F_3)
60  A_5 ≅ (P)SL_2(F_4) ≅ PSL_2(F_5)
' 120  S_5 ≅ ΣL_2(F_4) ≅ PGL_2(F_5)
168  (P)SL_3(F_2) ≅ PSL_2(F_7)
' 336  Aut((P)SL_3(F_2)) ≅ PGL_2(F_7)
360  A_6 ≅ PSL_2(F_9)
20160  A_8 ≅ (P)SL_4(F_2)

Each ' denotes a group G' that is not simple but is Aut(G) where
G is the simple group in the preceding line [G':G] = 2.
For G=(P)SL_3(F_2), an outer automorphism is inverse transpose,
corresponding to the action of G on the dual 3-dimensional space over F_2.
As we'll see Aut(A_6) is more complicated, containing A_6 with index 4
rather than 2; in the F_9 picture it's "PΓL_2(F_9)".
"(P)SL_n(k)" means the group is both PSL_n(k) and SL_n(k),
which as we'll see happens whenever the group of nth roots of unity
in k (i.e. the n-torsion group of k*) is trivial, which in turn
means n is relatively prime to |k*| = |k|-1.  The full list of
exceptional isomorphisms is about thrice as long, but includes
orthogonal, unitary, and symplectic groups, which we won't get to
in Math 155.  The ATLAS lists them all (though not, I think,
in a single place).

We'll introduce our approach by proving the uniqueness up to
automorphisms of the projective planes of order at most 5
and identifying their automorphism groups.  In each case we
do this by showing that all (hyper)ovals are equivalent, i.e.
by showing that a combinatorial projective plane Π must have a
(hyper)oval O and that Π can be reconstructed from O.

Today we review Π_2 and prove the result for Π_3.

For Π_2 we have:

Theorem: Let Π be a projective plane of order 2.  Then Π has
7 hyperovals, namely the line complements.  If O is a hyperoval of Π,
and O' is a hyperoval of some other projective plane Π' of order 2,
then any bijection f: O → O' extends uniquely to an isomorphism
Π → Π'.

Proof: The complement of a line l is a hyperoval because it has the
correct size (7-3 = 4 = 2+2) and meets every line l' in either 0 or 2
points (the former iff l'=l).  Conversely any hyperoval O has Bin(4,2)=6
secants, so 7-6=1 passant line l, and since the complement of l has size
4 = |O| it must be the same as O.

Now given f: O → O' we know the images in Π of the four points of O
and the six secants; thus the remaining line of Π, the unique passant
of O, must go to the unique passant of O'.  Since we know the images of
all lines, the images of all points are determined (since each point
is the unique intersection of the 3 lines containing it, or indeed of
any 2 of them).  This shows that the extension of f, if it exists,
is unique.

If O = {p1,p2,p3,p4} we then have the following roster of points and
lines in Π: the lines are

6 secants s_{ij} determined by p_i and p_j
1 unique passant line l
7 total

4 points of O: p1, p2, p3, p4
3 points off O: intersections s_{12} s_{34}, s_{13} s_{24}, s_{14} s_{23}
7 total

Moreover we know the 3 points on each line: s_{ij} contains s_i, s_j,
and an "off O" point where s_{ij} intersects another secant; and l
contains the three "off O" points.  These satisfy the axioms of a
finite projective plane of order 2, as can be checked either directly
or by observing that we already have one example.  So we can reconstruct
O' from f(p1), f(p2), f(p3), and f(p4) to complete the isomorphism f,
QED.

Corollary: i) Π_2 is the unique projective plane up to isomorphism.
ii) Aut(Π_2) = (P)GL_3(F_2) = (P)SL_3(F_2), and this group acts
transitively on hyperovals of Π_2.

Proof: (i) is clear: fix a hyperoval O in Π_2; for any projective
plane Π' of order 2 we can choose any of its 7 hyperovals O',
and any bijection f : O → O', and extend f to an isomorphism
Π → Π'.

(ii) Take Π' = Π_2.  Then there are 7 choices of O and 4! bijections,
whence |Aut(Π_2)| = 168.  Since Aut(Π_2) contains (P)GL_3(F_2),
a group of order 168, this must be the full group of automorphisms.
By our theorem these include automorphisms taking any oval O to
any other oval O'.  QED

Moreover: Fix a line l, let O be its complement, and consider the
stabilizer H, which is the subgroup of Aut(Π_2) consisting of
automorphisms g such that g(O)=O (equivalently: such that g(l)=l).
Our Theorem gives an isomorphism from H to the group S_4 of permutations
of O.  On the other hand, H acts transitively on l; this recovers the
group homomorphism S_4 → S_3.  Explicitly: if we label the points of O
p_1, p_2, p_3, p_4 then each point p of l is on two secants, and thus
determines a partition of O into two pairs -- and this partition
uniquely determines p as the intersection of the corresponding secants.
But there are only 3 partitions, so each one arises -- and this is
tantamount to the usual construction of the homomorphism S_4 → S_3
by the action of S_4 on such partitions.

So what happens for Π_3?

Theorem: Let Π be a projective plane of order 3.  Then Π has
234 ovals.  If O is a oval of Π, and O' is a oval of some other
projective plane Π' of order 3, then any bijection  f: O → O'
extends uniquely to an isomorphism Π → Π'.

Proof: We count ordered ovals (p1,p2,p3,p4).  There are:

13 choices for p1;
12 = 13-1 choices for p2 given p1;
9 = 13-4 choices for p3 given p1 and p2
(namely the complement of the line joining p1 and p2); and
4 = 13 - 3 - 3*2 choices for p4 given p1, p2, p3
(namely the complement of the three lines joining two of
p1, p2, and p3, which are distinct because of how we chose p1,p2,p3)

for a total of 13*12*9*4 = 5616.  Since that's 4! times the number of
unordered ovals, the count is 5616/234 as claimed.

Now fix an ordered oval O = {p1,p2,p3,p4}.  It has four tangents,
call them t1,t2,t3,t4, and six secants, call them s_{i,j} for distinct
i,j in {1,2,3,4}.  Given f: O → O' we know their images in Π,
call them

p1', p2', p3', p4';  t1', t2', t3', t4'; and s'_{i,j}).

There are six pairs of tangents, each determining an intersection point.
We claim that these are all distinct.  Indeed each point of Π - O
lies on either 0 or 2 tangents -- not 1 or 3 by parity, nor 4 because
(aha) Π has no hyperovals.  So once two tangents meet at a point
there are no others.  So we've located six more points of Π
(the intersections of tangent pairs), and thus their images in Π'.
The remaining 13-4-6 = 3 points of Π each lie on two secants, so
determine a partition of O into two pairs -- and as in our analysis of
Π_2 it follows that each of the three partitions arises once.
So we've determined the images of all 13 points, and this fixes
all the line images as well.  This proves that f, if it exists,
is unique.

The remaining 3 lines can be identified as follows.  Consider
the intersection of t1 and t2.  The four lines through it are
t1, t2, s(34), and a passant line l.  What other points are on l?
Not the intersection of t1 with t3 or t4 (because we already know
where l meets t1), nor the intersection of t2 with t3 or t4
(for the same reason).  But there are only three l's to be shared among
six such points, so l must also contain the intersection of t3 and t4.
Each of the other two points of l lies on two secants.  There are 3 such,
and we cannot use the intersection of s(34) and s(12), because we
already know where each of these secants meets l.  So it must be
the other two.  This completes the description of Π_3 starting from
a labeled oval; we could now check directly that it satisfies the
axioms of a projective plane but it is simpler to note that we already
have P^2(F_3) so we must have reconstructed it.  So we can complete the
reconstruction of f: Π → Π' knowing O' and the images on O' of
p1, p2, p3, p4, QED.

Corollary: i) Π_3 is the unique projective plane up to isomorphism.
ii) Aut(Π_3) = PGL_3(F_3) = PSL_3(F_3), and this group acts
transitively on ovals of Π_3.

Proof: (i) is clear: fix a hyperoval O in Π_3; for any projective
plane Π' of order 3 we can choose any of its 234 hyperovals O',
and any bijection f : O → O', and extend f to an isomorphism
Π → Π'.

(ii) Take Π' = Π_3.  Then there are 234 choices of O and 4! bijections,
whence |Aut(Π_2)| = 13*234 = 5616.  But Aut(Π_3) contains PGL_3(F_2),
a group of order 26*24*18/2 = 5616.  So this must be the full group of
automorphisms.  By our theorem these include automorphisms taking any
oval O to any other oval O'.  QED

Remark:  The fact that every point of Π - O is on 0 or 2 tangents
(while of course every point of O is on exactly 1 tangent) means that
the tangents constitute an oval O* in the dual projective plane Π*.
The tangents of O* in turn are the points of O, so O**=O as should be
the case for a duality.  This works for a conic over any field not of
characteristic 2 (we already saw that in characteristic 2 the tangents
to an oval all meet at a point).

March 2: Uniqueness and automorphism group of Π_4 (day 1)

We just finished showing: for q=2 and 3, a projective plane Π
of order n has respectively 7 hyperovals or 234 ovals O, so
4!*7=168 and 4!*234=5616 ordered hyperovals, and Aut(Π) acts
simply transitively on them; moreover Π is unique up to isomorphism:
for any other Π' and O', any bijection  O → O'  extends uniquely to
an isomorphism  Π → Π'.

Can we expect this to hold in general?  Not if the automorphism group
is no larger than PGL_3(F_q): the size of this group is
(q^3-1) (q^3-q) (q^3-q^2) / (q-1)  or about q^8,  which for
large q is much smaller than the number (q+1)! or (q+2)! of
permutations of an oval, let alone that times the number of ovals.
Nevertheless it still works for n=4 (with 168 hyperovals), though
in a special way.  (For n=5, you'll see that it doesn't quite work,
but as many as 120 of the 6!=720 possible bijections lift.)

For starters, how many ordered ovals are there?  The number of
ordered n-arcs in a projective plane of order q is:

n=1: q^2+q+1
n=2: (q^2+q+1) (q^2+q)
n=3: (q^2+q+1) (q^2+q) (q^2)
n=4: (q^2+q+1) (q^2+q) (q^2) (q-1)^2
n=5: (q^2+q+1) (q^2+q) (q^2) (q-1)^2 (q^2-5q+6)
n=6: (q^2+q+1) (q^2+q) (q^2) (q-1)^2 (q^2-5q+6) (q^2-9q+21)

and for n=7 and higher we can't tell because we don't know how many
coincident triples of secants there are.  So this technique cannot work
(at least not in this form) past q=5.

Also, just how transitively do we expect Aut(Π) to act?
Suppose Π = P^2(F_q) and consider PGL_3(F_q).  It acts
2-transitively; not 3-transitively because of collinearity,
but that's the only obstruction even up to 4 points.  Indeed
for any P_1,P_2,P_3,P_4 no three of which are collinear,
and any Q_1,Q_2,Q_3,Q_4 no three of which are collinear,
there is an automorphism in PGL_3(F_q) taking P_1 to Q_1,
P_2 to Q_2, P_3 to Q_3, and P_4 to Q_4 -- and this automorphism
is unique!  More generally we have:

Proposition:  For any field k, the group PGL_n(k) acts
simply transitively on ordered (n+1)-tuples of points in P^(n-1)(k)
in general linear position.

"Simply transitively" means that for any two such (n+1)-tuples
(P_0,P_1,...,P_n) and (Q_0,Q_1,...,Q_n) there is a unique group element
taking one to the other.  "General linear position" means no 3 points
on a line, no 4 on a plane, ... no n on a hyperplane.  This last
condition is actually sufficient: if we have k points on a projective
subspace of dimension k-2 then any n points containing them are on a
subspace of dimension n-2.

Proof: It is enough to prove this for one choice of (P_0,P_1,...,P_n).
Let P_0 be (1:1:1:...:1) and for each i=1,2,...,n let P_i be the
(1-dimensional subspace of k^n generated by) the i-th unit vector.
Let v_0,...,v_n be nonzero vectors in k^n corresponding to Q_0,...,Q_n.
General linear position ⇔ any n of these are linearly independent
⇔ any n of these form a basis.  So there's a linear map T taking
e_i to v_i.  This map is unique, so how do we get P_0 to go to Q_0?
Projectively, P_1,...,P_n go to Q_1,...,Q_n iff each e_i goes to
c_i v_i for some nonzero scalar c_i, and any choice of (c_1,...,c_n)
gives a unique T; conversely T determines (c_1:...:c_n) [note the
colons: these are projective coordinates].  Now v_1,...,v_n is a basis
so  v_0 = a_1 v_1 + ... + a_n v_n  for some scalars a_1,...,a_n;
moreover none of them are zero because then there'd be a linear
dependence among v_0 and the other n-1 v_i's.  So choose
(c_1:...:c_n) = (a_1:...:a_n)  and we're done.

Note that it should follow that #(PGL_n(F_q)) is the number of
ordered (n+1)-tuples of points in general linear position;
this can be checked directly (as we in effect did for q=n=3).

OK -- so we expect transitivity on n-arcs for n=4 but not beyond.
Yet a hyperoval in Π_4 has 6 points.  How can this be?

Let's count: taking q=4 in the above n-arc counts, we find that
the number of ordered hyperovals is

21 * 20 * 16 * 9 * 2 * 1 = 6! * 168

in particular, having chosen the first 4 points we have no choice
about the last 2 except for their order.  We can give explicit
coordinates starting from those of our proof above for the first
4 points: each of the remaining two must have nonzero coordinates,
all distinct.  Scaling the first to 1, there are only two choices:
(1:a:b) and (1:b:a), where a,b are the elements of F_4 other than
0 and 1, satisfying  ab = a+b = a^3 = b^3 = 1.  So we have the
hyperoval

(1 : 0 : 0)
(0 : 1 : 0)
(0 : 0 : 1)
(1 : 1 : 1)
(1 : a : b)
(1 : b : a)

and now it's clear that the remaining two points are switched by
the automorphism of P^2(F_4) that takes each coordinate to its
Galois conjugate: 0,1,a,b go to 0,1,b,a respectively.

So in fact it is reasonable to expect even for n=4 that every
bijection of hyperovals lifts to a unique isomorphism of projective
planes (provided it's true that Π_4 is unique).  That's what we'll prove.

Before starting this, yet another way that n=4 is special.  We know
one way of getting hyperovals in an algebraic projective plane P^2(k)
when #k is even: use a conic plus its center.  If all *ordered*
hyperovals are equivalent then any point can be the center.
How can this be?  Two conics can meet in at most 4 points (Bezout).
But that's just enough for center-switching to work for #k=4.
(It also works for #k=2 but that's not as surprising...)

OK: start with any order-4 projective plane Π and one of its
6!*168 ordered ovals  O = {p1,p2,p3,p4,p5,p6}.  This lets us
identify 15 of the lines: the 15 secants s_{i,j}.  That leaves
15 points and 6 passant lines to account for.  Each of the 15 points
is on 3 secants and 2 passants.  This immediately tells us that the
passant lines form a hyperoval O* in the dual projective plane Π*,
whose secants are the 15 points off O.  How to label O*?  Since we'll
show that any permutation g of O lifts uniquely to Aut(Π), we have
a homomorphism from the permutations of O to those of O* -- and that
turns out to be an outer automorphism from S_6 to S_6 !  We'll see how
this happens starting next time.

March 4: Uniqueness and automorphism group of Π_4 (day 2)

Recall: we began with any order-4 projective plane Π and one of its
6!*168 ordered ovals  O = {p1,p2,p3,p4,p5,p6}.  This let us
identify 15 of the lines: the 15 secants s_{i,j}.  That leaves
15 points and 6 passant lines to account for.  Each of the 15 points
is on 3 secants and 2 passants.  This immediately tells us that the
passant lines form a hyperoval O* in the dual projective plane Π*,
whose secants are the 15 points off O.

Consider one of these points q.  It is contained in 6/2=3 secants of O,
and thus 5-3=2 passant lines.  Now the secants determine a partition of
O into three pairs.  As noted already, a partition of a six-element set
into three pairs is called a syntheme in this context.  For any n,
the number of partitions of a 2n-element set into n pairs is

1 * 3 * 5 * 7 * ... * (2n-1) = (2n)! / (2^n n!)

[proofs: i) fix an element: it has 2n-1 possible partners; for each
choice we have to count pairings of the remaining 2(n-1) elements, etc.;
ii) standard double-counting; or iii) all are equivalent under S_{2n}
and the stabilizer has order 2^n n! -- yes, this is more-or-less
equivalent to (ii)].  This count is 1, 3, 15, 105, 945, ...  for
n=1,2,3,4,5,... .  For n=2 this was already seen to give the exceptional
homomorphism S4 → S3.  For n=3 it will lead us to the exceptional
(outer) homomorphism S6 → S6: the count is still small enough to
coincide with the number Bin(2n,2) of pairs.

We begin by observing that 15 is just big enough to account for all
the points not on O (clearly the syntheme determines the point).
So for any syntheme, the corresponding three secants are coincident.
This is the last time that can happen: already 105 is too large for
us to hope that each 1-factor of an oval in Π_7 yields four secants
that meet at a point.  (As you'll see in the present homework,
even for Π_5 it doesn't quite work even though there are still
only 6 points on the oval.)  Getting back to Π_4: we've now accounted
for all 21 points, namely the 6 points of the oval and one point for
each of the 15 synthemes; we've also accounted for 15 of the 21 lines,
namely the secants (one of each of the 15 pairs).

So we need only figure out how to group the 15 syntheme points into
6 passant lines, each of which will contain exactly two of them.
We've already noted that the passant line forms an oval O* in Π*;
its secants are the points where two passants intersect -- necessarily
syntheme points -- so we're in effect seeking to identify synthemes in O
with pairs in O*.

Now the key point is that two syntheme points q,q' are on a passant line
⇔ the unique line through q and q' is not a secant ⇔ the synthemes
associated to q and q' have no pair in common.  Well a passant line l
contains 5 points that determine 5 synthemes, each of which consists of
3 pairs, and all 3*5=15 of these pairs are distinct.  But there are only
Bin(6,2)=15 pairs to go around.  So each of the six passant lines l determines
a partition of the pairs into 5 synthemes -- what we'll call a total,
short for "synthematic total" (and the text calls a "factorization",
short for "1-factorization").

Lemma: there are six totals.

Proof: count triples (t,s,s') where t is a total and s,s' two of its
synthemes.  Given t there are 5*4=20 choices for (s,s').  We can check
that any two disjoint synthemes lie in a unique total.  (There are
4 other synthemes disjoint from both s and s'; one has a pair in common
with each of the other three, but the other three are pairwise disjoint.)
On the other hand, given s there are 4*2=8 choices for s', so the
number of (s,s') pairs is 15*8=120.  Hence there are 120/20=6 totals, QED.

So we have a unique choice for completing our (re)construction of Π from O:
there's one passant line for each of the 6 totals.  To check that this
works, we can either verify the axioms directly, or note that the
algebraic Π_4 is already known to satisfy them and we just showed
that starting from any of its 168 hyperovals we get our labeling of
all 21+21 vertices and edges.

We can thus formulate a theorem for projective planes of order 4
analogous to what we've done for order 3 and 2: let Π, Π' be
projective planes of order 4, and O, O' hyperovals on them;
then every bijection  O → O'  extends uniquely to an isomorphism
Π → Π'.  Taking Π'=Π we also find that Aut(Π) has size
6!*168 = 120960.  Identifying Π with P^2(F_4) -- which we can do
now that we've shown all these projective planes are isomorphic --
we also deduce that every automorphism is either a projective linear
transformation (i.e. an element of PGL_3(F_4)) or such a transformation
composed with the nontrivial automorphism of F_4.

Now what about Π'=Π* and O'=O*?  This gives an isomorphism between
the symmetric groups of permutations of O and O*, both S_6.  But it's
not an inner automorphism!  We noted already that for any group G
the group of inner automorphisms is G/Z(G) [where Z(G) = center of G];
for G=S_n with n>2, the center is trivial and conjugation by g is just
relabeling the objects that S_n permutes.  In particular an inner
automorphism of S_n preserves the cycle structure of each element.
But here we can see explicitly that this is not the case.  Recall that
we have projective coordinates

(1 : 0 : 0)
(0 : 1 : 0)
(0 : 0 : 1)
(1 : 1 : 1)
(1 : a : b)
(1 : b : a)

for the points in O.  Using these coordinates, O* consists of
the six lines (y:x:x), (x:y:x), (x:x:y) where x and y are
distinct nonzero elements of F_4.   Now a cyclic permutation of
the three coordinates has cycle structure 3,1,1,1 on O --
to see that it fixes (1:a:b), note that b=a^2 and a^3=1, so

(1:a:b) = (1:a:a^2) = (a:a^2:a^3) = (a:a^2:1) = (a:b:1),

and likewise for (1:b:a); but there are no fixed points in O*
so the cycle structure of the action on O* is 3,3.

We'll see that this is essentially the unique way to get an outer
automorphism of any S_n, in that n must equal 6 and the outer
automorphism must be the one relating O and O* up to conjugation
(= relabeling of O, or equivalently of O*).  And we'll exploit
the strongly regular graph T(n) -- recall that this graph has
vertices = pairs taken from an n-element set, and edges = pairs
that intersect in one element.  We'll prove that Aut(T(n)) = S_n
for n>4, and that for n other than 6 any automorphism of S_n induces
an automorphism on T(n) via the action on involutions.

March 9: more about the outer automorphism of S_6,
and Aut(S_n) in general  (cf. Chapter 6 of the textbook)

Last time we constructed a total by starting from two disjoint
synthemes, finding the four disjoint from both of them, and
showing that one is a dead end while the other three are pairwise
disjoint and thus complete the total.  Does that ring a bell?
We have 15 synthemes, each disjoint from 8 others; form the
relevant regular graph with n=15 and k=8, and then in effect
we're looking to extend an edge to form a maximal clique of 5,
succeeding provided we avoid a wrong turn at the beginning.
That's just what happened with cliques of maximal size m-1
on the triangular graphs T(m) for m>4; here m=6, and indeed
15 and 8 are the correct parameters.  Check: the syntheme graph
is indeed strongly regular with the same parameters (15,8,4,4).
In fact it's isomorphic with T(6), and we can use this isomorphism
to explain or construct an outer automorphism of S_6 if we identify
pairs (the vertices of T(6)) with simple transpositions and
synthemes (the vertices of the syntheme graph) with triple transpositions.
Indeed it's somewhat surprising that our text doesn't make this connection
(see Chapter 6 for its development of Aut(S_n) and the hyperoval picture).
We give this approach today -- it's basically a standard approach,
but "T(m)" and "cliques" provide a convenient link with something

What lets us connect these graphs with the structure of G is that
the 15 simple transpositions constitute a conjugacy class in S_6,
and the triple transpositions constitute another one.  Recall:
Elements x and y of a group G are said to be conjugate
iff there exists g in G such that  y = g^{-1} x g;  this is an
equivalence relation (why?), and the equivalence classes are called
conjugacy classes.  Any automorphism of a group must permute
the conjugacy classes -- and if a class is finite (as it must be
in a finite group) then its image must be a conjugacy class of
the same size.

Now in a symmetric group S_n we have

conjugacy class ↔ cycle structure ↔ partition of n.

For example, 6 has 11 partitions; here they are, each together with
the parity and size of its conjugacy class:

*   1   111111
15   21111
*  45   2211
15   222
*  40   3111
120   321
*  40   33
90   411
*  90   42
* 144   51
120   6

Stars mark even classes.  For example, the number of 6-cycles g is 5!:
they are in bijection with permutations ( g(1), g^2(1), ..., g^5(1) )
of {2,3,4,5,6}.  We'll soon explain how to get such counts for S_n
in general; for now we check our arithmetic by verifying that
the counts add up to #(S_6) = 6! = 720, with the even and odd classes
each adding up to half the total, e.g. 15+15+120+90+120 = (30+90)+120+120
= 360 as it should be.  Last time we saw that our outer automorphism takes
cycle structure 3111 to 33; it is comforting that these classes have the
same size, 40.  We note three further such coincidences: we've already seen
the count 15 for both 21111 and 222, and there's also 120 for both 321 and 6
and 90 for both 411 and 42.  These all look like candidates for switching:
they have not just the same size but also the same exponent, respectively
2, 6, and 4.  We next show that the outer automorphism switches
21111 ↔ 222 and 321 ↔ 6 but leaves 411 and 42 alone.

The first of these we already expect: our isomorphism Π → Π*
extending a bijection O → O* takes secants (⇔ pairs in O, also
lines outside O*) to syntheme points (points outside O).  To prove it,
note that for any 3-cycle c there's a simple transposition commuting with it
(switch any two of the three fixed points); so an outer automorphism takes
a simple transposition to an involution that commutes with a double 3-cycle,
which a simple transposition can't do (check this).  So it must go to a
triple transposition as claimed.  (Can you find a double 3-cycle that
commutes with a triple transposition?)

Similarly c commutes with a 321 permutation g (namely the product of c with
the simple transposition from the previous paragraph) but a double 3-cycle
c' cannot (e.g. if p is the unique fixed point of g then c' moves p, so
the conjugate of g by c' doesn't fix the same p).  So the conjugacy class
[g] must go to the 6-cycles as claimed.

Finally, since a 411 permutation is odd it is the product of an odd number
of simple transpositions (three suffice), so its image under an outer
automorphism is an odd product of triple transpositions.  But that's still
an odd permutation, so it cannot have cycle structure 42, QED.

So now what about S_n in general?  For a partition (or cycle structure) with
a1 1's, a2 2's, a3 3's, etc., the conjugacy class has size

(#)   n! / (a1! * a2! 2^a2 * a3! 3^a3 * ... )

(of course the product is finite because for k>n the k-th factor is
0! k^0 = 1).  This can be proved by generalizing either of our two
arguments last time for the special case that a2 = n/2 (and all other
a's vanish).  Fortunately we don't have to prove that for n>6 there are
no coincidences: it's enough to work with the simple transpositions.
Indeed we know that these generate S_n (any permutation can be obtained --
or if you prefer undone -- by a sequence of pairwise switches), so an
automorphism of S_n is determined by its action on simple transpositions.
For future reference we observe that S_n is even generated by the n-1 simple
transpositions (12), (13), (14), ... (1n): reduce to the previous claim
by writing an arbitrary simple transposition (ij) as (1i)(1j)(1i).

In turn we have a bijection from the simple transpositions to the vertices
of T(n).  I claim that any automorphism that takes this conjugacy class to
itself yields an _automorphism_ of T(n), i.e. it takes edges to edges.
This follows from the observation that two simple transpositions
do *not* commute if and only if they overlap on exactly 1 letter --
that is, iff the corresponding vertices of T(n) are adjacent.

Digression:

i) in general the product of two simple transpositions has order
1 if they coincide, 2 if they commute, and 3 if not; in group-theory lingo
this is sometimes expressed by saying that the simple transpositions form
a "conjugacy class of 3-transpositions" -- quite a few of the groups
in the ATLAS have such a description.

ii) the syntheme graph has a similar description, because two triple
transpositions don't commute iff they have *no* pair in common.
(If there's a common pair we're basically in the 4-element
normal subgroup of S_4 consisting of the identity and the three
double transpositions.)  Here too we see that these 15 involutions
as 3-transpositions -- which is what we expect since this conjugacy
class is equivalent with the simple transpositions under Aut(S_6).

Back to T(n): an automorphism of T(n) must permute its maximal cliques.
But for n>4 there are exactly n of them, and each vertex is contained
in exactly 2.  This gives us a homomorphism from Aut(T(n)) to S_n
and shows that the kernel is trivial (if we know where each maximal clique
goes then we can identify each vertex by intersecting the two cliques
it is in).  Since the map is clearly surjective, we deduce that
Aut(T(n)) = S_n for n>4.  We already saw that for n=4 there are twice as
many maximal cliques and indeed Aut(T(4)) is twice as large as S_4;
for T(3) it is easy to see Aut(T(3))=S_3, while Aut(T(2)) and Aut(T(1))
are clearly trivial.

So far we have:

Proposition: an automorphism of S_n takes the conjugacy class of
simple transpositions to itself if and only if it is inner.

Proof: given the automorphism f, it is enough to find an inner automorphism
that agrees with f on the simple transpositions.  If n is not 4, use the
image in S_n of the automorphism of T(m).  If n=4, the construction still
works because f cannot induce an automorphism of T(4) that switches the
two kinds of maximal clique: one gives three elements of S_4 that generate
all of S_4, the other generate only a subgroup S_3.  This completes the proof.

In particular for S_6 every automorphism is either inner or our one
outer automorphism followed by an inner one; so Aut(S_6) contains S_6
as a subgroup of index 2.

Theorem: For n other than 6, every automorphism of S_n is inner.

Proof sketch: For n=1 and n=2 the automorphism group is trivial.
For n=3,4,5 there is no other conjugacy class whose size is
the same as that of the simple transpositions.  For n>6
the same is true because the class of simple transpositions
is the smallest non-identity class, as one can check from the
formula (#) for the size of an arbitrary conjugacy class in S_n:
it's almost clear that for large n the size of any other class
grows at least as n^3, which does it for sufficiently large n,
and a bit of routine but unedifying work turns "sufficiently large"
into "at least 7".

March 21: the (5,6,12) Steiner system via Aut(S_6)
(cf. Chapter 6 of the textbook; also p.27, Exercise 13)

Recall that we found in Π_4 an oval O and a dual oval O*
permuted by S_6 in different ways, and showed that the resulting
outer automorphism of S_6 is essentially unique (i.e. unique up to
inner automorphism).  Such 6-element sets O and O* with non-conjugate
actions of S_6 are said to be "dual".  This duality gives:

@ a bijection between 15 pairs in O and 15 synthemes in O*
@ a bijection between 15 synthemes in O and 15 pairs in O*
@ a bijection between 10 even splits of O and 10 even splits of O*

The last of these is constructed as follows.  There are Bin(6,3)/2 = 10
ways to partition O into two triples.  Fix one of these, and let c and c'
be 3-cycles permuting them.  Then c and c' commute and generate a subgroup
A ≅ (Z/3)^2 over S_6 (it's a 3-Sylow because 3^2 || 6!).  We saw already
that each of c and c' maps to a double 3-cycle of O*.  Since these commute
but are neither equal nor inverses of each other, those double 3-cycles
must come from the same even split of O*, say c → c* c'*  and
c' → c* (c'*)^(-1).  So we get a well-defined bijection as claimed.
Moreover cc' maps to c*^(-1) and c^(-1)c' maps to c'*, so if we use
the same rule starting from even splits of O* we get the same bijection
(and A goes to a subgroup conjugate to A, as it must by Sylow's theorem).

What's more, these bijections are related as follows:

@ if a pair p in O is contained in a syntheme s, then the dual syntheme p*
contains the dual pair s*.  Proof: p and s commute as elements of order 2
of S_6, and that's the only way two elements in their conjugacy classes
to commute.  [NB The three pairs in a given syntheme constitute a maximal
coclique in the triangular graph T(6); on the syntheme graph the
corresponding coclique is the three synthemes containing a given pair.]

@ an even split {c,c'} splits each pair in a syntheme s* iff its dual
{c*,c'*} does not split the pair dual to s.  Proof: that's the only way
a triple or simple transposition can commute with a nontrivial element of
the 3-Sylow group A.

We can use this to obtain:

Theorem (6.7, page 87-88, extended)

The automorphism group M_12 is "simply 5-transitive":
for every pairwise distinct x_i (i=1,...,5) and pairwise distinct
y_i (i=1,2,3,4,5) there exists a unique automorphism taking each
x_i to y_i.  In particular |M_12| = 12*11*10*9*8 = 95040.

Proof: Let D be such a design.  It has  Bin(12,5)/Bin(6,5) = 132
blocks, and the intersection triangle for the six points of a block is

132
66    66
30    36    30
12    18    18    12
4     8    10     8     4
1     3     5     5     3     1
1     0     3     2     3     0     1

(this is page 87, Table 6.1 but with a typo corrected: the text has
2,3,2 instead of 3,2,3 in the bottom row).  In particular the complement
of a block is again a block.  This suggests the following construction.
Fix a block B with complement B@.  Choose an isomorphism between the
symmetric groups of B and B@ that realizes the outer automorphism
(i.e. choose a bijection between B@ and the totals of B).  The design
then consists of the following blocks B', ordered according to the
size of their intersection with B:

(6) B' = B itself.

(4) For any four-element subset S of B, the union of S and any of the
3 pairs in the syntheme of B@ associated to the complement of S

(3) For any three-element subset S of B, the union of S with either of
the two parts of the even split of B@ corresponding the even split of B
that contains S

(2) For any two-element subset S of B, the union of S with the
complement in B@ of one of the three pairs of the associated syntheme

(0) B' = B@.

This gives 1 + 3*15 + 2*20 + 3*15 + 1 = 132 blocks B' as needed (and
of course consistent with the bottom row of the intersection triangle).
The fact that this is actually a (5,6,12) Steiner system -- that is,
that any 5-element set P is contained in a unique block B' -- can be
recovered from our above description of the duality of pairs, synthemes,
and "even splits"; the only tricky case is when the intersection of P
with B has size 2 or 3, and we can cut down on the case analysis
a bit by proving that every P is contained in at least one B', and then
double-count pairs (P,B'): the count is at least #P = Bin(12,5),
but also equal to Bin(6,5) * #B' = 132*Bin(6,5), but that equals
Bin(12,5), so each P must be in a unique B'.  (Likewise it is enough
to prove each P is contained in at most one B'.)

Now suppose we have any (5,6,12) Steiner system D1.  Fix a block B1
of the design and a bijection of that design with B.  This can be done
in 132*6! = 12*11*10*9*8 ways.  We claim that this extends to a unique
isomorphism of Steiner systems.  We shall repeatedly use the following
fact: any 4-element subset T of the points of a (5,6,12) Steiner system
is contained in four blocks, each the union of T with a pair, and
these pairs partition the 12-4=8 points in the complement of T.
(Because for each point outside T, that point together with T
makes 5, which are contained in a unique block.)

First let T be one of the 15 four-element subsets of B1, i.e. the
complement of a pair P.  The blocks containing T are B1 itself and
three others which partition B1@.  This gives a map

f: pairs in B1 → synthemes in B1@

which takes P to the syntheme determined by the complement of P in B1.

Taking complements, it follows that for each pair P the three blocks
that intersect S1 in P are obtained from the union of P with B1@
by removing one of the pairs in f(P).

Now we know that either pairs or synthemes can be identified with the
vertices of the strongly regular graph T(6).  We next show that the
bijection preserves the edges.

If pairs P,P' overlap in a point then f(P) and f(P') are disjoint:
if f(P) and f(P') share a pair then its complement in B1@, together with
the three points of B not in P or P', are five points contained in
two blocks.

So, edges in the pair graph go to edges in the syntheme graph.
Since the two graphs have the same degree, it follows that
non-edges go to non-edges.  But we've shown that for any n>4
every automorphism of T(n) comes from the action of S_n.  So we've
identified B@ with the totals of B, and B1@ with the totals of B1,
and thus B1@ with B@.

We proceed to collect enough structure about the Steiner systems D,D1
to show that the bijections from B1 to B and B1@ to B@ extend to
an isomorphism from D1 to D.

Let T consist of three points in B1 and one in B1@.  The resulting
partition must pair each of the remaining points of B1 with a point in B1@:
if two points in B1 were paired then they together with T would be a block
that meets B1 in exactly 5 points.

Now the intersection triangle tells us that for any 3-element subset
S1 of B1 there are exactly 2 blocks that intersect B1 in S1.  I claim
that they do not intersect in B1@.  Indeed if they did then they would
meet in one point, and letting T be that one point union S1 we get
a contradiction.  So, S1 determines an even split of B1@.  Take the
complements of those two blocks (which are also blocks of the design)
we see that the complement of S1 in B1 determines the same even split.
So we get a well-defined bijection

g: even splits in B1 ↔ even splits in B1@ .

Moreover, E is the even split containing S1, and S1 contains pair P,
then each pair in the syntheme f(P) is split by g(E).  Proof: else
two of the pairs in f(P) contain one of the parts E' of the even split
g(E), and their union with P is a block that meets the block {E union E'}
in exactly five points.

We next deduce that f is a bijection.  Say that an even split E is "neighbor" of
a pair iff the pair contained in one of the parts of E, and is a "neighbor"
of a syntheme iff it corresponds to *none* of the syntheme's component pairs.
Then E is a neighbor of 4 pairs and 4 synthemes.  Exercise: these 15+15 = 30
neighborhoods constitute a (3,4,10) Steiner system.  In particular (and this
is much easier, and all we'll need): different pairs have different
neighborhoods.  But we saw in effect that g maps the neighborhood of P
to a neighborhood of f(P).  Since g is a bijection, it follows that
f is a bijection as well.  (Whew.)

Thus f is an isomorphism from the triangular graph T(6) of pairs in B1
to the syntheme graph of B1@.  We saw already that such an isomorphism
is tantamount to an identification of B1@ with the totals of B1.
(A point p in B1 goes to the total consisting of all synthemes f(P)
for P containing the point.)  That identification identifies B1@ with
B@, the complement of B in the (5,6,12) Steiner system we designed previously.
Using the properties of f and g that we've developed we now quickly confirm
that this bijection from B1@ to B@, together with the given bijection from
B1 to B, yield an isomorphism from D1 to our known D and that this
isomorphism is unique.

As noted already, this means there are 132*6! isomorphisms.  Moreover,
any bijection from a 5-point set F1 in D1 to 5-point set F in D
extends uniquely to a bijection from the unique block containing F1
to the unique block containing F, and thus to a unique isomorphism from
D1 to D.  In particular, taking D1=D we see that Aut(D) is simply
5-transitive, QED.

It also follows that, given D and B, we can recover Aut(S_6) as
the index-66 subgroup of Aut(D) as the group of automorphisms that fix
the decomposition of the 12 points into B and B@.  This may seem
circular, but there are other constructions of D (e.g. from the
affine plane over the 3-element field, or a Hadamard matrix of order 12,
or the projective line over the 11-element field), and each of those
can give us an alternative route to Aut(S_6).

March 23: extending Π_4 leads us to simplicity of A_n

[See page 22 of the textbook; a reference for practically all the
group theory we shall need, and much more, is Joseph J. Rotman's
An Introduction to the Theory of Groups (Springer 1995),
hollis.harvard.edu on a Harvard computer.]

We saw(*) that if a projective plane Π of order n is extendable
then either n=2 or n=4, and in both cases Π is algebraic.
For n=2, the extension is the Hadamard 3-(8,4,1) design.
For n=4 it must be a 3-(22,6,1) Steiner system.  We investigate
this case nex; it also leads us to the 4-(23,7,1) and 5-(24,8,1)
systems and the Mathieu groups, via a construction that the text
attributes to Lüneburg (page 22).

(*) assuming the nonexistence of a projective plane of order 10,
or at any rate of an extendable one.

A 3-(22,6,1) design D has 77 blocks.  Choosing a point p and identifying
the derived design D_p with Π_4, we see that the blocks are:

21: p + line
56: hyperoval

The intersection triangle is as follows (see Table 1.1 on page 21,
and ignore for now the leftward diagonals that start 759 and 253):

77
56  21
40  16   5
28  12   4   1
20   8   4   0   1
16   4   4   0   0   1
16   0   4   0   0   0   1

in particular any two of the 56 hyperovals intersect in 0 or 2 points.
It turns out that this is an equivalence relation on the 168 hyperovals
(that is: if we say O ~ O' if O,O' are hyperovals that are equal or
intersect on 0 or 2 points, then ~ is an equivalence relation),
with three equivalence classes.  We could check this combinatorially,
but what does it "mean"?

Recall that all hyperovals are equivalent under Aut(Π_4), and
every element of Aut(Π_4) is either in PGL_3(F_4) or is
the Galois conjugate of an element of PGL_3(F_4).  So say we have
two hyperovals O and O'.  We can get from O to O' by some g0 in Aut(Π_4).
Then g in Aut(Π_4) takes O to O' iff  g = g0 h  with h(O)=O, i.e. with
h in the stabilizer Stab_O.  We have already identified Stab_O with the
group S_6 of permutations of O; and we've also seen that a permutation
pi of O comes from PGL_3(F_4) [without a Galois conjugation] iff pi is even.
So, composing with an odd permutation if necessary, we can assume that
g0 is in PGL_3(F_4), and then g in PGL_3(F_4) takes O to O' iff
g = g0 h  for some h in the subgroup A_6 of Stab_O.

Now I claim:

1) There exists a well-defined surjective homomorphism
det: PGL_3(F_4) → F_4^*

2) A_6 is in the kernel PSL_3(F_4) of det

whence it follows that

3) All g taking O to O' have the same determinant.

Thus we can define an equivalence relation on the 168 hyperovals by defining

O ~ O' ⇔ there exists g in PSL_3(F_4) taking O to O'

that partition the hyperovals into 3 equivalence classes each of size 168/3=56.

It turns out that this is the equivalence relation we need to extend Π_4.

Proof of (1): we know from linear algebra that there is a surjective
homomorphism det: GL_3(F_4) → F_4^*.  The point is that it takes
every scalar 3x3 matrix to 1, and thus "factors through the quotient":
for any nonzero scalar c and 3*3 matrix g we have det(cg)=det(g),
so the determinant can be defined even when we identify cg with g.

Proof of (2): [ad hoc] A_6 is generated by 3-cycles; indeed A_n is
generated by 3-cycles for all n (yes, even n<3...): we saw that
S_n is generated by simple transposition *that move a fixed element*,
so A_n is generated by products of pairs of such transpositions,
and any such product is either a 3-cycle or the identity.  So, it is
enough to show that all 3-cycles have determinant 1.  Indeed it is
enough to show that *one* 3-cycle has determinant 1 because any two
3-cycles are conjugate in A_6 (see below).  But we already constructed
such a 3-cycle, the coordinate permutation of our hyperoval consisting of
the three unit vectors plus (1:1:1), (1:a:b), (1:b:a), and this linear
transformation visibly has determinant 1.  [We could also have used
diag(1,a,b), which together with that coordinate permutation yields a

Better proof of (2): The kernel of det: A_6 → F_4^* is normal
and of index 1 or 3.  But A_6 is simple, and of order >3.
Therefore ker(det) is all of A_6, as claimed.

But how do we know that A_6 is simple?  In fact we prove that A_n is
simple for n ≥ 5.  We follow Rotman... [including conjugacy of all
3-cycles in A_n for n ≥ 5, promised above for n=6.  NB for our
purposes it's enough that for any 3-cycles c and c' either c or c^{-1}
is conjugacte to c' in A_n, and that's easier and true even for
n=3 and n=4 (and vacuously for n<3...).  Further simplification:
at the top of page 51, instead of citing a "second isomorphism theorem"
show directly that if H is a normal subgroup of G and F is any subgroup
then the intersection of H with F is normal in F.]  Similar techniques
are used to prove the simplicity of the other families, as we'll see
for PSL_n(F_q).

So, we've narrowed down the 3-(22,6,1) to one choice,
but didn't yet show that it works (except indirectly since
we've already constructed a 3-(22,6,1) in a homework exercise
via the 2-(11,5,2) Paley biplane).

To that end (and because we won't have the indirect method available
for the 23- and 24-point designs), we argue as follows.  Fix any
three points.  We claim they're in a unique block.  If one of the three
is ∞_1, the other two lie on a unique line and we're done.
If not, but they're collinear, we're still done (they can't be on a
hyperoval because no hyperoval meets a line is as many as three points).
Finally, if three non-collinear points, put them at the three unit
vectors, and then exactly one of the three hyperovals through them works.
Indeed if O is any oval then diag(1,1,a) and diag(1,1,b) takes O to
some O', O" in the other two equivalence classes, so exactly one of
O, O', O" is in the class we chose.  So there exists a 3-(22,6,1) design
D_{22}, as desired.  Moreover it is unique up to isomorphism, and indeed
for any 3-(22,6,1) design D' and any point of D' we may choose
an isomorphism D' → D_{22} taking that point to ∞_1.
We shall see next week that by taking D' = D_{22} we can conclude
that Aut(D_{22}) is 3-transitive on points (and thus transitive
on blocks).

March 25: Interlude: recognizing S_n and A_n from their index-n subgroups
S_{n-1} and A_{n-1}.  Also: The 4-(23,7,1) Steiner system D_{23}

Suppose G is any finite group, and H a proper subgroup, so
n := [G:H] is at least 2.  [Recall that [G:H], called the
index of H in G, is the number of (left) cosets gH
of H in G, and is equal to  |G| / |H|  when G is finite;
the fact that this count n is an integer yields Lagrange's
theorem that |H| is a factor of |G|.]  Then G acts transitively
on the n cosets, and this action gives a homomorphism from G to
the group of permutations of these cosets, which is isomorphic with S_n;
moreover H, considered as its own principal coset, is also its own
stabilizer under this action.  The kernel of this homomorphism
G → H  is thus a normal subgroup of G that is contained in H
and thus has index at least n (equivalently, has size at most |H|;
in fact you can check that it is the intersection of all the
conjugates of H).  Therefore, if we know somehow that G has no such normal
subgroups other than the identity, then the homomorphism has trivial kernel,
and thus is an embedding of G into S_n as a transitive subgroup,
with point stabilizer H.

In particular, if under these hypotheses |G| = n! then we have an
isomorphism of G with S_n, which takes H to S_{n-1}.
Likewise, if  |G| = n! / 2 then G is a subgroup of S_n of index 2,
and thus normal in S_n; so the image is A_n (see the exercises on
page 51 of Rotman; for n>4 that's a consequence of the simplicity
of A_n, but it's not hard to check that the result still holds for n≤4).
Then H is A_{n-1}.  This will give us a way to identify some groups
G or H with alternating or symmetric groups once we have shown in some
other way that G is simple or nearly so.

For example: let G be S_6 itself, but let H be the group PGL_2(F_5)
acting on the 5+1 points of the projective line over F_5.  This group
has order (25-1) (25-5) / (5-1) = 120 = 5!, so [G:H] = 6.  Since
G has no nontrivial normal subgroup of index greater than 2, we obtain
an isomorphism S_6 → S_6 whose point stabilizer is a transitive
subgroup of the (first) S_6, indeed a sharply 3-transitive subgroup.
So we must have found a new construction of an outer automorphism of S_6.
Moreover we've identified (H =) PGL_2(F_5) with S_5 -- so we've obtained
one of the promised exceptional isomorphisms between nearly-simple groups.

A related exceptional isomorphism identifies PSL_2(F_5) with A_5.
In general, if q is an odd prime power then the determinant gives
a map from PGL_2(F_q) to a 2-element group (this uses the fact that
the units of F_q form a cyclic group), and the kernel of this map is
PSL_2(F_q).  [When q is even, ever element of F_q is uniquely
a square, and PGL_2(F_q) = PSL_2(F_q).]  So, PSL_2(F_5) is an
index-2 subgroup of PGL_2(F_5), and we know PGL_2(F_5) ≅ S_5;
this yields another exceptional isomorphism: PGL_2(F_5) ≅ A_5.
[It can be shown that in general if q is odd then PSL_2(F_q) is the
intersection of PGL_2(F_q) with the group of even permutations of
the q+1 points of the projective line over F_q.]

===

Back to building the 5-(24,8,1) design via it's repeatedly derived
designs with parameters 4-(23,7,1), 3-(22,6,1), 4-(21,5,1).
The last of these is a Π_4, which we already know is unique.
Last time we've narrowed down the 3-(22,6,1) to one choice,
and then showed that it works (which as it happens we've already
done indirectly, by constructing a 3-(22,6,1) in a homework exercise
via the 2-(11,5,2) Paley biplane).

The next step is 4-(23,7,1).  We extend the intersection triangle from
the 3-(22,6,1) to get

253
176  77
120  56  21
80   40  16   5
52   28  12   4   1
32   20   8   4   0   1
16   16   4   4   0   0   1
0   16   0   4   0   0   0   1

So in a 4-(23,7,1) design any two blocks meet in an odd number
of points (whereas in the 3-(22,6,1) the intersection was even,
as we'll see happens also for the 5-(24,8,1) ).  Fix two points
∞_1, ∞_2.  [That's the text's notation on page 22-23;
I've also seen I, II (and III for the 5-(24,8,1)), which is
probably better because we already have "points at infinity" in Π_4.]
The doubly derived design is Π_4, and we've already accounted for
each of the singly derived designs last time.  This tells us that
the 253 blocks include

21: ∞_1 + ∞_2 + line
56: ∞_1 + hyperoval in one equivalence class
56: ∞_2 + hyperoval in another equivalence class

That leaves 120, which must be 7-element subsets B of the points of Π_4
that meet each line, and each of the 56+56 hyperovals already used,
in an odd number of points.

We already outlined Monday (though it wasn't in the notes)
what such B must be.  Fix a point p in B.  Then each of the
5 lines through p meets B in 0, 2, or 4 other points.
4 is not possible: then B is a line plus two other points q,r
and a line through q but not r (or vice versa) meets B in exactly
2 points.  This can be done also using the "variance trick":
n1 + n3 + n5 = 21,  n1 + 3*n3 + 5*n5 = 7*5 = 35,  and
3*2*n3 + 5*4*n5 = 7*6 = 42 yields (n1,n3,n5) = (14,7,0).
Either way, we find that each p is on three lines that
go through two other points of B, and any two points of B
determine a unique such line.  That is, B is a 2-(7,3,1) design,
hence a copy of Π_2 in Π_4.  This is called a Baer subplane
(so the name B for such a Block is doubly apposite).  Conversely,
any Baer subplane satisfies our condition that every line meet it
in one or three points.  If it's at least two then they determine
a line of B, so exactly 3.  To exclude the possibility of 0
we can either count 7+14 lines through B, or argue by linear algebra:
F_4^3 is a 6-dimensional vector space over F_2; an F_4 line ax+by+cz=0
comes from a 4-dimensional subspace over F_2, and B comes from a
3-dimensional subspace -- and 3+4>6, so there's a nontrivial intersection.

Since the finite field F_4 contains F_2, there's an obvious Π_2 in Π_4:
the points (x:y:z) with F_2 coordinates (more properly, whose coordinates
can all be made to lie in F_2 after a suitable F_4^* scaling).
I claim that every Baer subplane is equivalent to this one under some
choice of coordinates -- i.e. under the subgroup PGL_3(F_4) of Aut(Π_4).
Indeed let {p,q,r,s} be one of the hyperovals of B.  We have seen that
for any field k the group PGL_3(k) acts transitively on 4-tuples of
points in general linear positions on P^2, and there's a unique choice of
coordinates that puts the 4-tuple at ( (1:0:0), (0:1:0), (0:0:1), (1:1:1) ).
Then B must also contain the intersections of lines pq and rs, pr and qs,
and ps and qr.  The first two of these are z=0 and x=y, i.e. (1:1:0);
likewise the others are (1:0:1) and (0:1:1), and there's our copy of Π_2
in Π_4.

(Digression: we can try this construction over any field k; it works iff
the last three points are collinear, i.e. iff the determinant

| 0 1 1 |
| 1 0 1 |
| 1 1 0 |

vanishes.  But this determinant is 2 -- we don't even need our formula
[Lemma 1.12 on p.4] for det(xI=yJ) here... -- so this always works if 0=2
in k, and always fails otherwise.  This can be interpreted classically
in terms of Ceva's and Menelaus's theorem: the ratio product must be both
1 and -1.  [BTW: Ceva's theorem was anticipated several centuries
earlier by "Yusuf Al-Mu'taman ibn Hud, an 11th-century king of Zaragoza"
according to the Wikipage for Ceva's Theorem (if the URL
"http://en.wikipedia.org/wiki/Ceva's_theorem" fails, replace the ' by %27) .]

So how many Baer subplanes are there?  PGL_3(F_4) acts transitively,
and the stabilizer is PGL_3(F_2), so the count is the ratio of the
two groups' orders, which is (63*60*48/3) / 168 = 360.  [Or if you prefer,
there are 21*20*12*9 choices for p,q,r,s, and each B is obtained from
168 of them -- but that's basically the same argument.]  We need to
use 120 of them, which is exactly 1/3.  It works almost exactly as for
the hyperovals, except a bit more easily.

We just saw that for any two Baer subplanes B, B' there's
g0 in PGL_3(F_4) taking B to B' -- this time we didn't have to go
through the twice-as-large group Aut(Π_4).  Any other such g is g0 h
with h(B)=B, i.e. h in Aut(Π_2) = GL_3(F_2).  This last "=" hides
something that we next make explicit.  As with hyperovals,
any combinatorial automorphism of B extends to Aut(Π_4); here
the extension is not unique, but there are two choices and exactly
one is in PGL_3(F_4).   To see this, consider the maps:

Stabilizer in Aut(Π_4) of B → Aut(Π_2)
Stabilizer in PGL_3(F_4) of B → Aut(Π_2)

both maps are is surjective.  The kernel of the second is {id}.
by applying the four-points-in-general-position lemma to PGL_3(F_4)
rather than GL_3(F_2).  So the second map is an isomorphism.
The kernel of the first map is either {id} or a 2-element group,
and we see it's the latter because it contains the field involution.

Now I claim that  det(g) = det(g0 h).  That is, I claim that
the restriction of det to

det: Stabilizer in PGL_3(F_4) of B → F_4^*

is trivial.  We could show this by proving that the stabilizer,
a 168-element group, is simple.  But it's even, um, simpler to note
that with our choice of coordinates every element of the stabilizer
manifestly has determinant 1 because it has a representative in PGL_3(F_2).

So we can define an equivalence relation:  B ~ B'  ⇔  det(g) = 1
and this splits the 360 Baer subplanes into 3 equivalence classes
of 120 each.  Again there is a combinatorial description: B ~ B'
iff  |B intersect B'|  is odd.  (Note that this holds even if B = B',
as for the O equivalence relation.)

Finally, how do hyperovals and subplanes interact, i.e. intersect?
We want to use one B class and two O classes such that each B and O
have odd intersection.  So, we have to extend ~ to an equivalence
relation that can mix hyperovals and subplanes in such a way that
each B class corresponds to a unique O class which consists of
the hyperovals that meet B in an even number of points.  Our
standard O containing (1:0:0), (0:1:0), (0:0:1), (1:1:1) meets
our standard B in those four points and no others; its images under
diag(1,1,a) and diag(1,1,b) meet that B in only three points
(the unit vectors but not (1:1:1)).  So we declare that
g (standard O) ~ g' (standard B)  iff det(g) = det(g').
This does not depend on the choice of g and g', and you'll check
that it does what we want: |B intersect O| is even iff B ~ O.

This completes our census of blocks of a 4-(23,7,1) design:

21: ∞_1 + ∞_2 + line
56: ∞_1 + hyperoval in one equivalence class
56: ∞_2 + hyperoval in another equivalence class
120: Baer subplane in the third equivalence class

There are six choices; each is stable under PSL_3(F_4), and
they're permuted transitively by PGL_3(F_4) and switching
∞_1 ↔ ∞_2, so all are equivlent.

It remains to check that this works.  This time we don't have
the shortcut of having constructed a 4-(23,7,1) [or 5-(24,8,1)]
design some other way.  But it's not hard.  Given four points:

@ If two of them are ∞_1 and ∞_2, we must use a line,
which is the unique line through the other 2.

@ If one of them is (say) ∞_2 and the other isn't, use
the above argument for verifying the 3-(22,6,1) design.

@ If all four are in Π_4, and are collinear, use that line plus
∞_1 and ∞_2.  There is no other choice because neither
a hyperoval nor a subplane can meet a line in four points.

@ If no three are on a line, they are equivalent to our four
standard points.  Those are contained in a unique subplane B
and a unique hyperoval O with B~O.  Hence exactly one of B and O
is available.

@ Finally, if three are on a line l but the fourth isn't,
we must use a subplane (lines can't work and a hyperoval meets l
in at most two points).  We must show that (i) there are three subplanes
B containing our four points, (ii) one in each equivalence class.
This will give us what we need.  For (i), say our points are p,q,r,x
with r not on l.  Let l' be the x-r line.  B must contain a unique one of
the three points on l' other than r and x.  Each of those points, call it s,
together with p,q,r gives four points in general position and we already
know that these are contained in a unique B.  For (ii), any two of the
three B choices meet in at least four points, and they cannot meet in more
lest they coincide (any five points include a (unique) hyperoval, from which
we can reconstruct the subplane as above).  So they meet in exactly 4
and are thus inequivalent.  [For (ii), alternatively -- and more honestly,
since this had to be done in the course of proving the properties of our
equivalence relation -- we may choose coordinates so that p,q,r,x
are at (1:0:0), (0:1:0), (0:0:1), (1:1:0).  Then one of the three B's
is our standard one, and the others are its images under diag(1,1,a)
and diag(1,1,b), which are inequivalent.  Yes, this gives you part of
Exercise 2 of the current problem set.  It also proves that the
construction actually yields a 4-(23,7,1) design using only the
determinant definition of the equivalent classes, without having
to verify that it is equivlent to the combinatorial one using
the parity of intersections.]

In conclusion: we have proved (again modulo Exercise 2) that
our construction gives a 4-(23,7,1) design D_{23}, and any 4-(23,7,1)
design D' is isomorphic with it -- and indeed we may choose the
isomorphism to take any ordered pair of distinct points of D' to
(∞_1, ∞_2).  We shall see next time that by taking
D' = D_{23} we can conclude that Aut(D_{23}) is 4-transitive on points
(and thus transitive on blocks).

March 28 and 30: The 5-(24,8,1) Steiner system D_24; its automorphism
group M_24, and its subgroups M_23, M_22, and [PSL_3(F_4)=]M_21;
the automorphism groups of D_23 and D_22

With the description of hyperovals and Baer subplanes we developed
to construct and prove the uniqueness of D_22 and D_23,
we now easily show D_24 is unique if it exists, and with a bit
more effort verify existence and the number of automorphisms.
We then describe Aut(D_24), which is M_24, the largest of
Mathieu's 5 sporadic groups, and its k-point stabilizer M_{24-k}
for k=1,2,3 (the last of which coincides with the normal subgroup
PSL_3(F_4) of Aut(Π_4)).

Recall the intersection triangle of a 5-(24,8,1) Steiner system
(Table 1.1 on p.21 of the text):

759
506  253
330  176  77
210  120  56  21
130  80   40  16   5
78  52   28  12   4   1
46  32   20   8   4   0   1
30  16   16   4   4   0   0   1
30   0   16   0   4   0   0   0   1

So in a 5-(24,8,1) design any two blocks meet in an even number
of points (which also follows immediately from the odd-intersection
property of the derived 4-(23,7,1) design.  Fix three points
∞_1, ∞_2, ∞_3.  [See the March 25 notes for the alternative I, II, III.]
The doubly derived design is Π_4, and we've already accounted for each of the
singly derived designs last time.  This tells us that the 759 blocks include

21: ∞_1 + ∞_2 + ∞_3 + line
3*56: ∞_j + ∞_k + hyperoval in equivalence class i ({i,j,k}={1,2,3})
3*120: ∞_i + Baer subplane in equivalence class i

That leaves 210, which must be 8-element subsets S of the points of Π_4 that
meet every line, hyperoval, and Baer subplane in an even number of points.

Proposition: Suppose S is a set of 8 points in Π_4.  Then TFAE:
(i) S intersects every line, hyperoval, and Baer subplane in an even number of points.
(ii) S intersects every line in an even number of points.
(iii) S is one of the  Bin(21,2) = 210 symmetric differences of line pairs.

Thus D_24 must use all 210 of them, so the structure of D_24 is
completely determined by the choice of ∞_i and a bijection between
those 3 points and the 3 equivalence classes of hyperovals and subplanes.

_Proof of Proposition_:
(i) ⇒ (ii) is obvious.  (iii) ⇒ (i) is easy, because the intersection of
every {line, hyperoval, subplane} with a line always has {odd, even, odd} size.
This leaves (ii) ⇒ (iii).  Applying (ii) to the five lines through a
given p point of S, we see one of these lines has four points of S, and
the others have two each.  (If 7=8-1 is a sum of 5 odd natural numbers
then one of them is 3 and the other four are 1.)  Call that line l.
Let p' be a point of S not in l.  The same argument shows p' is contained in
a line l' containing four points of S.  The intersection of l and l',
call it p", is not in S, else p" would lie on two four-point lines.
Hence S is the symmetric difference of l and l', and we're done.

Theorem: the resulting D_24 is indeed a (5,8,24) Steiner system.

Proof: We show that any five points are contained in a unique block.
If at least one of them is among the three ∞_i points then
we have done this already in our analysis of the derived designs
D_23, D_22, or [Π_4=]D_21.  So assume all five are in Π_4.

@ If all 5 on a line: must use that line plus all three ∞_i's.

@ Four on line l, a fifth p is not on l: can't use a line, hyperoval,
or subplane; so it must be a symmetric difference, and we readily see
that the symm.diff. of l and the line through p and the missing point of l
works and is unique.

@ No four of a line, but three on each of two lines l and l':
must be the Baer subplane they span (and whichever ∞_i goes with it).

@ No four on a line, and exactly one l containing three of them: must be
the symmetric difference of l and the line through the other 2.  Finally,

@ No three on a line: it's a Type I oval, so can't be a line or
symm.diff. or subplane, but it's contained in a unique hyperoval,
so it's that hyperoval (and whichever two of the ∞_i go with it).

QED!

Since D_24 exists and is uniquely determined by its design property
there is an unambiguous M_24 = Aut(D_24).  This is the largest of the
Mathieu groups.  We next give some of its properties (but do not prove
the simplicity -- that will be next week, together with the other M_n).

First we finally make explicit the relation among

@ PΓL_3(F_4),
@ its index-2 normal subgroup PGL_3(F_4), and
@ the index-3 normal subgroup PSL_3(F_4) of PGL_3(F_4).

Namely:

Lemma: (i) SL_3(F_4) is a normal subgroup of ΓL_3(F_4),
and PSL_3(F_4) is a normal subgroup of PΓL_3(F_4).
(ii) In each case, the quotient group is isomorphic to S_3, and
(iii) The three objects permuted can be identified with the
three equivalence classes of B's (or of O's).

of V = F_4^3.  Choosing coordinates, we have the linear maps, which form
the group GL_3(F_4) of invertible 3*3 matrices, and the conjugation
c = c^{-1}, taking x = (x0,x1,x2) to (x0^2, x1^2, x2^2).  These generate
ΓL_3(F_4) as follows.  The group consists of invertible matrices A
and transformations cA where A is invertible.  It's clear how to compose
A B and cA B.  As for A cB and cA cB, the key fact is that for any
vector x we have Ac(x) = c(cAc)x = cA'x where A' is the matrix obtained
by applying c to (i.e. squaring) each entry of A.   So Ac = cA', whence
A cB = Ac B = c A'B  and similarly cA cB = A'B.  This gives us the
group law in ΓL_3(F_4).  Since 1'=1 the group still has 1 for an
identity element (as it must -- it's still the identity transformation
of V), and the inverse of cA is (A^{-1})c = cA'^{-1} = c(A^{-1})'.
This is basically the special case H = {id,'} of the construction of the
semidirect product of a group G with a group H of automorphisms of G.

Now det(A') = det(A)' = 1/det(A).  So the transformations A and cA
with det(A)=1 form a subgroup, called (what else?) ΣL_3(F_4).
It contains SL_3(F_4) with index 2, so SL_3 is normal in ΣL_3(F_4).
But ΣL_3(F_4) is not normal in ΓL_3(F_4).  Indeed let A, B in GL_3(F_4)
have det(A)=1 but det(B) not 1.  Then cA is in ΣL_3(F_4).  I claim that
its conjugate by B is not:

B^{-1} cA B = c (B^{-1})' A B

and so it's of the form cM with det(M) = det(B)/det(B)' which is not 1.
On the other hand SL_3(F_4) *is* normal.  Suppose det(A)=1 and B in GL_3(F_4).
Then certainly det(B^{-1} A B) = 1 -- that's just the fact that SL_3 is
normal in GL_3 -- and the new issue is whether (cB)^{-1} A cB also has
determinant 1.  But the inverse of cB is c(B')^{-1}, so the conjugate is

c(B')^{-1} A cB = c(B')^{-1} c A' B = B^{-1} A' B

(note that we undid the ' when conjugating (B')^{-1} by c) which has
determinant det(A') = det(A)' = 1.

Now that SL_3(F_4) is normal in ΓL_3(F_4), what's the quotient?
I claim it's S_3, arising as ΓL_1(F_4), a.k.a. the semidirect
product of the 3-element group F_4^* with '.  Indeed I have a map from
ΓL_3(F_4) to this group, which is the determinant on GL_3(F_4),
and takes cA to c*det(A).  This is a surjective homomorphism and its
kernel is SL_3(F_4), so it identifies ΓL_3(F_4) / SL_3(F_4) with
the target group S_3 as claimed.

[Note that this used very little about F_4 and nothing special to
3*3 matrices, and can be adapted directly GL_n(k) for any field k with
an involution, and with a bit of change to GL_n(k) for any field k
with some automorphism group, which is inherited by GL_n in the
same way.  Even the next paragraph works more generally though
in one place we exploit the fact that all the invertible scalar
3*3 matrices over F_4 happen to have determinant 1.]

So this does (i) and (ii) for SL_3.  What about PSL_3?  That was the
quotient of SL_3 by the scalar matrices.  These matrices still form a
normal subgroup of PΓL3(F_4): the conjugate-linear transformations
don't commute with scalars but do conjugate each scalar to its inverse.
(This also means that PΓL_3 respects the construction of Π_4 as
the quotient of V-{0} by scalars.)  So the normality and quotient are
inherited by the projective groups PSL_3 and PΓL_3.  This proves
(i) and (ii) for PSL_3(F_4).

For (iii), it is now enough to produce two inequivalent ovals
switched by c, and that's easy: use the two ovals that contain
(1:0:0), (0:1:0), and (0:0:1) but do not contain (1:1:1).  QED

We'll also need the action of PGL_2(F_4) and PΓL_2(F_4) on P^1(F_4).
Since |P^1(F_4)| = 4+1 = 5  these map to S_5.

Lemma: The map is an isomorphism from PGL_2(F_4) to A_5
and from PΓL_2(F_4) to A_5.

(So we have another of our exceptional isomorphisms -- one of the
easier ones.)

Proof: We have seen that the action of PGL_2(k) on P^1(k) is faithful
and simply 3-transitive.  For |k|=4 this forces its image to be A_5.
Since PΓL_2(F_4) contains a simple transposition, its image must be S_5.

Remark: Likewise PGL_2(k) is S_3 and S_4 for |k|=2 and |k|=3 respectively,
and we saw that for |k|=5 it's the transitive S_5 in S_6.

Finally, we know already that PGL_3(F_4) acts 2-transitively on Π_4 --
indeed PGL_n(k) acts 2-transitively on projective (n-1)-space over k
for any field k and any n>1.  We'll need the fact that PSL_n(k) acts
2-transitively as well.  This is easy: 2-transitivity of PGL_n amounts
to the fact that for any distinct 1-dimensional subspaces l1,l2 of k^n
we can find a basis v1,v2,...,vn such that v1 is in l1 and v2 in l2;
for PSL_n we need only show that this basis can be chosen so that
det(v1,v2,...,vn) = 1 -- but this is easy because we can replace
v1 by c*v1 for any nonzero scalar c, and that lets us adjust the
determinant.

We can now prove:

Theorem:

M_24 acts 5-transitively on the points of the design, and
transitively on the 759 blocks.  The 5-point stabilizer has order 48.
The block stabilizer acts on the block as A_8, and on the complement
as affine GL_4(F_2).  The stabilizer of a block and a point outside it
acts as A_8 on the block and as GL_4 on the remaining

24 - 8 - 1 = 15 = |P^3(F_2)|

points.

Proof: We already have 3-transitivity, and the 3-point stabilizer is
PSL_3(F_4).  Since that's 2-transitive, we have 5-transitivity, and
the 5-point stabilizer in M_24 is the 2-point stabilizer in PSL_3(24),
which has order #PSL_3(24) / (21*20) = 16*9/3 = 48.

Transitivity on blocks follows from 5-transitivity, because any 5 points
determine a unique block.  Since all blocks are equivalent, we can study
the block stabilizer by choosing any convenient block B, and we choose
B = ∞_1 + ∞_2 + ∞_3 + the line at infinity l0 of Π_4.
Let G be the stabilizer of B, and H the image of G in the group
S_8 of permutations of B (i.e. H is the group of permutations of B that
come from G).  Then H is 5-transitive, and the stabilizer of
{∞_1, ∞_2, ∞_3} is the stabilizer of l0 in PSL_3(4),
which is PGL_2(k), which is A_5.  So H is A_8 as claimed.

Now the action of G on the remaining 24-8=16 points must preserve
the set of blocks disjoint from B.  The intersection triangle
promises us 30 such blocks.  Our description of blocks disjoint from
{∞_1, ∞_2, ∞_3} tells us they are the symmetric differences
of lines that meet in l0.  But the complement of l0 is an affine plane over F_4,
and lines of Π_4 that meet in l0 are parallel in the affine plane.
Well affine F_4^2 is also a 4-dimensional affine space over F_2, and
a little thought and/or counting shows that our 30 blocks are just the
3-design of affine hyperplanes in F_2^4.  So the action of G must preserve
that structure, whence its image is contained in AGL_4(F_2) (the semidirect
product of GL_4(F_2) with F_2^4).  But G can barely fit in AGL_4(F_2):

|G| = |M_24| / 759 = 16 |GL_4(F_2)| = |AGL_4(F_2)|

and G must act faithfully on those 16 points (proof: if g is a
non-identity element of G that fixes all of them then g(x)=y for
some distinct x,y in B; choose some other z in B, and a block B'
meeting B only in {x,z} to conclude that g takes B' to B' which is
a contradiction).  So G = AGL_4(F_2), and the block+point stabilizer is
GL_4(F_2), QED.

Corollary: GL_4(F_2) ≅ A_8 !  Cf. the outer isomorphism S_6 → S_6 we
found in M_12.  (We'll also find an outer automorphism of M_12 in M_24.)
Note that PSL_3(F_4) also happens to have 20160 = 8!/2 elements, but
its turns out that PSL_3(F_4) is not isomorphic with A_8.  For example,
A_8 has elements of order 15, but PSL_3(F_4) does not.

[March 32: Simiplicity of M_11 and M_23 using Robin J. Chapman's article
in the June-July 1995 issue of the American Mathematical Monthly]

April 4: Simplicity of M_12 and M_24;
start on simplicity of PSL_n(F_q) [n>2 or n=2 and q>3]

Theorem: M_12 and M_24 are simple.

Proof: In fact we'll show that if G is a 5-transitive permutation group
of any finite set X (of size at least 5), and the point stabilizer G_x
is simple, then so is G.  Note that by transitivity all G_x are conjugate
in G and thus isomorphic.  For M_12 and M_24 the point stabilizers are
M_11 and M_23, whose simplicity we proved last time.  Alas it is now known
(by CT = the classification theorem of finite simple groups) that the only
further examples are A_n with n ≥ 7, which we've already proved simple
in a different (though related) way.

Let H be a normal subgroup of G other than {1} and G itself.
We claim H is transitive.  Let h be any non-identity element,
so h(x_0)=y_0 for some distinct x_0,y_0.  We show that for any
distinct x,y there is a conjugate h' of h in G such that h(x)=y.
Indeed since G is 2-transitive we can find g such that g(x)=x_0
and g(y) = y_0.  Then g^{-1} h g(x) = g^{-1} h(x_0) = g^{-1} y_0 = y.

Now the intersection of H with G_x is normal (because the
intersection of H with any subgroup G' is normal in G').
So by hypothesis this intersection is either {1} or all of G_x.
Since H is transitive, the latter case implies |H| = |X| |G_x| = |G|
so |H|=|G|.  So we're in the former case and |H|=|X|, whence H acts
simply transitively.  (Note that so far we've used only 2-transitivity
of G_x, and there are still examples such as the 4-group in S_4.)

But (as we did with A_n for n>6) we can use high-order transitivity
to find a nontrivial commutator in H that has a fixed point,
and that will give our desired contradiction.  Let h be a
non-identity element of H, so h(x_0)=y_0 as before, and choose
x_1 distinct from x_0 and h^{-1}(x_0) (which is possible since |X|>2).
Then y_1 := h(x_1) is distinct from x_0, x_1, and y_0 (from x_0 by
assumption, from y_0 because h is injective, and from x_1 because
H is simply transitive).  But by 4-transitivity we can then find,
for all distinct x,y,x',y' in X, a G-conjugate of h that takes
x to y and x' to y'.  Once |X|>4 we can obtain a contradiction by
setting (x,y,x')=(x0,y0,x1) but y' not equal y1: then multiplying our
conjugate by h^{-1} yields a non-identity element of H (a commutator,
as promised) that has a fixed point.  This completes the proof.

Corollary: All the permutations in M_12 and M_24 are even.

Proof: Else the even permutations would form a normal subgroup
of index 2 (think about it -- it's Exercise 3.22 in Rotman),
contradicting the fact that M_12 and M_24 are simple
(and not of order 2).

Remark: The same is thus true also of the subgroups M_11 and M_23,
and also M_22 and M_21 = PSL_3(F_4).  It is not true of Aut(D_22)
and Aut(Π_4), because those groups contain automorphisms that
come from elements of M_24 that switch ∞_1 and ∞_2.

So how do we prove M_22 simple?  That contains the 2-transitive simple
subgroup M_21 = PSL_3(F_4).  As Rotman shows (Theorem 9.24), 2-transitivity
suffices to prove that a nontrivial normal subgroup H must be an
"elementary abelian" group, i.e. (Z/lZ)^r for some prime l and
integer r>0 -- and this will be enough because 21 is not a prime power.
However we must first prove that PSL_3(F_4) is simple -- and that's
the first case to which Chapman's criterion does not apply, again
because 21 is not a prime.  We next turn to the simplicity of PSL_n(F_q)
(except PSL_2(F_2) and PSL_2(F_3) which are the non-simple groups
S_3 and A_4).

April 11: Introduction to subgroups of PSL_2(F) and Galois' theorem;
the identification of PSL_2(F_7) with Aut(Π_2) = GL_3(F_2)

For the groups PSL_2(F) we can describe not only normal subgroups
(as we saw the only nontrivial examples are for |F|=2 or 3)
but all subgroups, and this is part of a beautiful story that
starts with Galois -- or arguably with Greek mathematics, via
the connection with finite subgroups of SO_3(R) -- and turns up
in a remarkable variety of mathematical contexts.

We introduce this topic via Galois' theorem on large subgroups H of
G = PSL_2(F_p), i.e. subgroups whose index [G:H] is small (without
being as small as 1).  For any group G there is a close connection
between transitive actions of G and small-index subgroups, and Galois'
theorem can be stated that way too.

The general picture is as follows.  Let G be a finite group acting
transitively on a set X.  Then the point stabilizer is a subgroup of G
with index |X|.  Conversely let H be a subgroup of G.  Then H has
[G:H] cosets gH, and those cosets are permuted transitively (from
the left) by G (and indeed if we started with a transitive action of
G on X and took H to be a point stabilizer then the cosets could be
identified bijectively with elements of X).  Let K be the kernel of this
permutation representation of G on the cosets.  Thus K consists of all
k in G such that kgH=gH for all g in G.  This condition is equivalent to
g^{-1}kg H = H, i.e. K consists of all k all of whose conjugates are in H.
So K is the intersection of all the G-conjugates gHg^{-1} of H.
This intersection is a normal subgroup of G (this can be checked
directly from the definition, and also follows from the fact that K is
the kernel of a group homomorphism); and K is strictly smaller than G
iff H is.  In particular, if G is simple then K is trivial and G acts
faithfully on the [G:H] cosets, i.e. is a subgroup of the symmetric
group of permutations of [G:H] objects.

When  G = PSL_2(F_q), the action on P^1(F_q) has a point stabilizer
of index q+1 (and we saw last week it's the "a^2 x + b  group").

Theorem (Galois): If q is prime then any proper subgroup of PSL_2(F_q)
has index at least q+1 except when q is 2, 3, 5, 7, or 11, when the
minimal index is q.

Remark: We shall see that the theorem holds even with q is a prime power
with the single further exception that if q=9 then the minimal index is 6.

Today we shall show for prime q that one cannot get below q (that's easy)
and describe the five exceptional cases, all of which have to do with
designs or groups that we've seen already.

Let G=PSL_2(F_q).  For q=2 and q=3 we know already that G is isomorphic
with S_3 and A_4 respectively.  For S_3, clearly the minimal index of
a proper subgroup is 2, attained uniquely by the normal subgroup A_3.
For A_4, there is no subgroup of index 2 (such a group would be normal
but the conjugacy class sizes 1,3,4,4 contain no subsum of 6) and we
know one of index 3 (the normal 4-group), which is unique (its complement
consists entirely of elements of order 3, so cannot be contained in
a group of order 4).

Assume q>3.  Then G is simple.  It follows that if G has a normal
subgroup of index n then G is a subgroup of the symmetric group S_n
of order n.  But then the order of G is a factor of |S_n| = n!.
But if q is an odd prime then |G| = (q^3-q)/2 is a multiple of q,
whence n is at least q.  Since we already know a subgroup of index q+1
the only remaining question is whether there is one of index q.

(Note that this argument fails when q is a nontrivial prime power.
For q=9 the formula (q^3-q)/2 still works, and gives 360=6!/2, so once
we find an index-6 subgroup it will follow immediately that PSL_2(F_9)
is isomorphic with A_6 !)

For q=5,7,11 the index-q subgroups are isomorphic with A_4, S_4, A_5
respectively -- check that their sizes 12, 24, 60 are the correct
(q^2-1)/2 in each case.  The fact that these are the rotation groups
of the regular solids is no accident, and the fact that there's no such
solid beyond the icosahedron and dodecahedron is closely related with
our approach to Galois' result that there are no examples past q=11.

As with q=2 and q=3 the maps G → S_q for q=5,7,11 tell us something
special about each of these groups G.  Since G is simple, the image
must be in A_q.  For q=5, the size of A_q is barely enough to
accommodate G, and we recover the isomorphism between G and A_5.
For q=7 and q=11 we get G as a proper subgroup of A_q, but one that
respects an interesting structure.  For q=7 this is a Π_2 structure,
and gives us the identification of G with Aut(Π_2) -- yes, we're
identifying the group structures of 3x3 matrices mod 2 with certain
2x2 matrices mod 7 (mod {1,-1})!  For q=11 it turns out to be the
Paley biplane -- you've already seen it has 660 symmetries with
point stabilizer A_5.  In each of the q=7 and q=11 cases there are
actually two conjugacy classes of index-q subgroups that are switched by
elements of PGL_2(F_q) not in PSL_2(F_q).  This gives rise to the
duality of the corresponding square 2-design.  We'll also see the
q=11 case (PSL_2(F_11) as a transitive subgroup of A_11), and also
q=9 (PSL_2(F_9) = A_6), in our description of how M_12 fits in M_24.

For the rest of today's lecture we give a design-theoretic proof of
the identification of PSL_2(F_7) with GL_3(F_2), using Π_2 and the
3-(8,4,1) design (and the simplicity of PSL_2(F_7), proved last week).

April 13: A_4, S_4, A_5, and the other finite triangle groups
via Cayley graphs

The classification of subgroups of PSL_2(F) includes as special cases
the three groups A_4, S_4, A_5.  As noted already last time, these are
the groups of rotations (= orientation-preserving isometries) of the
regular solids in three dimensions:

A_4 : tetrahedron (even permutations of the four vertices, or equivalently
of the four faces which are in natural bijection with the vertices)

S_4 : cube = hexahedron (any permutation of the four long diagonals);
dually, octahedron (any permutation of the four pairs of opposite faces)

A_5 : dodecahedron (any even permutation of the five inscribed cubes)
dually, icosahedron (can you find a natural set of five objects?)

Each is a subgroup of SO_3(R), which is contained in the group PSL_2(C)
of automorphisms of the Riemann sphere (C is algebraically closed so
every number is a square and PSL_2 = PGL_2); once we choose some
explicit matrices with algebraic coefficients we can find these groups
in PSL_2(F) for suitable finite F by reducing modulo a prime.

But that requires algebraic number theory and it's not easy to make sure
we have all the examples.  Instead we'll use an approach which combines
two familiar ideas but ones that get little actual use: Cayley graphs,
which are usually introduced only as a visualization device; and
generators and relations, which are usually very hard to apply in
any nontrivial space.  Specifically we'll show that each of A_4, S_4, A_5
is generated by elements b,c,d subject (only) to the relations

b^2 = c^3 = d^n = bcd = 1

with n=3,4,5 respectively.  Explicitly:

A_4: n=3, b = (1 2) (3 4), c = (1 2 3), d = (2 3 4)
[any double transposition and 3-cycle will do]

S_4: n=4, b = (1 2), c = (2 3 4), bc = (1 2 3 4)

A_5: n=5, b = (1 5) (2 3), c = (1 3 4), bc = (1 2 3 4 5)

We can use bcd=1 to solve for any one of the generators, e.g. d=(bc)^{-1},
and get a presentation with only two generators -- the minimum for a
non-cyclic group [though this is not so remarkable because lots of
interesting groups can be generated by only two generators, including all
simple groups].  For instance c and d can be taken to be rotations about
a face and a vertex respectively of the regular solids with 4, 8, and 20
triangles (with the face containing the vertex).  Thus the reason we
stop at 5 is ultimately that 1/2 + 1/3 + 1/n must exceed 1
(which is also why one of the other exponents must be 2).

The Cayley graph, call it C(G,S), of a group G with respect to a set
S of generators is a directed graph with vertices G and edges of |S| kinds
("colors"), one for each element s of S: there's an s edge going from g to h
iff h=sg.  The graph is connected because S generates G.  It is not
necessarily connected as a directed graph (i.e. there may be g,h for which
there's no directed path from g to h -- example: (G,S)=(Z,{1}) and h<g).
But if G is finite then any element has finite order, so each s^{-1} is
s^{k-1} for some k and we can even get from g to h by a directed path.
Each vertex has in- and out-degree |S|, and G acts simply transitively
on the vertices by multiplication from the right.

We do not let S contain 1 (since 1 doesn't help generate anything),
so C(G,S) has no loops.  There's an s edge g→h iff there's an s^{-1} edge
h→g; there's usually no reason to let S include both s and its inverse,
because once we have s we're already allowed to use s^{-1} to generate G --
unless s its own inverse.  We assume that S does not contain both s and s^{-1}
unless s = s^{-1}, that is, s^2 = 1.  In that case there's an s edge from
g to h iff there's one from h to g, so we simplify by combining them to
a single undirected edge.  There are then no g,h for which both g→h
and h→g are directed edges in C(G,S) (because if h=sg then g=s^{-1}h).

Examples:

i) nontrivial cyclic groups, (G,S) = (Z/nZ, {1}) or (Z, {1}).  If n=2
we get a complete graph on two vertices; for n>2 it's a directed
cycle of length n; for (Z,{1}), an infinite directed path.
We may use redundant generators (except for s^{-1}), e.g.
Z/6Z, {1,3}) or (Z/6Z, {1,2}).  What happens for (Z/6Z, {2,3})?
Note that this actually doesn't have any redundancy even though
1 generator suffices.

ii) Dihedral groups.  Example: S_3 with generators the 2-cycles
(1 2) and (1 3) gives

id -- (1 2) == (1 2 3) -- (2 3) == (1 3 2) -- (1 3) == id

so we have our benzene ring again (and indeed S_3 is the group of
automorphisms that take each "bond" to another bond of the same type).
More generally the dihedral group of 2n elements has two generators
g1, g2 satisfying g1^2 = g2^2 = (g1 g2)^n = 1 and then C(D_n, {g1,g2})
is a cycle of 2n edges of alternating types.

Generators and relations:

Here a group (usually finitely-generated, but not necessarily finite)
is specified by generators g_1,...,g_n and relations that may be
written either r_1,...,r_N [N needn't equal n] or r_j=1 or r'_j=r"_j,
to the same effect (the last being equivalent to r'_j (r"_j)^{-1} = 1).
For example, Z is just <g | >, a cyclic group of order k is <g | g^k>,
a free group on n generators is <g_1,...,g_n | >, a free abelian group
on the same generators is <g_1, ..., g_n | g_i g_j = g_j g_i>,
and any finite abelian group can be obtained from that by adding
some more relations -- in fact it's a standard theorem that we can
choose the generators so that the extra relations are g_i^k_i with
k_1 | k_2 | ... | k_n, and the group determines the k_i except for
prepending some 1's at the beginning.

Such presentations are useful because to construct a map  G → H
it is necessary and sufficient to find h_i in H satisfying the same
relations.  That is how we'll use it, at any rate.

The problem is that for a large finite but non-abelian group it can be
very hard to find such a presentation with few generators and relations.
We can certainly do it with |G| generators and |G|^2 relations; and we
can usually find very few generators g_1,...,g_n (n may be as small as 2)
and some relations r_1,...,r_N; but how do we know it's enough that
the map from the resulting group, call it Γ, to G is an isomorphism?
That's generally too hard; for general g_i and r_j it is known to be
literally undecidable even whether Γ is finite, or trivial!

Happily in our setting we can use the lowly Cayley graph to answer
the question.  Remember we're looking at

b^2 = c^3 = d^n = bcd = 1

consider first the simpler

b^2 = c^2 = d^n = bcd = 1

that is,

b^2 = c^2 = (bc)^n = 1

and use b and c as Cayley generators.  Before imposing the condition
(bc)^n=1 it's an infinite group, and the Cayley graph is just an infinite
path of alternating b's and c's.  That's the infinite dihedral group,
a.k.a. the Ax+B group over Z -- since A must be invertible it's either
1 or -1, so we have the semidirect product, generated by x → x+1 and
x ↔ -x, and we can take b: x ↔ -x and c: x ↔ 1-x.  Then bc is
x ↔ -(1-x) = x-1 which is of infinite order.  Forcing it to have
order n makes the Cayley graph close up after n+n steps and we get
the dihedral group of order 2n.

Now to be sure this case wasn't hard: any group element must be
a finite word of alternating b's and c's, so (bc)^n=1 reduces to
2n possible words and since D_{2n} satisfies all those conditions
we've found our group.  But for b^2 = c^3 = (bc)^n = 1  it's trickier.
We can use either c or d=bc as the second generator, and we'll go with d
(which makes no difference for n=3 but gives rise to more familiar
pictures for n=4 and n=5).  Now we have n-gons of d's linked by b bonds
subject to (bd)^3 = (bbc)^3 = c^3 = 1, so a graph with two bdbdbd hexagons
and one n-gon of d's at each vertex -- and this exactly gives us the
12, 24, or 60 vertices of the truncated tetra-, octa-, and icosahedron!

So: to find a copy of S_4 in H, just find non-identity b,c,d in H with
b^2 = c^3 = d^4 = bcd = 1.  This gives a map S_4 → H.  The kernel is
normal, and the only nontrivial possibility is V, but then the image is
S_3, making d^2 = 1.  So if d^2 = 1 we get the 6-element dihedral group
(as we already knew), and if d is actually of order 4 we have S_4.
Likewise for A_4 and A_5, changing d^4=1 to d^3=1 or d^5=1 respectively
(and without any annoying normal subgroup to worry about -- e.g. the
kernel from A_4 couldn't be V because then the image would be Z/3 and
b would be the identity).  We'll use this to describe when these groups
can be found in PSL_2(F_q).  The outcome is:

A_4: always unless q is an odd power of 2

S_4: iff q is odd and 24 | #PSL_2(F_q)

and nicest of all:

A_5: iff 60 | #PSL_2(F_q)

we'll also give the explicit congruence criteria on q mod 8 or 5
that check those divisibility conditions.

April 15: When does PSL_2(F) contain the exceptional triangle groups
A_4, S_4, A_5?

As announced last time:

Proposition. Let F be a finite field of q elements.
Then PSL_2(F) contains...

...A_4 (n=3), iff q is not an odd power of 2;

...S_4 (n=4), iff q is odd and congruent to 1 or -1 mod 8;

...A_5 (n=5), iff q is congruent to 0, 1, or -1 mod 5.

We saw last time that each of these groups is contained in an arbitrary
group H iff H contains elements b, c, d of order 2, 3, and n with bcd=1,
with n in {3,4,5} as above.  So we begin by describing the elements
of order 2, 3, 4, 5 in PSL_2(F).

Lemma: Let F be any field, and let g be a non-identity element of
PSL_2(F) of trace ±t.  Then g is:

i) A transvection iff t = ±2;
ii) of order 2 iff t = 0;
iii) of order 3 iff t = ±1;
iv) of order 4 iff t is nonzero and t^2 = 2;
v) of order 5 iff t^2+t-1=0 or t^2-t-1=0.

[Note: since g is a 2x2 matrix A defined up to A ↔ -A, its trace
t=tr(A) is also defined only up to t ↔ -t.  Comparison of (i) with
(ii), (iii), (v) shows that in characteristic p=2, 3, 5 respectively
g is a transvection iff it is of order p; this is true in general.
Comparing (ii) with (iv) yields the corollary that in characteristic 2
there are no elements of order 4; likewise in general PSL_2(F) has
no elements of order p^2 if F has characteristic p.]

Proof: Recall (or check directly) the 2*2 case of the Cayley-Hamilton
theorem: any 2*2 matrix A satisfies its characteristic equation

A^2 - tr(A)*A + det(A)*I = 0.

If A is in SL_2(F) has trace t then this simplifies to

(*) A^2 - t A + I = 0

(i) We show that a 2*2 matrix A is a transvection iff tr(A)=2 and
A ≠ I, from which the claim will follow.  Recall that A is a
transvection iff A-I has rank 1 and (A-I)^2 = 0.  For a 2x2 matrix,
(A-I)^2 = 0 already implies A-I has rank 0 or 1, and 0 is impossible
unless A=I.  So A is a transvection iff it satisfies (*) with t=2,
and of course (*) can hold for only one value of t (else A=0 but then
(*) does not hold at all), so A is a transvection iff tr(A)=2 as claimed.

(ii) If t=0 then (*) becomes A^2 = -I, so the image of A in PSL_2(F)
is a square root of the identity.  Conversely, if A is not a multiple of
the identity but A^2 = tA - I is then t=0.  (Note that this also means
that, except in characteristic 2, the matrix -I is the only order-2
element of SL_2(F): any other matrix with A^2=cI has c=-1.)

(iii) We compute A^3 using (*), which tells us A^2 = tA - I so

A^3 = A (tA-I) = tA^2 - A = t(tA-I) - A = (t^2-1) A - tI.

Thus if A is not a multiple of identity then A^3 is a multiple of I
iff t^2 = 1.  [The "if" part can also be seen directly from the familiar
factorizations  A^3+I = (A+I)(A^2-A+I),  A^3-I = (A-I)(A^2+A+I).]

(iv) Likewise computing A^4 yields (t^3-2t) A + (1-t^2) I
so A^4 is a multiple of I iff t^3-2t=0, and we already know that
t=0 would entail A^2=I.

(v) And A^5 = (t^4-3t^2+1)A + (2t-t^3), with the leading coefficient
factoring as (t^2-1)^2-t = (t^2+t+1)(t^2-t-1).  QED

Remark: this can be interpreted in terms of the eigenvalues of A,
which in this course we've been calling r and s.  So rs=1 and
r^n = s^n = 1 or -1.  For n=2 this means {r,s} = {i,-i} so r+s=0.
For n=3, r and s are conjugate 3rd or 6th roots of unity, which have
real part cos(2π/3)=-1/2 or cos(π/3)=1/2 so r+s=-1 or 1.  For n=4
they're 8th (but not 4th) roots of unity so we get 2*cos(π/4) or
2*cos(3π/4), i.e. sqrt(2) or -sqrt(2).  For n=5 we recover
the identities relating cos(π/5) and cos(2π/5) to the Golden Ratio.
Similarly for n=6 we'll get cos(π/6) and cos(5π/6), i.e. t^2=3, from

Using this Lemma we already see that some of the conditions claimed
in our Proposition are at any rate necessary: for S_4, the field must
have odd characteristic and contain a square root of 2, so q is
1 or -1 mod 8; and for A_5 the field must contain a solution of t^2-t-1,
so either q=2^f with f even or q is odd and F contains a square root of 5,
which in either case is equivalent to q = 1 or -1 mod 5 (or 0 mod 5).
(I used Quadratic Reciprocity here; in this case we can use the
root-of-unity interpretation to prove the necessary cases of QR,
and indeed one can use roots of unity to prove QR in general starting
with these examples.)

As noted last time, these conditions are also almost equivalent to
the Lagrange condition that #(PSL_2(F)) be a multiple of the size of
A_4, S_4, or A_5 respectively.  Indeed #PSL_2(F) is q^3-q if q=2^f
and (q^3-q)/2 otherwise.  This is always a multiple of 3 [Fermat],
and is a multiple of 5 iff q is 0, 1, or -1 mod 5.  This leaves only
divisibility by 4 or 8.  For q odd, divisibility by 4 is automatic
because q^2 is always 1 mod 8, and (q^3-q)/2 is a multiple of 8
iff (q-1)(q+1) is a multiple of 16 -- and since both factors are even
and only one is a multiple of 4, this happens iff q-1 or q+1 is
a multiple of 8.

We can now dispose of the case q=2^f of the Proposition.  We've already
taken care of S_4.  For A_5 the congruence condition on q means
f must be even.  But then F contains F_4, so PSL_2(F) contains
PSL_2(F_4), and we already saw that this group is isomorphic with A_5.
This also gives us A_4 when 2|f: it's the point stabilizer.  Putting
the stable point at infinity we see that the normal subgroup V of A_4
consists of 1 and the three transvections x → x+1, x+a, x+b
that commute with each other and multiply to 1.

Conversely: two transvections commute iff they fix the same point.
("If" is easy.  "Only if": else put the fixed point at 0 and infinity
and calculate that the two products of [1,a;0,1] and [1,0;b,1] differ by
diag(ab,-ab) so the cannot be equal. (*))  Put that point at infinity.
The other elements of A_4 must fix infinity as well because they
conjugate each of the three transvections to another one.  So it's
in the ax+b group with a^3=1, whence 3|2^f-1 and 2|f as claimed.

(*) To save space we'll abbreviate the 2x2 matrix

[a11 a12]
[a21 a22]

by [a11,a12; a21,a22].  (This is the format I'm used to from the
computer algebra package gp; other such packages [Maxima, Mathematica,
Magma, etc.] have similar conventions.)  Also [a,0; 0,b] is denoted diag(a,b).

To go further we must also address the condition bcd=1.  If bcd=1 then
also b'c'd'=1 where b',c',d' are the conjugates of b,c,d by the same
element of PSL_2(F).  It turns out all candidate c elements are
conjugate in PSL_2(F) [see below] so we may as well take c=[1,-1;1,0].
We show that over a finite field any of our values of t can be attained
as trace(bc) for some b of order 2.  By part (ii) of our opening lemma,
b ranges over 2*2 matrices of the form [x,y;z,-x] for some x,y,z, and then
det(b)=1 implies z=-(x^2+1)/y.  The trace of bc is then (x^2+xy+y^2+1)/y,
so we must solve

x^2 + xy + y^2 + 1 = ty

with y nonzero.  Since we're in odd characteristic we may complete
the square: let x = x1 - y/2 to get

x1^2 = -3*y^2/4 + t*y - 1.

In characteristic 3 this simplifies to x1^2 = t*y-1 so there's
always a solution (e.g. take x1=0 and y=-1/t; NB t can't be zero
in this case).  Otherwise we complete the square on the right-hand
side too, getting

x1^2 = -(3/4)*y1^2 - 1 + t^2/3.

So there's always a solution because t^2 isn't 3 (if it were
we'd have n=6), by a trick we've already seen: as x1 and y1 vary
over a finite field of odd order q, the LHS and RHS take (q+1)/2
different values, so there must be overlap.  The only issue is
that y might vanish, but then y1 doesn't, so changing y1 to -y1
finally yields the desired solution.

This completes the proof of the Proposition.

P.S. As promised:

Lemma: Let F be a finite field.  If A,A' in SL_2(F) are conjugate then
tr(A)=tr(A').  This necessary condition is also sufficient unless
tr(A) = 2 or -2.

Remarks:
1) Neither condition (F finite, trace not 2 or -2) can be dropped.
Over R, clockwise and counterclockwise rotations by the same angle
are conjugate in GL_2(R) but not in SL_2(R).  Even over a finite field,
we certainly must exclude tr(A) = 2 or -2 because a transvection T and
the identity I both have trace 2 and are not conjugate, and likewise
-T and -I have trace -2; and as we've seen it is also possible for two
transvections not to be conjugate.

2) It follows immediately from this criterion that two elements g,g'
of PSL_2(F) that are not of trace ±2 (i.e. neither the identity nor
a transvection) are conjugate iff tr(g) = ±tr(g').

Proof: The condition is certainly necessary by general linear algebra.
The converse is also standard linear algebra if we require only
conjugacy in GL_2(F) because the eigenvalues are distinct.
So we can conjugate A' to A with a matrix of some nonzero determinant D,
and then we need only show that there's a matrix of the same determinant
that conjugates A to itself, i.e. that commutes with A.  We'll use
matrices of the form Ax+Iy, whose determinant is x^2+txy+y^2.
So we must solve x^2+txy+y^2=D in F.  (Again we see how this can fail
if F=R; it can also fail in a finite field of odd characteristic
if t is 2 or -2.)  To do this in characteristic 2, just let x=0
and make y the square root of D.  Otherwise complete the square to get
x^2+rz^2=D where r=(t/2)^2-1 is nonzero because t is not 2 or -2.
Then use the fact that each of x^2 and rz^2 takes (q+1)/2 distinct values
so the sets of possible x^2 and D-rz^2 must overlap and we're done.

April 18: Dickson's theorem: the list of finite subgroups of SL_2(K)

Dickson proved:

Theorem: Let K be a finite field of characteristic p,
and G a subgroup of SL_2(K).  Then:

If |G| is relatively prime to p then it is either cyclic, the
dihedral group , or the preimage in SL_2(K)
of one of the groups A_4, S_4, A_5 described last week.

If |G| is a multiple of p then either
G is contained in a point stabilizer (in the action on P^1(K)), or
p=2 and G is dihedral of order 2n with n odd, or
p=3 and G is a preimage in SL_2(K) of A_5, or
G is a subgroup SL_2(k) for some subfield k of K, or
p>2 and G contains SL_2(k) with index 2 and is generated by SL_2(k) and
diag(c,1/c) where c is a square root of a generator of k*.

We follow the final section of Suzuki's Group Theory I
(Grundlehren der mathematischen Wissenschaften 247; Springer 1982;
chapter 3, section 6, p.392 ff.).  The theorem is stated as 6.17
(and describes all finite groups of SL_2(K) for K algebraically closed).
It is due to Dickson, appearing in his 1901 Linear Groups with an
Exposition of the Galois Field Theory (reprinted by Dover 1958).
Since a subgroup of PSL_2(K) is precisely a subgroup of SL_2(K)
containing -I, this also gives the complete list of subgroups of PSL_2(K).

For Galois' theorem we need only the first part: if |K|=p and
|G| is a multiple of p then the image of G in PSL_2(K) has index
prime to p, and thus equal at least p+1.  We may also assume
G contains -I (if p>2), because if not then  G ∪ -G  is a subgroup
that contains -I and has the same image in PSL_2.  See Suzuki p.396 for
the brief argument that shows that the list of subgroups containing -I
also suffices to determine all subgroups of SL_2.

Dickson's theorem is 6.17 in Suzuki.  6.25 answers our question
directly: the subgroups of SL_2(K) are

dihedral of order 2(q+1)/d and 2(q-1)/d and their subgroups, where
d = gcd(2,q-1);

an  a^2 x + b  group of order (q^2-q)/d and its subgroups,

A_4, S_4, A_5  as described Friday,  and

PSL_2(k) or PGL_2(k) where k is a subfield (and [K:k] is even
in the case of PGL_2).

Recall that to save space we'll abbreviate the 2x2 matrix

[a11 a12]
[a21 a22]

by [a11,a12; a21,a22], and further abbreviate the diagonal matrix
[a,0; 0,b] by diag(a,b), which is in SL_2(F) iff ab=1.
(F is an algebraically closed field containing K.)

6.2, 6.3  We shall repeatedly use the following elements of SL_2(K).
For c in F*, let d_c = diag(c,1/c).  These form a subgroup D of SL_2(K)
isomorphic with F*.  In particular d_{-1} = -I, which for p>2 is
the unique nontrivial element of the center Z of SL_2(K), and
(as we saw Friday) the unique element of order 2 [while for p=2
the center is trivial and the order-2 elements are the conjugates of t_1]).
Likewise let  t_λ = [1,0;λ,1] ;  these form a subgroup T, isomorphic with (F,+),
consisting of I and the q-1 transvections that fix 0 in P^1.  [I don't know
why Suzuki chose to fix 0 rather than infinity.]  Then  H = D T  is the
stabilizer of 0, and contains T as a normal subgroup.  Finally let
w = [0,1; -1,0].  Then  w^2 = -1  and w^{-1} d_c w = d_{1/c} for all c.

It is a basic fact of linear algebra that over an algebraically
closed field any 2*2 matrix is conjugate with some d_c or with ±t_1
(Jordan canonical form).  Thus the only elements of SL_2(K) with order
divisible by p are the conjugates of ±t_1, with order p for t_1 and
2p for -t_1 when p>2.

6.4-6.5 The centralizer of t_1 is T, and the centralizer of d_c is D
for c not in {1,-1}.  Hence if g in SL_2(K) is not in the center Z then
its centralizer is abelian (because it's the intersection of SL_2(K)
with the centralizer in SL_2(K') where K' is an algebraic closure,
and even that centralizer is abelian).  Moreover the normalizer of d_c
with c not in {1,-1} is <D,w>.

6.6-6.7 reviews what we know about 2- and 3-transitivity and fixed
points of the action on P^1(K) [in particular any nontrivial element of
PSL_2 has 0, 1, or 2 fixed points, with 1 iff it is a transvection.]

6.8 Let K be a field of characteristic p.  Let G a finite subgroup of
SL_2(K) containing Z, and M its set of maximal abelian subgroups
(each of which automatically contains Z).  Then:

(i) If x is in G but not in Z then its centralizer C_G(x) is in M.

(ii) The intersection of any distinct A and B in M is Z.

(iii) Any A in M is either cyclic of order prime to p or of the form
Q x Z where Q is a p-Sylow subgroup of G.

(iv) In the former case, the normalizer N_G(A) contains A with index 1 or 2.
In the latter case N_G(A) contains y such that conjugation by y
takes any x in A to x^{-1}.

(v) In the latter case, if Q is not {1} then G has a cyclic subgroup K
whose normalizer in Q is QK, and if |K| > |Z| then K is in M.

[NB this argument works in SL_2, not PSL_2.  For example, in PSL_2
(iii) would fail because there are subgroups isomorphic to V_4 even for p>2,
e.g. {I, w, d_i, w d_i} if i^2 = -1.  But such a group doesn't lift to an
*abelian* group in SL_2.]

Proof: (i) C_G(x) is certainly abelian because the centralizer in SL_2(K)
is abelian.  So it is contained in some A of M, and that A is abelian,
and contains x because C_G does, so A = C_G(x).

(ii) If x is in the intersection then its centralizer contains
both A and B, so A and B don't commute else AB would be abelian and
(as they're distinct) larger than both.  So x has non-abelian
centralizer, whence x is in Z.  Thus the intersection is Z as claimed.

(iii) If G has an element x not in Z then <x,Z> is abelian, so M
cannot contain Z once G > Z.  In this case A = C_G(x) for any
x in A - Z.  If x is not a transvection then its centralizer is
isomorphic with a finite multiplicative subgroup of the algebraic
closure, so is cyclic.  If it is a transvection then the centralizer
is Z x p-group.  [See Suzuki for this case, which we don't need
for Galois.]

(iv) Over the algebraic closure A is a finite cyclic subgroup of D,
so its normalizer even in SL_2(F) is <D,w> which contains D with
index 2, and conjugation by w inverts D.

(v) [Again see Suzuki because in this case |G| is a multiple of p.]

We now enumerate G in two different ways (double-counting is useful
not just for combinatorial designs!).  Let |Z| = e = 1 or 2, and
|G| = e g  so  g is the order of the corresponding subgroup of PSL_2.
[Suzuki lets q be the size of a p-Sylow Q of G, and |N_G(Q):Q| = ek.
For us q=1.  NB we're not using q=|K| here.]  Consider the G-conjugacy
classes in M [excluding those for which |A| is a multiple of p].  Let
C_1, C_2, ..., C_s be the classes for which the normalizer index is 1,
and C_{s+1}, ..., C_{s+t} be those for which the normalizer index is 2.
Let  e g_i = |A_i| with A_i in C_i.

Now every element of G-Z is contained in some maximal abelian, which is
unique by 6.8ii.  The contribution of C_i to the count e(g-1) of
non-central elements is  e(g_i-1)g/g_i  if i=1,2,...,s, and
e(g_i-1)g/2g_i  if i>s.  So,  1 - 1/g  is the sum of s terms
(g_i-1)/g_i and t terms (g_i-1)/2g_i.  But each (g_i-1)/g_i
is at least 1/2, so

1 > s/2 + t/4

[and the same is true if q>1 because then the C_i account for even
fewer elements of G].  Hence (s,t) is one of:

I     II   III    IV    V    VI
(1,0) (1,1) (0,0) (0,1) (0,2) (0,3)

In case I, (g_i-1)/g_i = 1-1/g  g_i = g  and   G/Z is cyclic.
[If q>1 then G is the normalizer of its p-Sylow.]
In case II, 1/g_1 + 1/2g_2 = 1/2 + 1/g > 1/2.  So g_1 = 2 or g_1=3.
If g_1=2 then 2g_2 = g so G=N_G(A_2) and G contains a cyclic group
with index 2.  If g_1=3 then g_2=2 and g=12 and we soon get G/Z=A_4
(G itself is isomorphic with SL_2(Z/3Z)).  [q>1 cannot arise.]
Case III requires q>1 and G = QxZ.
Case IV gives 1/2 + 1/2g_1 = 1/g + (q-1)/qk so again q>1
[and either p=2 and G is dihedral of singly even order or
p=3 and G is again SL_2(Z/3).  (An integer k is called "singly even"
if k is a multiple of 2 but not 4, "doubly even" if 4|k.)]
Case V gives us 1/2g_1 + 1/2g_2 = 1/g which is impossible
since the LHS exceeds 1/2g+1/2g = 1/g.  [With q>1 this gives rise to
a subgroup SL_2(F_q) or GL_2(F_q).]
Case VI gives 1/2g_1 + 1/2g_2 + 1/2g_3 = 1/2 + 1/g [and q=1].
Hence g_1,g_2,g_3 are some permutation of 2,2,g_3, 2,3,3, 2,3,4, 2,3,5.
These produce the corresponding triangle groups, except that 2,3,3
ends up in case II because the two generators of order 3 generate
conjugate subgroups of G.

April 20: S_4 and A_5 twins in PSL_2(F); the twin A_5's in PSL_2(F_11)
and the Paley biplane (= square 2-(11,5,2) design)

We saw last week that PSL_2(F_q) contains S_4 if q is ±1 mod 8, and
A_5 if q is not ±2 mod 5.  Now we just finished (at least in sketch)
showing where these fit in the list of all subgroups of PSL_2(F_q),
and indeed of finite subgroups of PSL_2(F) for any F.  The groups
that appear for F=C, namely cyclic, dihedral, A_4, S_4, and A_5,
are particularly important: versions of this list appear in contexts
ranging from graph theory (the text's chapter on "graphs with least
eigenvalue -2" is part of this story, and the -2 turns out to be there
for a good reason) to reflection groups, quadratic forms, simple
Lie groups (and thus finite simple groups like PSL_n and Sp_n
coming from matrix groups), singularities of algebraic varieties,
and at least a few other places whose names might not have much
resonance with the undergraduate curriculum (tame quivers anyone?).

For our main concern in 155r, we're particularly interested in the
first cases where these groups A_4, S_4, A_5 first occur as proper
subgroups of PSL_2(F_q).  That's q=5 for A_4=PSL_2(F_3),
q=7 and q=9 for S_4=PGL_2(F_3), and q=9 and q=11 for
A_5 = PSL_2(F_5) = (P)SL_2(F_4).  The action of PSL_2(F_7)
on the 7 cosets of S_4 identifies it with Aut(Π_2);
I already noted that the index-6 subgroup A_5 of PSL_2(F_9)
identifies PSL_2(F_9) with A_6, and so gives another route to
the outer automorphism of S_6, which we shall encounter again
in connection with M_12.  Today we'll show that the action of
PSL_2(F_11) on the 11 cosets of A_5 identifies it with the
Paley biplane, a square 2-(11,5,2) design we've constructed
as a special case of a 2-(q,(q-1)/2,(q-3)/4) design for any
prime power q == 3 mod 4 that usually has only the a^2 x + b
group but here has 12 times as many automorphisms.  (This happens
also for the q=7 version of this design, which is just Π_2 again;
it is known that for q>11 there are no automorphisms beyond the
a^2 x + b  group and Aut(F), forming a semidirect product as usual.)

You already showed that the Paley biplane is uniquely determined by
its parameters up to isomorphism, and its automorphism group -- call it
G for now -- has size 660 with point stabilizer A_5 acting transitively
on the other 10 points.  Since this design is square, the same is true
of the dual design, which gives us another A_5 in G, namely the block
stabilizer, which is not conjugate in G to the point stabilizer.
This should look familiar: again the same is true for Aut(Π_2),
which has two conjugacy orbits of index-7 subgroups isomorphic with S_4.
So there are two S_4's in PSL_2(F_7), and since we expect G to be
SL_2(F_11) there should be two A_5's in PSL_2(F_11).  This is true,
and we'll use it to reconstruct the biplane.

We can try to do this for any F_q with q odd, as follows: PSL_2(F_q) is
an index-2 subgroup of PGL_2(F_q); given a copy of H=S_4 or A_5 in
PSL_2(F_q) we can conjugate it by an element of PGL_2(F_q) to get
a new copy H' of H in PSL_2(F_q).  Can it be conjugate even in PSL_2,
not just PGL_2?  If so then by composing the two conjugations we find
g in PGL_2(F_q) that is not in PSL_2(F_q) but for which conjugation by g
takes H to itself.  So this conjugation is an automorphism of H.

Now for H=S_4 we know all automorphisms are inner.  So, composing with
conjugation by some element of H we get g', still in PGL_2(F_q) but not
PSL_2(F_q), that centralizes H.  But that's impossible because H is
its own centralizer in PGL_2.  (As we did last time we can check that
the centralizer of any element of GL_2 is abelian, so so the centralizer
in PGL_2 has an abelian subgroup of index at most 2, which H doesn't.)
So H and H' are not conjugate in PSL_2.

For H=A_5 the same argument almost works.  It is not quite true that
every automorphism of A_5 is inner.  But an outer automorphism is
conjugation by an odd element of S_5, and such an automorphism
switches the two A_5-conjugacy classes of 5-cycles.  But conjugation
in GL_2 preserves the trace, so this can happen only if our two
t-values coincide.  Remember these are the roots of t^2 ± t = 1.
This can happen, but only in characteristic 5 (and 2, but there
PGL_2=PSL_2 so there's no g to conjugate by).  Indeed in char.5
there's only one conjugacy class of A_5 because we already know that
A_5 is all of PSL_2(F_5) (and Aut(A_5)=S_5 is PGL_2(F_5)).  But
otherwise we again conclude that H and H' are not conjugate in PSL_2(F_q).

[In class I give a simple argument, using Dickson's theorem:
if g in PGL_2(F_q) conjugates H to itself then the union of H and gH
is a subgroup of PGL_2(F_q) containing H with index 2.  But PGL_2(F_q)
is contained in PSL_2(F_{q^2}); but by Dickson S_4 is a maximal subgroup
also of PSL_2(F_{q^2}), and likewise for A_5 unless q is a power of 5.]

In each case it follows that PSL_2(F_q) has two inequivalent actions
on sets of size  |PSL_2(F_q)| / |H|.  For instance, Aut(Π_2) acts
on points and lines of Π_2, and the point stabilizers are different
(though interchanged by an automorphism of Aut(Π_2), which comes from
PGL_2(F_7) or the inverse-transpose automorphism for 3x3 matrices mod 2).
[Aside: remarkably the cycle structures are the same: 1^7, 22111, 331,
421, 7 for elements of orders 1,2,3,4,7 respectively.  This not only
disproves a natural conjecture that the cycle structures determine
the group up to conjugacy, but can also be exploited in related
constructions of isospectral drums and non-isomorphic number fields
of degree 7 with the same zeta function.]  For PSL_2(F_11)
we get two inequivalent actions on 11 points, the coset spaces for
H and H'.  Well H' is the point stabilizer on its 11-point set;
how does it act on that of H?  It cannot be transitive because
11 is not a factor of 60.  Nor does it have a fixed point, else
it would be contained in a conjugate of H, but then it would be
a conjugate of H because |H'|=|H|, and we just showed that
this doesn't happen.  So it must have at least two orbits,
each of which is the index of a proper subgroup of H.  The
only possibility is 5+6.  So we get simultaneously the defining
permutation representation of H=A_5 on 5 items, and the action on
the 6 points of P^1(F_5) that we first saw in Aut(S_6).

This also lets us prove that PSL_2(F_11) acts doubly transitively
on each of our 11-point sets (as it should if it's going to be
Aut(Paley)).  That is, we show that the point stabilizer's
orbit sizes are 1+10.  The only other possibility is 1+5+5.  But then
a 3-cycle in H would have 5 fixed points, whereas one in H' has only 2
(since it has cycle structure 113 and 33 as a 5- and 6-point
permutation respectively).  But there's only one PSL_2(F_11)-conjugacy
class of 3-cycles (they have trace ±1, which is not ±2, so we
can apply the result proved at the end of the April 9 notes).
So it can be placed in both H and H' and that's a contradiction.

Doing this for each conjugate of H' gives us 11 five-point sets in
the coset space  PSL_2(F_11) / H, one for each of the 11 cosets in
PSL_2(F_11) / H'.  These are the blocks of a 2-design because
H permutes them and acts doubly transitively.  The design has
the parameters of a Paley biplane, and we already showed that
this determines the design uniquely, with a group G of 660 automorphisms.
We now see that PSL_2(F_11) is contained in G, and by comparing sizes
we conclude that G = PSL_2(F_11) as claimed.

[BTW the two 11-point representations also have identical cycle
structures, and can be put to the same uses as the two 7-point
representations of the simple group of order 168.]

April 22: The Hadamard matrix of order 12 and its automorphism group
(which we'll identify with M_12, or rather 2.M_12)

Recall (Feb.5): A Hadamard matrix of order n is an nxn matrix H
each of whose entries is +1 or -1 and whose rows are pairwise orthogonal
(whence its columns are too, since the condition is equivalent to
H\T H = n I).  Permuting rows or columns, and multiplying row
or column by -1, produces and isomorphic design.  Such an operation is
H --> P H Q for signed permutation matrices P (for rows), Q (for columns).
If  P H Q = H then (P,Q) is an automorphism of H; the map (P,Q) --> P
from Aut(H) to signed permutation matrices is an isomorphism because
P determines Q (and because acting by (P,Q) and then (P',Q') yields
P'P H QQ').  Ignoring the signs then gives a homomorphism to S_n
whose kernel is {I,-I} (if P,Q are diagonal then PHQ=H implies
P = Q = ±I).  Diagonal matrices (changing signs without permuting)
let us put H in a standard form where the first row and column consists
entirely of +1's.  Then the remaining n-1 columns of each row but
the first contain n/2 entries of -1 and (n/2)-1 entries of +1
(by orthogonality with the first row), and the collection of n-1 subsets
of size (n/2)-1 form a square  2-(n-1, (n/2)-1, (n/4)-1)  design D_H
(using the orthogonality of the other rows and columns).  Hence
4|n once n>2 (how does the n=2 Hadamard matrix [1,1;1,-1] escape?).
It is conjectured that this necessary condition on n is also sufficient.

For n=4 the Hadamard design has trivial parameters 2-(3,1,0),
so it's certainly unique up to isomorphism, whence H is as well.
For n=8 the Hadamard design has parameters 2-(7,3,1), so we now know
it is the unique Π_2.  Thus there's a unique 8*8 Hadamard matrix.
Its automorphisms are the same group AGL_3(F_2) of its 3-(8,4,1)
Steiner system.

Today we consider n=12.  Then D_H has parameters 2-(11,5,2),
so is again unique.  Hence so is H.  We just saw that D_H has
automorphisms by PSL_2(F_11), so Aut(H) contains that group in
its row and column stabilizer.  It is also transitive on ordered
pairs (row, column), because moving any row and column to (1,1)
and normalizing it to +1's yields an isomorphic Hadamard matrix.
Moreover, Aut(D_H) is 2-transitive so Aut(H) is at least 3-transitive.
We show that in fact it is sharply 5-transitive!

Start from scratch:

1) Normalize the first row to +1's (but not the first column).

2) Now any further row has six +1's and six -1's.  Permute the columns
of row 2 so that the six +1's are columns 1-6.

3) Now suppose row 3 has x of its +1's in columns 1-6 and the remaining
6-x in 7-12.  Then  (row 2) . (row 3) = x - (6-x) - (6-x) + x = 4x-12.
Since that's zero, x=3.  (That's of course the simple proof that
4|n once n>2.)  Permute the columns to put each triplet of +1's
as early as possible.  So far our matrix is:

+ + +  + + +  + + +  + + +
+ + +  + + +  - - -  - - -
+ + +  - - -  + + +  - - -

Note that the triplets are equivalent: any double transposition
yields the same matrix with row 2 and/or 3 multiplied by -1.

4) Any further row has a,b,c,d +1's among columns 1-3, 4-6, 7-9, 10-12
satisfying a+b+c+d = 6, a+b+(3-c)+(3-d)=6, a+(3-b)+c+(3-d)=6.
The last two equations give a+b=c+d and a+c=b+d, which means a=d, b=c.
So one of (3,0,0,3), (2,1,1,2), (1,2,2,1), (0,3,3,0).  I claim that
the first and last are impossible.  Indeed if we had (3,0,0,3)
then it would have to be

+ + +  + + +  + + +  + + +
+ + +  + + +  - - -  - - -
+ + +  - - -  + + +  - - -
+ + +  - - -  - - -  + + +

and (0,3,3,0) would be the same thing after multiplying row 4 by -1.
But then the fifth and further rows would also have to satisfy
a+d=b+c, and then a=b=c=d=3/2 which is impossible.  (The analogous
behavior *does* happen for n=8.)  So we're left with (2,1,1,2)
and (1,2,2,1).  We multiply each row by -1 if necessary to
assure it is (1,2,2,1).  It is then determined by the choice of

the +1 in columns 1-3,
the -1 in columns 4-6,
the -1 in columns 7-9, and
the +1 in columns 10-12.

This is still consistent with the triplet equivalence described
at the end of (3), with the proviso that the double transpositions
(12)(34) and (13)(24) change (1,2,2,1) to (2,1,1,2) and so require
multiplying the row by -1.

5) Consider now the requirement that any two of rows 4-12 must be
orthogonal.  Each triplet of columns contributes +3 if its
odd-sign-out columns match for the two rows, and -1 if not.
To get zero we must then have 3-1-1-1.  So any two rows match once
and mismatch the other three times.  Applying triplet equivalence,
we assume rows 4 and 5 match in the first triplet; then applying
equivalence of the three columns within each triplet we assume
they match in column 1, while the odd-signs-out in the remaining
columns are 4,7,10 and 5,8,12.  So far we have:

+ + +  + + +  + + +  + + +
+ + +  + + +  - - -  - - -
+ + +  - - -  + + +  - - -
+      -      -      +
+        -      -      +

and all the columns have been fixed except that we can still switch
columns 2 and 3.

6) We now show this determines the remaining 7 rows uniquely
up to unique permutation.   For starters, there's never
a match in both columns 1-3 and columns 4-6.  But there are
only 3*3=9 choices and 12-3=9 rows, so each choice must occur.
Permuting rows 5-12 to put them in lexicographic order gives

+ + +  + + +  + + +  + + +
+ + +  + + +  - - -  - - -
+ + +  - - -  + + +  - - -
+      -      -      +
+        -      -      +
+          -  ? ? ?  ? ? ?
+    -      ? ? ?  ? ? ?
+      -    ? ? ?  ? ? ?
+        -  ? ? ?  ? ? ?
+  -      ? ? ?  ? ? ?
+    -    ? ? ?  ? ? ?
+      -  ? ? ?  ? ? ?

Of course the same is true of any other pair of columns.  That
immediately determines row 6:

+ + +  + + +  + + +  + + +
+ + +  + + +  - - -  - - -
+ + +  - - -  + + +  - - -
+      -      -      +
+        -      -      +
+          -      -      +
+    -      ? ? ?  ? ? ?
+      -    ? ? ?  ? ? ?
+        -  ? ? ?  ? ? ?
+  -      ? ? ?  ? ? ?
+    -    ? ? ?  ? ? ?
+      -  ? ? ?  ? ? ?

Consider rows 4, 7, and 10.  They match in triplet 2, so they must
differ in each other column.  Switching them if necessary (by
switching columns 2 and 3) we may assume that row 7 singles out
column 8, and row 10 singles out column 9:

+ + +  + + +  + + +  + + +
+ + +  + + +  - - -  - - -
+ + +  - - -  + + +  - - -
+      -      -      +
+        -      -      +
+          -      -      +
+    -        -    ? ? ?
+      -    ? ? ?  ? ? ?
+        -  ? ? ?  ? ? ?
+  -          -  ? ? ?
+    -    ? ? ?  ? ? ?
+      -  ? ? ?  ? ? ?

Now rows 5 and 7 use column 8.  There must be one more such row,
which matches neither 5 nor 7 elsewhere.  So it must be row 12.
Likewise column 9, used by rows 6 and 10, must also be used in 8;
and the remaining rows (9,11) must use column 7:

+ + +  + + +  + + +  + + +
+ + +  + + +  - - -  - - -
+ + +  - - -  + + +  - - -
+      -      -      +
+        -      -      +
+          -      -      +
+    -        -    ? ? ?
+      -        -  ? ? ?
+        -  -      ? ? ?
+  -          -  ? ? ?
+    -    -      ? ? ?
+      -    -    ? ? ?

Finally: the row triplets {4, 8, 12},  {5, 9, 10},  {6, 7, 11}
have not been matched yet, so they must each share columns 10-12:

+ + +  + + +  + + +  + + +
+ + +  + + +  - - -  - - -
+ + +  - - -  + + +  - - -
+      -      -      +
+        -      -      +
+          -      -      +
+    -        -        +
+      -        -  +
+        -  -        +
+  -          -    +
+    -    -          +
+      -    -    +

which spelled out in full is

+ + +  + + +  + + +  + + +
+ + +  + + +  - - -  - - -
+ + +  - - -  + + +  - - -
+ - -  - + +  - + +  + - -
+ - -  + - +  + - +  - + -
+ - -  + + -  + + -  - - +
- + -  - + +  + - +  - - +
- + -  + - +  + + -  + - -
- + -  + + -  - + +  - + -
- - +  - + +  + + -  - + -
- - +  + - +  - + +  - - +
- - +  + + -  + - +  + - -

Since we could start with any five rows, put them at 1,2,3,4,5 in order,
and then choose the columns uniquely to get this matrix, it follows that
the image of Aut(H) in S_{12} is simply 5-transitive as promised
(and hence simple as shown in late March, since that depended only on
5-transitivity and the group size).

We're guessing that this group is M_{12}.  We could prove it by
finding a construction of 2*66 hexads from H that is intrinsic and thus
automatically invariant under the group; then these will automatically
be the blocks of a 5-design, which will thus be the unique (5,6,12)
Steiner system.  Can you find such a construction?

April 25: The Hadamard matrix H_12 and the Mathieu group M_12
(and the action of PSL_2(F_q) on Paley's Hadamard matrices in general)

Theorem: the image in S_12 of the automorphism group of an order-12
Hadamard matrix H_12 is isomorphic with the Mathieu group M_12.

Proof: For each of Bin(12,2)=66 pairs of columns there are six
rows where they match and six where they do not.  So we get
a design D of 2*66 = 132 six-element subsets of the 12 rows --
or rather a structure because we haven't yet shown there are
no repeats.  We claim it is a Steiner (5,6,12) design with
an action of Aut(H_12).  Any automorphism (P,Q) in Aut(H_12)
preserves D -- even the columns that are multiplied by -1 yield
the same 6+6 split, only with the matched and unmatched rows switched.
Since Aut(H_12) acts 5-transitively on rows, D is a 5-structure.
But then its lambda is  132 / (Bin(12,5)/Bin(6,5)) = 1  so there
can be no repeated blocks and we have a 5-(12,6,1) design as desired.

We've already shown that this design is unique up to isomorphism
and the automorphism group is M_12.  We now see that Aut(H_12)
acts on D, and of course -1 acts trivially and the action of
Aut(H_12)/{1,-1} is faithful.  Hence Aut(H_12)/{1,-1} is contained
in M_12, and since both groups are simply 5-transitive -- and thus
have the same size -- the map Aut(H_12) → M_12 is an isomorphism, QED.

It follows too that transposition of H_12 gives an outer automorphism
of M_12: as with S_6 it has two inequivalent actions, with the
point stabilizer in one acting transitively on the other.  This is
because our construction of D from H_12 shows that the column-pair
stabilizer M_10.2 acts on the rows as the block-pair stabilizer S6.2 .
Hence M_10 has no fixed rows -- at worst its row orbits would have
size 6 (though in fact it's transitive, as we'll explain) -- so
the column M_11 acts transitively on rows.  Proof: M_11 is simple,
whence any orbit of a permutation representation must have size
at least 11 if it is not a singleton (we already proved this for
any simple group: the size of any nontrivial orbit is at least as large
as the biggest prime factor of the group order).  But even the M_10
orbits have size at least 6, so M_11 must be transitive as claimed.

The simultaneous stabilizer in Aut(H_12) of both a row and a column
is Aut(Paley) which we already identified with PSL_2(F_11) in its
11-point permutation representation.  Remarkably there's also
a copy of PSL_2(F_11) in M_12 that acts on both the rows and columns
by its standard action on the 12 points of P^1(F_11).  We already
observed the analogous fact for PSL_2(F_7) and H_8 in the course of
our proof of the isomorphism between PSL_2(F_7) and (P)SL_3(F_2).
In fact this (unlike the 7- and 11-point representations) works for
the Paley Hadamard matrix of order q+1 for any prime power q
congruent to 3 mod 4.

Recall that Paley makes such a matrix by labeling both the rows and
the columns by P^1(F_q) and setting the (x,y) entry equal 1 iff either
(a) x or y is infinity or (b) both are finite and x-y is a nonzero square.
By construction the  a^2 x + b  group acts on both rows and columns
without even resorting to sign changes.  This is the point stabilizer
of the action of PSL_2(F_q) on P^1; you shouldn't be too surprised
then that allowing sign changes lets us extend the action to PSL_2(F_q),
still acting the same way on rows and columns.

Now the  a^2 x + b  group is a maximal proper subgroup of  PSL_2(F_q),
so once we show that any element of PSL_2(F_q) acts it will prove that
all of PSL_2(F_q) does.  For instance we can use  z ↔ -1/z,
corresponding to the matrix [0, -1; 1, 0].  (It's also not too hard
to show directly that this involution together with PSL_2(F_q)
generates PSL_2(F_q), and even possible to show in the same way that
any element of PSL_2(F_q) acts on the Paley matrix.)  So permute both
the rows and columns according to z ↔ -1/z.   We proceed to give
a computational construction of an action of this permutation;
a more conceptual one, without so much case analysis, appears later.

The involution z ↔ -1/z puts the 0 row and 0 column at infinity,
and we must then multiply some rows and columns by -1 to make the
new infinity row and column all +1.  The (0,0) entry was -1, so

(i) multiply row (inf,*) by -1.

Now (inf,0) is what used to be (0,inf) so was +1, but we multiplied it
by -1, so we must

(ii) multiply column (*,0) by -1.

For nonzero y in F_q, entry (inf,y) is what used to be (0,-1/y) so it's
now +1 unless 0-(-1/y) = 1/y is a nonzero square, i.e. y is a square.  So
(iii) multiply column (*,y) by -1 if y is a nonzero square.

Similarly for the finite-index rows, except that we didn't multiply
(*,inf) by -1 to start, so (0,inf) is still +1 and we don't negate
row (0,*).  For nonzero x, entry (x,inf) used to be (-1/x,0) so was
-1 unless -1/x was a square, i.e. x was a non-square (remember
-1 is not a square because q is 3 mod 4).  So,

(iv) multiply row (x,*) by -1 if x is a nonzero square.

So we've now made the (x,y) entries +1 if x or y is infinite.
For the remaining cases:

If x=y=0, the (x,y) entry was at (inf,inf) so was +1.  We multiplied
it by -1 in (ii) so it is now -1, as desired.

If x=0 but y is not zero, the (x,y) entry was at (inf,-1/y) so +1.
We multiplied it by -1 in (iii) iff y was a square.  Hence it's
-1 iff y is square, i.e. iff -y = x-y is not a square, again as desired.

If y=0 but x is not zero, the (x,y) entry was at (-1/x,inf) so +1.
We multiplied it by -1 in (ii), and again in (iv) if x is a square.
So it's +1 iff x is square, again as desired because x = x-y here.

If x=y and is not zero, the same is true for (-1/x,-1/y) so that entry
was -1.  It has been multiplied by -1 either twice, in (iii) and (iv),
or not at all.  So it's still -1 as desired.

Finally suppose x and y are distinct nonzero elements of F.  Then
the x,y entry was 1 or -1 according as

(-1/x) - (-1/y) = (x-y)/xy

was a square or not.  In (iii) and (iv) we multiplied it by -1 once
iff just one of x or y is a square, i.e. iff xy is not a square.
Hence it is now 1 iff x-y is a square, again as desired.  QED!

Corollary: M_12 contains PSL_2(11), acting transitively on rows
and transitively on columns.

Proof: Since H_12 is unique, the Paley matrix is isomorphic with it,
so PSL_2(11) is contained in Aut(H_12) / {1,-1} = M_12.

If you feel the derivation of the PSL_2(F_q) action has way too many
cases -- that there should be a simpler explanation -- then you
should prefer the following approach.  Let k=F_q.  Since we want to
think of the all-(+1) row and column as having index infinity,
we should really index the rows and columns by representatives of
the lines in k^2.  Use (x,1) and (y,1) for the finite-index rows
and columns, and (a,0) and (b,0) for the ones at infinity, with
a,b to be chosen soon.  Observe that x-y is the determinant
(x,1) ∧ (y,1).  So we aim to use the quadratic character of
that determinant should to re-define the matrix.  Since
(a,0) ∧ (y,1) = a  while  (x,1) ∧ (b,0) = -b, we'll choose
a=1 and b=-1 to maintain the pattern.  Finally on the diagonal
we have two representatives of the same line; let c be their ratio,
and make the diagonal entry +1 iff c is not a square.  Note that
this works for our choice of row and column representatives:
(x,1) = (x,1)  [c=1, a square], while (1,0) = - (-1,0)
[c=-1, not a square].  Now observe that if we change a row or
column representative v to some other representative bv then
that row or column is multiplied by +1 or -1 according as
b is a square or not (on the diagonal as well as off it).
Any SL_2(F_q) transformation, applied simultaneously to rows
and columns, then yields an equivalent matrix because it
preserves the determinant and permutes the q+1 lines: we need only
multiply some rows and columns by -1 (namely those whose representative
has been multiplied by a non-square) to recover the same matrix.
We can also check directly that this construction yields a
Hadamard matrix: given any two columns, we may as well choose
coordinates so they're indexed by (1,0) and (0,1) and scale the
corresponding rows to the same vectors, so their pattern is

+ +
+ -

and then any other row (x,y) matches iff -y/x is a square which
happens for exactly (q-1)/2 ratios y/x, so indeed we have the same
number (q-1)/2 + 1 = (q+1)/2 of matches and mismatches.  This approach
also shows that before forming the quotient by {I,-I} we already
had an action of PSL_2(F_q), because -1 acts by multiplying all the
row and column representatives by -1 and that yields the same matrix.

April 29, part 1:  More on M_12, the (5,6,12) Steiner system, and the
affine (2,3,9) and inversive (3,4,10) planes

We should develop a few more facts about the (5,6,12) Steiner design and
its automorphism group before proceeding to finding Aut(M_12) in M_24.

For starters, we found Aut(S_6) by the action of S_6 on totals.
Here's a somewhat analogous construction of the outer automorphism of M_12.
There are 66 block pairs.  Any two divide the 12 points into either
four triplets or 4+2+2+4.  Form a graph whose vertices are the
block pairs, with two adjacent iff they give four triplets.
I claim that this is the triangular graph T(12), and that the
outer automorphism can be constructed from the action of M_12
on the 12 maximal cliques (sometimes called the "totals" of the design
for the sake of analogy with the S_6 situation).  This is easy from
our description last week of any three or four rows of H_12:
block pairs of rows biject with pairs of columns, and two column pairs
intersect iff their block pairs yield four triplets.  (We can also
check using the intersection triangle that our graph has degree
2 * Bin(6,3) = 20.)  This also gives a bijection from triples of
columns to partitions of the 12 points into four triples
any two of which constitute a block; on T(12) those are
triangles that are not contained in a maximal clique.
(Check: there are Bin(12,3)=220 triples, and also 20*66/Bin(4,2)=220
such sets of four triples.)  See below for what remains of M_12
after we specify a triple (a.k.a. M_9).

What do the Bin(12,4)=495 quadruples of columns correspond to?
in the other direction: given 4 rows, we can confirm using
our analysis of the previous lecture that their product has
either four +1's and eight -1's or vice versa; choose the four,
and check (using "one match, three mismatches") that applying the same
construction to those four columns retrieves the same four rows.

MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM

Just as with the (5,8,24) Steiner system, the first three derived
designs of the (5,6,12) are unique, and the design can be reconstructed
from three kinds of ovals in the third derived design.  The uniqueness
is easy for the first derived design with parameters (4,5,11),
because from such a design we immediately reconstruct a (5,6,12) design
using the 66 block complements and the 66 hexads consisting of a
block plus the 12th point -- and we already know the (5,6,12) design
is unique and has a transitive automorphism group.  On the other side,
a (2,3,9) design is an affine plane of order 3, so it's unique too:
we know that any affine plane is the complement of a line in a
projective plane, and Π_3 is unique with an automorphism group
acting transitively on lines.  That leaves the (3,4,10), which
we've already constructed in several ways (e.g. as the algebraic
inversive plane of order 3), but I don't think we've quite proved
its uniqueness, let alone given the construction of the (5,6,12) design
analogous to what we did for the (5,8,24) starting with Π_4.

At the end of the day several aspects of the picture are quite similar.
Let P be the affine plane of order 3, which has 9 points and 12 lines.
(An equivalent description: points are a 2-dimensional space over F_3;
lines are sets of three distinct points that sum to zero.)
An "oval" of P is a set of 4 points of P that meet each line in
at most 2 points.  These are exactly the ovals in Π_3 that do not
meet the line at infinity.  We readily count them as we did for Π_3:
(9 * 8 * 6 * 3) / 4! = 54.  They naturally partition into three
classes each of size 18; we'll use one class to get the (3,4,10)
design and will need all three to reach the (5,6,12).  The group
Aut(P) has a normal subgroup of index 6 with quotient group S_3
permuting the three classes.  But naturally the details are different;
e.g. here the subgroup arises from the normal subgroup V_4 of S_4,
here occurring as PGL_2(F_3) !

which has 30 blocks, and derive a (2,3,9), which is P by deleting
a point I.  Since any three points of D determine a unique block,
we must add 18 4-element subsets of P that meet no line in more than
2 points -- that is, ovals -- and also meet each other in at most 2 points.
The key fact is that any oval O in P can be written uniquely as
the symmetric difference of two non-parallel lines l,m.  Indeed
such a symmetric difference is readily seen to be an oval;
each oval that arises this way does so uniquely -- its four points
can be split in two other ways but both yield parallel lines;
and this accounts for 9 * Bin(4,2) = 54 ovals, i.e. all of them.
Let p be the intersection of l and l'.  There are two other lines
l',m' through p, which give rise to another oval O' disjoint from P.
We claim that if O appears in D then so is O'.  Indeed by the
intersection triangle for a (3,4,10) Steiner system
there are 3 blocks disjoint from O.  These are either I+line
or another oval.  But there are only two lines disjoint from O,
namely l' and m', and only one other oval, namely O' (else one
would contain p and this one of l' and m').  So we must use O'.
[On further thought, this last step was not needed: the next step
doesn't use it, showing instead that we must use the four translates
of O' by nonzero vectors in l' or m'; repeating this argument
a few times finds all 18 translates of O and O'.]

We claim that we must also the other 16 ovals that are translates
of O or O'.  [Recall that Aut(P) = AGL_2(F_3) = Ax+b group =
stabilizer in Aut(Π_3) of the line at infinity; the map (A,b) → A
is a homomorphism onto GL_2(F_3) with kernel F_3^2 = translations.]
Proof: let q,r be points on l and m, and let s = q+r-p (this makes sense
even in the affine plane, not just in F_3^2).  So in coordinates we may take
p=(0,0), l and m to be x=0 and y=0, and q,r,s to be (0,1), (1,0), (1,1).
What's the fourth point in the unique block containing q,r,s?
Not I because they're not collinear; nor (0,2) or (2,0) because
then that block would meet O in 3 points; nor (1,2) or (2,1)
because that's collinear with r,s or q,s respectively.  So it's p
and we get O' translated by (-1,-1).  This argument gives four
translates each of O and O', and applying it again to those translates
gives the remaining ones.

So we've determined all 18 ovals.  We can either verify that the
resulting 30 blocks constitute a (3,4,10) system, or as usual
we can use the fact that we've already constructed one to show that
once we've reduced to one possibility it automatically works.
Likewise for the (5,6,12) design we join

I + II + III  to the 12 lines;
each pair from I, II, III to one of the three classes of 18 ovals;
each of I, II, III to the oval complements from the class
associated with the complementary pair; and
the empty subset of {I,II,III} with the 12 line complements on P.

This gives a total of 12 + 54 + 54 + 12 = 132 blocks as expected.

To see the connection with V_4 in S_4: the action of GL_2(F_3)
on the line at infinity is via PGL_2(F_3) = S_4; all the translates of O
come from pairs of lines through the same two points at infinity,
and those of O' come from pairs through the other two points,
so a choice of 18 ovals means a choice of even split of the
4 points at infinity as 2+2 -- and we know that S_4 acts on them
(and thus on the extra three points I, II, III) via the map
S_4 → S_3 with kernel V_4.

There's yet more to be said about the humble P [not to be confused with
humble Π, I suppose]: P can be found in any algebraic projective plane
over a field where x^2+x+1 has roots, e.g. Π_4.  Indeed we've done this
in characteristic 3 where x^2+x+1=(x-1)^2, and otherwise there are three
distinct roots of unity 1,r,s and we can use

0,  1, -1;
-1,  0,  1;
1, -1,  0;
0,  r, -1;
-1,  0,  r;
r, -1,  0;
0,  s, -1;
-1,  0,  s;
s, -1,  0.

(These all satisfy x^3+y^3+z^3=0, and are the inflection points of this
plane cubic, and indeed of x^3+y^3+z^3=cxyz for any c.)  Note that we
still can't do it over R, for instance because of Sylvester's theorem
(in a finite configuration of points in RP^2, not all collinear,
there are two points not collinear with any other) -- and indeed R
has no sqrt(-3) or cube roots of 1.  But it does work over C,
so Sylvester fails over C.

April 29, part 2:
A bit on the Golay [24,12,8] code; the 2576 "umbral" dodecads
and M_12.2 inside M_24

As we started with linear algebra, we end with it.  I mentioned a bit
of this already in outlining the topics; I'll say just a little more,
because several people are doing this in much more detail for their
final project and I don't want to step on their toes...

Let k be the 2-element field Z/2Z, and D=(X,B) the Steiner (5,8,24)
design.   Let V be the 24-dimensional space of functions X → k.
This can be identified naturally with the set of subsets of X:
v is identified with {x: v_x = 1}, and in the other direction
a subset is identified with its characteristic function.
The weight of a vector v is its number of 1's (= the size of
the subset of X corresponding to v).  Vector addition corresponds
to the symmetric difference, and the standard bilinear pairing
<u,u'> = sum_x u_x u'_x  takes the characteristic functions of
sets S,S' to the parity of the size of their intersection.

The extended binary Golay code G is a subspace of V generated by
(the characteristic functions of) blocks.  We'll show:

i) <v,w> = 0 for all v,w in G.

ii) G contains the all-1's vector.

iii) Every vector in G has a doubly even weight.
["n is doubly even" means 4|n.]

iv) G has dimension 12.

v) G has no vectors of weight 4 (and thus none of weight 20).

vi) G has  1, 759, 2576, 759, 1  vectors of weight  0, 8, 12, 16, 24
respectively.  (Check: these counts sum to  4096 = 2^12.)
[The 2576 size-twelve subsets of X are often known as the "dodecads" or
"umbral dodecads" of the design, as the blocks themselves are known as

vii) Given a 12-element subset S corresponding to a vector in G,
there are 132 blocks B whose intersection with B has size 6.
These intersections are distinct and constitute a (5,6,12) design.

viii) M_24 = Aut(G).  This group acts transitively on weight-twelve
vectors of G.  The subgroup of M_24 that fixes such a vector S acts as
M_12 on S, and -- via an outer automorphism of M_24 -- on its complement S'.

Proof:

(i) [given already] By bilinearity it is enough to check this
on generators.  We already know that any two blocks meet in
8, 4, 2, or 0 points.  Each of these numbers is even, QED.

(ii) Choose any block B, and recall that the complement contains
30 blocks forming the design of affine hyperplanes in k^4.
Taking any two complementary hyperplanes B' and B" gives us
three generators of V whose sum is the all-1's vector.

(iii) [not using (ii)] We use the following variant of
inclusion-exclusion that counts symmetric differences.
If {A_i : i in I} is a finite collection of subsets of some set,
for each nonempty subset J of I we denote by A_J the intersection of
the sets A_j with j in J.  Then the size of the symmetric difference
Delta of the A_i is the sum over J of  (-2)^(|J|-1) #(A_J).
[Recall that using -1 instead of -2 would give ordinary inclusion-exclusion.]
Proof: a point that appears in exactly k of the sets is counted
(1 - (1+(-2))^k)/2  times, and that's 1 or 0 according as k is
odd or even.

In particular, if each #(A_i) is 0 mod 4, and any two A_i and A_j
intersect in a set of even size, then |Delta| is a multiple of 4.
This hypothesis is satisfied whenever the A_i are blocks of D, QED.

(iv) For now we show that G has dimension at most 12.  Proof:
by (iii) G is contained in its annihilator w.r.t. the pairing <.,.>.
But this pairing is nondegenerate, so  dim(G) ≤ dim(V) - dim(G)
whence  dim(G) ≤ dim(V) / 2 = 12.

(v) M_24 acts on D and thus on G.  Since M_24 acts transitively on X,
if G contained one word of weight 4 it would contain them all.
But that's absurd (it contradicts all of (i), (ii), and the part of
(iv) we've already proved).  Since G contains the all-1's word,
it thus cannot contain a word of length 24 either.

(vi) So far we know that G contains at most 2^12 = 4096 words in total,
and at least 1 + 759 + 759 + 1 = 1520 of weight 0, 8, 16, or 24
respectively.  Hence it contains at most 2576 words of weight 12.
At any rate it contains at least one, because we can certainly find
blocks B, B' whose intersection has size 2, and form their symmetric
difference.

(vii) By (ii), G also contains the complement S'.
By (i), any block B intersects S in a set of even size.
We claim this size is one of 2, 4, or 6.  Indeed if it were 8
then B would be contained in S and the symmetric difference B+S
would have weight 4, contradicting (v).  Likewise if it were 0
then B+S' would have weight 4.  Moreover, (*) if B intersects S in
a set of size 6 then no other B' block intersects S in the same set
(else B+B' would be a nonzero vector of weight at most 4,

We can now prove that the six-element subsets of S that arise as
intersections with a block constitute a (5,6,12) Steiner system.
Indeed let Y be any 5-element subset of S.  Since D is a (5,8,24)
Steiner system, there exists a unique block B containing Y.
Then B intersects S in at least 5 points, hence exactly 6, as desired.

(viii) By (vii) there are 132 such choices of B.  Each intersects S'
in a set of size 2.  If two blocks B1,B2 meet S' in the same pair
then B1+B2 is supported on S, so it is either 0 or S itself,
and the latter occurs iff the intersections of B1 and B2 with S
are complementary blocks of the (5,6,12) design.  This gives us
an injection from (5,6,12) block-pairs to pairs in S'.  But there are
only  Bin(12,2) = 66  pairs available, so the map is a bijection.

We claim that, after possibly permuting S', this is the same bijection
we obtained earlier in the week using Hadamard matrices.  Indeed
two distinct pairs in S' intersect iff the corresponding blocks in S
have odd intersection (and thus necessarily intersection of size 3).
So we have an isomorphism from the triangle graph T(12) of pairs of S',
and the odd-intersection graph of block pairs in S -- and we already know
(using maximal cliques, since 12>4) any such isomorphism is induced from
a permutation of the underlying 12-element set.

We could now also complete the proofs of (iv) and (vi) in various ways;
for instance we could compute dim(G) using the map from G to the
12-dimensional space of functions on S' (by forgetting the S
coordinates), and checking that its image has dimension 11
(it contains all vectors of weight 2) and its kernel is the
1-dimensional space {0,S}.  Perhaps the most amusing way to finish off
all that remains of (iv), (vi), and (viii) is as follows.  Let H be
the stabilizer in Aut(G) of S.  Then H acts on the (5,6,12) design,
so maps to M_12; and the map is injective because if an element of M_24
acts trivially on S then it acts trivially on T(12) and thus also on S'.
Thus the orbit of H has size at least |M_24| / |M_12|.  But that's
(24*23*22*21*20*48) / (12*11*10*9*8) = 2^4 7 23 = 2576.  Yet we saw
in (vi) that G has at most 2576 such vectors, and that's only if
dim(G)=12, there are no vectors of weight 8 and 16 other than
blocks and block complements, Aut(G) is no larger than M_24, and
all of M_12 acts on G (so is in M_24).  Hence all that must happen.
In particular there are elements of M_24 that take S to S', so
the group that stabilizes the partition  X = S + S'  contains M_12
with index 2.  Since we've seen that the  block-pair / S'-pair
structure of the Hadamard matrix reappears here, the actions of
M_12 on S and S' must be related by the same outer automorphism.

QED!

Final(?) remark: the 759 blocks consist of

132 blocks meeting S in 6 points and S' in 2 points, and
132 blocks meeting S in 2 points and S' in 6 points, and thus
495 blocks meeting S in 4 points and S' in 4 points

because  759 - (132+132) = 495.   But there are Bin(12,4)=495
quadruples in either S or S', and none arises twice (why?).
Hence each arises, and we get a bijection between the quadruples
in S and S'.  It will not surprise you (and perhaps you can
convince yourself) that this bijection is the same one we
constructed last time using the Hadamard matrix.