Guest Post: Ordinals and Hydras, by Brian Lawrence

A few days ago, the hydra game came up in discussion with Brian Lawrence; we'd both run into a somewhat mystifying analysis of the game via ordinals.  Brian and I came up with a much more down-to-earth analysis (which experts will recognize as the same as the usual one); Brian decided to write it up, and I asked him if I could post his write-up here.

Guest post by Brian Lawrence:

I've read in a number of places about the Hydra game, which one plays by applying a certain type of move to a rooted tree.  The theorem (surprising at first) is that the Hydra game always terminates: there is no infinite sequence of legal moves.  The theorem is used in mathematical logic as an example of a result that can only be proven in a sufficiently strong proof system.  I'm not an expert in logic, so I won't say any more about that.  Instead, I'd like to give a straightforward proof of the Hydra theorem.

The game is based on the mythical Hydra, a many-headed monster.  In myth, every time you cut off one of the Hydra's heads, it would grow two more.

Our Hydra is a (finite) rooted tree.  One performs a sequence of moves; each move is as follows.  Choose a leaf \(x_0\) of the rooted tree, and remove it.  Then, let \(x_1\) be the parent node of \(x_0,\) and \(x_2\) the parent node of \(x_1\) (assuming it exists).  Let \(S\) be the subtree consisting of \(x_1\) and all its descendants, after \(x_0\) has been removed.  When you remove \(x_0,\) the Hydra chooses a \(N > 0,\) and attaches \(N\) new copies of \(S\) as children of \(x_2.\)  If \(x_0\) doesn't have a grandparent, the Hydra doesn't get to add anything to the tree.  (Note that the Hydra can choose a different \(N\) at every step.)  The game ends when only a single vertex (the root) remains.

One move of the hydra game.

There's a link to a Java version at the bottom of this page.  (The in-website applet doesn't work for me, but the downloadable version does.)

Here's the remarkable fact: no matter how we choose which leaves to cut off, and no matter how large the Hydra chooses its positive integers \(N,\) the game is guaranteed to terminate after finitely many steps.

Let's define the "depth" of a rooted tree to be the length of the longest path from root to leaf.  If a tree has only one vertex, it has depth \(0;\) if all leaves are adjacent to the root, the tree has depth \(1;\) and so forth.  I'll prove the theorem by induction on the depth, starting with the case of depth \(2.\)  (The theorem is trivial for trees of depth \(0\) and \(1.\))

For any rooted tree, call an "immediate subtree" the subtree consisting of \(x\) and all its descendants, where \(x\) is a neighbor of the root.

Now suppose we have a tree \(T\) of depth at most \(2.\)  Its immediate subtrees have depth at most \(1.\)  For every integer \(r \geq 0,\) call \(T_r\) the tree of depth at most \(1\) where the root has \(r\) children. 

The tree \(T_4\)

(The depth of \(T_0\) is \(0.\))  To describe our tree \(T,\) we just have to describe its immediate subtrees; in fact, if \(a_r\) is the number of immediate subtrees isomorphic to \(T_r,\) then our tree \(T\) is determined completely by the tuple \((a_0, a_1, a_2, \ldots).\)

The tree corresponding to the tuple \((0, 2, 1, 1, 0, 0, \ldots)\).

Let \(\mathcal{I}_2\) be the set of tuples \((a_0, a_1, a_2, \ldots)\) of nonnegative integers, such that all but finitely many \(a_r\) are nonzero.  We impose the lexicographic total ordering on \(\mathcal{I}_2:\) given \(a = (a_0, a_1, a_2, \ldots)\) and \(b =(b_0, b_1, b_2, \ldots)\) in \(\mathcal{I}_2,\) take the largest \(r\) such that \(a_r \neq b_r,\) and define \(a > b\) if and only if \(a_r > b_r.\)

We have described a bijection between trees of depth at most \(2\) and tuples in \(\mathcal{I}_2.\)  The theorem (for depth \(2\)) now follows from the following two lemmas, which are left as an exercise to the reader.

Lemma. Any legal move changes a tuple \(a \in \mathcal{I}_2\) to another tuple \(b\) such that \(b < a\) for our total ordering.

Lemma. The set \(\mathcal{I}_2\) is well-ordered: it contains no infinite descending chain.

We will construct recursively a well-ordered set \(\mathcal{I}_d\) which classifies the isomorphism types of trees of depth at most \(d.\)  Specifically, we will construct \(\mathcal{I}_{d+1}\) as a set of tuples of nonnegative integers, where instead of indexing by \(\mathbb{N}\) we index by \(\mathcal{I}_d.\)  Precisely, \(\mathcal{I}_{d+1}\) is the set of functions \(a: \mathcal{I}_d \rightarrow \mathbb{N}\) (we notate such a function \(i \mapsto a_i\)) such that \(a_i = 0\) for all but finitely many \(i \in \mathcal{I}_d.\)

Again, we impose on \(\mathcal{I}_{d+1}\) the lexicographic total ordering: given any \(a, b \in \mathcal{I}_{d+1},\) choose \(i \in \mathcal{I}_d\) maximal such that \(a(i) \neq b(i);\) and take \(a > b\) if and only if \(a(i) > b(i).\)  (Only finitely many \(i\) satisfy \(a(i) \neq b(i),\) so there is no difficulty in choosing the maximal one.)

Just like before, there is a bijection between elements of \(\mathcal{I}_{d+1}\) and isomorphism types of trees of depth at most \(d+1:\) given a tree \(T,\) produce the tuple which records for each \(i \in \mathcal{I}_d\) the number of copies of \(T_i\) that are immediate subtrees of \(T.\)

The result now follows from the following two lemmas, the first of which we again leave as an exercise.

Lemma. Any legal move changes a tuple \(a \in \mathcal{I}_{d+1}\) to another tuple \(b\) such that \(b < a\) for our total ordering.

Lemma. The set \(\mathcal{I}_{d+1}\) is well-ordered: it contains no infinite descending chain.

Proof. Suppose \(a^{(1)} \geq a^{(2)} \geq a^{(3)} \geq \cdots\) is an infinite descending chain in \(\mathcal{I}_{d+1}.\)  We want to show that eventually the chain stabilizes: there exists \(N\) such that \(a^{(n)} = a^{(N)}\) for \(n \geq N.\)

\((\)A remark about notation: \(a^{(N)} \in \mathcal{I}_{d+1}\) is a tuple in our infinite descending chain of tuples.  For \(i \in \mathcal{I}_d,\) the \(i\)-th entry of this tuple is \(a^{(N)}_i.)\)

Let \(j \in \mathcal{I}_d\) be an index; we say that the sequence \((a^{(n)})\) stabilizes to the right of index \(j\) if there exists a positive integer \(N\) such that
$$a^{(n)}_i = a^{(N)}_i$$
whenever \(i \geq j\) and \(n > N.\)

Since \(\mathcal{I}_d\) is well-ordered, we can choose \(j\) minimal such that our sequence stabilizes to the right of \(j;\) also choose \(N\) as above, so that \((a^{(n)})_i = (a^{(N)})_i\) whenever \(i \geq j\) and \(n > N.\)

The tuple \(a^{(N)}\) has only finitely many nonzero entries.  If \(a^{(N)}_k = 0\) for all \(k < j\), then the sequence stabilizes at \(a^{(N)}\) and we are done.  Otherwise, choose \(k\) maximal such that \(k < j\) and \(a^{(N)}_k > 0.\)   (There are only finitely many \(k\) with \(a^{(N)}_k > 0,\) so again there is no harm in choosing the maximum.)

I claim that in fact the sequence stabilizes to the right of index \(k\), contradicting our choice of \(j.\)  Consider an index \(i \geq k.\)  If \(i \geq j\) then we know \((a^{(n)})_i = (a^{(N)})_i\) for \(n > N.\)  If \(k < i < j\) then \(a^{(N)}_i = 0;\) and by the lexicographic ordering on \(\mathcal{I}_{d+1}\) we know that \(a^{(n)}_i = 0\) for \(n > N\) as well.  So we only need to worry about \(i = k.\)  Again by lexicographic ordering, the sequence
$$a^{(N)}_k, a^{(N+1)}_k, a^{(N+2)}_k, \ldots$$
is a nonincreasing sequence of natural numbers.  Any such sequence in \(\mathbb{N}\) eventually stabilizes.  So our sequence \((a^{(n)})\) stabilizes to the right of \(k\) as well, contradicting our choice of \(j.\) \(\blacksquare\)

In closing I'll just mention that the sets \(\mathcal{I}_d\) are examples of ordinals; interpreting them as such gives rise to the proofs linked to at the very beginning of this post.

The parity of zero, the primality of two, and other mysteries

As we know,
There are known knowns.
There are things we know we know.
We also know
There are known unknowns.
That is to say
We know there are some things
We do not know.
But there are also unknown unknowns,
The ones we don’t know
We don’t know.
— Donald Rumsfeld, 2/12/2002

From time to time, I try to speak or write about mathematics for general (non-mathematical) audiences.  If you've done this, you know it's pretty hard -- in large part because it's hard to know what people know, despite my best attempts to find out.

Enter Google Surveys.  For a pretty reasonable fee, it turns out anyone can run a survey through Google; the respondents are randomly selected and reweighted by demographics (age, gender, location).  So I decided to find out:  What percentage of Americans over the age of 18 know what a prime number is?  What about an even number?  I also tried to design the questions so they tested a bit more than basic knowledge; for example, I wanted to know whether the respondents knew that zero is even (a surprisingly controversial topic).

Methodology

Here are the two surveys I ran, as they would appear to respondents.  Each survey received about 250 responses from randomly selected Americans over the age of 18.  (And cost me a well-spent $25.)

Even numbers:

You'd think this one would be pretty easy...

I included 0 because I suspected it would be the most "difficult" number to identify as even; I included 774 to check that people know how to deal with large-ish numbers.  17 and 99 were supposed to be easy, whereas 257 was aimed at checking if people were simply looking for an even digit.

Prime numbers:

Maybe a little tougher.

Here 57 is included in honor of Grothendieck.

The order of the answers was reversed for a random half of the respondents.  As I understand it, Google shows these questions on sites with some premium content -- users can take the survey in lieu of paying.

Data

You can download the raw data for the survey on even numbers here, and the survey on prime numbers here.  The data includes the type of website on which the survey was taken (news, arts and entertainment, reference, etc.), the gender of the respondent, their approximate age, region within the US, whether they are browsing from a rural, suburban, or urban area, their approximate income, and the amount of time it took them to respond to the question.  Google infers much of this data from the browsing habits of the user, though, so I don't know how reliable it is.

Analysis

So, what percentage of American know that 2 is a prime number?  That zero is even?

Well 8 is pretty even, I guess.

The percentages indicate how many survey-takers thought the number in question was even.  So about 75.7% of people think 8 is even (not bad!) but 774 is much harder.  I don't know what was going on with the 0.8% of people who thought that 17 was even, but maybe this is an example of the Lizardman constant.

Yeesh.

(Note that the histogram above says that there were 199 respondents.  In fact, there were 250, but because of the reweighing, the survey only had the power of the survey with 199 truly randomly-chosen respondents.)  The good news is that more than 40% of survey-takers knew that 13 is prime; on the other hand, 17% thought that 9 is prime.  That said, I founded it heartening that the top three answers were indeed the three primes.  That's the wisdom of crowds for you.

How did the respondents do overall?  Below are graphs indicating what percentage of respondents got 0,1, ..., 6 answers correct on each survey.

Even numbers:

Not bad!

In particular, more than half of the survey-takers were able to get 5 or 6 answers correct.  Not too shabby!  To get a perfect score, one had to identify zero as even, which only 24% of the respondents were able to do, so I think this is a pretty good result.  Interestingly, about 2/3 of the people who correctly identified zero as even got perfect scores.  The median number of correct answers was 5 out of 6; the mean was about 4.5.

Prime numbers:

Pretty tough.

Identifying primes was evidently much harder.  The median number of correct answers was 3 out of 6 (no better than chance), and the mean was about 3.6.

I did do some more detailed analysis (e.g. breaking the results down by demographics, looking at the response time, etc.) but didn't find anything particularly interesting.  But for your edification, here is a plot of median response time (in milliseconds) against the number of correct answers to the survey on even numbers.

Hmm.

There seems to be a weak relationship between time spent and the number of correct answers (though the people who answered almost everything wrong did so pretty slowly), but maybe this isn't surprising.

Conclusions

I actually found these results pretty heartening!  My biggest worry is that I'm now addicted to polls, at $25 a pop.

I was a bit surprised how few respondents knew that 0 is even.  Parity is a concept which actually comes up in daily life -- for example, when one wants to know which side of the street a given address is on, or in certain regulatory questions.  I was also a bit surprised that it was so difficult to identify 2 as a prime.

Of course there are some problems with these polls.  The biggest, in my opinion, is that they don't let people indicate how sure they are -- one worry I have is that if people weren't sure if, say, 2 was prime, they'd just leave it blank.  So, for the sake of symmetry, I should really run another survey, asking people to identify non-prime numbers.  I suspect far fewer than 70% of respondents would say 2 is composite.  If I decide to run another survey, I'll post about it here, of course.

Please let me know if you download and do anything with the data!  (Data on even numbers, data on primes.)

 

Uniformization over finite fields

I just asked this question on MathOverflow.  It's basically idle curiosity, but I've now been idly curious about this for several years, so I figured I might as well ask it publicly.  Please let me know if you have any thoughts!

In short, the question is:  Do every two smooth projective curves over \(\overline{\mathbb{F}_q}\) of genus at least \(2\) have a finite etale cover in common?

See the question for more details and remarks.

Constructive Criticism

Here is a classical example of a non-constructive proof.

Thm 1. There exist irrational numbers \(x,y\) such that \(x^y\) is rational.

Proof.  If \(\sqrt{2}^{\sqrt{2}}\) is rational, then we may take \(x=y=\sqrt{2}\).  Otherwise, we may take \(x=\sqrt{2}^{\sqrt{2}}, y=\sqrt{2}\); then \(x^y=2\), which is certainly rational. \(\blacksquare\)

This proof is non-constructive in that it doesn't actually give a (proven) example of a pair \(x,y\) with the desired property; it gives two possibilities (namely \((\sqrt{2}, \sqrt{2})\) or \((\sqrt{2}^{\sqrt{2}}, \sqrt{2})\)).  Of course one can give more constructive proofs; for example, one can take $$x=\sqrt{2}, y=2\log_2(3),$$ and it's relatively easy to check that \(x,y\) are irrational.

What if we ask that \(x,y\) are transcendental, instead of irrational?

Thm 2. There exist transcendental numbers \(x,y\) such that \(x^y\) is rational.

Proof 1. Take \(x=e, y=\ln 2. \blacksquare\)

Of course it is well-known that \(e, \ln 2\) are transcendental, but the proofs (especially in the latter case), are pretty non-trivial.  I recently ran across the following non-constructive proof of Thm 2, which is much easier.  Before I give the proof, I need a definition:

Definition. A real number is incomputable if there is no computer program (Turing machine) which prints out its decimal expansion.

There are of course lots of incomputable numbers, since there are only countably many computer programs.  Of course, any incomputable number is transcendental, since computer programs can approximate roots of polynomials with rational coefficients arbitrarily well.

Proof 2.  Let \(x\) be any (positive) incomputable number.  Let \(y=\log_x(2)\).  Then \(y\) is incomputable, as otherwise \(x\) would be computable; hence \(x,y\) are transcendental, and by definition \(x^y=2\). \(\blacksquare\).

Proof 2 is even worse than our original proof of Theorem 1 -- not only is it non-constructive, the examples it produces cannot in principle be constructed!

The purpose of this post is to observe that with some mild modification, we can make a constructive version of Proof 2.  Instead of using the fact that algebraic numbers are computable, we can use the fact that they are efficiently computable.  In particular, if a number is algebraic, then the \(n\)-th digit of its decimal expansion may be computed in polynomial time in the number of digits of \(n\).

Proof 3.  By the (proof of) the Time Hierarchy Theorem, there exists an(explicit) number \(x\) so that the \(n\)-th digit of \(x\) is not computable in polynomial time, but so that \(x\) is computable.  (In particular, \(x\) is computable but transcendental.)  Set \(y=\log_x(2)\).  Then \(x^y=2\), but \(y\) is not computable in polynomial time, as otherwise \(x\) would be as well, (as one may solve the equation \(x^y=2\) efficiently, since logarithms and exponents may be computed efficiently).\(\blacksquare\)

This is the cheapest constructive proof of Theorem 2 that I know, though of course the \(x, y\) which are produced can only be written down slowly.

Sawin on Severi's Conjecture

One of my favorite questions is: for which \(g, n, p\) is the moduli space of \(n\)-pointed genus \(g\) curves \(\mathscr{M}_{g,n, \mathbb{F}_p}\) unirational/uniruled?  Will Sawin has just posted a beautiful paper on the ArXiv answering this question in most cases, for \(g=1\).  Indeed, he shows that for \(n\geq p\geq 11, \mathscr{M}_{1, n, \mathbb{F}_p}\) is not uniruled... (more below the fold)

Read More

C.S. Lewis on Commutative Algebra

I believe that in all men’s lives at certain periods, and in many men’s lives at all periods between infancy and extreme old age, one of the most dominant elements is the desire to be inside the local Ring and the terror of being left outside.
— C.S. Lewis, "The Inner Ring"

Read the whole essay here.  Sorry for the pause in blogging -- I should start up again soon.

A "minimal" proof of the fundamental theorem of algebra

When I was in graduate school, I came up with what I think is a nice proof of the fundamental theorem of algebra.  At the time, I wrote it up here somewhat formally; I thought it might make a nice blog post, since the formal write-up obscures the very simple underlying ideas.  The goal was to use the minimal amount of technology possible -- in the end I use just a little algebra and some elementary point-set topology, as well as the implicit function theorem...

Read More

Are Shimura Varieties \(K(\pi, 1)\)'s?

Let \(\mathscr{A}_g\) be the moduli space of principally polarized Abelian varieties of dimension \(g\).  The complex-analytic space (stack) associated to \(\mathscr{A}_g\) is a \(K(\pi,1)\); that is, its only non-vanishing homotopy group is \(\pi_1\), which is \(Sp_{2g}(\mathbb{Z})\).  In particular, the cohomology of \(\mathscr{A}_g\) is the same as the cohomlogy of \(Sp_{2g}(\mathbb{Z})\).

Jesse Silliman, a graduate student at Stanford, has told me an argument that shows that this is in some sense maximally untrue when one considers the "etale homotopy type" of \(\mathscr{A}_g\). 

Read More

Weapons of Math Destruction

I've just finished reading Cathy O'Neil's book Weapons of Math Destruction, which I highly recommend.  (One notable feature of the book is that the skull and cross-bones on the cover is the second known example of mathematical piracy.)  

Wea-puns of mass destruction?

The book is, as you might guess from the title, quite negative about the use of big data and mathematical models in government and the corporate world.  This is a point of view that I felt some knee-jerk disagreement with; that said, Cathy is quite clear that her intent is only to discuss the negative features of big data and the use of mathematics in social and business planning:

Big data has plenty of evangelists, but I’m not one of them. This book will focus sharply in the other direction, on the damage inflicted by WMDs [weapons of math destruction] and the injustices they perpetuate.

And I think one should read the book with that comment in mind; of course the models the book complains about have some redeeming features and effects.  But that complaint (which is prevalent in the Amazon reviews) misses the point -- to decide whether, on balance, they are a good thing, one has to have a careful accounting of their evils.  This book is that accounting -- it makes no pretense at even-handedness and does not try to weigh the good against the bad, except in the most minimal way.  I don't view that as a strike against it.  

I do have some mild quibbles with the book -- I think that in some cases, the book is uncharitable to the users of the algorithms it objects to.  For example, on page 110, Cathy discusses the use of personality tests in job applications.  Certain answers on these tests reveal that the test-taker has "high narcissism."  "Who wants a workforce peopled with narcissists?," the text asks.  This section is at best misleading -- as the author probably knows, the narcissism these tests discuss is not necessarily pathological.  Rather, narcissism in this setting is a technical term, which may in fact be healthy.  And throughout, the book offers the potential for abuse of algorithms as a strike against them (or offers anecdotal cases of abuse). For example, in the discussion of car insurance companies' use of opt-in technology that tracks one's driving habits, Cathy suggests that soon this technology will be opt-out, at a significant cost.  I'm not necessarily skeptical that this will happen, but I'd argue that we should wait for the abuse to occur before objecting to the technology.

In any case, I think this is an important (and excellent) book, and a necessary counterweight to the techno-utopianism to which I, and many in government (in particular in the current administration), business, and academia are often prone.  I doubt the book will cure me completely of my faith in technocracy.  But I think its real goal is likely to temper that faith with some skepticism. At bottom, the book advocates for rigor in modeling, and for internalizing negative externalities -- who can argue with that?

Morita Theory, Tannaka Duality, and Approximate Tannaka Duality

Let \(R\) be a ring -- it is well-known that the category \(R\text{-mod}\) of (left) \(R\)-modules does not determine \(R\).  For example, the functor $$-\otimes R^n: R\text{-mod}\to \text{Mat}_{n\times n}(R)\text{-mod}$$ is an equivalence of categories.

That said, there is one additional piece of data that lets us recover the ring \(R\) from the category \(R\text{-mod}\). 

Read More

Integral House

I'm teaching Calculus III at Columbia this semester, and am kind of amazed at the exploitative prices charged for Stewart's Calculus, 8th Edition, "Early Transcendentals".  (Not that anyone knows what "Early Transcendentals" are).  On the other hand, I now know how Stewart was able to build the famous "integral house":

Who says that math doesn't pay?

The man loved calculus...

Stewart's concert hall, in which he could, for a brief moment, forget the cries of the millions of calculus students whose textbook purchases paid for his mansion habit

More pictures at Sotheby's and HuffPost.

TAAAG

Still at UGA, at the very enjoyable but weirdly named conference TAAAG.  The conference featured a very interesting talk by Padmavathi Srinivasan, as well as mini-courses by Michel Raibaut, Ben Williams, and Arnav Tripathy (whose website I can't find).

Arnav's great talk on the integral Hodge and Tate conjecture, with lots of gestures

There were also many short talks, by many of the other participants.  

Padma gave a great talk about her work comparing

  1. The discriminant of a hyperelliptic curve, with
  2. The conductor of the curve.

As I understand it, the conductor only depends on the family of curves (it is more or less the difference in Euler characteristic between the general fiber and the special fiber of a minimal regular model), whereas the discriminant only depends on the covering of the base given by the family of Weierstrass points.  In the case of a minimal regular model, these numbers are just the Euler characteristics of the vanishing cycles sheaf on the family of curves (resp. the family of Weierstrass points).  It would be nice to enhance her inequality to a map of sheaves...

 

Krashen the party

I'm at UGA for the week, in between SWAG and TAAAG.  Today Danny Krashen gave a great talk on this paper of Auel, First, and Williams.  The paper is one of the latest in a long tradition of papers which construct counterexamples by making a topological computation and then approximating the relevant topological spaces by algebraic varieties.  (To my knowledge, this technique began with Totaro, but it has been exploited to great effect by Antieau, Williams, and most recently, fellow Ravi student Arnav Tripathy.) ...

Read More