GBB Logo         
GBB Services
GBB Services : Mathematics : Prime numbers

There are infinitely many primes

by Walter Vannini

Introduction
Starting Point
Euclid's counter example
Kummer's counter example
Other counter examples
Euler's analytic proof
Thue's counting proof
Chaitin's algorithmic complexity proof
Erdos' counting proof
Concluding words
Feedback

Introduction

The first few prime numbers are

2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, …

They're defined as the positive integers that have exactly two positive integer factors (which are themselves and one, of course).

One of the oldest significant mathematical results is that there's an infinite number of primes. It was first proved by Euclid in about 300 BC, in his Elements (book IX, proposition 20).

Over the years, many different proofs of the result have been found. Six of them can be found in Chapter 1 of the third edition of "Proofs from THE BOOK". Three can be found in Chapter 2 of Hardy and Wright's "An Introduction to the Theory of Numbers" (in sections 2.1, 2.4, and 2.6). Three can be found in Chapter 2 of Chaitin's "Meta Math!". Several can be found in the wikipedia primes entry.

For quite a while, I knew of only one proof of the "infinitude of primes", namely Euclid's proof. Over the years, I've come across others, and I've noticed that several of them are really relying on deriving a contradiction from a particular consequence of assuming that there are finitely many primes. That consequence, and various ways of disproving it, are the subject of this essay. I might deal with some of the other proofs in the future.

Starting point

Assume there are only finitely many primes. Call them

2, 3, 5, … pk

Since every positive integer, other than one, is either itself prime, or a product of primes, we can conclude that every positive integer N can be written in the form

N = 2n1 3n2 5n3 … pknk

where each of n1, n2, n3, … nk is a non-negative integer.

All of the proofs discussed here will show, in one way or another, that this is impossible.

Euclid's counter example

Euclid's proof that not every integer is of the form

2n1 3n2 5n3 … pknk

is very direct. He simply pointed out that the number

N = 2 • 3 • 5 • … • pk + 1

can't be of the required form.

Kummer's counter example

In 1878, Kummer pointed out that the number

N = 2 • 3 • 5 • … • pk − 1

can't be of the required form either.

Other counter examples

And, of course, there's an infinite number of related counter examples. All numbers of the form

N = n • 2 • 3 • 5 • … • pk ± 1

where n is any positive integer, provide a counter example also.

Unlike Euclid's (and Kummer's) proof, all the remaining proofs won't produce an explicit counter example.

Euler's analytic proof

What I really like about the following proof is that it uses techniques from another area of mathematics, real analysis, to give a short and sweet proof involving infinite sums and convergence. It is true that the proof could be reformulated in terms of inequalities involving finite sums of rational numbers (and from there could be restated purely in terms of finite operations on integers) but it would lose its conceptual simplicity in the process.

Euler's great insight, in my opinion, was to realize that if every number is of the given form, then it would follow that a finite product of convergent geometric series would be equal to the harmonic series. This is impossible because the harmonic series diverges.

Here's some details. By considering all terms resulting from expanding the following product, we have:

(1 + 1/2 + 1/22 + 1/23 + … )
(1 + 1/3 + 1/32 + 1/33 + … )
(1 + 1/5 + 1/52 + 1/53 + … )

(1 + 1/pk + 1/pk2 + 1/pk3 + … )

≥ (1 + 1/2 + 1/3 + 1/4 + 1/5 + 1/6 + 1/7 + 1/8 + … )

We only need inequality, which is just as well, since proving equality would require proving that prime factorization is unique, which is something we don't need for any of the proofs.

Now, since every prime is ≥ 2, we have that every geometric series in the above product is bounded above by

(1 + 1/2 + 1/22 + 1/23 + … ) = 2

This immediately implies the inequality

2k ≥ (1 + 1/2 + 1/3 + 1/4 + 1/5 + 1/6 + 1/7 + 1/8 + … )

Since (1 + 1/2 + 1/3 + 1/4 + 1/5 + 1/6 + 1/7 + 1/8 + … ) = ∞ , we're done.

Another way to think of the result is that Euler showed that the sum of the reciprocals of all numbers of the form

2n1 3n2 5n3 … pknk

is finite, while the sum of the reciprocals of all positive integers is infinite. This can be interpreted to mean that for any collection of primes, "most" positive integers are not products of only those primes.

Thue's counting proof

Like Euler's proof, this proof shows that there must be positive integers that are not of the required form. But, we don't leave the realm of the integers.

This proof starts by looking at the set of integers ranging from 1 up to 2n. The cardinality of this set is 2n.

On the other hand, by the finiteness assumption, every integer from 1 to 2n can be written in the form

2n1 3n2 5n3 … pknk

where each of n1, n2, n3, … nk ranges from 0 to n.

The cardinality of this set is (n+1)k.

This means that we must have the inequality

2n ≤ (n+1)k for all n.

But this is known to be false for sufficiently large n.

This proof shows that the infinitude of primes follows from the fact that exponential growth out paces polynomial growth.

Chaitin's algorithmic complexity proof

The informal version of Gregory Chaitin's proof can be found in his book "Meta Math!". In short, it takes log N bits of information to represent a "random" integer N, but it only takes k log log N bits to represent N if it's of the form

2n1 3n2 5n3 … pknk

Since log N grows faster than log log N we're done.

When I read this, I thought it would be amusing to restate it as a simple counting argument, and quickly did so: just write N as 2n, so that log N is log(2n), and k log log N is log(nk), and follow your nose. You'll get Thue's proof.

After that, I thought I'd look at Chaitin's original published proof. It appeared as "Appendix 2: An Information-Theoretic Proof That There Are Infinitely Many Primes" in his 1979 paper Toward a mathematical definition of "life". In that paper, Chaitin defines algorithmic entropy H of a positive integer N, and uses various properties, such as "subadditivity of algorithmic entropy" to show that

H(N) ≤ Sum( H(nj) ) + O(1), from which it follows that

log N + O(log log N) ≤ k[log log N + O(log log log N)];

which gives the the contradiction log N = O(log log N).

The final paragraph of the paper reads as follows "This proof is merely a formalization of the observation that if there were only finitely many primes, the prime factorization of a number would usually be a much more compact representation for it than its base-two numeral, which is absurd. This proof appears, formulated as a counting argument, in Section 2.6 of the 1938 edition of Hardy and Wright [20]; we believe that it is also quite natural to present it in an information-theoretic setting." (italics mine).

I looked in Hardy and Wright, expecting to see Thue's proof, and was pleasantly surprised to find that Chaitin referenced a different counting argument, due to Erdos.

Erdos' counting proof

Look at the set of integers ranging from 1 up to 22n. The cardinality of this set is 22n.

By the finiteness assumption, every integer from 1 to 22n can be written in the form

2n1 3n2 5n3 … pknk

Erdos used the fact that ni=2mi+ei where ei is 0 or 1 (so that pini=(pimi)2piei) to conclude that every integer from 1 to 22n can be written as

M2 2e1 3e2 5e3 … pkek

where M ranges from 1 to 2n, and each of e1, e2, e3, … ek ranges from 0 to 1.

The cardinality of this set is 2n 2k.

This means that we must have the inequality 22n ≤ 2n2k, ie

2n ≤ 2k for all n.

But this is clearly false: just take n=k+1.

Incidentally, there's nothing special about the 2 in either the base or exponent of 22n: any integers greater that 1 would also work. Here's how the argument would go with a base of 42 and with a factor of 1729 in the exponent.

Look at the set of integers ranging from 1 up to 421729n. The cardinality of this set is 421729n.

By the finiteness assumption, every integer from 1 to 421729n can be written in the form

2n1 3n2 5n3 … pknk

Use the fact that ni=1729mi+ei where ei is 0 or 1 or ... or 1728 (so that pini=(pimi)1729piei) to conclude that every integer from 1 to 421729n can be written as

M1729 2e1 3e2 5e3 … pkek

where M ranges from 1 to 42n, and each of e1, e2, e3, … ek ranges from 0 to 1728.

The cardinality of this set is 42n 1729k.

This means that we must have the inequality 421729n ≤ 42n1729k, ie

421728n ≤ 1729k for all n.

But this is clearly false: just take n large enough.

On the other hand, the 2 in the base of 2n in Thue's proof has to be 2. We're relying on the fact that every prime is greater than or equal to 2.

Concluding words

We've looked at only one approach to proving that there are infinitely many primes. There are several others, including yet another proof due to Euler, in which he showed that the sum of the reciprocals of the primes is infinite.

It turns out that the idea behind Erdos' counting proof can be used to give an alternative proof of Euler's sum of the reciprocals of primes result, and this is just what Erdos did. If you're curious, you can read about it in either Proofs from THE BOOK or in Hardy and Wright's classic number theory book.

Feedback

John Stillwell sent me an email asking "Have you looked at Ribenboim's Little Book of Bigger Primes ?". Err, no, I hadn't. I checked it out, and found out that Thue's name is associated with the counting version of Chaitin's proof, and that Kummer's name was associated with a variation of Euclid's proof. No mention of Erdos' counting argument though. I've modified this essay accordingly.


If you have corrections, additions, modifications, etc please let me know mailto:walterv@gbbservices.com

July 22 2007 Posted
September 3 2007 Last Updated

Back to top of page