 
Home 


At one time or another, we’ve all probably heard the claim that a million monkeys typing for a million years would write all of the works of Shakespeare or write all of the books in the British Museum – or something similar. Is this true? Before we start, let's think about the context in which the statement is made. It's usually an implicit, sometimes explicit, suggestion that if you randomly pound away on a keyboard long enough, you'll write something – purely by accident – that makes sense. How could this happen? Instead of saying all the works of Shakespeare I could just as easily say all the dialog from every Hitchcock movie. Or I could say all the elements of the IRS tax code. I could also say the written descriptions for the Saturn 5 Rocket. Another possibility is all of the State of the Union addresses given by every past U.S. President. And why couldn't we include, in the last category, the State of the Union addresses given by a number of future U.S. Presidents? After all, there is no way the monkeys could know (or care) whether the stuff they write is from the past or the future. And why would what they write be restricted to the actual works of Shakespeare, or the actual blueprints for the Saturn 5? If, in a million years, they could write those things, they could (and would) write Shakespeare's plays with different endings, or Saturn 5 descriptions with one or more errors. And they would also write Shakespeare's plays with Hitchcock movie dialog interspersed, or State of the Union addresses combined with elements of the U.S. tax code. They would write Shakespeare's plays with Acts three and one swapped, and Shakespeare's plays with all the dialog backwards – instead of to be or not to be that is the question, the dialog would be question the is that be to not or be to. And they would also write a version with niotseuq eht si taht eb ot ton ro eb ot. There would be a version with to be or not to be niotseuq eht si taht. They would write a version where every sentence is duplicated exactly twice, and another where every sentence is duplicated exactly three times, and another where every word is duplicated exactly twice – to to be be or or not not to to be be that that is is the the question question. There would be a version with every word followed by its twin written backward to ot be eb or ro not ton to ot be eb that taht is si the eht question noitseuq. And there would be a version where every word is delineated by the letters xx, as in xxtoxx xxbexx xxorxx xxnotxx xxtoxx xxbexx xxthatxx xxisxx xxthexx xxquestionxx. The possibilities are (literally!) endless. After all this, a few things should be clear. Suppose we pick out a 100word paragraph from Sheakespeare's plays. Or we could pick 100 words from a Tom Clancy novel. Or 100 words from the Gettysburg address. It doesn't matter which 100 words we pick. If, in a million years, the million monkeys would type 100 words from one of Shakespeare's plays, they would have to type every possible set of 100 words. So this gives us a way of calculating some answers. Let's do a simpler problem first. Start by assuming the monkeys use typewriters with only 28 keys – 26 letters of the alphabet, a space, and a period. Ignore upper and lower cases; ignore numbers; ignore punctuation; ignore parentheses, dashes, commas, semicolons. And rather than try to determine if they can type all of Shakespeare, let’s just determine whether they will type a recognizable sentence: To be or not to be, that is the question. This sentence has forty characters, including spaces and the period. Now we don't have to use this sentence – any fortycharacter sentence will do. In fact, any forty characters will do. The only way we can be certain the monkeys will type a particular forty characters is if they type every possible set of forty characters. So let's see how long it would take for them to do this. 

Top  
 
Before we go to the next step, let me explain scientific notation. I can write 

10 = 10^{1} 





where the expressions are pronounced “ten to the first power", "ten to the second power", “ten to the third power", and “ten to the fourth power". Since 100 = 10 x 10, 1000 = 10 x 10 x 10, and 10,000 = 10 x 10 x 10 x 10, the superscript (called an exponent) is just the number of factors of ten required to obtain the number. It’s also the number of zeroes in the number. The beauty of this notation is, when we multiply numbers, we just add exponents:  
100 x 1000 = 10^{2} x 10^{3} = 10^{2 + 3} = 10^{5} = 100,000  
1000 x 10,000 = 10^{3} x 10^{4} = 10^{3 + 4} = 10^{7} = 10,000,000  In general:  
10^{N} x 10^{M} = 10^{N + M}  And it's easy to show  P^{N} x P^{M} = P^{N + M}  
For any number P, not just 10. The reason we need to use this notation is that we'e going to be looking at some really big numbers. An example where numbers are not exact multiples of ten:  
60 x 60 x 24 x 365 = (6 x 10^{1}) x (6 x 10^{1}) x (2.4 x 10^{1}) x (3.65 x 10^{2}) = (6 x 6 x 2.4 x 3.65) x 10^{1 + 1 + 1+ 2} = (6 x 2.4) x (6 x 3.65 ) x 10^{5} 

Now 6 x 2.4 is approximately 6 x 2.5, which is 15; 6 x 3.65 is roughly 20. So the answer is approximately  
15 x 20 x 10^{5} = 300 x 10^{5} = 3 x 10^{7}  
There's one more thing we need to look at. Suppose we do the following:  
[10^{2} ] ^{3} = 10^{2} x 10^{2} x 10^{2} = 10^{2 + 2 + 2} = 10^{6} = 10^{ 2x3}  
A more general way to look at this is:  
[10^{N} ] ^{M} = 10^{N} x 10^{N} x 10^{N} . . .  for M factors of N  
= 10^{N + N + N . . . }  for M of the Ns  = 10^{N x M }  
So we multiply exponents when raising to a power:  [P^{N} ] ^{M} = P^{ N x M}  
I have used P instead of 10 because, although I haven't proven it, this rule holds for all numbers, not just 10.  
Top  
 
The sentence we're considering is To be or not to be, that is the question. It has forty characters, including spaces and the period (we're going to ignore the comma and upper cases). We're actually going to calculate how many monkeys it will take to type out all possible fortycharacter sentences. Because we know that if they do that, they will certainly type this one. Start with one character. There are obviously 28 possible phrases which use only one character: a, b, c, d, e, f, g, . . . , x, y, z, ., and _. Here I have denoted a space by an underscore, so the last two "phrases" are a period and a space. There are 28 x 28 = 784 possible twocharacter phrases: aa, ab, ac, ad, ae, . . . , ax, ay, az, a., and a_ are the first 28 (the last two are "a" followed by a period and "a" followed by a space). The next 28 are ba, bb, bc, bd, be, . . . bx, by, bz, b., b_ (the last two of these are "b" followed by a period and "b" followed by a space); the next 28 are ca, cb, cc, cd, . . . cx, cy, cz, c., c_, and so on. The last 84 of the 784 twocharacter phrases are za, zb, zc, zd, . . . zz, z., z_; followed by .a, .b, .c, .d, .e, and so on up to .x, .y, .z, .., ._; finally followed by _a, _b, _c, _d, _e, and so on up to _x, _y, _z, _., and __. The last two are, respectively, a space followed by a period and two consecutive spaces. The number of threecharacter phrases will be 28 x 28 x 28 = 21952. They will start with aaa, aab, aac; they will finish with __x, __y, __z, __., ___. The minimum number of characters the monkeys must type to get all possible onecharacter phrases is 28. The minimum number of characters the monkeys must type to get all possible twocharacter phrases is 2 x 784 = 2 x 28^{2} (the number of possible phrases multiplied by the number of characters). I have said "minimum number of characters" because there is the possibility some of the phrases the monkeys type will be duplicates – a monkey might type xrwe jxco itr and later type the same thing; or another monkey might type it as well. I'm trying to make the problem easier by ignoring this possibility. The minimum number of characters the the monkeys must type to get all possible threecharacter phrases is 3 x 28^{3}. Note that the set of all onecharacter phrases is a subset of all twocharacter phrases; the set of all twocharacter phrases is a subset of all threecharacter phrases, and so on. So the minimum number of characters the monkeys must type to get all possible Ncharacter phrases is N x 28^{N}. For the case we're examining, 40 characters, the minimum number of characters the monkeys must type is 40 x 28^{40}. We can use the results of Scientific Notation above to rewrite: 28^{40} = [28^{4} ] ^{10} and 28^{4} = (2 x 14)^{4} = 16 x (14 x 14) x (14 x 14) which is approximately 16 x 200 x 200 = 16 x 2 x 2 x 10^{4} = 6.4 x 10^{5}. This means 28^{40} = [6.4 x 10^{5}] ^{10} = 6.4^{10} x 10^{50}. Now 6.4^{10} = (6.4 x 6.4)^{5} = 40^{5} = 4^{5} x 10^{5} = 1024 x 10^{5} ; this is roughly 10^{8}. So the minimum number of characters the monkeys must type to get all possible fortycharacter phrases is approximately 



Top  
 
So how many monkeys do we need? First, let’s not use a million monkeys. Let’s use a hundred billion – over ten times the number of people on Earth. Let’s also assume that they’ve been typing away for ten billion years – even though the earth has been around for only about 4.5 billion years. Let’s also assume that every star in the Milky Way galaxy (there are about 200 billion stars) has a planet with this many monkeys, who have been typing for ten billion years. How many monkeys is this? Using Scientific Notation, it's easy to calculate the number of monkeys in the Milky Way galaxy (a hundred billion is 100 x 10^{9}): 

100 x 10^{9} x 200 x 10^{9} = 10^{2} x 2 x 10^{2} x 10^{18} = 2 x 10^{22}  
Let's further assume that each monkey has been typing 120 words per minute for ten billion years. There are five characters in a word, so 120 words per minute is 600 characters per minute, or ten characters per second. Now the number of seconds in a year is 60 seconds/minute x 60 minutes/hour x 24 hours/day x 365 days/year. We already calculated this; it's 3 x 10^{7} seconds. We multiply this by ten characters per second and by ten billion years:  
10 x 3 x 10^{7} x 10^{10} = 3 x 10^{18}  
This is the number of characters each monkey types in ten billion years; multiply by the number of monkeys:  
3 x 10^{18} x 2 x 10^{22} = 6 x 10^{40}  
This is the number of characters typed by all the monkeys in ten billion years of typing. So, have the monkeys typed all possible sentences in this time? If they have, they would have written all the works of Shakespeare (and anyone else). Now we know the minimum number of characters the monkeys have to type to produce all possible threecharacter phrases – it's 3 x 28^{3} which is about 66,000. To type all possible fivecharacter phrases requires a minimum of 5 x 28^{5} characters; this is about 86 million. To type all possible tencharacter words, the monkeys would have to type a minimum of 3 x 10^{15} characters. For the minimum number of characters required for different phrase lengths, see the table below.  
Top  






















































































































Top  
 
To type all possible 40character phrases would require that the monkeys type a minimum of 3 x 10^{59} characters; as we saw above. (Earlier we estimated the value as 4 x 10^{59} ; the value in the Table is derived from a spreadsheet and is more precise.) But we also saw that all these monkeys, in ten billion years, would type only 6 x 10^{40} characters. So despite having all these monkeys, and all this time, the monkeys could not type all possible 40character phrases. And I ignored the possibility of duplicate phrases. From the table, we see that – at best – the monkeys would type all possible 27character phrases. If we were to consider filling every known galaxy in the universe with monkeys, that would only increase the number of characters they type by a factor of 10^{12}. This would make it possible for the monkeys to type all possible 35character phrases – not nearly enough. As far as typing all of the works of Shakespeare or typing all of the books in the British Museum – it's clear from the Table that the numbers of monkeys and the number of years – as astronomical as they are – are woefully inadequate. As the number of characters in the phrase grows, the minimum number of characters the monkeys must type grows geometrically. That minimum number, to get all possible Ncharacter phrases, is N x 28^{N}. For phrases of length N + 1 the number is (N+ 1) x 28^{N + 1}. The ratio of these numbers is 28 x (1 + 1/N), so for large N the minimum number of characters the monkeys must type grows by a factor of 28 for each additional character. The number grows by roughly 28^{10} = 3 x 10^{14} for each additional ten characters. So that by the time we reach 100character phrases, the minimum number of characters the monkeys must type exceeds 10^{146}. Now a single page can easily contain many hundreds of characters. So there is no way any set of monkeys in the observable universe could ever type all possible letters on a single page, let alone all the works of Shakespeare. 

Covering a planet with monkeys, in every solar system in the galaxy, and every galaxy in the known universe, and allowing them to type away for ten billion years – it couldn't be guaranteed that they would even type the sentence: TO BE OR NOT TO BE THAT IS THE QUESTION. 

Top  Home 