## The Probability on the Chanukah Page

On the Chanukah page, the following claim appears:

"The probability of all these words appearing with minimal skip in this one section of text by mere chance is less than 0.0000000000000000000000001 !!"

This number is correct, but it is so absurdly low that no reader should see it without asking "What's the catch?" This page explains what the catch is.

The first catch, albeit a minor one, is the phrase "in this one section of text". Why is this section of text more significant than any other? As far as we know, it is not. It just happens to be the section of text containing these words, and any other section of the same length would have done just as well. The right question to answer was really "what is the probability that these words appear in a section of text as short as this?" The answer (whose difficult calculation we will omit) is that it is 100-200 times larger.

In this context, note that we could conceivably read this section of War and Peace and then concoct a story about how it is parallel to the Chanukah story. The idea would be to try to recover the factor of 100-200 by convincing the reader to believe this particular section is the "right" section in which to find Chanukah codes. Arguments like this are almost invariably inventions made after the fact.

The second catch, by far the most important, is the phrase "all these words". Did we work by listing just these words then miraculously finding them all close together? NO, and you can be sure that the vast majority of clusters shown in the Torah were not found that way either. In fact, we had a collection of 100-150 words and tried them all. After we had the words that worked, we devised a story involving famous quotations in order to convince the reader that the words were chosen objectively. That is how it is done; we did not invent the method.

In summary, we began with a list of 100-150 words. Only after finding the minimum ELS of each one did we choose which section of the text to present. For that section of the text, we presented all the minimum ELS's we could find. After completing this process, we computed the probability of these words being in this place. The computation is correct, but the answer does not reflect the difficulty of finding a cluster which is just as good.

Creator: Brendan McKay, bdm@cs.anu.edu.au.