Article of Witztum

This is a copy of an article that Doron Witztum published as a PDF file. The Hebrew has been transliterated using the Michigan-Clairmont encoding scheme. No claims of accuracy are made.

A reply to this article is available.


This is a response to an article "Astounding Discoveries in War and
Peace", published on the internet in May '97.

Did They Really Find Codes in War and Peace?
FULL VERSION 
Doron Witztum, '97

SUMMARY
Recently on Channel One Television in Israel, on the show "Popolitika",
as well as in other places, psychologist Professor Maya Bar Hillel
claimed that she and her colleagues found "hidden codes" in the Hebrew
translation of Tolstoy's War and Peace. They claim that this finding
is the same type of code that is found by code researchers in the
Torah. The first example they brought was on the subject of Chanukah.

There is a fundamental difference between the serious scientific
research that has been done on codes in the Torah, and the "War and
Peace" example presented by this group. It is possible to distinguish
between them like distinguishing counterfeit money from the real thing.

INTRODUCTION
Scientific research into ELSs (Equidistant Letter Sequences) in the
book of Genesis has been proceeding for some twelve years. The
researchers have focused on two phenomena:

  1. The close proximity of one minimal ELS to another, where there is
     a conceptual relationship between them (e.g., an ELS of "hammer"
     near an ELS of "anvil").
  2. The close proximity of a minimal ELS to a conceptually related
     expression in the text, as it is read consecutively (e.g., an
     ELS of "hammer" near the appearance of "anvil" in the text as a
     string of consecutive letters).

Please note that the researchers do not measure the probability for the
appearance of any individual expression as an ELS. Rather, given two
expressions appearing as ELSs, we measure whether their proximity is
closer than may be expected to occur by chance.

METHODOLOGY
After defining the phenomena to be investigated, the researchers then
determined a rigorous method to measure the level of success:
determining what was to be considered "in close proximity" and what
was not.  After establishing this methodology, the researchers were
now able to conduct experiments to investigate the phenomena; to check
if the proximities of such ELSs occur much more frequently in the text
of Genesis than would be expected to by chance.

An "experiment" in this instance means examining the proximities
between paired words of a specified set. For every "topic" there is
a pool of expressions which are conceptually related. These can be
represented schematically in the form of a pyramid:  At the top of
the pyramid are just a relatively small number of the most important
words and expressions: These comprise the top of the hierarchy. As
one descends the hierarchy, there are many more expressions related
to the topic.

A method must be found to select appropriate pairs of expressions upon
which to carry out the experiments. This must be done in such a way
that the set of expressions will be an objectively "closed sample."
This can be accomplished in two ways:

A. If the expressions chosen are those at the top of the hierarchy.
This is feasible only with regard to those topics or subtopics where
it is clear which expressions deserve to be at the top of the
hierarchy.

B. If the expressions are not those at the top of the hierarchy,
then one can make use of an objective outside source (even, for
example, an encyclopedia or an impartial expert) to determine the
sample. Of course, this could be done for option A as well.

Experiments Conducted Ten samples of word pairs, selected according
to the methodology outlined above, have been described thus far in
scientific papers, and have been used as the basis for experiments:

Two of these samples are described in the paper, "Equidistant Letter
Sequences in the Book of Genesis," by Doron Witztum, Eliyahu Rips
and Yoav Rosenberg, published in Statistical Science (1994).

A third sample is described in a paper by Harold Gans: "Coincidence
of Equidistant Letter Sequence Pairs in the Book of Genesis."
(preprint)

Three further samples are described in another article by Doron
Witztum, Eliyahu Rips and Yoav Rosenberg: "Equidistant Letter
Sequences in the Book of Genesis II: The Relationship to the Text",
(regarding proximities of ELSs with expressions found in the
consecutive text).

Four additional samples are described by these same authors in the
article, "A Hidden Code in the Book of Genesis: The Statistical
Significance of the Phenomenon" (Hebrew), which was presented as a
guest lecture before the Israeli National Academy of the Sciences
on Mar. 19, '96.

In 8 out of 10 of these experiments the results were statistically
significant in the extreme. Of the remaining two, one gave results
which were statistically significant, and one was recorded as a
failure.

Additional samples, prepared under the same guidelines, served as the
basis for other similar experiments, the details of which will be
described in full in Doron Witztum's forthcoming book. A few partial
examples from this collection are presented publicly in lectures
given by, among others, "Arachim" and Aish HaTorah.

The results of these experiments indicate that the convergences
between the ELSs themselves, as well as between the ELSs and the text
are not chance occurrences.

A CASE STUDY- THE HOLIDAY OF CHANUKAH
I will describe here a brief experiment that was carried out on the
topic of "Chanukah."  This topic can be spelled in one of two ways
in Hebrew: "XNKH" or "XNWKH".  Furthermore, each of these spellings
can be written with or without the prefix "H" (the definite article):
"HXNKH" and "HXNWKH".

The source for spelling "XNWKH" appears, for example, in the legal
work which is considered the final authority in our day, the Mishna
Brurah, (670:1):

 WYMYM )RW HW HNQR)YM XNWKH, RWCH LWMR: XNW K"H, $BYWM K"H XNW M)WYBYHM

"And these days are called Chanukah, that is they rested ("Chanu")
25th ("k"h", the 25th), since on the 25th (of Kislev) they rested from
their enemies"

I used precisely the language which appears in this source. In the
first stage I searched for proximities of XNW (they rested) and
M)WYBYHM (from their enemies).  The proximity of these ELSs was
significant. I then marked the minimal ELS of the word XNW which
participated in this convergence, and checked whether it was also
convergent with the phrase $BYWM K"H (since on the 25th).  I also
checked whether it was proximal to the words XNWKH and HXNWKH. 
Investigation showed that each of these four convergences was
significant, and the combined probability of occurring by chance is
about one in a thousand.  It also became clear from this experiment
that the form HXNWKH was by far the most successful, so I chose to
use it for the continuation of the experiment.

In the second stage I decided to use an expression taken from the
very top of the hierarchy of expressions related to Chanukah. The
first expression which came to mind was the name of Judah the
Maccabee, of the Hasmonean family.  For example, in the
Encyclopedia Hebraica, under the entry for "Chanukah", he is the
only member of the Jewish forces mentioned as the hero (as YHWDH
HMKBY and as YHWDH HX$MWN)Y; Judah the Maccabee and Judah the
Hasmonean).  He is mentioned there not only as the chief military
leader, leading the victory over the Greeks, but also as the one
responsible for establishing the holiday of Chanukah.  Not only are
his name and appellations mentioned in the encyclopedia, they are
on the lips of every Israeli toddler.

Using his name and appelations from this entry in the Encyclopedia
Hebraica, I listed all of the possibilities for his appellations:

1. Judah the Hasmonean - YHWDH HX$MWN)Y 
2. Hasmonean -           X$MWN)Y 
3. The Hasmonean -       HX$MWN)Y
4. Judah the Maccabee -  YHWDH HMKBY
5. Maccabee -            MKBY
6. The Maccabee -        HMKBY
7. Judah-                YHWDH

Of this list, only the following occur as ELS's in the text of Genesis:

1. Hasmonean -     X$MWN)Y 
2. Maccabee -      MKBY
3. The Maccabee -  HMKBY 
4. Judah -         YHWDH

The name YHWDH is exceptional in this sample in that it is a common
Jewish first name.  If you were to mention the name "Yehudah" on a
Jewish street, no one would have the faintest idea that you were
referring to the star of the Chanukah story.  This is not the case
with the other appellations.

We decided therefore to investigate the three word pairs:

1. HXNWKH - X$MWN)Y
2. HXNWKH - MKBY
3. HXNWKH - HMKBY

THE RESULTS
An experiment was performed searching for ELSs of these pairs.
The combined probability for these three pairs (as calculated
according to the procedures described in the above articles) is
1/700,000.

It is important to emphasize that the decision whether or not to
utilize the word YHWDH was not a difficult one. There are, after all,
only two choices. The critical reader is at liberty to double the
probability to 1 in 350,000.

"CHANUKAH CODES" IN WAR AND PEACE
Research into hidden codes in Genesis has attracted the attention of
two different groups of people, who have consequently become involved
in the field. There are those who want to make money and/or to promote
their own interests through the codes.  This group is uninterested in
the credibility or the scientific basis of the phenomenon.

There is a second group, who would like to undermine the credibility
of the phenomenon.  This group is opposed to it for reasons of their
own, not the least of which are ideological.

The common denominator between the two groups is that they make the
exact same mistakes, using flawed and misleading examples when it
suits their purposes.

This is similar to counterfeiting-- some people counterfeit money
simply to make money.  Others do it (e.g. in times of war) to
undermine confidence in the enemy's currency.

For example: In May 1997 it was publicized through the Internet
that amazing codes had been "discovered" in the book War and Peace.
The authors of the document, who signed their names as codes in the
segment of War and Peace which accompanied their report, were
Professor Brendan McKay of the Australian National University,
Dr. Dror Bar Natan from the Hebrew University of Jerusalem, Alec
Gindis and Aryeh Levitan of Jerusalem. This is the group that
Professor Maya Bar Hillel works with as well.

These authors report that they examined a text of 78,064 letters
(the same length as Genesis), taken from the Hebrew translation of
War and Peace (from the beginning of the book). They investigated
how the topic of Chanukah appears in this text. By their account,
they discovered that in one section of the text no less than 59
expressions related to Chanukah appear, each one of which is the
minimal appearance of the word in the entire text, or else it
appears in consecutive letters. The authors claim that they
calculated the probability of this occurring by chance and
discovered that it came to less than 0.0000000000000000000000001.

In the document which they publicized, the authors try to create the
impression that what they are presenting is similar to the codes which
have been discovered in the book of Genesis. (See the internet site at
http://www.math.gatech.edu/~jkatz/religions/numerics/chanukah.html).

The truth is that there is an enormous difference between this attempt
and the genuine discoveries which have been made. Let us consider the
following analogy:

Imagine that a certain bomb manufacturer, the "Alpha Company," is
trying to persuade a Congressional subcommittee that the U.S. army
should be supplied with bombs of their making.

The company's scientists claim that they have developed "smart" bombs,
which are incomparably more accurate than any other bomb. However, they
themselves admit that not every "smart" bomb strikes exactly on target:
Mainly because the guidance system still does not take into
consideration every single factor affecting the trajectory (because the
factors are numerous and complicated). Yet, the developers of the bomb
assert that tests have shown that the accuracy of these "smart" bombs
is much greater than would be expected without a guidance system.

They provide the members of the subcommittee with photographs and
diagrams documenting their tests. On these photographs and diagrams
one sees impressive strikes on or near the target, and all within a
10 metre radius of the target.

The chairperson of the subcommittee consults with the neutral experts
whom he has invited to attend the meeting. They agree that the strikes
seem to have been extremely accurate. However, they note that one
important piece of data is lacking: How many bombs struck outside the
target area?

"That's a very good question," comments the chairman to the scientists
of "Alpha," "What do you have to answer?"

They explain that the bombs were manufactured under the supervision of
the governmental agency in charge of military testing, an objective
institution, who can verify just how many bombs were supplied for each
test, how many of them struck within 10 metres of the target, and how
many landed outside the target area. And indeed, the documentation
which has been made available is sufficient for the experts to make
their own calculations and to see for themselves just how impressive
the test results are.

At this point, lobbyists from the other bomb manufacturers try to
intervene. They claim that ordinary bombs can achieve just as good
results. To prove their claim they present a document reporting tests
which were run using ordinary bombs. They lay before the members of the
subcommittee a large photograph with an accompdiagram on which many
strikes have been indicated. "Here!" say the lobbyists for the ordinary
bomb makers, "You see that the ordinary bombs were no less accurate
than the "smart" bombs! There are no less than 59 bombs that landed
near the target!"

The members of the subcommittee examine the photographs and the
accompanying diagram, and compare them with the documents presented by
the manufacturers of the "smart" bomb. They notice immediately that
there is a big difference between the two presentations: In the tests
done by "Alpha" the strikes were within a 10 meter radius of the
target. On the other hand, the documents of the other lobby cover a
much larger area, with a radius of a 1000 meters, in which there are
a number of individual targets, but the strikes are not particularly
close to these targets.

The members of the subcommittee request one additional piece of
critical information: How many bombs struck outside of the target area?

The lobbyists hem and haw, and reply that they do not have that
information.

"If so," says the chairman of the subcommittee, "What claim do you
have at all? - Of course when you have such a large target area some
of the bombs are going to fall inside of it! The comparison you have
made is nothing but a fraud!"

As we have seen above, the researchers working on the hidden codes in
the book of Genesis have proceeded like the scientists of the "Alpha
Company."  The number of expressions in the experiment, corresponds
to the number of "bombs" in our analogy.  The parallel to having the
bombs manufactured under objective supervision consists of choosing a
set of word pairs which forms an objectively closed sample.  We have
already discussed the ways in which this can be accomplished.

We now proceed to the regular "bombs" manufactured by the competition.

A. HUGE AREA OF THE TARGET
Dr. McKay et.al. proceeded in the same manner as the "lobbyists" of our
analogy. In contrast to the compactness of the proximities which we
presented relating to the topic of "Chanukah", McKay et al. used an
enormous array consisting of 14,719 letters, which comprises about 20%
of the entire text!

Of course, it is only to be expected that about 20% of the expressions
tested for should land within such an enormous area.

B. LAXNESS IN THE LIST OF EXPRESSIONS
A close examination of the list of expressions which these researchers
report having found reveals certain peculiarities.  (We will use the
following convention: We will rank the ELS's of a particular expression
according to the size of the skips between letters.  The ELS with the
shortest skip in the entire text will be termed the "most minimum."
The next smallest ELS will be called the "second most minimum" and the
one after - the "third most minimum," etc.):

1. McKay et al.'s claim to have found 41 different expressions, each at
its most minimal appearance or as consecutive letters.  Some of these
appear more than once, yielding 59 total hits.

2. The word list they supply in English.  When one tries to deduce what
words they correspond to in the Hebrew text, one runs into difficulties,
as we discovered when we tried to replicate their experiment. For
example:

* They list the expression "Greek army." The Hebrew equivalent,
  CB) YWNY, does not appear in the target segment.  They seem to have
  had in mind the expression CB) YWN, which does appear in its most
  minimum form in the segment.

* The expression "Hashmonai," which is written in Hebrew X$MWN)Y,
  does not appear in the segment at all.

* "Modiin" is written MWDY(YN in Hebrew. This word does not appear
  in the segment.  We thought perhaps they had used the nonstandard
  spelling, MWDY(YM, but this form also does not appear in the segment.

* "Pure oil" (as it relates to the topic of "Chanukah") translates
  as $MN +HWR. This expression does not appear in the segment.

* "Maccabees" is written in Hebrew MKBYM. The minimum of MKBYM does
  not appear here. This expression appears in the segment only in its
  eighth most minimum form!  After additional investigation, we guessed
  that they were referring to the expression HMKBYM (the Maccabees),
  with the definite article.  And indeed, the minimum form of this word
  does appear.

* "Praise" is the translation of the word HLL. This short word appears
  in its most minimum form 76 times in the text, 14 of which are in
  this segment, just as one would expect to occur by chance. Therefore
  we thought that perhaps they had in mind the expression HHLL (the
  praise) with the definite article.  However, this form appears 6 times
  in the text, only one of which is in this segment.

* Of course, the most fundamental expression related to this topic is
  the word "Chanukah" itself.  The minimum of XNKH does not appear here.
  We looked for the word XNKH in the segment, and it appears only in
  its seventh most minimum form.  By now our experience had taught us
  to try adding the definite article, but written this way we found the
  word only in its 13th most minimum form!  It then occurred to us that
  perhaps they were referring to an appearance of the word as XNKH
  consecutive letters in the text.  And indeed, XNKH does appear in its
  consecutive form.

  If we spell the word "Chanukah" in its alternative form, XNWKH, we
  find that it does appear as a minimum.  (The spelling XNWKH appears
  in the segment in its fifth most minimum form).

  "Chanukiya" is written in Hebrew either XNKYH or XNWKYH.  Since on
  the table which they supplied it appears that they used both forms,
  we looked for both forms.  XNWKYH appears only in its second most
  minimum form, as does the spelling XNKYH. With the addition of the
  definite article, HXNWKYH does indeed appear in its most minimum
  form, but HXNKYH only appears in its third most minimum form.

* "Torah" appears in this segment only in its 15th most minimum form.
  As consecutive letters in the text only one of its three appearances
  landed in the target segment. With the definite article it appears
  only in its 13th most minimum form.

* "Miracles" translates into NSYM in Hebrew.  We were only able to
  find it in its seventh most minimum form.  We thought that here too
  they had used the definite article, but NSYM only appears in its
  ninth most minimum form.  We thought of looking for the word in its
  consecutive form, and indeed we did find one such appearance, but
  unfortunately it was outside of the target segment.  Only afterwards,
  when we consulted their own table did we discover that they had
  spelled the word - "NYSYM" (the addition of the first "Y" is a
  convention to facilitate reading without diacritical marks)!

* A "spinning top" is a SBYDWN in Hebrew.  This word does not appear
  at all in the segment.  It was later discovered that they had used
  the word "dreidel", which isn't even Hebrew, it's Yiddish!

Let us summarize what we have learned in the preceding section:

McKay et al. must have investigated at least the following
possibilities for each of their 59 words:

* The word as it is written in its "full" form, i.e. with the
  inclusion of the vowels "W" and "Y"
* The word as it is written using it's vowelized spelling form
  ('ktiv dikduki').
* With the definite article ("H")
* Without the definite article ("H")
* Non-Hebrew forms of the word (e.g. "dreidel").

Each of these possibilities was searched for twice: once as an ELS and
then again as consecutive letters in the text.  In all, 10 different
possibilities for each word.

Even if their list really represented an a priori list of expressions
relating to the topic of Chanukah (which we will soon see it does not),
and even if there were only five different possibilities for each word
instead of 10, we would still expect to receive result similar to the
ones they observed purely as a result of chance, because the
probability of hitting within such a large target area, comprising a
fifth of the text, is of course one in five.

Moreover, we have learned that the report which was publicized
contained inaccuracies, and that some of the words reported are not to
be found at all!

THE REAL RESULTS
In order to demonstrate just how critical the size of the target
segment is, and just how critical it is to use a precise methodology
for measuring proximites, we ran the following test.  Our measurements
were made using the methodology outlined in the above-mentioned papers.
McKay et al. are familiar with it, and they know how to use it:

We mentioned in section 2 that within the enormous area of the table
used by the researchers there were two possible "targets" - the two
occurrences of the main topic, "Chanukah."  The spelling XNWKH appears
as an ELS, and the spelling XNKH appears as consecutive letters in the
text.  If the associations are genuine, then the related words should
converge around these two targets.

We carried out two test "bombings," where the arsenal of "bombs"
consisted of exactly the words the researchers had marked off on their
table.  In one test we "bombed" the word "XNWKH."  The combined results
were just what one would expect to happen by chance.  In the second test
we "bombed" the word "XNKH" with the same bombs.  Once again the results
were totally random.

The proximities they show around the word XNKH or XNWKH, are expected
to occur by chance on every other page of your local newspaper.

C. FURTHER ERRORS
By all rights, we could end our critique here.  But if we examine the
way they chose their expressions, we can see that the situation is even
more grave:

McKay et al. quoted translations from (primarily) three Hebrew sources.
They indicated with bold face type the words which they considered to
be the most important ones.

One can see right away that they chose to mark off only a small
selection of words.  Among those which they have ignored are some of
the most central ones related to Chanukah.  For example the name
Matthias (MTTYHW).

Even more surprising is the fact that they did not even use all of 
the expressions which they themselves marked off!  They seem to have
"forgotten" the words: "priests", "king", "Greek kings", "High Priest",
and others.  On the other hand, among the list of words which they
marked off in the segment of the text, there are more than 10 which
are not marked in the sources.  The explanation is simple: the
expressions which they "forgot" are the ones which failed to show up
in the target segment!

But that is not all: The innocent reader is lead to believe that the
expressions which they did find in the segment are the same as those
which appear in the original Hebrew sources from which the word list
was compiled.  This is in fact not true.  They retranslated the words
back into Hebrew arbitrarily to fit the words found in the target
segment!

In one instance, they went even one step further.  They actually
changed an original text of the Talmud.  Instead of the phrase "..and
only found but one vessel of oil", they changed the source to read
"..and they found but one small pure vessel of oil."  Why the change?
Very simple--they needed the expression 'pure vessel', because they
knew that it appeared in their segment of the text!

Naturally, we would like to believe that all of this happened as the
result of innocent errors, but the document proby McKay et al. should
at least serve as warning to all to be wary of charlatans.  What is
clear from our analysis is that one has to check any claimed results
carefully: scientific analysis allows us to distinguish between what
is real and what is counterfeit.