
The Voynich transcription
A scribe imagined, transcribing lines from Dante's \"La Divina Commedia2.
Here is an example of how I imagine the transcription of a medieval document, which resulted in the book that we now know as the Voynich manuscript.
Let us suppose, for the sake of argument, that:
- a scribe receives Canto 1 of Dante Alighieri's La Divina Commedia, with instructions to transcribe it to the symbols that we now call Voynich glyphs;
- the principal glyphs are defined as Glen Claston will define them six hundred years later, with the exception that the symbol that Claston will call {4o} is not two glyphs but one;
- the Italian words are written in full, without the abbreviations and concatenations of the Foligno edition;
- the producer has prescribed a one-to-one mapping of Latin letters to glyphs, either not knowing or not caring that this mapping will preserve the frequencies of the Latin letters;
- it follows, with some confidence, that each Latin letter maps approximately to the equally ranked glyph; for example e to {o}, a to {9}, i to {a}, and so on.
In this scenario, the scribe examines the first line of Canto 1, which consists of seven words:
>nel mezo del camin di nostra uita
and following the mapping that the producer has laid down, he writes the rough transcription as shown below:
He then refers to the producer’s “slot alphabet” for the correct order in which the glyphs must be written. We do not know whether this alphabet was simple or complex; nor whether it was rigid or flexible. We might guess that it embodied rules of the following nature:
- If the "word" contains the glyph {4o}, write that glyph in the leftmost position.
- If the "word" contains the glyph {m}, write that glyph in the rightmost position.
- If the "word" contains the glyph {9}, write that glyph in the rightmost position.
- The glyphs {c} and {C} can be to the right of, but not to the left of, the glyphs {h} and {k}.
The scribe's clean copy, which he writes on the vellum, is as shown below:
Five of these seven “words” are real “words” in the Voynich manuscript; and the other two "words" differ by only one glyph from real Voynich "words".
In practice, we have no reason to believe that the source documents included La Divina Commedia, or even that they were in medieval Italian. My working assumption is that they were in languages that were spoken and written in Europe in the fifteenth century.
However, this exercise demonstrates that a one-to-one mapping of letters to glyphs, coupled with some kind of re-ordering process, can replicate real Voynich "words".
I think that the way forward is to try many candidate languages, and many alternative transliterations of the Voynich manuscript, with all the permutations that this will involve. This is not a manual task; it will necessarily be a massive computational approach. It was just such an approach that cracked the Zodiac cipher.