What Did Proto-Indo-European Sound Like?—And How Can We Know?

[This post was originally published in October 2013]

Archaeology magazine recently published an article entitled “Telling Tales in Proto-Indo-European”, which included a recording of a short text in Proto-Indo-European (PIE), the ancestor of modern languages in Europe and parts of Asia. This recording, made by Dr. Andrew Byrd of the University of Kentucky, a student of UCLA’s Indo-European expert H. Craig Melchert, drew considerable attention in the media (see here, here, and here). The text read by Byrd is a short parable called “The Sheep and the Horses”, which was originally written by a German philologist August Schleicher in 1868, as a way to experiment with the reconstructed PIE vocabulary. Here is the English translation of the story (which may sound familiar to people who watched the movie Prometheus):

A sheep that had no wool saw horses, one of them pulling a heavy wagon, one carrying a big load, and one carrying a man quickly. The sheep said to the horses: “My heart pains me, seeing a man driving horses.” The horses said: “Listen, sheep, our hearts pain us when we see this: a man, the master, makes the wool of the sheep into a warm garment for himself. And the sheep has no wool.” Having heard this, the sheep fled into the plain.

Note that the vocabulary used in the parable—particularly, words for ‘wool’, ‘horse’, ‘wagon’, and ‘carrying’— places PIE firmly into the period of the “secondary products revolution”, that is not much earlier than 4000 BCE (or 6,000 years ago), contrary to the Anatolian theory recently advocated by Russell Gray, Quentin Atkinson and their co-authors (Bouckaert et al. 2012).

Although a number of media reports presented Byrd’s recording as the first attempt at reconstructing what PIE sounded like, debates about the nature of the PIE sound system have been raging since the mid-1800s. As Byrd himself notes, his own recent recording is “a very educated approximation” based on the Glottalic theory of PIE, accepted by some but not all Indo-European scholars. He further admits, “since there is a considerable disagreement among scholars about PIE, no one version can be considered definitive”; the only way to know for sure what PIE sounded like, he quips, is to have a time machine.

The PIE recording produced for the Archeology magazine is by no means the only attempt to reconstruct the pronunciation of historical dialects or even of long-dead languages. But how can we know how anything was pronounced in periods before sound recording? Unfortunately, there are only three reliable ways to know how a language—any language!—sounds: to listen to a native speaker pronouncing it; to listen to a recording of a native speaker making the same sounds; or to know the position of the tongue and mouth used when speaking the language. Obviously, when it comes to languages of the past, the first two methods cannot be applied and the third is rarely applicable.

Linguists, however, have developed several strategies for triangulating historical pronunciation, although it ultimately remains something of a guessing game. The first source of evidence is contemporary descriptions, however imprecise, vague, and non-technical, of how a given sound or word was pronounced. One example is a passage in Quintilian (Education of an Orator 1.7.7), where he says that the word urbs ‘city’ is pronounced with a [p] before s, [urps]. This is a case of voicing assimilation: the consonant in question is pronounced as voiceless because it assimilates to the following voiceless consonant /s/, thus making the letter “b” pronounced [p].

Another example of a contemporary description of Latin pronunciation is a poem by Catullus that pokes fun at a man who, in his eagerness to speak correctly and not to omit his “h”s, overshoots the mark and adds “h”s where they do not belong (the technical term for this is “hypercorrection”). One example of this unnecessary “h” is the word for ‘ambush’, which the man reportedly pronounced as [hinsidias] instead of [insidias] (the English insidious derives from the same root). From this we can learn that by the Classical Latin period in the middle of the 1st century BCE, the h had been already lost in this word. Since phonological changes are typically generalized over whole classes of sounds, it is most probable that the h-sound was lost in the word-initial position across the vocabulary. This conclusion is further supported by evidence from other sources, discussed below.

H-dropping.svgNote that the “h”-troubles continue in English of today, chiefly because of the large number of h-words borrowed from Latin and its descendants (especially, French). English words starting with “h” that were borrowed from French have a silent initial “h”, as in hour, honor, honest, and heir. In contrast, the initial “h” in native Germanic words is pronounced in almost all English dialects (Cockney, North Devon, and Rossendale being rare exceptions), as is the case with happy and hot. However, this distinction is not as clear-cut as it may appear: several English words that were borrowed from French still retain the “h” sound because of the influence of spelling (the linguistic term is for this phenomenon is “spelling pronunciation”). Some examples include hostel, hotel, and haste. The “h”-situation has proved so confusing that some words are pronounced differently by different speakers, with or without [h]: herb, human, humor, and humble. The pronunciation of “herb” in particular helps differentiate American English, in which the “h” is not pronounced, from British English, in which it is pronounced.

The “weak” nature of “h” in Latin and its tendency to drop out are further confirmed by another source of evidence about historical pronunciation: inscriptions. An inscriptions is any text that is not transmitted by manuscript, which would have been subject to repeated copying and corrections, as well as an occasional writing over the original text. Inscriptions include tombstones, milestones, laws, decrees, dedications and the like that entail engraving on a durable material like stone that allows them to be preserved intact. Another feature of inscriptions that makes them particularly useful for determining historical pronunciation is that they are often written by people who are not professional scribes. As a result, inscriptions often contain spelling errors that indicate pronunciation at the time, although such pronunciation may be informal or uneducated. For instance, several Latin inscriptions contain the word onorem, spelled without the initial h- found in Classical Latin spelling (honorem), indicating that this sound was not pronounced, thus confirming our earlier conclusion based on Catullus’s poem.

Yet another way to access historical pronunciation is borrowings to and from the language in question, along with renditions of personal and place names found in other languages. For example, to understand how names were pronounced in Latin, we can examine how they were rendered in Ancient Greek. Take the name Cicero: in contemporary Greek period it was always spelled with two letters “kappa”, which we know to have always been pronounced [k], never [s] or [ts]. Thus, we can infer that in Latin Cicero was pronounced as [Kikero], with two [k]-sounds.

The confusion about the pronunciation of the letter “c” carried into Modern English as well, again because of French: it is pronounced [k] in carry and comic, but as [s] in mice and grace. (I am leaving aside the cases when “c” is followed by “h”, where it can be pronounced either [tsh] in cherry, or [sh] in chaperone, or even [k] in maraschino cherry.) In the Old English period, the letter “c” could be pronounced either as [k], as in cyssan ‘kiss’ and cneow ‘knee’, or as [tsh], as in cild ‘child’ and ceap ‘cheap’. In the latter set of words, it was later consistently replaced by the spelling “ch”. In the former set of words, the letter “c” was replaced by “k” by the Middle English period (roughly 1100-1450). Crucially, in the Old English period the letter “c” was never pronounced [s]. This use of the “c” came to England with Norman scribes, who—being trained to write in French—used the French spelling conventions even when writing English. Hence, the spellings mice and grace.

Borrowings are especially helpful in determining historical pronunciation when the same word was borrowed twice at different periods, which has happened to many French words that entered English twice. Take, for example, castle and château. Both words derive from the same French root, but they were acquired in different periods. The difference in pronunciation reflects that: the initial [k] in castle versus the [sh] in château illustrates a phonological change that happened in French in the period between the two borrowings. Another phonological change is the disappearance of the medial [s], still present in castle but absent in château. This process of s-deletion in French was accompanied by the so-called compensatory lengthening, a process that made the preceding vowel /a/ longer (in French, this is encoded in spelling via the diacritic, circumflex accent, over the letter “a”).

But borrowings and transliterations across languages can also be misleading in regard to historical pronunciation: for example, neither the Russian pronunciation [garvard] ‘Harvard’, nor the English pronunciation [‘moskou] for the Russian [mas’kva] is indicative of how the place name is pronounced in its original language.

Finally, one other strategy for figuring out pronunciation of the past that has proven useful to historical linguists is rhyming poetry. Of course, the usefulness of this technique is limited to languages and historical periods when rhyming poetry was popular enough to have left a substantial corpus of work. For example, Beowulf contains alliterative rather than rhyming poetry: the beginnings rather than the ends of words match. Still, rhyming poetry has been extremely helpful in figuring out how words were pronounced in the times of such great poets as Chaucer (mid-14th century) and Shakespeare (late 16th-early 17th century). For example, Chaucer rhymed breeth and heeth, while our breath and heath do not rhyme. Based on Chaucer’s intended rhyme, we can infer that the two words were pronounced with the same vowel in his day. In a similar fashion, Shakespeare rhymed tongue and wrong, which indicates that one (or both) of these words has changed its pronunciation some time in the past 400 years. As it turns out, the pronunciation of these words varies greatly from one English dialect to another, and in some British dialects the two words still rhyme. So following “with open eye” the rhyming poetry’s “melodye” (two words that also rhymed for Chaucer, as can be seen from the famous Prologue from The Canterbury Tales) can shed some light on historical pronunciation. And as linguist David Crystal and his son, actor Ben Crystal, have discovered, reconstructing the original Elizabethan pronunciation of Shakespeare’s plays—and performing them in the original accent—can help us determine the bard’s intended meaning as well as add to our enjoyment of his plays and poems.

