Disentangling “The Tangled Roots of English”

Feb 28, 2015 by

[I am deeply grateful to Martin W. Lewis for the inspiring discussions of, and extensive collaboration on, the issues examined here.]

Haak_mapSeveral articles, written by historical linguists, geneticists and archeologists, have been published in recent weeks on the issue of the Indo-European origins—and drew renewed public attention to the topic through reports in the popular media (see here and here). Three of them deserve a special mention. The first is “The Indo-European Homeland from Linguistic and Archaeological Perspectives”, written by archeologist David W. Anthony and historical linguist Don Ringe and published in Annual Review of Linguistics. The second is “Ancestry-Constrained Phylogenetic Analysis Supports the Indo-European Steppe Hypothesis”, written by a team led by historical linguist Andrew Garrett; it is to be published in Language, the flagship journal of the Linguistic Society of America (the preprint is available online). The third is “Massive migration from the steppe is a source for Indo-European languages in Europe”, written by a large team of geneticists and archeologists (including David W. Anthony, one of the strongest advocates of the Steppe theory) and published in Biorxiv online. Notably, all three articles support the Steppe theory, which links Proto-Indo-European (PIE), the ancestor of all Indo-European languages, to the pastoralist inhabitants of the Pontic Steppes in southern Russia some 5,500 years ago. This has not been a good month for the Steppe theory’s main competitor, the Anatolian theory, proposed by archeologist Colin Renfrew and most recently supported by Russell D. Gray, Quentin D. Atkinson and their colleagues (cf. Bouckaert et al. 2012). According to their view, PIE was spoken by Neolithic farmers in Anatolia (present-day Turkey) about 8,500 years ago.

As discussed in the forthcoming book by Martin W. Lewis and myself, The Indo-European Controversy: Facts and Fallacies in Historical Linguistics, historical linguists typically side with the Steppe theory (and, more generally, advocate a more recent and more northerly PIE homeland). But if you read the New York Times, particularly its publications by senior science journalist Nicholas Wade, you would not know that. Although journalists should be unbiased and evenhanded, as well as somewhat knowledgeable in the area they cover, Wade clearly picked a side in the Indo-European debate years ago and keeps providing coverage based on his own prejudices rather than facts. On August 23, 2012, his front-section article’s headline declared that “Family Tree of Languages Has Roots In Anatolia”, and the first sentence of the article itself proclaimed that “Biologists using tools developed for drawing evolutionary family trees … have solved a longstanding problem in archaeology: the origin of the Indo-European family of languages”. (Why the origin of a language family is dubbed a “problem in archeology” is as bewildering as the rest of the article.) Wade’s most recent piece, titled “The Tangled Roots of English” (hence the title of this post) and published in the NYT on February 23, 2015, states that “a surprisingly sudden resolution of this longstanding issue may be at hand”, in reference to the new pieces of evidence in support of the Steppe theory. Of course, there is nothing surprising or sudden about the Steppe theory, originally proposed by Marija Gimbutas in the 1950s. Only someone who is either completely ignorant of, or purposefully ignoring, the subsequent debate could be startled by the recent avalanche of evidence in favor of the Steppe theory.

Anthony&RingeWade’s ignorance of even the most basic notions in historical linguistics and his favoritism are both evident in the latest article, where he pays only lip service to the advocates of the Steppe theory and rushes to support the Anatolian theory without providing either arguments in favor of it or challenging the mounting evidence against it. As one illustrative example of this, consider Wade’s treatment of the “wheel” argument, one of the strongest arguments against the Anatolian theory. Wade writes:

“Linguists objected that proto-Indo-European could not have fragmented so early because the wheel wasn’t invented 8,000 years ago, yet many Indo-European languages have related words for wheel that must be derived from a common parent. But Dr. Renfrew argued that, long after their dispersal, these languages could all have borrowed the word for wheel along with the invention itself.”

The subsequent paragraph of his article turns to an overview of Atkinson & Gray’s 2003 article in Nature; thus, Renfrew’s position is presented as unchallenged. However, Wade follows Renfrew in committing the fallacy of thinking that just because something might have happened, it indeed did happen. Words for wheels and wheeled vehicles could have been borrowed; more generally, words for various technological inventions—from ‘shoulder yoke’ to ‘iPad’—frequently are borrowed from one language into another, typically together with the spread of the invention itself. But the fact is that the words for ‘wheels’ in the Tocharian, Indo-Iranian, Greek, and Germanic branches of Indo-European are all inherited from the same ancestral root (cited as “kwekwlos” in Wade’s article), as shown in the image on the left, from Anthony & Ringe (2015: 204). If these words were borrowed, they would show some signs of “foreign sound signatures” from the source language rather than the sound changes that occurred within the target language itself. The detailed argument is presented in Ringe (2006), and I will not go into it here. Suffice it to say that Ringe’s argument is based on a close examination of the relevant linguistic forms, whose formation is unique, involving reduplication, a zero-grade root, and a thematic vowel. According to Ringe (2006: 4), “the probability that it could have been formed independently more than once is virtually nil”. Thus, the only sensible conclusion, which Wade omits from his presentation, is that the “wheeled” vocabulary is inherited from PIE, not borrowed by some of its much-later descendants from their “cousins” in other Indo-European branches. As I have concluded in my earlier post on the subject, “the ‘wheel’ vocabulary originated in PIE prior to its split into daughter languages, which thus must have happened some time after 4000 BCE” (i.e. the time when evidence of wheel use appears in the archeological record). In the 2012 article, Wade waves such difficulties away without rebuttal. In the most recent piece he does not mention them at all.

Besides ignoring any inconvenient facts, Wade gets “entangled” in the facts (and in their analysis) that he does discuss in the article, particularly with respect to the history of English. Here his statements are simply contradictory. To understand why this is the case, we need to take a closer look at Wade’s presentation of Chang et al.’s (2015) work. The main innovation of this paper and the key departure from the model used in Bouckaert et al. (2012) is that Chang et al. implement ancestry constraints which restrict the relationships between eight ancient and medieval languages and thirty-nine modern languages in their data set, where the later languages are well-known to be the descendants of the earlier languages. Thus, Chang et al. assume the following ancestor-descendant relationships: Vedic Sanskrit—Indo-Aryan languages, Ancient Greek—Modern Greek, Latin—Romance languages, Classical Armenian—Modern Armenian dialects, Old Irish—modern Irish and Scots Gaelic, Old West Norse—West Scandinavian (Faroese, Icelandic, and Norwegian), Old High German—modern High German varieties (German, Swiss German, and Luxembourgish), and Old English—modern English. As Chang et al. explain, a “logical possible alternative” to a model with ancestry constraints, such as theirs, is that the later languages are descended not from the putative ancestor, but from another language spoken at the same time; “for example, perhaps Modern Irish and Scots Gaelic are descended not from Old Irish, but from an undocumented variety that had already significantly diverged from it” (p. 205). Chang et al. further explain that this alternative is “the only interpretation of the phylogenetic trees given by Bouckaert and colleagues …, whose analyses include ancestral and descendant languages but do not constrain their relationship except that each ancestral language forms a clade with its descendants” (p. 206). In Bouckaert et al.’s tree, ancient and medieval languages are crucially not, as most historical linguists believe (see Chang et al. for extensive references), ancestral to the later languages, : Latin is not ancestral to Romance languages, Vedic Sanskrit is not ancestral to Indo-Aryan languages, Old Persian is not ancestral to modern Persian, and so on. By the same token, Old Norse is not ancestral to modern Icelandic and Faroese, let alone other Scandinavian languages. Likewise, Bouckaert et al.’s tree “shows Old Irish evolving for over 500 years after its common ancestor with the other Goidelic languages” (Chang et al. 2015: 206). As for English, Bouckaert et al.’s tree shows it to descend not from Old English but from another variety spoken at the same time.

On this issue, as on the Indo-European problem as a whole, Wade sides with Bouckaert et al.; he writes: “But the case is not yet closed. … Dr. Garrett’s correction of the Bouckaert tree … may not be as conclusive as [it] seem[s]”. Furthermore, he cites Atkinson as saying: “The Garrett and Chang model is overzealous in forcing ancient languages to be directly ancestral – the data don’t support this”. It is not clear what data that might be, as the historical linguistic evidence for the ancestral relationships assumed by Garrett and his team is very solid. When it comes to modern English, some scholars have proposed that it descends not from Old English but from Old Norse—a theory that I have challenged in previous posts (see here, here, and here). Note, however, that Bouckaert et al.’s tree does not show that modern English is a descendant of Old Norse, or even of Frisian for that matter. Thus, under Bouckaert et al.’s scenario it must derive from some undocumented West Germanic variety spoken at the same time as Old English (5th-10th century).

Naturally, one wonders where this alleged ancestor—the “true Old English”, as it were!—must have been spoken. Wade’s explanation is that this “true Old English” was spoken at the same time and in the same place as what we normally label as “Old English”, which existed as a parallel written variant: “living languages are likely to be descended from a spoken language that diverged from the written version”, he writes. While it is generally true that people tend to write in a somewhat different way from how they speak, it does not mean that they always write and speak parallel yet distinct languages. On this issue, Wade quotes Paul Heggarty, a historical linguist at the Max Planck Institute for Evolutionary Biology, as saying that “written languages tend to be fossilized”. But Wade fails to understand the significance of that quote: it means that written languages are likely to be more representative of the spoken language of an earlier period, but not of a different language altogether. Thus, in English we write knight, as in Old English, but pronounce [najt] as in (late) Middle English. Moreover, such fossilized elements are typically found in two areas of language: pronunciation and syntax, but neither is particularly relevant to Bouckaert et al.’s model. For example, the pronunciation of dog (with a rounded or unrounded vowel) is not important for them, but the fact that it replaced the earlier hund as the “everyday equivalent” of the meaning ‘dog’ is what matters. And since Bouckaert et al.’s entire approach is predicated on examining only vocabulary, the syntactic fossilization and stylization typical of ancient and medieval texts, especially of translations of religious texts, which often copied syntactic structures of the original, is also irrelevant.

To make matters worse for Wade, not only does he uncritically side with the view that ancestral relationships, well established in historical linguistics, do not hold, but he also contradicts his own conclusion elsewhere in the article. In the discussion of the “wheel argument”, Wade states that “hweohl in Old English [is] itself the ancestor of wheel in modern English”. Yet, under his favored scenario, namely that Old English is not the ancestor of modern English but rather its “great-aunt”, modern English could not possibly have inherited its word wheel (or any other word, for that matter) from Old English. Words are like hair color and not like a coin collection: you can only inherit it from direct ancestors, not from side branches of the family. Thus, Wade’s article is not internally-consistent, in addition to being overly dramatic, highly biased, and rather lacking in understanding of the subject matter at hand—not at all what one would expect from a leading science journalist of a major news outlet. In following posts, I will look at Wade’s other articles on language-related issues—and a bigger picture of peddling misinformation and perpetuating ignorance will emerge.



Anthony, David W. and Don Ringe (2015) The Indo-European Homeland from Linguistic and Archaeological Perspectives. Annual Review of Linguistics 1: 199-219.

Bouckaert, Remco; Philippe Lemey; Michael Dunn; Simon J. Greenhill; Alexander V. Alekseyenko; Alexei J. Drummond; Russell D. Gray; Marc A. Suchard; and Quentin D. Atkinson (2012) Mapping the Origins and Expansion of the Indo-European Language Family. Science 337: 957-960.

Chang, Will; Chundra Cathcart; David Hall & Andrew Garrett (2015) Ancestry-Constrained Phylogenetic Analysis Supports the Indo-European Steppe Hypothesis. To appear in: Language.

Haak, Wolfgang et al. (2015) Massive migration from the steppe is a source for Indo-European languages in Europe. biorxiv online.

Ringe, Don (2006) Proto-Indo-European wheeled vehicle terminology. Unpublished Ms., University of Pennsylvania.



Subscribe For Updates

We would love to have you back on Languages Of The World in the future. If you would like to receive updates of our newest posts, feel free to do so using any of your favorite methods below: