On the Basques, Their Genes, Their Language, and What They Mean for the Indo-European Debate

Sep 18, 2015

On September 7, 2015, Michael Balter published an article in the Science Magazine discussing the recent research into the genetic heritage and language of the Basque people, particularly the study by a team of geneticists led by Mattias Jakobsson of Uppsala University, Sweden (published in PNAS Online in the summer of 2015; referred to below as Günther et al.). As in my earlier post on the Slavs and their languages, I will not discuss methodological aspects of Günther et al. article, but only their conclusions and what they mean for the Indo-European controversy, namely the debate on where and when the ancestral language of the Indo-European family was spoken. (For a more detailed discussion of the controversy, the reader is referred to a recent book by Martin Lewis and myself, The Indo-European Controversy: Facts and Fallacies in Historical Linguistics and my earlier posts on this blog.) But before proceeding with the issue of Basque genes and language, let me make some clarifying remarks about the characterization of the Basque language in Balter’s Science Magazine piece.


First of all, in the title and throughout the article Balter describes Basque as an “unusual” and even “unique” language. While it is not entirely clear what he means by that, from the context it appears that what makes Basque “unique” for him is it not belonging to any language family. Basque is indeed an isolate, that is the only member of its language family, with no known surviving relatives. It has been hypothesized that the ancestor of Basque was a member of a bushier Vasconic family that included a number of other languages spoken throughout Atlantic Europe (see Wikipedia map reproduced on the left). Twists of fate—and not any inherent peculiarity—made Basque the only surviving language among its Vasconic brethren. But Basque is not unique in being an isolate. There are hundreds of such languages around the world, including Burushaski (Nepal), Kusunda (Nepal), Ainu (northern Japan), Ket (Western Siberia), Nivkh (Russian Far East), Waorani (Equador), Kutenai (British Columbia, Canada), Shasta (Northern California), Massep (Eastern Papua, Indonesia), and many others.

As discussed in my earlier posts on Burushaski and Kusunda, being an isolate does not necessarily make a language “weird”, “unusual”, or “unique” in the typological sense. Basque is no exception: while some of its grammatical properties may seem odd from the Anglophone perspective, most of its features are fairly unremarkable in a broader typological perspective. For example, Basque consonant and vowel inventories, though somewhat smaller than in English, are characterized by WALS as “average”. The lack of grammatical gender, even in 3rd person singular pronouns, might surprise an English speaker (English distinguishes he, she, it, but not grammatical gender of nouns), but many languages (including Finnish, Turkish, Hungarian, Malagasy, and Ket)—in fact, 67% of languages in the WALS sample—do not make this distinction. Word order in Basque also may seem peculiar from the English perspective. For example, in Basque adjectives are placed after the noun (in English, adjectives normally precede nouns, as in big houses). But the same noun-adjective order is also found in Hebrew, Irish, Sardinian, Abkhaz, Malagasy, Maori, and in 64% of languages in the WALS sample. The order of verbs and their arguments in Basque clauses is also unlike that of English: Subject-Object-Verb rather Subject-Verb-Object. But approximately 40% of the world’s languages use the SOV as the basic order, including Japanese, Turkish, Georgian, Chechen, Lezgin, and Hindi. To ask yes/no questions, Basque uses special question particles, unlike English which reverses the order of the subject and the auxiliary verb (e.g. Has John left?). But question particles are also found in a broad range of languages (61% in the WALS sample), including Arabic, Japanese, Russian, Turkish, Yiddish, and Zulu. The list goes on. Thus, Basque is hardly “unique” as far as its linguistic profile is concerned.

Equally imprecise is Balter’s characterization of Basque as a “relic language”, at least under the dictionary definition of relic, as “an object surviving from an earlier time, especially one of historical or sentimental interest”. As much as the Basque language is “of historical interest” (as we shall see below), it did not survive from the prehistoric era unchanged. Rather, like any other human language, Basque keeps changing. It is through those changes, which proceed differently in different areas, that dialects of Basque have emerged (see this Wikipedia map). The history of the Basque language is discussed in detail in Trask (1996). The difference between Basque, which is often seen as “survived from the prehistoric era”, and English, which is virtually never seen in the same way (the title of a BBC article, “English language ‘originated in Turkey’” is a rare exception), is that we append the same label to Basque of today and its ancestor several millennia ago, but different labels to English and its prehistoric ancestor (i.e. Proto-Indo-European). However, the actual relationship between today’s language and its prehistoric ancestor is pretty much the same in both cases: the great-great…-grandfather of today’s Basque has the same relationship to the modern language as Proto-Indo-European has to English. Our traditional labels stem from convention more than fact.

One last warning concerns a potential confusion between the geopolitical unit of the Basque Country, the ethnic group of the Basque people, and the linguistic group of the Basque language. Ethnic Basques live also outside the Basque Country, and not all ethnic Basques speak the Basque language (in fact, only a minority of them do). However, unlike some other language that have been adopted outside the ethnic group (e.g. English or Russian), Basque is rarely spoken as a mother tongue by non-Basques.

With those clarifications in mind, let’s now turn to the Günther et al. article. According to the authors, they conducted the first ever “genome-wide sequence data from eight individuals associated with archaeological remains from farming cultures in the El Portalón cave (Atapuerca, Spain)”. These pre‑historic individuals “emerged from the same group of people as other Early European farmers”. The advancing agriculturalists mixed with—and eventually acculturated—local hunter-gatherers. Besides showing how agriculture must have spread through southwestern Europe, which Günther et al. argue was mostly through migration rather than cultural transmission, the genome data from the El Portalón skeletons sheds new light on the origins of the Basque people: because “the El Portalón individuals showed the greatest genetic affinity to Basques”, Günther et al. conclude “that Basques and their language may be linked with the spread of agriculture across Europe”.

In other words, three population waves can be distinguished in the pre-historic peopling of Europe: Paleolithic hunter-gatherers, Near-Eastern agriculturalists, and Steppe pastoralists. (Some scholars deny the existence/importance of the third wave, but recent genetic evidence, discussed in my earlier post, strongly supports migration from the steppes.) Each new wave mixed with, acculturated, and in some cases subsumed the pre-existing population. While this broad-strokes picture is largely agreed on, the issue of which contemporary groups show how much genetic and/or linguistic connection to which pre‑historic population is a more controversial one. Thus, Basques have been commonly assumed to be descendants of the first population wave, the hunter-gatherers. Gradually pushed into the mountainous “refuge zone” in the Pyrenees, they maintained their genetic uniqueness (for instance, earlier genetic studies found them to have “a higher-than-normal frequency of Rh‑negative blood types”, as pointed out by Balter), as well as their language. Or so the story went. If Günther et al. are correct, Basques are descendants not of the hunter-gatherers but rather of the agriculturalists who spread through southern Europe and into Iberia, ultimately from the Near East.


This conclusion has important consequences for the Indo-European debate. If the distinctiveness of the Basques is a result of them being descendants of an earlier wave, surrounded by a sea of advancing Indo-European-speaking groups (primarily, Celtic- and later Latin-speaking), and that earlier wave was the farming population, it follows that the advancing Indo-Europeans must be the third population wave, the steppe pastoralists (who eventually adopt agriculture, as more suitable to the geographical conditions of their new habitat). In other words, the finding that links Basques to agriculturalists rather than hunter-gatherers provides a strong argument in favor of the Steppe theory of Indo-European origins (as schematized on the left). According to the Anatolian alternative, the original Indo-Europeans were the Near Eastern agriculturalists, who later spread into Europe. For this to be possible, we need to assume that the Basques and the Indo-Europeans were two very different waves of agriculturalists that presumably came from different places and did not mix much. There is little evidence, as far as I am aware, to support such a scenario.

But as mentioned above, we should be careful about distinguishing “peoples” and their languages. As Balter points out, we “cannot entirely rule out the possibility that Basque still has its origins in a hunter-gatherer language that was retained and carried along as farming spread throughout Iberia”. Is this possibility, however remote, a way out for the advocates of the Anatolian theory? I remain skeptical about this scenario, however, as it would involve hunter-gatherers contributing the Basque (or more broadly Vasconic) language, the agriculturalists contributing the distinctive Basque genetic make-up and the Indo-European language, with the steppe pastoralists bringing in the characteristic “Indo-European” DNA signature but making no major impact on the language. This scenario seems quite outlandish to me.

In fact, the possibility that the Basque language descends from the hunter-gatherers’ tongue while their genetic heritage comes mostly from the agriculturalists seems very remote to me. Generally, it is not impossible for a group to show a linguistic linkage to one population and a genetic linkage (primarily) to another population. For example, Hungarians are genetically close to other (Central) Europeans, whereas their language is closely related to those of Khanty and Mansi, two groups of (until recently) reindeer pastoralists inhabiting Western Siberia. But when a group of hunter-gatherers encounters a group of agriculturalists, the flow of genes and/or language typically goes from the agriculturalists to the hunter-gatherers rather than the other way around. For instance, the interaction of farming Bantu peoples with the Hadza, a hunter-gatherers group in north-central Tanzania, resulted in the Hadza exhibiting a significant genetic admixture from the Bantu peoples. (The Hadza kept their language, which used to be classified as a member of the Khoisan family, but is now thought to be an isolate; cf. König 2008.) Similarly, the Bantu peoples had a linguistic influence on the Pigmies, whose original languages have been completely replaced by Bantu ones, though they retained their peculiar genetic profile. Neither the Hadza, nor the Pigmies had a significant genetic or linguistic impact on the Bantus. In fact, I cannot think of any known case where “farmers” overran hunter-gatherers but kept their language. (If the readers know of such examples, I welcome comments in the Disqus section below.)

Thus, a reasonably coherent picture emerges as to the peopling of Iberia (and of Europe, more generally), with the Günther et al. study contributing an important piece to the puzzle: if correct, their conclusions weighs heavily in support of the Steppe theory of the Indo-European origins and against the Anatolian alternative, adopting which would require making weird assumptions, not supported by the genetic, linguistic, or archeological record.



Günther, Torsten et al. (2015) Ancient genomes link early farmers from Atapuerca in Spain to modern-day Basques. PNAS Online.

König, Christa (2008) Khoisan Languages. Language and Linguistics Compass 2(5): 996–1012.

Trask, R.L. (1996) History of Basque. New York/London: Routledge.



