What is Phonemic Diversity? —And Does It Prove the Out-of-Africa Theory?

May 24, 2014 by


The article by Bouckaert et al. “Mapping the Origins and Expansion of the Indo-European Language Family” in Science is not the first foray of (some of) the authors into the realm of historical linguistics and language evolution. In an earlier article “Phonemic Diversity Supports a Serial Founder Effect Model of Language Expansion from Africa”, also published by Science and lavishly praised by Nicholas Wade of the New York Times, Quentin Atkinson—this time the sole author—claims that by applying mathematical methods used in genetics to linguistic data from 504 living languages around the world, one can trace the origin of human language to West Africa (see map on the left). This result is intriguing, especially in light of the fact that most researchers place the origin of human language in East Africa. Yet a number of responses published in Science by linguists, cognitive scientists, and statisticians, including a Technical Comment I co-authored with Rory van Tuyl, identify serious methodological and substantive flaws in Atkinson’s research. Here, I focus on the errors apparent merely in the abstract of the article, several of which show a lack of understanding of even the most basic linguistic concepts, taught in introductory classes.

Here is the abstract of Atkinson’s article:

“Human genetic and phenotypic diversity declines with distance from Africa, as predicted by a serial founder effect in which successive population bottlenecks during range expansion progressively reduce diversity, underpinning support for an African origin of modern humans. Recent work suggests that a similar founder effect may operate on human culture and language. Here I show that the number of phonemes used in a global sample of 504 languages is also clinal and fits a serial founder–effect model of expansion from an inferred origin in Africa. This result, which is not explained by more recent demographic history, local language diversity, or statistical non-independence within language families, points to parallel mechanisms shaping genetic and linguistic diversity and supports an African origin of modern human languages.”


The decline in genetic diversity as one moves farther from the putative origin of modern humans in Africa is well-documented and easily explainable in terms of successive population bottlenecks: as only a subset of the original population survives such a bottleneck, the amount of genetic variation in the resulting population decreases. According to Atkinson, this pattern is observable—and for the same reasons—in regard to linguistic diversity. Here, the first conceptual issue arises: the term “linguistic diversity” is typically used in linguistics to signify the number of languages/dialects in a given area, country, continent, or population group. In other words, “linguistic diversity” refers to the number of languages, not to any properties internal to these languages. It is thus parallel to the concept of “biodiversity” in life sciences, not to “genetic diversity”. It has been observed that the spatial distribution of linguistic diversity and biodiversity largely overlap, perhaps indicating that similar mechanisms underpin both types of diversity. I concurr with Gorenflo et al. (2012) that “although different processes may have given rise to the diversification of languages, cultures, and species in different areas, similar forces currently appear to be driving biological extinctions and cultural/linguistic homogenization”.


Maps of linguistic diversity, such as the one reproduced on the left, mark each language by a dot; the more dots cluster in any given area, the higher the degree of its linguistic diversity. As can be seen from this and similar maps, linguistic diversity does not decline as one moves away from West Africa. While it is true that West Africa itself—especially the area around the border of Nigeria and Cameroon—is highly linguistically diverse, other areas of comparable if not higher linguistic diversity can be found in the Caucasus, Nepal, and especially in Papua New Guinea, arguably the most linguistically diverse place on Earth. Particularly damaging to Atkinson claim is the area of pronounced linguistic diversity in the Mesoamerica, since this area is quite remote in terms of human migration out of Africa.

However, from the rest of the abstract and the body of the article, it becomes clear that Atkinson used the term “linguistic diversity” to mean something quite different, essentially “phonemic diversity”. Atkinson purports to “show that the number of phonemes used in … 504 languages is … clinal” (italics mine). However, Atkinson demonstrates nothing of the sort, as he does not actually count the phonemes used in any given language. Instead of adding likes with likes, he adds “apples and oranges”… and tomatoes! A proper calculation of the number of phonemes in any given language would add the number of consonant phonemes to the number of vowel phonemes. Instead, Atkinson adds together consonants phonemes, vowel qualities (typically defined by such features as height, backness, and roundedness), and tones (suprasegmental features). An equivalent of this calculation in physics could be the sum of certain molecules, atoms, and electrons, meaningless mélange of disparate entities added together. The number of vowel qualities in a given language comprises only a subset of vowel phonemes in that language, as a language may employ, for example, a binary length as a meaningful (i.e. phonemic) distinction, which effectively doubles the number of vowel phonemes. For instance, Latin had five vowel qualities (i, e, a, o, and u) and a two-way length distinction, which gives a total of ten vowel phonemes rather than five listed by Atkinson, who counts only the vowel qualities. Similarly, Finnish has eight vowel qualities (i, y, e, ø, æ, a, o, and u) and a two-way phonemic length distinction (e.g. il ‘day’ vs. i:l ‘work’, tuleen ‘into fire’ vs. tuuleen ‘into wind’); this adds up to the total of 16 vowel phonemes rather than eight. Tones, like length, are not themselves phonemes, but are rather superimposed over (vowel) phonemes, and hence are known as suprasegmental features. A two-way tone distinction in a language with five vowel qualities would result in ten vowel phonemes, rather than seven phonemes à la Atkinson (i.e. five vowel phonemes + two tones). Similarly, a three-way tone distinction in a language with the same five vowel qualities would result in 15 vowel phonemes, not eight; and a five-way tone system in a five-vowel-qualities language would produce 25 vowel phonemes rather than ten.


Would Atkinson’s clinal pattern hold if the true number of phonemes is calculated for each language, adding up consonant and vowel phonemes (the latter number taking into account meaningful length and tone distinctions, where applicable)? The languages that best fit the pattern that Atkinson supposedly found include the click languages of Africa (some of which have more than 100 phonemes) and Hawaiian, toward the far end of the human migration route out of Africa, which has only 13. (English is roughly in the middle, with about 45 phonemes, depending on the dialect).


However, a quick examination of the WALS map of consonant inventory size, reproduced on the left, reveals that languages with rich (consonant) phoneme inventories—marked by dark red dots—include not only African click languages like !Xóõ and Ju|’hoan, but also non-click languages of the Caucasus (e.g., Lezgin and Kabardian), some Papuan languages, and even some languages in South America, at the farthest end of the human migration route out of Africa (e.g., Jaqaru and Araona), in direct contradiction to Atkinson. Also, quite a few languages with very small (consonant) phoneme inventories are located in western Africa, as revealed by the dark blue dots on this map.



A study conducted by Keith Hunley, Claire Bowern, and Meghan Healy (HB&H) and published in February 2012 in the Proceedings of the Royal Society also contradicts Atkinson’s findings. Unlike Atkinson, they used full consonant and vowel inventory figures from 725 languages, which together contain 908 distinct phonemes. HB&H show that there is no negative correlation between the number of phonemes and the distance from the putative area of origin in West Africa. Moreover, they test three other predictions of Atkinson’s Serial Founder Effects (SFE) theory and show that they hold for genetic variation but not for phonemic variation. The first prediction—based on the idea that founder effects along the out-of-Africa migration routes reduce variation—is that Africans will possess more unique alleles and phonemes than the indigenous peoples of other regions. This prediction is borne out for alleles but not for phonemes: though Africa has more private phonemes (i.e. phonemes unique to this region) than any other region (as expected), Oceania has a relative deficit of private phonemes, compared with private alleles, and the Americas have a relative excess. The second prediction is that following each founder event, the new daughter group will carry only a subset of the variation of its parental group, so that a negative correlation must exist between within-group variation and geographical distance from the African origin. Again, while this pattern holds for genetic variation, it fails in the case of phonemic variation, as the number of phonemes is highest on average in Eurasia, not in Africa.


The third prediction is that the pattern of among-group variation will be tree-like, and the tree will be rooted in Africa. This prediction too is borne out for genetic variation but not for phonemic variation: a midpoint-rooted phoneme tree, produced using a Bayesian approach, exhibits some regional clustering, but it contains considerably less geographical structure than the microsatellite neighbor-joining (NJ) tree (see image reproduced on the left). Thus, HB&H effectively disprove Atkinson’s SFE hypothesis, showing instead that phoneme inventories provide information about recent contacts between languages, but fail to illustrate more ancient evolutionary processes, in direct contradiction to Atkinson’s claim (in the above-cited abstract) that this pattern “is not explained by more recent demographic history”.


A more general issue, however, is why it is the number of phonemes should be expected to exhibit a clinal pattern in the first place. Why not the number of basic color terms? Or the number of grammatical genders? Or any other feature quantitatively describing language? It appears from Atkinson’s writing that he takes phonemic variation to run parallel to genetic variation, thus revealing an egregious lack of understanding of what a phoneme is, and perhaps of what genetic variation is too. The term “genetic variation” is actually misleading, as the variation is not among genes as such, but among alleles, that is alternative forms of those genes. Following a bottleneck, the surviving population will carry only a subset of the alleles—not of the genes—of its parental group. Number_of_grammatical_gendersBut phonemes—by definition, linguistic sounds that are used to discriminate meaning—are parallel to genes, not to alleles. Alternative forms of phonemes, which can be seen as parallel to alleles, are called allophones. For example, /d/ and /t/ are distinct phonemes in English, as they contrast in words like dent and tent, or write and ride. Both /d/ and /t/ have a range of allophones that are conditioned by location within the word as well as regional dialect and social class (see, for example, Labov 2001). For instance, /t/ in top is pronounced as an aspirated voiceless stop, whereas in stop it is unaspirated; in pot it is typically pronounced as an unreleased stop. Cockney speakers, as well we those of certain Scottish accents (e.g. in Edinburgh and Buckie), pronounce the intervocalic /t/ in better as a glottal stop, while most speakers of American English (as well as those from Belfast, New Zealand, Singapore, and the younger speakers from North Devon) pronounce it as an r‑like flap (this is also the way /t/ is pronounced in writer, which makes it sound exactly like rider for these speakers). A bottleneck in the population of English speakers may eliminate, say, the glottal stop from the range of allophones of the phoneme /t/, thus reducing allophonic variation; however, the phonemic inventory of English would remain the same. In fact, “variation in allophones is found in all languages and is a major driver of language change. In contrast, the level of phonemic variation within a language is small” (HB&H, 2012, p. 6).

If one is to draw a parallel between sound change and genetic evolution, the SFE model may turn out to be applicable to allophonic variation: “a daughter population would contain a subset of the allophonic diversity found in the parent, and the daughter would then be subject to processes of allophonic change, drift and selection that lead to sound change. Crucially, such changes are largely neutral with respect to phoneme inventory size” (HB&H, 2012, p. 6). Unfortunately, this hypothesis cannot be tested currently, as no databases of the allophonic variation exist so far.

Moreover, there is no bias towards decreasing the size of phonemic inventory over time as human populations moved out of Africa, as phonemes may be added as well as eliminated. An example of the former process is the addition of the phoneme /v/ to English as it passed from the Old English stage to Middle English. In Old English, [v] was an allophone of the phoneme /f/ used in intervocalic position (i.e. between vowels); there were no minimal pairs like few/view, where /f/ and /v/ would contrast. However, the borrowing of numerous words from Normal French where [v] appeared in non-intervocalic position, such as virgin, veil, and veal, led to a reanalysis of /v/ as a separate phoneme.

Conversely, the voiceless labiovelar approximant /ʍ/, which contrasted phonemically with /w/, as in which/witch, has been lost in all but few varieties of English; those that retain it include Hiberno-English, Scottish English, and some Southern American dialects.

Both increases and decreases in phonemic inventory size may characterize populations that migrate and those that stay. The pattern observed for recent migrations in historical times, however, is exactly the opposite of what Atkinson hypothesizes for prehistoric migrations: “émigré” languages tend to be more conservative than their “home country” counterparts. This is true of their lexicons (e.g. Québecois French retains rue barrée for a street closed off from traffic, which has been replaced in France by rue fermée), as well as grammars (e.g. Judeo-Spanish preserved the feminine gender for nouns in -or such as calor ‘heat’, color ‘color’, and favor ‘favor’, which were feminine in the Middle Ages but are now commonly masculine in Standard Spanish). The same is true of phonemic inventories. For example, Judeo-Spanish preserves the phonemic distinction between /ʃ/, /ʒ/ and /dʒ/, as in deshar ‘to leave’ (Modern Spanish dejar), hijo ‘son’, and gente ‘people’, respectively. This distinction existed in Castilian Spanish in the 15th and 16th centuries, but later disappeared from Modern Spanish, with all three sounds being replaced by /x/. Similarly, Yiddish preserved a phonemic contrast between word-initial /s/ and /z/; witness such minimal pairs as sok ‘syrup, sap’ (from Slavic) and zok ‘sock’ (from Germanic). Examples like this cannot be ignored, as they show directly that languages that split-off and relocate in new areas often maintain phonemic contrasts that are lost in the language of the population that stays in place.


More generally, analyzing phoneme inventory size rather than composition ignores the fact that languages may have identical inventory sizes but yet very little overlap. For example, Arabic and Georgian both happen to have 28 consonant phonemes; however, only 13 of them are found in both language. The consonants of Arabic not found in Georgian include pharyngeal and pharyngealized consonants, whereas those found in Georgian but not in Arabic are aspirated stops and affricates, as well as ejective sounds. According to HB&H (2012, p. 6), in the evolution from PIE to Proto-Balto-Slavic, the consonant phoneme inventory shrank from 25 to 19 members, though only 15 of those consonants were present in PIE. In fact, languages often simultaneously lose and gain phonemes, which obfuscates a direct relationship between inventory size and language change. An excellent example of this process from the history of English is the Great Vowel Shift, which took place between 1350 and 1700.  This shift resulted in the loss of long vowel phonemes like /e:/ and /ɛ:/, as in see and speak (which merged into the same phoneme /i:/), as well as in the acquisition of diphthongs such as /ej/ in name and day. However, simply stating that the 15 vowel phonemes of Middle English were reduced to 11 in Early Modern English misses all the complexities involved in this sound change.

All in all, the Science article by Atkinson on phomenic diversity seems to be yet another example of shoddy work in which mathematical methods are applied in a simplistic fashion, without any understanding of concepts and phenomena under consideration. Such works produce results that contradicts well-known facts about the nature of human languages, as well as plain common sense.



Gorenflo, L.J.; Suzanne Romaine, Russell A. Mittermeier, & Kristen Walker-Painemilla (2012) “Co-occurrence of linguistic and biological diversity in biodiversity hotspots and high biodiversity wilderness areas”. PNAS online.

Hunley, Keith; Claire Bowern; & Meghan Healy (2012) “Rejection of a serial founder effects model of genetic and linguistic coevolution”. Proceedings of the Royal Society. pp. 1-9.

Labov, W. (2001) Principles of linguistic change: social factors. Oxford, UK: Blackwell.



Related Posts

Subscribe For Updates

We would love to have you back on Languages Of The World in the future. If you would like to receive updates of our newest posts, feel free to do so using any of your favorite methods below: