Why Are There Different Languages?—Response to Vyvyan Evans (Part 4, Conclusion)
[Many thanks to Martin W. Lewis and Johanna Nichols for insightful discussions of some of the issues considered in this post.]
As discussed in the previous post, Vyvyan Evans advocates an alternative to the so-called saltationist understanding of language evolution that follows from the Universal Grammar (UG) theory; according to Evan’s gradualist view, language emerged in an incremental process of improvement in cooperative behavior. However, as I pointed out at the end of the previous post, this gradualist theory does not explain why there are so many mutually incomprehensible variations on the theme of human language. If language “evolved for the purpose of communication”, as Evans indicates, why don’t we all have the same language that would allow us all communicate with each other?
One popular way to explain linguistic diversity is to tie it to cultural variability—and perhaps even to variation in cognitive abilities. The proponents of this approach think that languages are different because they are expressions of cultures and ways of thinking that vary widely from place to place. There are two problems with this idea. First, it moves the issue of diversity to another level without solving it: if languages vary because they express varied cultures, why do cultures themselves differ so dramatically (in ways expressible by language diversity)? The second and more fundamental problem concerns linguistic patterns that cannot be taken as expressions of culture.
Surely, languages have many words that reflect the local environment or customs: Norwegian gravlax ‘cured salmon’ and sludd ‘mix of rain and snow’ are unimaginable in the tropics, and it is hardly surprising that Norwegian words for ‘banana’ and ‘coconut’ are themselves “imports”, just as these fruit are. Even words for something so seemingly universal as numerical concepts can be culturally loaded. For example, in Russian the words pervoje ‘first’, vtoroje ‘second’, and tretje ‘third’ mean not only the order in a progression, like their English counterparts, but have also become associated with specific types of dishes based on customary meal patterns: the ‘first’ is necessarily a liquid, soupy dish, while the ‘second’ is some sort of “solid” food, and the ‘third’ is a sweet course. These culture-specific meanings can override the purely numerical ones: thus, one can skip the soup and start the meal with “the second” (vtoroje), or eat “the first” (pervoje) as a second course, after an appetizer or salad of some sort. In a multi-course dinner, “the third” (tretje) may come as a fifth or seventh course.
However, the key elements of language—most importantly, grammar—do not correlate with environment or customs. Take, for instance, patterns of word order across languages. As Evans reminds us, languages “also differ over the word order used for subject, verb and object, with all possibilities being attested. English uses a fairly common pattern – subject (S) verb (V) object (O): The dog (S) bit (V) the postman (O). But other languages do things very differently.” For example, Japanese places the object before rather than after the verb. According to the World Atlas of Linguistic Structures (WALS), 47% of the languages in their sample exhibit the OV pattern as in Japanese, and 46% of the languages exhibit the VO pattern as in English (the remaining languages do not have a dominant order). But what do languages in either group have in common in terms of environment or customs? The answer, as it turns out, is “nothing”. For example, languages that place their objects before verbs are found in diverse physical environments ranging from the steppes of Central Asia to the mountainous Andes of South America and from the temperate Basque Country to the tropical jungles of Papua New Guinea (see the WALS-based map on the left). Moreover, languages in neighboring, climatically identical locales may differ radically with respect to their linguistic patterns: for instance, in coastal Papua New Guinea, one finds languages with Subject-Object-Verb, Subject-Verb-Object, and Verb-Subject-Object orders. Patterns of lifestyle, social organization, and technological advancement are equally haphazardly distributed with respect to such linguistic typology. For example, languages with the Subject-Object-Verb order include Japanese, whose speakers live in a high-tech world, Evenki, spoken by reindeer pastoralists in Siberia, and Guugu Yimidhirr, a language of Australian aboriginal hunter-gatherers. Rarer word order patterns, such as Verb-Subject-Object, are found in languages spoken by groups that share nothing in terms of environment or culture, such as salmon-fishing speakers of Halkomelem on the Pacific Coast of British Columbia and cow-herding speakers of Maasai in Eastern Africa. Many additional examples can be brought to illustrate this argument. Such varying word orders are not only meaningless but also do not correlate with “culture”, however loosely defined. Why then do they exist?
Even more difficult to substantiate is the Whorfian thesis, named after Benjamin Lee Whorf, its most ardent proponent. (This thesis is also known as Sapir-Whorf, or alternatively Whorf-Sapir, hypothesis, after Edward Sapir, who was Whorf’s mentor but a less vocal advocate of these ideas.) According to this theory, different peoples have different ways of thinking about the world that are reflected, or perhaps even shaped, by their languages. Whorf’s views were based on his understanding of Hopi, a Native American language that he claimed to lack tense markers such as English -ed or will or words like afterward. Driven by his agenda to show that peoples commonly dismissed even by the educated folk of his day as “savages” are as mentally developed as Westerners are, Whorf claimed that this gap in the Hopi language was connected with the circular sense of time in Hopi cosmology—and as a result the language was not worse, not poorer, and not simpler than languages with tense markers, just different.
Subsequent studies have shown that Whorf’s conception of the Hopi language as lacking temporal expressions is plain wrong, as it actually does have tense markers and words like later and already. Similarly, linguists have discredited other claims about language made by advocates of the Whorfian thesis. Yet, the idea that the specific language one speaks affects—or perhaps even determines—one’s ways of perceiving and thinking about the world remains widely popular in non-linguistic circles. In recent years, a number of studies by psychologists and sociologists have sought to revive the Sapir-Whorf thesis. In one study, Yale economist Keith Chen argued that whether or not a language has future tense affects how people perceive the future, which in turn translates into such socio-cultural outcomes as tendencies to save or spend money, or propensities for health-improving practicing such as exercising and not smoking. Other studies examined whether a language that has grammatical gender based on biological sex—that is, whether it divides nouns into “masculine” and “feminine”—shapes speakers’ perception of objects denoted by these nouns as more “male” or “female” and influences how their societies treat women. In previous posts, I have argued that such studies are so problem-ridden that they cannot be taken seriously. As John McWhorter writes in Language Hoax, despite the Whorfian attempts to link people’s cognition to their language, “all you get is false leads and just-so stories” (pp. 36-37).
McWhorter suggests instead that languages are “shambolically magnificent accretion[s] of random habits”. There is a great deal of truth to his words: many bewildering puzzles in any language are merely petrified relics of earlier regular patterns. One such puzzle particularly bewilders my students of Russian: why does the vowel in lev ‘lion’ disappear in l’va ‘of lion’, but the vowel in a similar-sounding les ‘forest’ does not, so that ‘of forest’ is lesa, not *l’sa? To solve this mystery, one has to look beyond modern Russian. Historically, the two vowels, which sound exactly the same now, developed from two different sounds: the vowel in ‘lion’ was pronounced like the i in the English pit, whereas the vowel in ‘forest’ used to be the same as in the English pat. Moreover, only the former vowel was subject to a rule that deleted it if it was followed by another syllable in the same word. Hence, when the ending -a (meaning, roughly, ‘of’) was added, the vowel in ‘lion’ dropped out, while the vowel in ‘forest’ remained unaffected. Over time, two independent sound changes converged, causing the two vowels to be pronounced exactly the same in standard Russian. (In some rural Russian dialects, however, the two vowels are still pronounced differently: ‘lion’ is pronounced [lejv] and ‘forest’ sounds as [ljes].) As spelling often lags behind pronunciation changes, for a long time Russians continued to spell the two vowels differently, even after the pronunciations merged. The spelling reform of 1918 finally abolished the letter that had been used for the vowel in ‘forest’ (called “yat”). Now the two vowels are pronounced the same and spelled the same, yet only one of them still drops out when an ending is added. A “shambolically magnificent accretion of random habits” indeed.
Various grammatical distinctions that some languages make and others do not—gender, source of evidence, 5‑way past tense distinction (e.g. in Yagua, a small tribal language spoken in northeastern Peru), and so on—are also seen by McWhorter as “random habits”. Because his main concern is why such grammatical complexities are retained over time rather than how they emerged in the first place, McWhorter can avoid the latter issue. According to him, languages “complicate as a natural result of millennia of habits developed by people using them quickly and unconsciously” (What Language Is, p. 59). In essence, he argues that languages stay complex because they can. To an outside—and crucially, adult!—observer, languages like Ket (spoken in Western Siberia) or Archi (spoken in northeastern Caucasus) seem “ingrown” and “disheveled”. Take, for example, Ket: as described by McWhorter, Ket uses different prefixes to mark the subject on the verb, so di– in diksivɛs ‘I come’ and bɔ- in bɔɣatn ‘I go’ both mean ‘I’. This is not unlike -o in Spanish vengo ‘I come’, except in Ket, ‘I’ is appended to the beginning rather than the end of the verb. Moreover, there is no “logic” to this division into the di-class and bɔ-class: if for some random reason you decide to learn Ket, you would have to memorize which verbs appear with di– and which ones with bɔ-. To make matters worse for a foreign learner, both di– and bɔ- change depending on what sounds they end up next to, so di– may drop the i, bɔ- may become ba-, and so on. In another seemingly random twist, some Ket verbs in both classes—again, you can only memorize which ones—can “take two pronoun prefixes meaning the exact same thing” (McWhorter, p. 58). Thus, “digdabatsaq means that I [not we!] go to the river and come right back a little later… digdaddaq … is how to say I go to the river and stay for the season… and … digdaksaq means that I go to the river and stay for some days or weeks” (p. 58). Go figure!
But since babies and toddlers have no problem picking up languages like Ket despite their complexity, their grammatical complications are perpetuated—unless adult learning intervenes. In the case of Ket, very few outsiders care to learn it. But when many grown-ups have to pick up a language, they tend to “shave off” such layers of grammatical complexity, argues McWhorter. The result is a language that has fewer grammatical distinctions and fewer exceptions to regular patterns—at least for a while, as over time the normal process of accretion kicks in again. To McWhorter, “languages are like bathtub rings. For whatever it’s worth, viewed under a microscope, bathtub rings are complex, teeming slices of biology. But that doesn’t mean they serve a purpose” (p. 59).
But while we understand how bathtub rings form, McWhorter remains rather vague as to why grammatical distinctions arise in the first place: how does the quick and unconscious use of language lead to the emergence of the distinction between masculine and feminine nouns, or between events witnessed directly and those inferred, or between events that happened hours, weeks, or years ago? Why do they emerge only in some languages but not in others? And if it is true that at least some parts of language are innate, as is claimed by the Universal Grammar theory, why is so much of the substance of language left open to chance? Why are there so many blanks in the “how-to” manual that is Universal Grammar? Why, for example, can languages “decide” whether to put objects before or after the verbs, or whether to drop meaningless “it” in “weather”-sentences? Why aren’t languages all the same? After all, if they were, communication—and learning foreign languages, and designing machine translation tools, and governing multilingual countries—would be so much simpler. Why, then, are languages so mind-bogglingly different?
Below, I sketch an outline of the answer, as proposed by Mark Baker (2003). To understand the significance of variation in language, we have to challenge the popular wisdom that language, as Evans puts it, “reflects human pro-social inclinations for inter-subjective communication” (The Language Myth, p. 3), or Mark Amidon once quipped, serves as “the means of getting an idea from my brain into yours without surgery”. Communication is not the only function of language. Crows, after all, have pro-social inclinations and extensively employ “inter-subjective communication”, but as sophisticated as their messaging may be, it does not come close to matching human language.
To probe for the answer to the “why” of linguistic diversity, let’s take as a concrete example two languages spoken in neighboring areas of the north-central Caucasus: Ossetian and Chechen. You may have heard of the two groups because of their roles in the ongoing geopolitical tensions in the region, yet their languages are largely unfamiliar to non-linguists outside the region. Sure enough, due to a centuries-long sojourn in the same region, the two languages swapped words and even sounds. (Particularly, Ossetian, a relative newcomer to the area, picked up many words and sounds from its neighbors.) Yet, the grammars of the two languages differ radically, especially when it comes to marking subjects and objects. Ossetian, which happens to be a distant “cousin” of English, does exactly what English does with subjects and objects that are pronouns: subjects always appear in the same form, regardless of whether there is an object in the sentence, while object pronouns have a different form. Thus, the subject is the same form he in He kissed a girl and in He left, but the object is him, not he: The girl kissed him. In Ossetian, this same pattern is found: subjects are subjects, and objects are different—except in Ossetian not just pronouns but nouns too change their form, adding a suffix -ɨ. (To pronounce this sound, try to say the “oo” of book, but spreading the lips as if for “ea” in beak.) Thus, ‘The boy kissed the girl’ is “boy girl-ɨ kissed” (verbs are typically placed at the ends of sentences in Ossetian, like in Japanese). To say ‘The girl kissed the boy’, swap the two nouns and put the -ɨ on ‘the boy’: “girl boy-ɨ kissed”. And ‘The boy left’ is just “boy left”, with no -ɨ. Neighboring Chechen, however, is entirely different on this score: here a special suffix -s is added to subjects rather than objects, and only if there is also an object in the same sentence. Thus, ‘The boy kissed the girl’ is “boy-s girl kissed” (as in Ossetian, verbs typically come at the end in Chechen), and ‘The girl kissed the boy’ is “girl-s boy kissed”, but if the sentence lacks an object, as in ‘The boy left’, the subject no longer has the -s: “boy left”. A system like that in Chechen is called ergative alignment, and the more familiar pattern of Ossetian (and English) is known as the nominative-accusative alignment, after the nominative and accusative cases known from Latin grammar. The two different types of alignment are also reflected in what the verbs agree with and many other phenomena. The two languages, in short, have a very different “flavor” to them. But why?
One could explain this contrast, as McWhorter does, through the “accretion of random habits”, but such accounts simply push the question further into the past. A more interesting answer emerges if we consider other factors at play here. Recent genetic studies have shown that despite the long-term co-existence of the Ossetians and the Chechens in the same general area, their gene pools remained rather distinct. This is particularly true in regard to Y-DNA, which is passed on along the male line: women seem to have intermarried more often across the ethnic divides, but men mostly stayed within the group. Yet, many elements of Chechen and Ossetian culture, including music, dances, costumes, cuisine, are strikingly similar. How, then, have these groups kept track of their ethnic affiliations? The answer is language, which serves as an obvious marker of who is “us” and who is “them”. This idea is developed by Mark Baker (2003), who likewise challenges the idea, all too often taken for granted, that “the evolutionary purpose of language is to provide a way of communicating complex propositional information to kin and collaborators”, comparing language to codes and cyphers. As Baker puts it: “Our language faculty could have the purpose of communicating complex propositional information to members of our group while concealing it from members of other groups” (p. 351).
If language serves to create and perpetuate ethnic groupings, what then is the evolutionary purpose of such ethnic divisions? Wouldn’t the world be a better and more peaceful place without such rifts? This has been precisely the thinking behind the creation and spread of Esperanto, intended to be the language of all humankind, even a path to lasting peace. Yet despite its relative success in becoming a means of interlinguistic communication and acquiring a community of first and second language speakers, Esperanto has never come close to becoming “everybody’s language” nor has it led to peace and harmony, as its creator Ludwik Zamenhof had hoped. Instead, scholars such as Mark Fettes propose viewing the Esperanto-speaking community as a quasi-ethnic minority in its own right (cf. Fettes 1996). Although Esperanto is reasonably popular in many parts of the world, especially in Europe, North America, and East Asia, and although its speakers share no ethnic, geopolitical, or historical background, most Esperantists do share a certain set of cultural and ideological views. A number of psychological and sociological studies reveal a typical Esperanto speaker to be “anti-conformist, well-educated, […] often declar[ing] contrarian or left-leaning values such as ‘internationalism’, ‘humanitarianism’, ‘green politics’, and ha[ving] sympathy for issues such as minority and regional language rights” (Gledhill 2014, p. 323). Thus, ironically, instead of breaking down inter-group barriers, Esperanto has produced a new division into “us” and “them”. So why does humanity hold on to these divides?
A detailed answer to this question goes beyond the scope of this post, but psychologists suggest that there are many benefits to our species “groupishness”. For example, it has been proposed that group identity gives one a chance of improving one’s self-image: “if I belong to such a smart/handsome/brave group, I must be smart/handsome/brave”. Similarly, Jonathan Haidt, the author of The Righteous Mind, wrote: “Studies of groupishness have generally found that groups increase in-group love far more than they increase out-group hostility”. Although humans find themselves as members of all sorts of groups, from clans to sports teams, genetically-bound groups play an important role in human social organization. Genes, however, are invisible, and if the ideas outlined here are on the right track, language may have been “designed” to vary in a constrained way in order to create an overt, easily recognizable marker of “our people”. Recent findings that languages matches genes better than geography offer some support to this idea that language helps demarcate gene pools.
Additional Sources:
Baker, Mark C. (2003) Linguistic differences and language design. TRENDS in Cognitive Sciences 7(8): 349-353.
Fettes, Mark (1996) The Esperanto Community: A Quasi-Ethnic Linguistic Minority? Language Problems & Language Planning 20(1): 53-59.
Gledhill, Christopher (2014) Phraseology as a Measure of Emergent Norm: The Case of Esperanto. In: José Carlos Herreras (ed.) Politiques linguistiques et langues autochtones d’enseignement dans l’Europe des vingt-sept. Valenciennes: Presses universitaires de Valenciennes. Pp. 317-348.