Peoples, languages and genes in the Caucasus: An Introduction

May 15, 2014 by


The Caucasus region, dominated by the imposing Great Caucasus mountain range and stretching between the Black Sea and the Caspian Sea, has long been known as one of the world’s ethnically and linguistically most diverse areas. According to the Roman historian Pliny, when the Romans came to the Caucasus, they needed 134 interpreters to deal with the jumble of languages they found. The 10th century Arab geographer and historian al-Azizi referred to the area as the “mountain of languages”. Today, this relatively small area (about the size of New England) is home not only to over 100 languages, but to four distinct language families that are indigenous and unique to the region: the Northwest Caucasian family, the Northeast Caucasian family, the Nakh family and the South Caucasian (or Kartvelian) family. In addition, several languages from families common elsewhere – Indo-European and Turkic – are spoken by various groups in the Caucasus region as well. Like the linguistic situation, the ethnic situation too presents a complex and highly mosaic picture, because ethnicity correlates closely, though not perfectly, as we shall see below, with the languages (see map on the left).

32161_fullThe southern part of the region – Transcaucasus – consists of the three former Soviet Republics, now independent countries of Georgia, Armenia and Azerbaijan. Georgia is home to ethno-linguistic groups speaking South Caucasian (Kartvelian) languages: Georgian (4 million speakers), Svan (15,000 speakers in northern Georgian region of Svanetia; see picture on the left), and Mingrelian (500,000 speakers). (The fourth Kartvelian language, Laz is spoken by 30,000 people in Turkey.) Most speakers of Kartvelian languages are Christian (Georgian Orthodox), but there are smaller groups of Kartvelian speakers in southern Georgia who are Muslim. In addition to Kartvelian-speaking majority, Georgia has a number of other groups: Armenians (5.7%), Azeris (6.7%), Ossetians (0.9%), Russians (1.5%), Greeks (0.3%), Ukrainians (0.2%) and others (figures are from the 2002 census).

The other two majority languages in the Transcaucasus region belong to language families found outside the area: Armenian is an Indo-European language (it is an isolate not closely related to any other Indo-European languages), and Azeri is a Turkic language, closely related to Turkish, as well as Turkmen (spoken in Turkmenistan, Uzbekistan and Afghanistan) and Gagauz (a minority language in Moldova). There are also significant minority Armenian and Azeri groups in other countries south of the Caucasus (Turkey, Iran, etc.). Armenians are predominantly Christian (Armenian Orthodox), whereas Azeris are mostly Shi’a Muslims.


Although both Armenians and Azeris are linguistically related to populations outside the Caucasus (Indo-European and Turkic, respectively), genetic studies indicate that both groups are more closely related genetically to their geographic neighbors in the Caucasus than to their linguistic relatives elsewhere (e.g. Nasidze and Stoneking 2001; Nasidze et al. 2001). Like other Caucasian populations, the gene pools of both Armenians and Azeris are intermediate between those of Europeans and Near Eastern populations of the northern Fertile Crescent: the Turks and the Kurds, as well as Iraqi and North African Jews (see chart). These genetic results indicate that both Armenians and Azeris are descendants of Neolithic migrants from the Near East who later adopted a different language via the process that Colin Renfrew called “elite dominance”, whereby the language of a small invading group is adopted by the larger resident population, either because it is imposed by force or because it is considered socially desirable to speak the language of the invaders. The origins of the Armenian language are obscure, but the Azeri language was probably introduced in the 11th century CE by central Asian nomads (Johanson 1998).

The situation in Armenia and Azerbaijan is further complicated by the fact that both countries used to have significant minority populations associated with the other country. For example, Armenians constituted 11.6% of the population in Azerbaijan in 1886, and Azeris constituted 34.2% of Armenia’s population in 1897. Today, these minority populations have dwindled down considerably: approximately, 120,000 Armenians still live in Azerbaijan, but Azeris have been driven out of Armenia and no longer appear as a category in any 21st century census data. The most recent, 2009 Azerbaijani census lists the following minority groups: Kurds, Tat and Talysh (all three Iranian-speaking groups); Lezgin, Udi, Avar, Tsakhur, Kryz, Khinalug (all speaking Northeast Caucasian languages); Georgians (Kartvelian-speaking); Russians and Ukrainians (Slavic-speaking); Turks and Tatars (Turkic-speaking), and Jews (speaking different languages).* Today’s population of Armenia is much more homogeneous and overwhelmingly ethnic Armenian (97.9% in 2001 census), with tiny minority groups of Yezidis (1.3%) and Russians (0.5%).


Moving over the crest of the Caucasus mountain range, we find an even more complicated ethno-linguistic picture in the North Caucasus. While geopolitically, all of this area is part of the Russian Federation, a number of its internal republics constitute the North Caucasus belt (from west to east): Adyghea, Karachai-Cherkessia, Kabardino-Balkaria, North Ossetia, Ingushetia, Chechnya, and Dagestan. Ethno-linguistically, North Caucasus is home to five distinct groups (also from west to east): Northwest Caucasian groups, including Adygeis, Circassians and Kabardians, as well as Abkhazians in the neighboring Georgia; Turkic-speaking Karachays and Balkars; Iranian-speaking Ossetians (four of the forthcoming GeoCurrents posts will be dedicated specifically to them); Nakh-speaking Ingush and Chechens; and groups speaking Northeast Caucasian (or Dagestanian) languages, such as Agul, Avar, Dargin, Lak, Rutul, Tabasaran, Tsakhur, and many others. Genetic studies too (e.g. Balanovsky et al. 2011) confirm the distinctiveness of these groups, as each one appears to be associated with its own predominant genetic signature (see map below; Turkic-speaking groups are not shown).

Balanovsky_mapThe Ossetians and the Turkic-speaking groups are relative newcomers to the area. Ossetians are linguistic descendants of the Iranian-speaking groups who arrived from the steppes to the north(east) around some time between 1000 BCE and 500 CE, followed by Turkic speakers around 1000-1500 CE. According to Balanovsky et al. (2011: 2906), “the new migrants forced the indigenous Caucasian population to relocate from the foothills into the high mountains”, thus repeating the centuries old pattern, described by Johanna Nichols of UC Berkeley: newly arrived groups push earlier inhabitants up the slopes or impose their language on them.


dagestan_Language_mapConsider, for example, Tsezic languages, including Bezhta (#8 on the map), Hunzib (#16), Dido (#13), Hinukh (#15) and Khvarshi (#21). According to Nichols, this is the branch that split of the rest of the Dagestanian family the earliest, around 2,000 years ago; now, they are spoken at the highest altitudes along the crest of the mountain range. Andic languages, including Akhvakh (#2), Andi (#3), Bagvalal (#6), Botlikh (#9), Chamalal (#10), Ghodoberi (#14), Karata (#20) and Tindi (#29), whose split of the rest of the family tree is more recent, occupy the medium altitude belt, whereas the relatively new (and structurally simplified) Avar (#5) is spoken at the lowest altitudes.


The vertical dimension is important also because of the widespread patterns of the so-called “vertical bilingualism”: residents of highland villages – typically, men – participate in seasonal migrations to lowlands regions offering markets and winter pastures at lower altitudes and generally know the language of a lower village, but not vice versa. For example, speakers of Tsezic languages often speak some Andic language, whereas Andic speakers may be quite fluent in Avar (the latter also serves as a lingua franca in other areas of Dagestan, as marked on the map).





*As we can see, most ethnic groups listed in the census data are associated with a language. However, censuses in this region still list Jews as an ethnic (rather than religious) group, following the common practice during the Soviet times.





Balanovsky, O; K Dibirova; A Dybo; O Mudrak; S Frolova; E Pocheshkhova; M Haber; D Platt; T Schurr; W Haak; M Kuznetsova; M Radzhabov; O Balaganskaya; A Romanov; T Zakharova; D F Soria Hernanz; P Zalloua; S Koshel; M Ruhlen; C Renfrew; R S Wells; C Tyler-Smith; E Balanovska; and The Genographic Consortium (2011) Parallel Evolution of Genes and Languages in the Caucasus Region. Molecular Biology and Evolution 28(10): 2905–2920.


Johanson L (1998) The history of Turkic. In: Johanson L, Csato E (eds) The Turkic languages. Routledge, London, pp 81–83.


Nasidze I, Stoneking M (2001) Mitochondrial DNA variation and language replacements in the Caucasus. Proc R Soc Lond 268:1197–1206.


Nasidze I, Risch GM, Robichaux M, Sherry ST, Batzer MA, Stoneking M (2001) Alu insertion polymorphisms and the genetic structure of human populations from the Caucasus. Eur J Hum Genet 9:267–272.

Related Posts

Subscribe For Updates

We would love to have you back on Languages Of The World in the future. If you would like to receive updates of our newest posts, feel free to do so using any of your favorite methods below: