The Tale of Two Ukraines, the “Missing” Five Million Ukrainians, and Surzhyk

Jun 25, 2014 by


[This post was originally published on GeoCurrents in March 2014]

ukraine-protests-map-by-language-kIn recent weeks, a number of mainstream news media outlets, including the CNN and The New York Times, have attempted to explain the current crisis in Ukraine in terms of a division between the more “Russian” eastern (and southern) Ukraine and the more “Ukrainian” western (and central) parts of the country. At the height of the violent anti-government protests, on January 24, 2014, the Washington Post’s blogger Max Fisher posted what he considers to be “the one map you need to understand Ukraine’s crisis”. The importance of this map, according to Fisher, is in that it shows the correlation between the geography of the Euro-Maidan protests and the ethno-linguistic divide:

“Ukraine’s ethno-lingistic political division is sort of like the United States’ “red America” and “blue America” divide, but in many ways much deeper – imagine if red and blue America literally spoke different languages.”


The correlation between ethnicity, (native) language, religion, and voting patterns—and the consequent split into “two Ukraines”—was established over a decade ago. (In fact, economic factors, such as the region’s contribution to GNP, salary levels, and industrial production, can be added to the mix, as they too correlate with the eastern/western Ukraine divide.) However, the easy slippage between ethnic and linguistic terms is problematic in the case of Ukraine because the ethnic and linguistic categories are not coextensive, although they do overlap to a significant degree. Moreover, speaking of the Russian- and Ukrainian-speaking populations are forming as two different language communities is somewhat misleading, particularly for the American audience not used to the high degree of bilingualism found in Ukraine.

Let’s begin by scrutinizing the published numbers concerning the ethnic and linguistic composition of Ukraine. As the most comprehensive set of data comes from the 2001 population census, the rest of this post relies these figures even though the numbers have changed in the intervening thirteen years. According to both the and the Wikipedia article on Ukraine, the total population of the country is 48,457,000 (here and below, I round large numbers to the thousands). However, if the Ethnologue figures for native speakers of all 25 languages listed for Ukraine are added up, the total number is only 43,248,000, approximately 5.2 million people short of the total population. One minor reason for this discrepancy is the fact that the figures for some of the minority languages (particularly, Crimean Tatar, Czech, Gagauz, Jakati, Karaim, Krimchak, Rusyn, Urum, and Eastern Yiddish) come from different years, ranging from 1970 census for Czech to various 2007 sources for Gagauz, Karaim, and Krimchak. However, these languages are relatively small; the largest of them, Eastern Yiddish, counted 634,000 speakers in 1991, but this number was much reduced by 2001, after most Ukrainian Jews had moved to Israel. Moreover, by 1991 Yiddish was the mother tongue of mostly older Jews, many of whom died in the subsequent 13-year period and have not been replaced by a younger generation of Yiddish speakers. Many of the abovementioned minority languages, moreover, are currently somewhere on the endangerment scale. Therefore, even figures adjusted for these minor languages would hardly account for the “missing” 5.2 million Ukrainian citizens. Short of assuming that millions of people do not speak any language natively at all, we must conclude that the Ethnologue tabulation for Ukraine is significantly incorrect.

In addition to enumerating linguistic groups, the 2001 census also addressed ethnic classifications. According to the figures published in Wikipedia, 77.8% of Ukrainian citizens are ethnic Ukrainians, 17.3% are ethnic Russians, and 4.9% fall into the “others/unspecified” category. These percentages translate to 37,700,000 ethnic Ukrainians and 8,383,000 ethnic Russians. The Ethnologue publishes a comparable figure of 37,500,000 ethnic Ukrainians, noting that only 32,000,000 of them are native speakers of Ukrainian. What do the remaining 5.5 million ethnic Ukrainians speak? Note that this number is conspicuously close to the 5.2 million people whose native language is unidentified. It seems likely that the majority of these people are ethnic Ukrainians who speak Russian as their first language. Yet, the Ethnologue figure of 8,330,000 native speakers of Russian is a close match for the 8,383,000 ethnic Russians. However, it should be noted that some people from the “other” ethnic category, including many Jews, are native speakers of Russian. It would thus appears that at least some 50,000-60,000 ethnic Russians speak some language other than Russian, most likely Ukrainian, as their mother tongue. If, as I suggested above, 5.2-5.5 million ethnic Ukrainians are native speakers of Russian, the figure of 8.4 million Russian native speakers simply cannot be correct. Whichever figure is actually incorrect, it is clear that a significant number of ethnic Russians speak Ukrainian as their mother tongue and an even larger number of ethnic Ukrainians speak Russian natively.


This discrepancy between ethnic and linguistic groups also has a geographic reflex. As shown in the map on the left (also from The Washington Post), while a significant ethnic Russian population is found only in four regions of eastern Ukraine (Kharkiv, Luhans’k, Donets’k, and Zaporizhzhya regions) and in Crimea, the Russian language predominates in other areas of southern Ukraine as well.




As can be seen from comparing the next two maps, in each of the southern/eastern Ukrainian regions, the percentage of native speakers of Russian is nearly twice the percentage of ethnic Russians. For example, in the Kharkiv region 44.3% speak Russian as their native language, but only 25.6% are ethnic Russians; in the Zaporizhzhya region, 48.2% are native Russian speakers but only 24.7% are ethnic Russians; in the Mikolayiv region, 29.3% speak Russian natively but only 14.1% are ethnic Russian; and so on. Such figures affirm the idea that significant numbers of ethnic Ukrainians are native speakers of Russian.



However, dividing Ukraine into “Ukrainian-speaking” and “Russian-speaking” areas based on native speakers is rather misleading, due to widespread bilingualism in the two languages. As can be seen from the map from (based on the 2003 survey), especially if compared to the map of Russian-as-a-native-language above, Russian is used in everyday life not only by native Russian speakers but also by a significant number of native speakers of other languages, particularly Ukrainian. In Central Ukraine 25.6% use Russian, but the proportion of native Russian speakers is only 4-7.5%, varying from place to place. In East-Central Ukraine, nearly 60% use Russian, while native speakers of Russian constitute only 10-15% of the population. In Eastern Ukraine, over 90% use Russian, but native speakers constitute only 45-75%.


A more detailed map, based on the same survey and published by The Guardian, underscores my point that a sharp dichotomy between Russian and Ukrainian is misleading, as a third linguistic variety called Surzhyk—an intermediate language/dialect between Ukrainian and Russian—must be considered as well. As can be seen from this map, Surzhyk is widely spoken throughout Ukraine, with the exception of the westernmost regions, where Ukrainian predominates. In the central, eastern, and southern regions, Surzhyk is used by 9-22% of the population, with the highest proportion found in the east-central area— Chernihiv, Sumy, Poltava, and Dnipropetrovs’k—with over 20% Surzhyk usage (more on Surzhyk below).


In fact, Ukrainian and Russian (and Belarusian as well) constitute a dialect continuum, where “the boundaries which define ‘languages’ as opposed to ‘dialects’ are more political than linguistic”, as Lenore A. Grenoble reminds us (p. 591). Within Ukrainian three major dialectal groups are defined: the northern dialects (shown in blue), being most heavily influenced by Belarusian and less so by Russian; the south-eastern dialects (shown in yellow), being most heavily influenced by Russian (although #4 Middle Dnieprian dialect is the closest to “Standard Ukrainian”); and the south-western dialects (shown in red), being the most heavily influenced by Polish and the least heavily influenced by Russian.

Similarly, Russian spoken in (eastern) Ukraine is very similar to the southern Russian dialects, which share certain linguistic features with Ukrainian rather than with northern or central Russian dialects, or with Standard Russian. One feature that aligns the Russian dialects in Ukraine with other southern Russian dialects and with Ukrainian is the pronunciation of /g/ as a voiced velar fricative. Thus, in Russian of southern Russia or Ukraine—as well as in Ukrainian—the initial sound in Harward and Harry (Potter) is pronounced as a more h-like sound than in Standard Russian or northern Russian dialects, where these words sound more like [Garvard] and [Gari Potter]. Considering the dialects of both Russian and Ukrainian, a unified continuum emerges, from the “most Ukrainian” dialects in western Ukraine to the “least Ukrainian” dialects in northern Russia—but on both sides of the Russia-Ukraine border, very similar dialects are found. Whether they are classified as “Ukrainian dialects” or “Russian dialects” is more a geopolitical issue than a purely linguistic one.



This East Slavic dialectal continuum stems from the relatively late emergence of distinct East Slavic languages and the continuing contact between them. The East Slavic linguistic variety, distinct from West and South Slavic, emerged “during the seventh to eighth centuries” (Grenoble, p. 591), but the emergence of discrete East Slavic languages and the end of a mutually intelligible pan‑East Slavic is much harder to date. According to Sussex and Cubberley (2006: 80) the East Slavs were “culturally, religiously and linguistically coherent” at least to the sacking of Kiev by Mongol invader in 1240. Geopolitically, we can differentiate East Slavic principalities, such as Polotsk in what is now Belarus, Volynia in today’s Ukraine, and Rostov-Suzdal’ in what was to become northern Russia. Some of these principalities, however, spanned the modern Russia-Ukraine border: for example, Chernigov Principality was centered around the city of Chernigov in what is now Ukraine but also included lands around Kursk, which are now Russian. The three peoples, Russians, Ukrainians, and Belarusians, emerged as linguistically, culturally, and geopolitically distinct groupings centuries later, chiefly as a result of both the influence of the Golden Horde (Kipchak Khanate) and the gradual absorption of some of the lands of East Slavs—roughly today’s Belarus and Ukraine—into the Grand Duchy of Lithuania and later the Kingdom of Poland, a process that began in 1386. The fifteenth century is thus commonly cited as the period from which we can speak of distinct Ukrainian and Russian languages.

However, since most of Ukraine became part of Russia with the Treaty of Pereyaslav in 1654, there has been intensive ongoing contact between the Russian and Ukrainian languages, which made them even more similar than they would have been otherwise, given the relatively recent date of differentiation. According to Grenoble (p. 592), “in the seventeenth and early eighteenth centuries the language influence was bidirectional and mutual”, with Ukrainian having more impact on Russian than it would have in later periods, both in the form of loanwords and through incorporation of “grammatical and rhetorical traditions”. Ukrainian also served as an intermediary for lexical borrowing from Polish into Russian. An example of a word borrowed from Polish into Russian through Ukrainian is xlopets ‘boy, chap’. Its Polish past is revealed through a comparison to its cognate xolop ‘peasant, serf’, which developed in Russian itself: the Polish-derived word exhibits no pleophony (i.e. no extra –o- before -lo-), the historical change that occurred in East Slavic but not West Slavic languages at a time prior to the seventeenth century.

From the late eighteenth century, Russian had more influence on Ukrainian, including its western dialects in Transcarpathia and Bukovyna, than vice versa. “Russification of Ukraine intensified during the periods of the Russian Empire and the Soviet Union”, writes Grenoble, “By the end of the Soviet era, it is possible to speak of diglossia in Ukraine, with Russian as the High variety used in formal, administrative, and educational domains, and Ukrainian is [sic] less formal, home settings” (p. 592). Although Ukrainian has regained the status of the official language used “in formal, administrative, and educational domains”, a high degree of diglossia and bilingualism remain common. The majority of Ukrainian citizens know both Russian and Ukrainian—with differing degrees of fluency—and typically mix them in speech as “that is perceived to be the norm for their speech community” (Grenoble, p. 592, see also Bilaniuk 2004).

This frequent mixing of Ukrainian and Russian in speech has led to the emergence of Surzhyk, a “hybrid sociolect”, as Grenoble calls it (see also Taranenko 2007: 125). The etymology of the term is pejorative as its literal meaning refers to a lower grade of grain, a mix of wheat and rye. The best way to describe Surzhyk is as Ukrainian matrix with certain Russian features inserted. As is common for other mixed linguistic varieties, including American Russian (or Émigré Russian), which I discussed in detail elsewhere (see Pereltsvaig 2003), such inserted features are mostly lexical. For example, Surzhyk uses a phonologically-adjusted form of the Russian word nakonec ‘finally’ (in Surzhyk, nakonic’) instead of the Ukrainian counterpart narešti ‘finally’. In other cases, Surzhyk employs a Russian derivational morpheme as opposed to a Ukrainian one, as in Russian/Surzhyk word pokupateli ‘shoppers, buyers’ vs. Ukrainian pokupci). Another Russian word in Surzhyk is stolova ‘dining room’ adapted from the Russian stolovaja (from the root stol ‘table’) by changing adjective morphology of Russian to that of Ukrainian/Surzhyk (the Ukrainian word for ‘dining room’ is ïdal’nja, from the root ‘to eat’).

Besides lexical borrowings from Russian, Surzhyk also uses Russian patterns of preposition and case use. For example, in Surzhyk ‘a conference on issues…’ is narada po problemam, where the preposition na mirrors the Russian use—soveščanie po problemam. As in Russian, it takes a dative form of the noun in Surzhyk. Ukrainian, in contrast, uses preposition z with the genitive case of the noun: narada z problem. Similarly, Surzhyk time expressions are modeled on those of Russian rather than Ukrainian. In Surzhyk, like in Russian, ‘at ten o’clock’ is expressed by preposition v with the accusative case of the numeral and the genitive form of the noun meaning ‘hours’: v desjat’ godyn in Surzhyk, v desjat’ časov in Russian. In Ukrainian, the pattern is different: the preposition used is o, which appears with both the numeral and the noun ‘hours’ in the locative case: o desjatij godyniĭ. Certain morphological features of Russian also appear in Surzhyk, such as the use of nominative for vocative (Ukrainian has a separate vocative case); the genitive singular forms of masculine nouns ending in -a, not in -u as in Ukrainian; and the dative singular forms of masculine animate nouns ending in -u as opposed to the Ukrainian form ending in -ovi.

Although some features of Surzhyk are regular, others are idiosyncratic and vary “depending on several parameters: rural versus urban; level of education; and time period (pre-Soviet, Soviet, post-Soviet)” (Grenoble, p. 593). The difference among dialects of Ukrainian, dialects of Russian, and the use of Surzhyk across Ukraine create a far more complex picture than that of “Russian-speaking Ukraine versus Ukrainian-speaking Ukraine”, as depicted in the mainstream media.




Bilaniuk, Laada (2004) A typology of surzhyk: mixed Ukrainian-Russian language. International Journal of Bilingualism 8(4): 409-425.

Grenoble, Lenore A. (2010) Contact and the Development of the Slavic Languages. In: Raymond Hickey (ed.) The Handbook of Language Contact. Wiley-Blackwell. Pp. 581-597.

Pereltsvaig, Asya (2003) The role of L2 in L1 loss of aspect in Diaspora Russian. Paper presented at the LSA Annual Meeting, Atlanta, January 2003.

Sussex, Roland and Paul Cubberley (2006) The Slavic Languages. Cambridge, UK: Cambridge University Press.

Taranenko, Oleksandr (2007) Ukrainian and Russian in contact: attraction and estrangement. International Journal of the Sociology of Language 183: 119-140.

Related Posts

Subscribe For Updates

We would love to have you back on Languages Of The World in the future. If you would like to receive updates of our newest posts, feel free to do so using any of your favorite methods below: