Is Esperanto simple and easy?

Aug 8, 2014

In my previous post, I maintained that the notion of “simplicity” (and “complexity”) when applied to language is far from being simple. At least two kinds of simplicity need to be distinguished: simplicity of learning (what I called “applied simplicity” or what may be called “ease of learning”) and descriptive simplicity. As the article by Jouko Lindstedt that I criticized in that post is focused on Esperanto, it is now time to ask with Lindstedt whether Esperanto is indeed “simpla kaj facila”, simple and easy. The answer itself, once again, is rather complicated.

Let’s start with the ease of Esperanto, that is the ease of learning, presumably by adult speakers of some other language(s). (I’ll note parenthetically that when it comes to the ease of learning by children, there do not seem to be any perceptible difference if the language is considered as a whole. Even polysynthetic languages, such as Greenlandic or Ket,—whose complexity, in John McWhorter’s sense, boggles the mind—appear to be perfectly easy for children to learn. The “descriptive simplicity” of a particular area of grammar, such as its gender or case system, however, appears to have consequences for the ease of L1 acquisition, but quite in the reverse way of what one might expect: it is languages that have more pervasive and rich systems of gender or case that appear to be easier for children to figure out!) Back to Esperanto. It was designed to be learned—ideally!—by all adults around the world and consequently Esperanto does have certain features that make it relatively easy to learn—but mostly for speakers of Western Indo-European languages, that is, mostly, for Germanic and Romance speakers, less so for speakers of Celtic tongues, and to an even lesser degree for speakers of Slavic languages. Esperanto is written in Latin/Roman script and a great deal of its lexical roots (i.e. nouns, verbs, and adjectives) are taken from Germanic and Romance languages. Esperanto sound inventory is not particularly large and is easy to master for speakers of Indo-European languages as it lacks any “exotic”, typologically rare, and difficult-to-articulate sounds.

Even the grammar of Esperanto is comparatively easy—once again for speakers of Germanic, Romance, or Slavic languages. For an English speaker, Esperanto does have some unexpected twists, two of which Lindstedt mentions in his article: the accusative case and the special marking for (in)transitive verbs. As is well know, English lacks case. But although the Esperanto accusative case suffix –n is more Turkic that Indo-European in appearance, the notion of marking objects (and objects of prepositions) different from subjects is not alien to most Indo-European languages. While English (and French, Italian, Spanish) speakers expect such a marking only with pronouns (English: He saw him), but speakers of German and Icelandic, as well as of Russian or Lithuanian, would feel perfectly at home with accusative case. Similarly, many English verbs lead double lives as both transitive and intransitive: John broke the computer (transitive) vs. The computer broke (intransitive). But most other Indo-European languages distinguish the two classes of verbs. For example, in Russian intransitive counterparts of transitive verbs are usually marked with –sja, a suffix whose job it is to indicate intransitivity, as in Kompjuter razbilsja (lit. ‘Computer broke-SJA’), Sobaka kusaetsja (lit. ‘Dog bites-SJA’, meaning the dog bites people in general), or Vanja breetsja (lit. ‘Vanya shaves-SJA’, i.e. Vanya shaves himself). In French, sometimes the intransitive counterpart is marked by a special marker se, as in Jean se rase (lit. ‘Jean SE shaves’), but for other verbs it is the transitive form that needs special treatment, as with Le chef a fait fondre le chocolat (lit. ‘The chef has made melt the chocolate’). With this in mind, the Esperanto’s insistence that each verb by itself is either transitive or intransitive, and consequently that changes in transitivity need to be marked by a certain suffix (e.g. the transitivizing suffix –ig in Ni boligas la akvon ‘We boil the water’), does not look weird. Thus, in some ways the constructed Esperanto looks more Indo-European than English does.

When it comes to descriptive simplicity, I agree with Lindstedt that Esperanto does not have the look and feel of a “really simple” language, in McWhorter’s sense. It is not very pidgin- or creole-like. (Most linguists do not count true pidgins among natural languages at all, which leaves creoles as “the simplest languages”.) For a creole, Esperanto is too full of bound morphemes: inflectional suffixes on nouns, adjectives, and verbs. The abovementioned accusative suffix –n is not the sort of thing one finds in creoles, nor is the plethora of verbal tense and mood markers found in Esperanto, such as –os (future indicative), -inta (active participle), and –u (jussive).

Surprisingly perhaps, Esperanto does, however, have certain features that are more typical of pidgins than of natural languages. One such feature is the special part-of-speech suffixes -o, -a, -e, and -i that mark that a word is a noun, adjective, adverb, and infinitive verb, respectively, as in vido ‘sight’, vida ‘visual’, vide ‘visually’, and vidi ‘to see’. While the suffix –i can be analyzed as an infinitival marker (the counterpart of the English to), the other suffixes carry no meaning other than that of lexical category. Unlike the suffix –o in the Russian okno ‘window’, which encodes not only the word’s noun-ness, but also its neuter gender, 1st declension, singular number, and nominative case, the suffix –o in Esperanto means only “noun”. There is no grammatical gender or nominal declensions in Esperanto, and the Esperanto –o does not encode the singular or plural number (plurals are expressed by the addition of the suffix –j and singulars lacks this suffix), nor the nominative case (nominative case is not marked by a suffix, or as linguists say “is zero-marked”, while its “antipode”, accusative case, is marked by the abovementioned suffix –n). Such part-of-speech suffixes are not found in any natural languages that I know of, but they are found in at least one pidgin, Russenorsk, where nouns (typically) end in –a (e.g. fiska ‘fish’) and verbs (typically) end in –om (e.g. kopom ‘buy’).

While it is not very pidgin-like, Esperanto nonetheless shows some signs of McWhorter’s “simplicity”. It lacks some of the grammatical “ingrowth” and “disheveled-ness” found in Indo-European languages of Europe. Unlike Romance and (some) Germanic languages, Esperanto lacks grammatical gender, which makes learning French or Russian a nightmare for a foreigner. Nor does Esperanto have the so-called “particle verbs”, constructions with particles that need to appear immediately after the verb, after a pronoun, or even attach themselves to verbs in the form of prefixes, depending on the language and the construction—which make the lives of English, Swedish, and German learners so complicated. Those who have tried to learn Russian or Polish would find Esperanto simple and easy in comparison, as it lacks the complex systems of case and aspect that plague learners of Slavic. Also absent from Esperanto are complex morphophonological rules that change how morphemes (i.e., meaningful bits of words) are pronounced in different contexts. For example, in Russian the 1st person singular present tense of the verb ljubit’ ‘to love’ is ljublju ‘I love’ and not *ljubju (by analogy with regular verbs like solit’ ‘to salt’, solju ‘I salt’). Similarly, the instrumental case of vremja ‘time’ is vremenem ‘by time’, not *vremem by analogy with the regular ditem ‘by child’. Even Russian speakers themselves are at a loss as to the 1st person singular present tense of pylesosit’ ‘to vacuum’: pylesosju? pylesoshu?

However, one cannot say that Esperanto is just a simpler, typologically averaged version of Indo-European because some properties of its grammar have a decidedly non-Indo-European feel. Particularly, consider the overall architecture of its morphology. Esperanto is an agglutinative language, like Turkish (Turkic), Hungarian (Finno-Ugric), or Swahili (Bantu). Indo-European languages tend to be fusional, although some (especially English and to some extent Romance languages) have developed elements of isolating morphology. None of the languages that Zamenhof spoke are agglutinative, including Hebrew, which combines elements of fusional morphology with the non-concatenative (or “root-and-pattern”) morphology. I have always wondered why Zamenhof chose to go the agglutinative route and whether it seemed to him to be “easier” for adult learners to wrap their minds around this way of building words. After all, learning an agglutinative language means just memorizing the various morphemes and the template in which they should be attached to the noun, adjective, or verb. Learning a fusional language requires one not only to learn the morphemes and the order in which they are attached, but also the complex morphophonological rules that determine how each morpheme changes when appearing next to certain neighbors. I wonder if there are any experimental studies that address this issue—perhaps some of my readers know?


  • John Cowan

    Tolkien said about Esperanto, “Also I particularly like [it], not least because it is the creation ultimately of one man, not a philologist, and is therefore something like a ‘human language bereft of the inconveniences due to too many successive cooks’ – which is as good a description of the ideal artificial language (in a particular sense) as I can give.”

    Note that although Finnish is agglutinating, it does indeed have complex morphophonemic rules, and even Turkish has vowel harmony. What is fusional about fusional languages is that the endings generally perform multiple roles: in Latin, the verb ending -o combines or fuses the notions “1sg”, “indicative”, “present”, and “active”.

    • Thank you for the quote, John, I love it.

      And thanks for the correction about Finnish. Turkish and Turkic in general (and I am most familiar with Tatar) indeed have vowel harmony, as well as voicing and nasal assimilation. The difference between that and the fusional morphophonemics that I talk about in the post is that vowel harmony applies to all (bound) morphemes in the language, not to a specific one or ones. In an (ideal) agglutinative language, morphemes don’t get a chance to “cast individual votes”, as it were. Only the general rules apply. But there are such rules, indeed.

  • Ivan Derzhanski

    It is true that Esperanto doesn’t have the {tlh} sound of Klingon, or its {Q} … but its _c_ and _ĵ_ aren’t exactly frequent across languages, nor is the presence of _h_ and _ĥ_ at the same time.