More on word order, morphological types and historical change

In a comment to the previous posting, Venelina Dimitrova raised a number of interesting issues, which I thought it would be best to address in a separate posting rather than in the comment section.

1. Is there a correlation between the SOV word order and a synthetic mode of expression? While I don’t know of any studies on this issue, we can easily examine figures in the World Atlas of Linguistic Structures Online, which allows us not only to see each feature or “linguistic structure” in isolation but also to combine two features and thus look for correlations. Let’s combine the feature #81A “The order of Subject, Object and Verb” with feature #20A “Fusion of Selected Inflectional Formatives”.

The latter feature needs a brief comment. While morphological typology is often regarded as a scale from isolating (e.g. Chinese) to agglutinative (e.g. Turkish) to fusional (e.g. Latin) to introflexive (e.g. Modern Standard Arabic),

“recent research has shown that such a scale conflates many different typological variables and incorrectly assumes that these parameters covary universally … Three prominent variables involved in this are phonological fusion, formative exponence, and flexivity (i.e. allomorphy, inflectional classes).”

The feature #20A “Fusion of Selected Inflectional Formatives” refers to the degree to which grammatical markers (aka formatives) are phonologically connected to a host word or stem; there are three basic values: isolating, concatenative, and nonlinear (the latter includes tonal and ablaut). While some languages use exclusively isolating (Vietnamese), concatenative (English) or nonlinear ways of connecting grammatical markers (the latter is extremely rare), other languages use a combination of morphological types: tonal and isolating (Yoruba), tonal and concatenative (Maasai), ablaut and concatenative (Hebrew), isolating and concatenative (Mandarin Chinese).

Do these morphological fusion types correlate with the order of major sentential constituents (subject, object and verb)? Ignoring the rarer morphological types and focusing on languages that are either exclusively concatenative or exclusively isolating, there is some bias but not a strong correlation. Overall, many more languages are exclusively concatenative than exclusively isolating: the former outnumber the latter by almost 8 to 1. This is true regardless of the word order one considers: there is more concatenative than isolating languages among those that are SOV, SVO, VSO, VOS or have no dominant word order (there is no data on object-initial languages). However, SVO languages are the most likely to be isolating: of the 15 languages that are isolating (and whose order order is known), 8 are SVO and only 3 are SOV. Another way to look at this: among SOV languages concatenative languages outnumber isolating ones by nearly 19 to 1, while among SVO languages this bias towards concatenative is much less strong, with only 3 concatenative languages per each isolating one.

Still, I would say that it is not possible to claim a correlation between SOV and fusional/concatenative morphology.

2. Is the change in morphological type unidirectional (from synthetic/fusional to analytical)? The idea that language changes in a unilinear fashion, from “flexional” to “flexionless”, has been put forward by Otto Jespersen in his 1891 doctoral dissertation, but even then, one of the examiners, Hermann Møller disagreed, arguing that language history moves in spirals, not along a line of constant “progress”. More recently, it has been proposed that — assuming a rather simplistic model that recognizes three morphological types (fusional, isolating and agglutinative) — language changes in a cyclic fashion:

“a fusional language can develop into one of the isolating type, an isolating language can become agglutinative, an agglutinative language may move towards a fusional profile, and so on.” (Dixon 1994: 182-183)

This is illustrated with the “clock of morphological type change” below:

Numerous historical developments serve to illustrate the separate stages of this morphological type clock. For instance, while Proto-Indo-European is (typically) reconstructed to be a nearly purely fusional language (there are, however, some debates about the accuracy of this), its descendant languages have all moved -– albeit at different rates -– towards a more isolating profile. One example of a language that has made much progress in this direction is English, which in the course of its development from Old English to Modern English has lost most of the inflectional bound morphology. Another good example is French, which too lost most of its inflectional bound morphemes (both Modern English and Modern French can be characterized as three o’clock languages). Other, more conservative Indo-European languages, such as Russian and Lithuanian, have preserved more of their bound morphology, remaining closer to one or two o’clock.

In contrast, Chinese illustrates the isolating-to-agglutinative change. More precisely, Early Chinese is now thought to have been a nearly purely isolating language but with (still) some elements of fusion (at about three o’clock), while Classical Chinese was even closer to the isolating ideal (roughly at four o’clock). But the clock of morphological type change keeps ticking, moving Modern Chinese in the direction of an agglutinative language, towards five o’clock. For example, Mandarin Chinese -– although still largely isolating -– has developed some bound morphemes, such as the perfective suffix -le and the durative suffix -zhe.

Another example of the isolating-to-agglutinative change is the development of Dravidian languages (now spoken mostly in southern India). Proto-Dravidian was on the isolating side of agglutinative (at about seven o’clock), while its descendants, modern Dravidian languages, have moved along to a more agglutinative type at nine o’clock.

Or consider the development of Australian aboriginal languages. Proto-Australian, the ancestral language of Australian aboriginal languages can be placed at about seven o’clock; modern languages from the Pama-Nyungan family have become more agglutinative, at eight or nine o’clock, while the non-Pama-Nyungan languages have moved even more radically, towards ten or eleven o’clock, having developed strong elements of fusional morphology. A similar degree of fusional morphology has been developed by many modern Finno-Ugric languages (also at ten or eleven o’clock), while their ancestral language, Proto-Finno-Ugric, probably was more purely agglutinative, at around nine o’clock.

One may ask how long it takes to complete the cycle and if there are any known examples of languages that have gone successively through all the stages. The estimates for the length of the cycle are difficult to provide: according to Dixon (1994: 185), it can take “under normal conditions of change, probably anything from two or three thousand years to fifty thousand and more”. Given the length of the cycle and the fact that we have written records going back at the most five thousand years (and even that only for very few languages) and we can reliably reconstruct aspects of proto-languages going back about 6,000 to 8,000 years, it is unsurprising that there are no good examples of complete cycles based on a single language. The best example we have is of Egyptian, which has a long recorded history. According to Hodge (1970), Old Egyptian (about 3,000 BCE) had a complex verb structure which included reference to person; most of these affixes were lost by Late Egyptian (about 1,000 BCE), which used periphrastic constructions involving auxiliaries. By the time of Coptic (200 CE on) a new complex verb structure has developed, using quite different forms from those of Old Egyptian.

Let’s now consider briefly what processes may change a language of one morphological type into another. The change from an isolating profile to an agglutinative one, as in the development of (Mandarin) Chinese, happens through the application of such processes as augmentation, which makes distinct words become grammaticalized as affixes. This is, for example, how the two aspectual suffixes of Mandarin Chinese developed. Another example of augmentation involves the fate of the Russian reflexive marker -sja: in Old Russian it was an independent word, whereas in Modern Russian it turned into a bound morpheme (linguists disagree on whether it is a clitic or an affix, but either way it is a bound morpheme). It is so thoroughly connected to the verb that some verbs cannot appear without it: for example, there are verbs bojat’sja ‘to be afraid’ and smejat’sja ‘to laugh’, but no sja-less (transitive/causative) versions, *bojat’ and *smejat’. Yet another example of augmentation involves creation of case markers from postpositions, verbs or nouns (for examples and a detailed discussion see Blake 2001: 161-175). The augmentation process happens because markers that start out as independent words appear frequently right next to certain kinds of words: for example, auxiliary verbs frequently appear right next to lexical verbs, and postpositions – next to nouns. This frequent juxtaposition leads to independent words being reanalyzed as bound affixes.

The change from an agglutinative profile to a fusional one happens due to inevitable phonological changes, which preserve the same morphological elements but fuse their realizations: here a vowel is omitted, there two adjacent consonants are blended, and before you know it, the boundaries between morphemes are not as clear-cut anymore.

A further application of phonological changes that leads to withering of grammatical morphemes, in combination with morphological simplification, where, for example, inflectional affixes may be dropped (especially, if they consist of sounds likely to be withered, such as unstressed vowels or a single consonant) will inevitably result in a fusional language moving in the direction of an isolating one. This is what happened to Proto-Germanic case inflection, which was expressed through special suffixes. However, Proto-Germanic had word-initial stress (Modern English preserves this stress pattern in words of the Germanic origin), which made the case markers unstressed and as a result more predisposed to being lost with time.

Blake, Barry J. (2001) Case. 2nd edition. Cambridge University Press.
Dixon, R.M.W. (1994) Ergativity. Cambridge University Press.

  • Östen Dahl

    Sinnemäki, Kaius
    Word order in zero-marking languages
    Studies in Language, Volume 34, Number 4, 2010 , pp. 869-912(44)
    "It has often been argued that languages with no morphological marking of core arguments (referred to here as zero-marking languages) should prefer SVO word order. This correlation is tested here by studying the effects of word order, genealogical relatedness, and areal diffusion on the distribution of zero marking with multiple logistic regression. The possible confounding areal and genealogical factors are studied in multiple ways. The results, based on data from 848 languages, suggest that zero marking (morphological simplicity) correlates with SVO (syntactic simplicity), regardless of its areally and genealogically biased distribution. It is argued that this word order preference is affected by functional motivations and language contact."

  • Asya Pereltsvaig

    @Östen Dahl: Thank you for your comment and the reference. Very interesting study indeed! Still, it seems to be in agreement with what I said: a isolating language is likely to be SVO, but SOV is not strongly likely (although somewhat more likely) to be concatenative.

  • Asya Pereltsvaig

    [a comment from Venelina, which for some reason disappeared…]

    Thanks for the detailed response,Asya. True, language typology is a complicated issue and trying to correlate that with word order will be difficult to argue. Yet, the slight correlation is interesting.I agree also that language change can not be mono-directional, if you examine that at a long period of time, it can go both ways. Still, I am very intrigued by the fact that word order can change from SOV to anything else but not the other way around. If you look only at the category of the noun, actually that's what I meant at my precious comment but did not make myself clear, if you take only the S-O relations and the meaning expressed by Nominative and Accusative cases, you could hypothesize that SVO is more "practical". For example, if you take the Russian example you cited in the previous posting, Mat lyubit doch (excuse my transliteration), the nouns are unmarked and you know the meaning only by word order. In Bulgarian, the case will be the same: majka-ta obicha dashterya-ta (mother-the loves daughter-the), both nouns are marked with the same definite article and you know the meaning only by word order. If you take the same sentence in hindi, the object must have the accusative marker "ko", something like "mother daughter-ko loves", so you will have to use that marker to clarify the expression. So, maybe SVO is a marker in itself, and easier to apply than a separate word or morpheme.. The comment above provides an interesting article, as well as the one posted on facebook, thanks!

  • Asya Pereltsvaig

    Venelina, I am not sure I understand the concept of "practical" in this connection. After all, more than half of the world's languages (though not half of the world's speakers) find it more "practical" to do SOV rather than SVO.

    As for Russian (and I suspect Bulgarian is similar), it is underlyingly SVO and with something (like the indeterminacy of case marking) interfering, the other word order permutations become impossible. I think it is also overwhelmingly SVO in embedded clauses, where discourse-driven factors are "neutralized"…

  • John Cowan

    "Isolating" can't be identified with "zero-marking": a language may be quite isolating and yet use case markers in the form of clitics or even separate phonological words. In any case, the association between zero-marking and SVO is only statistical: examples are given of zero-marking languages that are SVO, SOV, VOS, and NDWO.

    I meant to mention earlier that if we look purely at inflectional morphology and exclude derivational morphology, English and Mandarin are on a par: both have about a dozen inflectional morphemes. English of course has far more complex derivational morphology, both conventional prefixes and suffixes and the morphs used in classical and neo-classical compounds (as morph- and -ology in morphology).

  • Asya Pereltsvaig

    @John Cowan: Thank you for your comment! I agree with you on the distinction between "isolating" and "zero-marking" (and I did mean "isolating"). Also good point on the derivational morphology.

  • Steven Lubman

    I wonder if it has been observed whether languages have undergone slower evolution since the spread of universal literacy.

  • Asya Pereltsvaig

    @Steven Lubman: this indeed seems to be the case. One good example is English: Shakespreare to Chaucer (or rather Chaucer to Shakespeare) is approx. 200 years, Shakespeare to us is about 500 years. Yet Shakespearian English is closer to ours than to Chaucers…

  • RAGE

    What about Persian? Where can Persian located on the clock in your opinion?

    • Modern Persian is a fusional language but less so than Old Persian — so somewhere around 2 o’clock maybe?

      • RAGE

        Thanks for the reply, Asya. I have been in Iran recently for 4 days for business. It seems Persian is quite agglutinative at least particularly. Persians use verb at the end of the sentence (SOV in word order) and generally does not use personal pronouns like Turkish (null-subject language). The subject reveals itself at the end of the verb as a suffix, again like Turkish. For example “motoshakker-am” – I thank you, or “memnun-am” – Thanks/I am grateful! If I say them in Turkish, it would be like “motoshakker-im”, “memnun-um”.

        So should Persian be located on the clock around 9? But then the clock should rotate in counter-clockwise? This is why I have asked your opinion…

        BTW it seems Persian has no grammatical gender, and uses both fusion and (but heavily) agglutination. Also first and third pronouns is similar to Turkish ones, man (Persian) – ben (Turkish) / men (Azerbaijani Turkish), u (Persian) – o (Turkish). I wish I would have some free time in Iran…

        • Word order (SOV) or the availability of null subjects have nothing to do with whether a language is agglutinative or fusional or isolating. But yes, you’re right and I was wrong: Persian is better described as an agglutinative language, with some elements of fusion perhaps…

          • RAGE

            Thank you, Asya. I know word order and null subject has nothing to do with whether a language is agglutinative or fusional or isolating, I have just shared some observations. I have spent just 4 days with heavy schedule, so maybe I just encounter a local dialect or my observation is somehow maybe selective, so I want to learn your opinion. I have tried to find some information but almost all of them is controversial and superficial, and I don’t much time to spare for this, if even I want to. So when I have read your article I decide to ask you. Also the clockwise rotation of morphological types and change does not fit my observations about Persian.

            I am a Turk and similarities between Persian and Turkish got my attention while I was shortly in Tehran. I was aware before the travel to Iran that there is around 25%-35% Turkic population in Iran. But seeing and hearing this situation was different. In Tehran I can handle my business mostly speaking Turkish (Azerbaijani Turkish surely but I am already from a city in Southeast Turkey and the local dialect is close to Iraq-Syrian Turkmen dialect and Azerbaijani Turkish) Of course half of the conversations were in Persian by interpreters, and sometimes in English. I have noticed most of the Iranians understand Turkish and some of them did not reveal it to use as an advantage in business, so I kept my mouth close there. I wonder whether Persian got agglunation later or from beginning, and what is the situation for ancient Persian.

            Another point is, -of course this is just my opinion and a generalization- Iranian people looks like a mixture between Anatolian population and Northern Indians. Iranians are darker from Turkish population of Anatolia, and most of them has Northern Indian faces with an Anatolian head which has a high vault. And many Iranians have dark circles under their eyes like Indians and Armenians. When I saw someone with lighter skin, everytime I have learned that they were not Persians/Ajams, etc. Whatever, at the end Iran is located on the road from Northern India and Indus valley to Middle East and Anatolia, so it should be a mixture between both of them.

            If I can spare some time, I want to learn Persian.