The evolution of word order and “free word order” languages

Oct 20, 2011

In the previous posting, I outlined the recently proposed theory by Gell-Mann and Ruhlen on the origin and evolution of word order. According to their proposal, the most recent common ancestor of all currently living human languages, Proto-Human, had the SOV order. They reconstruct this by examining word orders in modern languages, as well as in some earlier languages. There are, however, several problematic aspects to Gell-Mann & Ruhlen’s work (henceforth, G-M&R), and although they themselves claim that they “do not think that such corrections will affect [their] conclusions”, I’d like nonetheless to point out some problems with their account.

One issue that is immediately apparent concerns free word order (FWO) languages. G-M&R admit the possibility that a language may have a mixed word order pattern with no single dominant word order. In their sample of 2136 languages, 125 languages (approximately 6%) are listed as “languages with mixed word order” (see Table 1 in G-M&R and their supplemental data). Curiously, G-M&R do not give a definition of what counts as a “language with mixed word order”. Matthew S. Dryer in World Atlas of Linguistic Structures Online lists 189 languages out of 1377 (13.7%) as having “no dominant order”. Clearly, the discrepancy in the proportion of FWO languages according to the two studies cannot be a result of different sampling procedures, so it must be the case that the two studies use different definitions and consequently classify languages differently. Indeed, WALS classifies Adynyamathanha (a Pama-Nyungan language in Australia) as having “no dominant word order”, while G-M&R — as SOV; WALS lists Alawa (another Australian Aboriginal language) as having “no dominant word order”, while G-M&R — as SVO; similarly, WALS classifies Amahuaca (a Panoan language spoken in Brazil and Peru) as having “no dominant word order”, butG-M&R — again as SOV. These discrepancies are disturbing, especially to the extent that G-M&R classify languages as SOV, while other sources classify them otherwise.

Clearly, allowing different word orders in different constructions is not sufficient for G-M&R to classify a language as a FWO language. For example, both Latin and Russian allow all six imaginable permutations of the Subject, Object and Verb, and yet G-M&R list Latin as SOV and Russian as SVO. There are good reasons to believe that these are indeed the dominant (though not the only possible) word orders in these languages. For example, in Russian if both the subject and the object are such that they do not distinguish nominative and accusative cases, as in the famous Jakobson’s example Mat’ ljubit doch’ (literally, ‘mother loves daughter’), such sentences are understood as SVO rather than OVS (e.g. the above example is understood as ‘The mother loves the daugher’, not ‘The daughter loves the mother’). Experimental work by Irina Sekerina of the College of Staten Island confirms as much.

Still, someone considering statistical frequency of various word orders in spoken Russian might easily reach the conclusion that Russian is an SOV language, as this is the most common word order. Conversely, someone examining Russian narratives (ranging from biblical texts to fairy tales or popular jokes) may well conclude that Russian is a VSO language. In fact, G-M&R’s claim that early Slavic was a VSO language is somewhat suspect, because it is based on examinations of (and reconstructions from) surviving manuscripts most of which are narrative (specifically, biblical) in nature (thanks for David Erschler, Philip Minlos and Pavel Iosad for a discussion of this point).

All of this is not to say that G-M&R’s classification of Russian as SVO is suspect, but their classifications of some other languages (on whose present-day syntax and especially historical development we know precious little) may be problematic.

Furthermore, Dryer distinguishes at least three types of languages with no dominant word order. The first type is languages that can be called non-configurational, i.e. languages with highly flexible word order, all or most orders of subject, object, and verb will be possible and common; Nunggubuyu (a Gunwinyguan language spoken in northern Australia) is an example of such a language. Another type includes “languages lack a dominant order only because just the subject or just the object exhibits flexibility with respect to the verb”. An example of such a language is Syrian Arabic, which allows both SVO and VSO orders; there does not seem to be a reason (according to Dryer and the references he cites) to consider one of them dominant. However, only these two orders are common and the order of verb and object is relatively inflexible. The third type of language lacking a dominant order consists of “languages in which different word orders occur but the choice is syntactically determined”, such as German and Dutch, where

“the dominant order is SVO in main clauses lacking an auxiliary and SOV in subordinate clauses and clauses containing an auxiliary. Because this results in both orders being common, neither order is considered dominant here and these two languages are shown on the map as lacking a dominant word order.”

Interestingly, G-M&R’s list of languages with mixed word order seems to include only languages of the second and third type, that is languages that allow two (but no more than two) alternate word orders. And even the decision to include a language into the list of mixed word order languages is not always clear. For instance, they classify Dutch as mixed SOV/SVO, but German, which exhibits very similar word order facts, as SVO.

When it comes to non-configurational languages, that is those where any of the six word orders are possible and the dominant word order is particularly difficult to determine, G-M&R claim that

“it is not always easy to determine which order is basic, and indeed for some languages it has been claimed that there is no basic order. Whether this is really true is difficult to determine. In any event, notwithstanding the often free word order, the Australian family is generally
regarded as having SOV as its most characteristic type.”

Thus, G-M&R list Nunggubuyu and several other Australian Aboriginal non-configurational languages as SOV. In addition to these non-configurational Australian languages, several polysynthetic languages (including Yupik, Inuit, Greenlandic Eskimo etc.) are listed as SOV by G-M&R. To the extent that there exists a bias toward listing languages with problematic word order facts as SOV, this may very well distort G-M&R’s claims about the origin and evolution of word order.

Finally, languages with two (or more) competing word orders, where the choice is syntactically determined, suggest that the order of Subject, Object and Verb may not be a primitive, as it is considered by G-M&R, but rather an artifact of several factors, perhaps ones that order the verb with respect to the subject and the object separately, as well as ordering the subject and the object with respect to each other, or ordering the verb with respect to other clausal elements (e.g. negation, adverbs, auxiliaries, tense, etc.) not considered by the SOV ordering classification adopted by G-M&R. I will elaborate on this point in the following posting.

  • jkdenne

    Isn't the sentence "Mat' liubit doch" could be easily perceived as "The daughter loves mother" in the specific context or if the first word is stressed? I agree that your interpretation would come first if the sentence is taken out of the context. Though it would be a more common usage, the language is rarely used in the vacuum, and the context is everything. If so, wouldn't Russian qualify for a mixed language?

  • Asya Pereltsvaig

    @jkdenne: Thank you for your comment! You are absolutely right that in languages like Russian multiple word orders are possible but they are pragmatically (informationally) and intonationally different. This is exactly what allows to determine the dominant word order in Russian: it is the one that can occur in the most neural contexts and with neutral intonation. For example, as an answer to 'What happened?', if one says Tramvaj obognal trollejbus (Tram overtook trolleybus) it is understood overwhelmingly as 'The tram overtook the trolleybus' (not the other way around). Experimental studies with native speakers confirm as much.

    This also makes Russian (and similar languages) distinct from the truly non-configurational languages, Nunggubuyu-style, where a dominant word order cannot be similarly determined.

    Russian is also different from German-style languages, where different types of clauses have different word orders (e.g. Verb-second in main clauses and verb-final in subordinate clauses). By the way, subordinate clauses in Russian are overwhelming SVO (which supports the view that it is the dominant order).

    To recap, Russian certainly allows multiple word orders but all the ones other than SVO are said to be derived from SVO, which is the dominant (typical, most neutral, etc.) word order. The fact that one of the orders is the dominant one means that Russian does not qualify as a mixed language according to G-M&R's way of classifying languages (as far as I can tell, since they don't give an explicit definition) or as "language with no dominant order" as in WALS.

  • Maiki Bodhisattva

    Hi, I just discovered your blog yesterday and I’m still clicking though interesting posts from the pasts so sorry if this was already addressed. Having grown up in Poland and studied various languages I never understood obsession with word order as expressed by linguists based in Anglosaxon universities. As you point out above many languages are relatively, if not fully free-word-order. Galia omnis divisa in partes tres est, no? I myself regularly use ‘non-stardard’ order in my speach and writing and it rarely raises anybody’s brow. I expect it to be even more so for hypothetical Proto-Human, which probably didn’t have to deal with complex clauses. First language users should be happy to successfully convey the message ‘all spear mammoth much meat’. Only longer sentences require users to settle down with some form of standard order and even then the necessity is not so strong for synthetic languages as it is for analytical ones (which not so surprisingly is the language this research was published in). the firsts proto-civilisations requiring any more sophisticated medium for recording laws and contracts were already thousands of years removed from PH.

    • Glad you find the blog interesting. Check out my other (more current) blog:

      Regarding your points: first of all, it is not clear how complex/simple PH was. The idea that it was just grunts that somehow got more complex and sophisticated does not meet linguistic muster, I am afraid. So it is quite possible that whatever made language (as in “recursive computational system…” etc.) possible, made it as complex as we know it now.

      Second, even in languages with “free” word order such as Russian or Polish the word order is not really “free” and there is an underlying or basic word order. For example, if I ask “What happened?”, you are not equally likely to answer in any of the six logically possible orders (of subject, object and verb) in Polish/Russian. This is very well documented experimentally.

  • In a Norwegian an affirmative sentence can have svo or ovs, but svo is assumed it there is not any clue to what is object or subject. Often the first utterance that establishes what is object has svo, while ovs is used thereafter.

    • That’s very true and a common pattern crosslinguistically. Russian is exactly like that too.

