Discovering the linguistic wheel?

Jul 25, 2011

On the one hand, it is great to see so much about linguistics in the popular press and the blogosphere. But on the other hand, many articles (in press and online) that purport to report some newly-made linguistic discovery often make me wonder…

A case in point: a recent report in describing the great discoveries of a team of Oxford researchers into historical morphology of Romance languages.

For starters, the article claims that the study looked at “the hundreds of Romance languages, including French, Spanish, Italian, Portuguese, Romanian and Catalan”. Hundreds, really? Which ones? Notice that they say “hundreds of Romance languages“, not “dialects” or “varieties”. Later on the article corrects itself by saying that the database of the research team includes irregular verb forms in “80 Romance languages and dialects” — a much more reasonable figure. The Ethnologue website, which generally takes a splitter’s approach to linguistic varieties, lists only 41 Romance language (plus Latin, which is classified as a Latino-Faliscan language, a member of another branch of Italic grouping within Indo-European). This list of 41 languages includes not only the better known Romance varieties like “French, Spanish, Italian, Portuguese, Romanian and Catalan”, but also Aromanian and Megleno Romanian, Judeo-Italian and Shuadit, Napoletano-Calabrese and Sicilian, Asturian and Extremaduran, Picard and Walloon. I doubt there could be hundreds of other Romance varieties, even dialects in addition to those. Nor can one truly say that “Romanian or the French spoken on the Atlantic coast of Canada” have been “neglected by mainstream Romance linguistics”.

Another aspect of this article that really perplexed me is what exactly the reported discovery is. That irregular forms “are learned by successive generations despite ‘making no sense’ or, apparently, having any function in the language”? Anybody who has tried to learn Spanish or French or Russian knows that! And linguists have known for a long time not only that irregular forms survive, but also why and how they do so. Irregular forms arise as a result of historical changes in the system, such as phonological changes (the cited examples of the French je meurs ‘I die’ vs. nous mourons ‘we die’ are a case in point). The irregular forms survive because children learn them as such, and come to realize that these are exceptions to the regular rule. For example, English children learn the irregular forms like went and ate, after they go through a stage of overextending the regular form and creating forms like goed and eated. The reason that such irregular forms survive is that they are reinforced by frequency: rarer verbs become regularized with time, more frequent irregularities survive. Sometimes, the irregular patterns are also reinforced by analogical extention: that’s what happens with think-thank-thunk.

So if anyone can explain to me what the great, earth-shattering contributions of this research are, I am listening!

  • John Cowan

    If you click through to the actual database, you'll find that the claims in the report are (as usual) grossly exaggerated. There are 72 language varieties listed, not "hundreds"; there are about 360 verbs listed, however. What the database holds is IPA representations of various irregular verbs in these language varieties: there are about a dozen finite and non-finite forms available, and the finite forms can be obtained for any person/number combination. For example, this page lists all the available forms of [ˈdewɾa], the Algherese Catalan descendant of DEBERE.

    The database is not as complete as my description would suggest: there are huge gaps. Only 17 of the verbs are available for Algherese, and indeed only 66 for Modern Standard French. Furthermore, syncretized verbs are handled separately from unsyncretized forms. For example, Jèrriais [alɛ] is a syncretism of AMBULARE, ESSE(RE), IRE, and UADERE, as shown by the present indicative forms [vɛː], [va], [va], [alõ], [alɛ], [võ], the future and conditional in [ið-], and the past participle [ɛːtɛ]; but it is not easy to compare these forms with Standard Spanish [anˈdaɾ], which descends from AMBULARE alone.

    So what we have here is not groundbreaking research at all, but useful grunt work for the understanding of the development of Latin verbs, almost none of which are irregular, into irregular Romance verbs. That's not to say it isn't useful work, or that interesting things can't be discovered from detailed comparisons.

    On another note, overregularization (I've been reading Gary Marcus's articles over the weekend) isn't nearly as common as often supposed. In anglophone children, it peaks about about 4% of all instances. Admittedly, this is hundreds of times more frequent than adult overregularizations, though I once knew a woman, a native speaker, who regularly said I lied down, perhaps a hypercorrection. But children never actually go through a phase where everything is regularized, a point frequently misunderstood.

  • Asya Pereltsvaig

    @John Cowan: Thank you for your comment and for exploring the database in more detail than I have! You are absolutely right that overregularization is not a 100% situation: children overregularize alongside correctly irregular forms. In fact, they do use the (correct) irregular forms before they learn the rule and hence can overregularize at all.

    As for my main point, I never said that the work of collecting form from various linguistic varieties is not useful or cannot lead to interesting discoveries. It is just that so many reports in the popular press and the blogosphere exaggerate the actual findings. And of course such grunt work can be very useful. Have you read the "Latin Alive" book? What do you think of it?

  • John Cowan

    I haven't read it, unfortunately.

    What would be interesting to find out is how sporadic irregularization occurs. Irregular forms tend to become regular in inverse proportion to their rarity, and sound-changes accompanied by restructuring can cause whole classes of forms to become regular en masse, as in the development of the irregular passive suffixes in Maori. In proto-Maori, active forms ended in a consonant, and the uniform passive ending was -ia, but when final consonants were lost, all the passive forms became irregular at a single blow, as it was no longer possible to predict whether the ending should be -tia, -sia, -nia, etc. (Eventually -tia became the regular ending applied to loanwords, denominal verbs, etc., but it is by no means the most common ending).

    But I know of no theoretical explanation why English should have shifted from digged, sticked to dug, stuck in the 16th century. Analogy, we say vaguely; but why should analogy operate in exactly those words and not in clicked, licked, picked, pricked, slicked, ticked? For that matter, there are no other English irregular verb forms closely resembling dug; the other verbs with /[ɪ]/~/[ʌ]/ alternation end in a nasal or /-k/.

    Have you seen any such explanation?

  • Ran

    @John Cowan, "sound-changes accompanied by restructuring can cause whole classes of forms to become regular en masse": 'Regular' here is just a typo for 'irregular', right? Or am I misunderstanding your Maori example?

  • John Cowan

    Typo, yes.

    Similarly, the collapse of the English strong verb system accounts for about half the irregular verbs, and minor violations of the morphophonemic rules for pronouncing -ed account for most of the rest.

  • Asya Pereltsvaig

    @John Cowan: I don't have any explanation for why these specific verbs became irregular and not others. And I don't know if such explanation exists at all. I was hoping that this supposedly groundbreaking research would shed new light on this question, but it didn't (as far as I can tell).