The Institute for Molecular Medicine Finland (FIMM) has compiled the Finnish Gene Atlas, which contains genome-wide gene marker data for more than 40,000 Finns. Among the findings are two interesting points: (1) Finns are unique on the genetic map of Europe, differing considerably both from Central Europeans and from their eastern neighbors; and (2) genetically, Finns have more in common with, for example, the Dutch or Russians living in the area of Murom, to the east of Moscow, than with their linguistic relations, the Hungarians; more generally, genetic closeness clearly follows geographic distance more closely than linguistic distance.

This situations — genetic closeness to the neighbors rather than to the linguistic cousins — is not unique to Finns: Hadza and Sandawe in Tanzania are genetically Bantu but speak Khoisan-related languages; the inhabitants of Madagascar genetically have roots both in Indonesia and the East Coast of Africa, but speak an Austronesian language (Malagasy). Even the Finns’ linguistic cousins — the Hungarians — are genetically Central European but linguistically trace their lineage to the Ural mountains.

Discrepancies of this sort between the genetic and linguistic roots arise through migration, conquest, massive second language learning and language shift. Typically, a small group of migrant warriors and/or political elite manage to impose their language on the much larger local population; as a result, the genetic consequences of the conquest are much less significant than the linguistic consequences. For example, a relatively small group of Magyar established a rule over a much larger population of Romance-speakers on what is now the Great Hungarian Plain; the estimates are that Magyars constituted no more than 30% of the resulting population. As a result, present-day Hungarians have very little of the Magyar (Uralic) gene pool, but the Hungarian language is clearly a Finno-Ugric (Uralic) language, unlike any of its neighbors (for example, Romanian, Polish, German — all Indo-European languages).

All of this underscores one big difference between how genetic belonging and linguistic classification are determined. From the genetic point of view what counts is how much of a given group’s gene pool matches what other genetic signature. For example, one can say that Hungarians are 10% Uralic and 90% European (or whatever these figures actually are). From the point of view of language family classification, one does not talk about percentages. For example, we cannot say that Hungarian is 10% or 50% or 90% a Finno-Ugric language. A given language either does or does not belong to a given language family. To make a determination of whether a given language belongs to a given language family, it does not matter how much of its vocabulary is shared (or cognate) with other languages in the family. What does matter is the source of the CORE vocabulary — words that denote body parts and physiological processes, kinship terms (‘mother’, ‘father’, ‘brother’, etc.), basic natural phenomena (‘day’, ‘night’, ‘sun’, ‘winter’), low numbers (usually, 1-10) and similar words — and the basic grammatical patterns. These are the parts of language that are the least prone to borrowing from other languages.

For instance, take English: only a small proportion of the present-day English vocabulary can be traced back to Old English and through it to the Germanic roots. There are also some later borrowings from other Germanic languages like German and Dutch, but even taking these into account, the proportion of the English vocabulary of Germanic origin is negligible. The majority of the English vocabulary comes either from French (including Anglo-Norman French) or Latin — but English is definitely NOT a Romance language, like Latin or French. And its basic grammatical patterns are Germanic as well. For example, English morphology closely resembles that in other Germanic languages (for instance, English — like German or Dutch — have past and present tense and mark the past tense in one of two ways, by adding –ed or by changing the vowel of the stem). English even has remants of the once much more common Verb-Second pattern: in sentences that start with never or only…, the verb (auxiliary, modal or do) must come second, before the grammatical subject. This Verb-Second pattern applies in all main declarative sentences in languages like German or Norwegian.

Similarly, Hungarian is a Finno-Ugric language, with a large proportion of its core vocabulary from the common Finno-Ugric stock, agglutinative morphology, extensive case system, vowel harmony and vowel length distinction used to encode meaning. All of these “basic vocabulary and grammar” considerations outweigh the significant proportion of the Hungarian vocabulary that has been borrowed from its neighboring Central European languages, Latin and the general European word stock. Those borrowings do not make Hungarian any less Finno-Ugric.

When it comes to language, there is no “half-breeds”!

