I hear this question a lot, or claims that this or that language is “the oldest”. Recently, this question has come up again in connection with the research on the birthplace of human language, published recently in Science and discussed earlier in this blog (and also here). The logic of the argument is this: Khoisan languages are the oldest languages in the world, they have click sounds and therefore it must be that the first human language, the so-called Proto-Human had click sounds as well.

This argument is flawed in several ways. Even if it were true that Khoisan languages are “the oldest” (a question to which I return immediately below), there is no way to prove that the first human language had the same properties (e.g., the same phoneme inventory) as modern-day Khoisan languages. While it is true that languages outside of Africa have not added click sounds to their inventories, it is not true that click sounds cannot in principle migrate from one language into another. The evidence comes from Bantu languages that have click sounds. In all likelihood, Proto-Bantu did not have click sounds. The only Bantu languages that have them today — such as Yeyi, Xhosa and Zulu -– are those that have been in close proximity or in verifiable contact with Khoisan languages that have clicks. Thus, these Bantu languages must have borrowed click sounds from Khoisan languages. Another piece of evidence to support the borrowing theory comes from the fact that only dental, alveolar and lateral clicks are found in Bantu languages, whereas Khoisan languages may also have bilabial and palatal clicks.

But let’s get back to the question of the oldest language. If we give this question a bit more thought, we will see that it is really meaningless. Indeed, what would it mean to say that a given language is “old” or “ancient”? There is one sensible way to use the expression “ancient language” — for languages that were spoken in antiquity and are no longer spoken today. In this sense, the term “ancient language” applies to Scythian, Tocharian and Sumerian, to give just a few examples. But this is not the sense in which the expressions “the oldest language” or “the most ancient language” are commonly used. For example, Khoisan languages are not “ancient languages” in this sense: they are still spoken today by nearly 100,000 people. Furthermore, we know little to nothing about the history of these languages, what they were like in antiquity etc.

So why exactly is the statement that Khoisan languages are “the oldest” meaningless? Let’s think about this. Each person and each generation of people have parents. We can think of human history as a huge, immensely complicated family tree, going back to the first humans. And all these people spoke a language. Some language. Each group or generation of people anywhere on this planet speaks the language of its parents. Almost. Little, sometimes unnoticeable modifications in how words are pronounced, what they mean or how they are put together may be (and are) introduced from generation to generation but no generation makes up a completely new language, not related to that of their parents. If this were ever to happen, children and parents truly would not be able to communicate (rather than merely pretend that they do not understand each other). But these little changes accumulate over time so that the language of long-ago ancestors can become completely incomprehensible and unrecognizable to a later generation. But historical change is slow, very slow. It takes decades, hundreds if not thousands of years.

There are also historical situations when one language is replaced by another. When a group of invaders comes and subjugates the locals, it may impose its own language on the conquered people. It doesn’t happen in all instances of conquest, however: sometimes it is the conquered who give their language to the resulting mix of the invaders and the locals. For example, English derives from the language imposed by Anglo-Saxon invaders on the local Celtic population, but not from the language of the later Viking or Norman invaders (i.e., English is not a descendant of Old Norse or Norman French, and thus it is neither a Scandinavian nor a Romance language, despite the heavy influences of both). But even in the case of such language replacement, the switch is not instantaneous: it takes several generations for the new language to become the language of the land. So again, the language of the great-grandchildren is not the same as that of the great-grandparents, but contemporary generations speak in ways that differ little.

So new languages do not arise from nothing. Every new language is a development of some branch of the bushy family tree of existing languages. The key to understanding this is that languages constantly change. As I said above, languages change slowly, but the process of change is nonetheless relentless. But people who start out speaking the same language do not necessarily adopt the same modifications to their language. One group may change the pronunciation of one word, and another group — the meaning of another word. And before you know it, different groups, which may be distinguished geographically or socially, speak different dialects. These dialects are still mutually comprehensible (which is what dialects are, by definition), but they are not exactly the same. And the process of language change does not stop there. More and more changes in pronunciation, meaning and grammar accumulate and eventually dialects diverge to such a degree that they are no longer mutually comprehensible, at which point we tend to call them languages rather than mere dialects of the same language. In other words, new languages arise as variations on the theme of already existing languages.

Proto-West-Germanic, Proto-Germanic or even Proto-Indo-European languages. In theory, we can trace an unbroken line of descent with small modifications at each generation from Proto-Human to present-day English.

And the same can be said for each and every language. In fact, each and every language traces its ancestry back to Proto-Human. Therefore, all currently spoken languages are equally old. The difference is just in the labels. We think of a given language as “old” if it has had the same label for a long period of time. For example, the name of the Farsi (or Persian) language, an Indo-Iranian language of Iran, has been used since the 6th century B.C.E., and this fact gives us the sense that Farsi is an ancient language.

In the case of Khoisan languages, the label itself is not very old, and these languages are perceived as old in a somewhat different sense. If we consider a genetic tree of human populations (such as in the picture below), Khoisan people are typically on the branch the splits off the common tree the first.

Can we extrapolate from this that the present-day Khoisan languages are “the oldest”? Not really and for two reasons. One reason is that population trees constructed by geneticists don’t always correlate with a linguistic tree. The fact that Khoisan people belong to the branch that splits off the population tree the first does not mean that they speak a language that splits of the human language tree the first. The second reason is that even if it were true that the Khoisan languages split off the language tree the first, it does not mean that they have not changed since. After all, remember that all languages change, all the time. And the Khoisan languages as we know them today are not like the language of Khoisan ancestors was centuries or millennia ago. In theory, Khoisan languages may have “invented” click sounds at some point in this history, or even borrowed them from some other languages (that left no descendants or lost click sounds since then). There is no way to prove (or disprove) that the proto-Khoisan languages of yesteryear had click sounds, let alone that the first human language, the Proto-Human had them.

  • voraratis

    We think of a given language as "old" if it has changed little and still would be understood by ancestors that lived many centuries ago. Understanding is not measurable and thus not ojective and what they spoke like is a guesswork, but it is still a primary definition, according to me.

    • So basically, mutual intelligibility is to languages what interbreeding is to species. 🙂

      • Sort of. Except interbreeding is more of a black-and-white issue than mutual intelligibility.

    • “if it has changed little” — the problem is that comparing how much this or that language has changed would be difficult and all languages change anyway.

  • Asya Pereltsvaig

    @voraratis: Thank you for suggesting another way of interpreting "the oldest language". I agree with your intuition, but unfortunately, although very intuitive, this definition is as meaningless as the one I described in my posting. As I mentioned, all languages change, and over long periods of time (relevant for this question), all languages change at about the same rate. While there are periods of more significant "leap forward" changes, they are typically balanced out by periods of more stability. So in this sense — the language that is most like the ancestral language — all languages are the oldest too…

    • Tuli000

      @ Asya Peretsvaig: I agree that the question is naive, but I don’t think the answer is complete.

      “Typically” would mean “on average”. But then there will be a small fraction that are far from the average. These would count as the most conservative.

      If language change really occurs at the same rate everywhere (say modelled on a random walk), these outliers can be considered an uninteresting accident, like the fact that one sand dune is taller than another. A scientist would study the process of sand dune formation, but does not assign a special status to the tallest ones.

      But if the rate of language change is connected to major systematic variables such as the role of writing or the amount of contact, and if these vary in a big way across our finite historical landscape, then there might be some interesting concrete examples of old languages.

      For example, what do you think about claims I have read that, say, Greek (or Lithuanian) has been particularly conservative over the last two thousand years? Compared to other European languages or world languages.

      In any case this certainly won’t get us back to “original” languages, if that is what is meant by oldest. It seems intuitively plausible that, as you suggest, on such a long time scale all languages have changed so much that there are no winners. But more detail would be welcome.

      • Thank you for sharing your thoughts and questions! I agree that the statement I’d made (somewhat too quickly, I admit) that all languages change at the same rate needs fine-tuning. For one thing, it only applies to relatively long time spans. And as you mention, writing appears to slow down the change (but note that writing is relatively a recent phenomenon, maybe 5,000-6,000 years old, so looking at longer periods of time it wouldn’t be significant at all). As for the claims that Lithuanian is the most conservative of Indo-European languages, it is often made, which proves that not all languages change at exactly the same pace, however, such differences in pace may be rather insignificant to the overall amount of change. To use your sand dune metaphor, if one dune is an inch taller than another, but both are 6 feet higher, we might want to ignore the minor height differences between them. Hope that makes sense.

  • jazzmoth

    "So new languages do not arise from nothing. Every new language is a development of some branch of the bushy family tree of existing languages."

    I would like to point out that this is misleading – there are languages that are not connected to the "family tree". For instance, Nicaraguan Sign Language arose from homesigns, which are not genealogically connected to any contributing language. I do not know if spoken languages have ever arose in this manner, which I believe is what you intended.

  • Asya Pereltsvaig

    @jazzmoth: Thank you for your comment! You are absolutely correct about exceptional cases like Nicaraguan Sign Language that was more or less "created from scratch" (I was wondering whether I should mention it in the posting, but I felt it would take me too far afield). This is very rare among spoken languages (and even among sign languages) though it happens with pairs of "wild children" isolated from human society from birth. But such cases are extremely rare ("wild children" are rare as is, and being isolated alone is more typical for them). And I don't know of any such "wild children lingo" taking root as a communal language of any human group.

  • uzza

    Mmmm. Deaf children most often do not speak the language of their parents, and signed languages do not arise from 'pairs of “wild children”'. They arise and take root as a communal language wherever the social conditions allow it. The best documented examples of this happening are ISN (Nicaragua), ABSL (Israel), and probably Martha's Vineyard SL.

    The necessary social conditions were more common in pre-technological populations, but we have no way of knowing if this occurred also with early hominids.

  • Asya Pereltsvaig

    @uzza: yes, as I already mentioned, I was talking about spoken, not signed languages. Yet, even when it comes to sign languages like ASL and BSL and others, they are learned from other speakers, not made up from scratch by every deaf person…

  • uzza

    before you say anything else you should look up ISN and ABSL. They were not learned from other speakers.

  • Asya Pereltsvaig

    @uzza: I am familiar with ISN and ABSL. Once again, though, I am talking about the big picture, whereas you bring up two exceptional cases that are only tangentially relevant to the topic of discussion (spoken languages).

  • John Cowan

    I'm surprised by the claim that all languages change at broadly the same rate, integrated over long periods. It seems clear that Baltic is more conservative than Germanic, Standard German more conservative than English, Spanish more conservative than French, Tolomako more conservative than Sakao, relative to their respective common ancestors. To be sure, there may be features that go against the general trend. Cantonese, for example, has lost the pre-nuclear semivowels of Middle Chinese whereas Mandarin has kept them, and has split tones where Mandarin mostly has not, but its overall conservatism in the rest of its phonology, in syntax, and in lexis seems clear enough.

  • Asya Pereltsvaig

    we are not talking about the same length of time…

  • be_slayed

    One could also talk about the oldest recorded language (which would be Sumerian).

    But in any case, the popular press often irritates me on this point. I recall one article which discussed both Hindi and (Mandarin) Chinese. The writer explained that Chinese is 2000 years old (or something like that) while Hindi is only a couple hundred years old.

  • Asya Pereltsvaig

    @be_slayed: Thank you for your comment. Indeed, the statement that Chinese is 2000 years old and Hindi is only a couple hundred years old is meaningless. The only difference is that old(er) forms of Chinese are still called Chinese, while Old Hindi is called Sanskrit. But your point about the oldest recorded language is well taken. With a littlest problem: we should be talking about the oldest recorded language *as far as we know*: there might have been a language written down earlier but maybe none of those writings have been found (or preserved).

  • William

    "Khoisan languages are the oldest languages in the world, they have click sounds and therefore it must be that the first human language, the so-called Proto-Human had click sounds as well."

    I realize you're summarizing Atkinson's argument to later refute it, but I don't think C (the so-called Proto-Human had click sounds as well) necessarily matches the A and B here. My interpretation would be that since the Khosian languages have the most complex phonemic inventories today, they've changed the most over time, so in fact "Proto-Human" would not have most of the features of the modern Khosian languages.

    This jibes with your observation elsewhere that "émigré colonies typically retain an older state of their language than the homeland community" — the implication being that the homeland community's language changes in the period subsequent to the emigration. Since one way to change a homeland language is to diversify its phonemic inventory by making it more complex, in this scenario the Khosian or Bantu languages serve as a homeland language to the rest of the world's languages.

    In that sense, the Khosian languages may be the "oldest" in the world, but since they've changed (phonemically) more than any other language family, in another sense they're also the newest.

  • Asya Pereltsvaig

    @William: Thank you for your insightful comment!

    You are absolutely right that the statement that Proto-Human had clicks does not follow from the fact that Khoisan languages have clicks (even if we assume that in some relevant sense, they are the "oldest languages"). Here, I was just summarizing the argument that was made. So there is a logical flaw to the argument, in addition to one of the preconditions being false.

    Nor is it true that Khoisan languages have "the most phonemic inventories". It is true that they have click sounds, and some of them have a clicks galore, but there are other types of languages with other types of rare sounds (implosives, ejectives, complex tones, etc.). If the phoneme inventory complexity can be measured at all (something I very much doubt, and on which I am hoping to write an academic paper one of these days), it is likely that Niger-Congo languages will end up as the most complex. Still, we shouldn't forget that phoneme inventory richness is just one, arbitrarily chosen factor in language complexity.

    As for whether the most complex languages (whichever languages these are) are the newest or the oldest, it is not possible to make the argument one way or another. Languages change but overall complexity seems to remain about the same (the problem is, of course, that language complexity cannot be measured objectively since different types of complexity are not easily comparable).

    Finally, when it comes to emigre and homeland languages, I was talking about more or less modern times, and emigration perceived as such. It is not clear if gradually migrating bands of early modern humans perceived themselves as emigres at all (at least, in the relevant sense).

  • William

    Here's one way to measure phonemic inventory size, from the Science article:

    "I examine geographic variation in phoneme inventory size using data on vowel, consonant, and tone inventories taken from 504 languages in the World Atlas of Language Structures (WALS)[.]"

    Obviously we may disagree with the usefulness of the method, but it is a quantifiable measure of phoneme inventories.

    Using that method, here are the paper's general findings:

    "[The] largest phoneme inventories [are] in Africa and the smallest in South America and Oceania."

    I apologize for assuming that you had looked into the paper's data to see that the Khosian languages were the most diverse phonemically. Actually, according to Atkinson's method and data, it's the fourth-most diverse family:

    Diversity*   Family
    1.230475742 Hmong-Mien
    0.939518526 Tai-Kadai
    0.665052419 Niger-Congo
    0.652261428 Khoisan
    0.592516392 Kadugli
    0.590997617 Nilo-Saharan
    0.551371994 Oto-Manguean
    0.542626779 Sino-Tibetan
    0.315508385 Afro-Asiatic
    0.315294145 Austro-Asiatic

    *Mean Total Phoneme Diversity = "WALS contains information on three elements of phonemic diversity – vowel, consonant and tone diversity [….] WALS values for the three items were standardized [and the] standardized scores were then averaged to produce a measure of total phonemic diversity in each language."

    As you can see, five of the top 10 most diverse languages (phonemically, according to Atkinson's method) are African. This, in concert with estimates of speaker counts and distance measurements, and the idea that emigrants tend to conserve a less diverse dialect (either through the serial founder effect or the homeland/émigré effect), implies that human language originated with people in Africa and radiated out from there to the rest of the world.

    • H Klang

      Why would the same processes that produce all those phonemes in the original languages not continue to operate in the emigre populations, given the very long time they have had to do so?

      In other words, I can understand the serial founder effect as an impoverishing effect that occurs at the time of emigration, but why would the results (fewer phonemes) then persist over many millennia? Particularly when the languages evidently have had time to re-diversify in so many other ways — or are African languages more diverse in other ways as well?

      • Thank you for your comment! I agree that Atkinson’s theory has no response to these questions. If indeed our ancestors have been losing phonemes during their treck out of Africa, why weren’t phonemes also added. In fact, we do know that languages add phonemes and even whole classes of phonemes. This is one of the points that my co-author Rory van Tuyl and I make in our response to Atkinson to appear in Science — so stay tuned!

  • Asya Pereltsvaig

    @William: thank you for the comment regarding Atkinson's work on phoneme inventory vs. distance from point of origin. I am familiar with Atkinson's work and I wrote a short critique here in this blog (http://languages-of-the-world.blogspot.com/2011/04/where-was-human-language-born.html). I am also in the process of writing a co-authored response to Atkinson's article. So I will not get into details here, but the gist of it is that Atkinson's phoneme counting is terribly flawed. There's additional problems with his statistical tools and his conclusions. Given all this, his work isn't worth much, IMHO.

  • Asya Pereltsvaig

    [The earlier comment from William that seems to have been accidentally erased]

    Here's one way to measure phonemic inventory size, from the Science article:

    "I examine geographic variation in phoneme inventory size using data on vowel, consonant, and tone inventories taken from 504 languages in the World Atlas of Language Structures (WALS)[.]"

    Obviously we may disagree with the usefulness of the method, but it is a quantifiable measure of phoneme inventories.

    Using that method, here are the paper's general findings:

    "[The] largest phoneme inventories [are] in Africa and the smallest in South America and Oceania."

    I apologize for assuming that you had looked into the paper's data to see that the Khosian languages were the most diverse phonemically. Actually, according to Atkinson's method and data, it's the fourth-most diverse family:

    Diversity* Family
    1.230475742 Hmong-Mien
    0.939518526 Tai-Kadai
    0.665052419 Niger-Congo
    0.652261428 Khoisan
    0.592516392 Kadugli
    0.590997617 Nilo-Saharan
    0.551371994 Oto-Manguean
    0.542626779 Sino-Tibetan
    0.315508385 Afro-Asiatic
    0.315294145 Austro-Asiatic

    *Mean Total Phoneme Diversity = "WALS contains information on three elements of phonemic diversity – vowel, consonant and tone diversity [….] WALS values for the three items were standardized [and the] standardized scores were then averaged to produce a measure of total phonemic diversity in each language."

    As you can see, five of the top 10 most diverse languages (phonemically, according to Atkinson's method) are African. This, in concert with estimates of speaker counts and distance measurements, and the idea that emigrants tend to conserve a less diverse dialect (either through the serial founder effect or the homeland/émigré effect), implies that human language originated with people in Africa and radiated out from there to the rest of the world.

  • Anonymous

    Tamil is the oldest language of all and it was being spoken till date.

  • Asya Pereltsvaig

    @Anonymous: what if anything do you base your statement on? This is a blog for scientific discussions, not ideological statements!

  • D.Saravana

    How can you explain about "sweet" if a person haven't ever tasted that he couldn't understand that. Experienced to be understood. and you also can't explain in scientific method. . If you want to know the Tamil Language is very oldest one you have to read "திருமந்திரம்"("Thirumandhiram" written by "Thirumoolar" "திருமூலர்". it was written BC 2000 yrs He also lived for 3000 yrs and 13 day's it's all actually he's a sidhar"means He knows Everything" actually i also amazed because when i read some pages of that book(all are in poetry method it's very difficult to understand even if you are a Tamil person because now a day's we are using modern Tamil. He had explained more than a doctor and psychiatrist. It may be different wold to you. something can't explain with today scientist's.Today scientists can't explain in this(because Thirumoolar should be a scientist in that time)
    If you can explain this in scientific method Please explain this.



    Please Refer These sites:

  • Asya Pereltsvaig

    @D.Saravana: The stuff you've posted on the man without food is outside the scope of this blog and will be erased. As for your claim that Tamil is the oldest language, I suspect you haven't read the posting carefully or you would see that you contradict yourself. The Old Tamil is not the same language if it can't be understood by a Modern Tamil speaker. More generally, however, let me remind you that this blog is a space for scientific discussions, not ideological statements!

  • D.Saravana

    Ok. but in our today's wold the science must be improved.. you have to understand why i'm saying this. now a day's we have founded that 9 planet. but before 2000,3000 yrs Tamil people found that there is 9 planet out there. how did they done?. you can see all Tamil Temples that 9 planet as God's.

  • Asya Pereltsvaig

    @D.Saravana: Thank you for your comment. No doubt science will improve in the future (that's the hope anyway), but it will not improve by rejecting the scientific method and adopting political ideology or religious beliefs instead.

  • Balaji

    @Asya Pereltsvaig:
    Dear Asya,
    As Tamilians we had found lot of scientific truth and explained everything in our literatures which is being considered mythical now. Now the science is trying to find proofs for that. Though we are talking in religious-belief way, that was the one helped us not to ruin the world because of our scientifical discoveries as we are doing now as saying technology. Do you disagree?.
    We never used to break the science, because we think that is more holy.

    And one more thing, lot of you scientifical things which was even proved earlier, has been disproved now like atom theory

    As you need a scientific proof on the oldest language.
    Tamil civilization started in Kumari Kandam (Lamuria) which was submerged before 16,000 BC. Lot of texts or literatures as a proof had been lost there. Do you have any proof or atleast a belief of any languages which existed before this.
    As we don't know the origin about anything we have, we believe that's given by God or Nature (if you are not atheist).

  • Asya Pereltsvaig

    @Balaji: Many groups have beliefs that they go back longer than the rest. Even if true, it doesn't mean that the language they speak now is the samee that was spoken then. Please read the post closely: it is all explained there. As for language that is as old as 16,000 BC (18,000 BP), humans have had language for about 100,000 years (if not twice that). That makes the oldest human language 82,000 years or so older than Kumari Kandam.

  • Balaji

    @Asya Pereltsvaig:
    My concern here is ideological statements cannot be avoided completely. Don't look people oddly who is making such statements. These people have more compassion than a scientist on what they have in their mind because they are thinking it is heavenly.

  • Asya Pereltsvaig

    @Balaji: Everybody is entitled to their religious and ideological beliefs; however, this is not a forum for them.

  • stefan

    Dear Asya, thank you for the blog (and your patience…). I love languages, studied Indo-European languages for a long time and now learned Arabic to know a language belonging to a different linguistic family. The entire question of the oldest language is quite meaningless indeed. People raise it, as in your blog, for ideological and nationalistic reasons. I met Arabs who believe Arabic is the oldest language since they believe that God speaks Arabic. There are Jews who believe the same etc. Meanwhile it is clear that both languages are just branches of the semitic languages, which are the cousins af the African-Asean languages, and both go back to same proto-language, etc.
    Since Indo-European has been studied extensively, one can prove in a jiffy that none of the Indo-European languages is the oldest and all go back to proto-Indo-European (a reconstructed language), probably spoken in the Black Sea region 10.000 to 7000 years ago. The precise timing is a matter of dispute, I tend to believe that it was spoken longer than 7000 years ago. The first written Indo-European text is linear B (1500 BC) and it is quite simple to see that it is proto-Greek. So in 3500 years (1500 BC until now) the changes are in that respect in a sense limited. Add 3500 year more back in time and you would be at 5000 BC, 7000 years ago: but the changes from proto-Indo-European to all the first recorded Indo-European languages (Greek, Hittite, Latin, Gothic etc.) are enormous. So therefore I believe that 8000 BC is probably closer to the period of Proto-Indo-European. Recently several studies were released coming to the same conclusion on several grounds.
    One day the research on the semitic languages will be as extensive as the one on Indo-European, and then it will become even more demonstrable that issues like "is Hebrew older or Arabic (and does that say something about which religion is best) are meaningless.
    The real question is how far back we will be able to go with regard to languages. Reconstructions like proto-Indo-European can, as seen, go back up to 10.000 years. But now genetics give hope for more insight, like the genome of the neanderthaler which showed that they, like cromagno man, had the language gene. The future looks fascinating from that perspective. Thanks again.

  • Asya Pereltsvaig

    @stefan: Thank you for your comment! I am not sure I understand the logic behind your reconstruction of the PIE timing or the connection between the "oldest language" and "which religion is best", though…

  • stefan

    Thanks. Regarding PIE. If PIE was spoken around 5000 BC (and some claim even 4000 BC), then the changes between 5000/4000 BC and 1500 BC were enormous. The first recorded Indo-European texts date back to about 1500 BC, so called linear B, used on Crete, and Hittite in Asia Minor. After Ventris deciphered linear B (which used the same alphabet as linear A, which was however probably a semitic language), it turned out to be a quite recognizable form of Greek in an earlier stage (I read the texts). Of course there have been major, major changes in Greek in the next 3500 years but those changes are to my feeling far more limited than those from PIE to linear B 1500 BC. Any history of the Greek language now starts with linear B: it is in a sense one group that clearly belongs together in spite of all the obvious changes, and I would call that in a sense fair. If you take the first Latin texts (700 BC), it is flabbergasting to see how "limited" (in a sense) the changes were from then to the modern latin-based languages. Again far less change in that period than from PIE to the first recorded PIE-based languages. If you want to think in terms of only 2000 years: especially Augustan Latin and modern Spanish are remarkably close. I wouldn't call these impressions scientific but neither are they baseless. I don't know if it would be possible to put the speed(s) of linguistic change in a model but if one day it would be possible, it might indeed lead to the conclusions that the period between 5000/4000 BC until the first recorded PIE-based languages was too short to explain the big changes. Therefore I have a kind of natural sympathy for those scholars who believe PIE to have been spoken far earlier than 5000 BC like Mario Alinei (who taught at the university I attended but I followed his lectures, a shame). If the "early daters" are right, we now know that even after about 10.000 years it is still possible to recognize languages as part of one family. In that regard it is interesting to note that to date no connection has been established between Asian languages and native American languages (apart from Inuit) though it is clear that the native Americans came from Asia. Most now believe that that migration took place from 30.000 to 15.000 BC. It may be that all these migrants spoke languages not related to the existing Asian languages but it seems to me more likely that the changes are too big to still find the relationship (and the fact that most of these languages were not written, doesn't help).

    As far as religion is concerned: I have spoken with Arabs who believe that Arabic is the oldest language (and very special compared to other languages) and Jews who believe the same about Hebrew, both on the basis of religious bias. They get however confused when you present to them the family tree of their languages and the fact that these languages were actually derived from common "ancestors", which were derived from again older "ancestors" which makes the "special divine origin" and oldest language status either claims rather meaningless.

  • Asya Pereltsvaig

    @stefan: Thank you for your detailed comment.I understand now what you meant by saying that PIE must have been spoken earlier than the agreed consensus. It's not clear for me, though, how the closeness/differences between languages (or stages thereof) can be measured objectively in order to support this view. A lot of scholars actually believe that the languages of the oldest Ancient Greek, Latin, Sanksrit texts are too close to each other than one would expect with the 7,000 YA date.

    Nor is it a fact that language change happens at an even pace across times. In fact, it is probably not so, but we don't really have a clear picture of that yet.

    As for religion and language, it is a fact that Biblical Hebrew is older than Classical Arabic (as in "was spoken earlier"), and it is also obvious that Judaism predates Islam — but how can one make conclusions about which religion is best based on that, I don't know.

  • Sandesh

    Language is just a weapon to make some one understand! when you are dying you use sign language to ask the water, there your oldest or finest language won't come to help you!

  • Sandesh

    Its common that we people think that what we are following is correct and it is the best! infact nothing is. World or language or anything for that matter hasn't created from scratch, it transformed from one state to other. people argues that their language is older, their culture is perfect,.. so on.

    Humanity is my religion and i follow each and every aspects of Humanity. Humanity never tells me to kill animals to get rid of hungry,… WHAT RELIGION YOU BELONGS TO? don't see the world in a first person view, see the world in third person view! then everything looks silly including yourself. Till you die, say Humanity is my religion and i'm here to help others, taking care of others and for others! YES, LIFE IS A FIGHTING, but not with others. its between you and your thoughts! THINK DIFFERENT, HAVE A BROAD MIND!

  • Sandesh

    Above two comments for those who say that their language is oldest!

  • Asya Pereltsvaig

    @Sandesh: Thank you for your comments! Keep reading!

  • subramaniam

    .kumari kandam millian yeas ago .Now under water.mayan also same with tamil sangam.So my comment is tamil.

  • Costas92

    the oldest language is albanian…cames from pelasgians to illyrians and now to albanians people. we speak the homer’s language so..

    • Hmmm, none of those things you mention are nearly old enough: the “oldest” human language is 100,000 years old, whereas Homer and Illyrians are much much more recent!!!

      • HTB

        But pelasgians is older than ancient greek language

  • Armando

    As everyone should understand that the language we are speaking now (English) is only 816 years old, and Tamil is 27600 years old i got to say that its really really gorgeous, and the Sanskrit is almost 50000 years old so Sanskrit and Tamil was scientifically proven as the oldest language in the planet….   

    • So what exactly happened in in 1196 AD that marks the birth of English?

      It’s a pity you haven’t read the post and the subsequent discussion, as you would see how misplaced your comments are. “Scientifically proven” indeed! [snort]

  • David Cowley

    On the Khoisan (Bushmen): as you say, the human family tree show them ‘splitting’ off first; Now, the fact that studies show these folk seem to be the most genetically diverse population in the world means that ecologically speaking (and I am an ecologist by the way) if you had to bet on any single group of people as most representing the original modern human stock, then the Khoisan are your best bet. That would apply with any other species (plant, animal etc) – the most genetically diverse area is the most likely to be where the species arose. (Meaning it would be more accurate to say everyone else split from them really.)
    Now, on the language question, of course languages change, so the idea of which living lang is the oldest gets complicated (I note by the way that the Khoisan languages fall into several distinct groups). But, again, if one had to bet on which languages were the most likely to be closer to the earliest human tongues – then for the same reason as the genetics, we should go for the Khoisan. I confess by the way to being something of a bushman ‘fan’ – they are amazing people with a hunter-gathering way of life that has almost gone but is, again, a link to our distant past. 

    • Thank you for sharing your insightful comments, David!

      Re: the genetic diversity of Khoisan, I am not sure they are that diverse within the group. They are very different from the rest, but among themselves? I am not sure.

      As for languages, although there are some groupings within the family, most scholars, as I understand, accept them as a valid language family. Internal subgroupings based on shared descent can be found in any language family. So having distinct subgroupings doesn’t make Khoisan languages an “old language” (or language family). Nor is there any sense in which Khoisan languages are “closer to the earliest human tongues”, as I discuss in the post. There is no way to know what the earliest human tongues were like. But if we had some magical way to find out, I am almost sure that it would be some non-African language that would be the most similar grammatically. Because which language that would be is entirely accidental. No contemporary human language is closer to the original one because of not having changed as much. If I had to put a bet on where to find a language most like the original one, I’d say Papua New Guinea — because there are most languages (per unit of area) there…

      • Millie

        Adam and Eve, from the Bible spoke Arabic. And they were the first humans created, as it reads in the bible.
        It all depends if you are religious or not. But that is what I believe.

    • thompst

      A couple of points here– first to the main article. While geneticists agree that San (Khosian) genetic groups split off earlier than all others, this does not mean *in any way* that they are “older” genetically– like languages, genes also continue to change. The common phrasing is very unfortunate, and misleading. Modern San are just as contemporary, and have been changing just as long, as everyone else—just on their own track.

      Back to language, the same general principle holds true. There is nothing to assert that the “more original” languages included clicks, that were later lost. It could also be that clicks were developed by the Khosians long after the major split between Khosian and non-Khosian languages, perhaps rather recently (the great Bantu expansion was not that long ago). Without a time machine, there is absolutely no way to determine this.

      • I’ve heard geneticists refer to these as “older lineages”, though of course they are not necessarily preserving their genes over time.

        And yes, there’s no way to determine whether the original language had clicks. At least so far.

  • arber

    Webster’s New Twentieth Century Dictionary,
    Unabridged Second Edition, De Luxe Color, William Collins and World
    Publishing Co., Inc., 1975.
    Albanian is the first.

  • Samnort


    scientific discoveries
    made by famous
    scientists, argue
    that Russia and the Russian language – a very ancient in the world. Linguists and geneticists proved it.

    • You must be reading Zadornov & Co—pseudo-science at best. If you read in Russian, I suggest reading Andrei Zalizniak’s popular explanations about these issues.

    • You must be reading Zadornov & Co—pseudo-science at its best! If you speak/read Russian, I would suggest you check out Andrei Zalizniak’s work on the subject. E.g. http://languagesoftheworld.info/site-news/andrei-zalizniaks-lecture.html

    • Firuz

      an old language should be free of exceptions,it mean language as product of brain should follow math. rules. Russian with so much exception couldn’t be an old language. Older language of current days are those who have little exception in their structure,grammar etc…

      • “an old language should be free of exceptions” — and you base it on what? because if you think logically about this, exceptions arise over time, for example exceptions to morphological patterns of inflection (e.g. irregular verbs like sing-sang-sung) arise as a result of independent phonological changes. And if exceptions arise over time, the older the language the more exceptions you’d expect to find there, no?

  • KKSK

    tamil is the oldest language in the world the civilization which were still present since 10,00 B.C

  • subramaniam

    Tamili.tamil is the father langguage in the world.1 sangam 2sangam 3sangam . The 1sangam born in kumari kandam pandyas kingdom. Now under water .

  • MatthewTanner

    The graphic is illegible.

  • Stephen John

    he LORD said, “Behold, they are one people, and they all have the same language. And this is what they began to do, and now nothing which they purpose to do will be impossible for them. 7″Come, let Us go down and there confuse their language, so that they will not understand one another’s speech.” 8So the LORD scattered them abroad from there over the face of the whole earth;

  • FP

    Dear discussants,
    if I can suggest try to find a work (theory) of Dr.Cyril Hromnik about languages 🙂

  • Drlove Singh

    i wanna say onely one thing that Sanskrit is one of the oldest language in the world and it should be mentioned in the list. scintificly its proved that sanskrit is more than or 2500 years old language
    than why its not in the list of oldest languages of the world.

    • Why don’t you read the post before you make a food of yourself? There is no such thing as “the oldest language in the world”!

  • FreeSpeech101

    Korean speaks tamil words. Cameroon speaks tamil words. We are not talking loaned words here and there. Korean literrally main day to day words. Its quiet unbelievable. Even in east corner of russia aborigins and nz maori speaks tamil distorted. Tamil is mother of all languages. Sanskirit is pefected language from tamil.

  • Thank you for sharing this example! Yes, “davenport” made a complex path from place to proper name to common noun (sofa). Is it a particular kind of sofa that your family calls a “davenport”? tercüme bürosu

  • Shiva Shanker

    As a response to some of the statements raised by the participants. It was elsewhere referred that Sanskrit is 50000 years old and Tamil is 27000 years old. This is baseless. If you take the oldest Tamil available literature and the oldest Sanskrit Literature, Tamil is much older than Sanskrit. Sanskrit is derived from Prakrit and Prakrit also had a precursor language, where as Tamil was existing even when old Prakrit was spoken. Researchers like Alex Collien confirmed this. Listen to this video clipping on Youtube….http://www.youtube.com/watch?v=WRghF1pQccA

  • Sarvan

    Tamil language is the largest language

  • Moinullah Khan

    According to SANSKRIT TRACED TO ARABIC, By Sheikh Mohammad
    Ahmad, B.A.Hons. L.L.B., Advocate Lahore High Court 1982, Arabic is the mother
    of all languages. He has proved this now but Quran stated this 1435 years ago.

    • The Arabic of the Quran is from the 7th century — Biblical Hebrew predates that by two millennia. But let’s not worry about facts getting in the way of your ideological convictions!

  • Aditya

    I appreciate the amount of research you have done on this very interesting topic. But as far as I interpret it, this article just discusses mainly about Khosian language and doesn’t really answer the question in the title mentioned.

    • You seem to have missed the point of the post: that there is no such thing as “the oldest language”.

  • The chart is too small. I would like to see the details.

  • Atadiusti

    Viewed the article for the chart. both the hypertext redirect and thumbnail were broken.
    Sad reader am I.

    • What do you mean? Works for me…

      • Atadiusti

        I mean that if I click on the thumbnail, I get redirected to the thumbnail as an image page, rather than the full-size image that would allow me to see the text. If I click the hypertext, I get redirected to another page of this site with the banner “No Results Found”

        • Try now, I’ve linked to the image and not its thumbnail. If that’s still too small, write to the authors of that blog to see if they have a better image.

  • Bill Wade

    A question that intrigues me is the extent to which the flexibility of a language influences the flexibility of its parent culture, and vice versa. Do English-speaking cultures accept change more teadily than those with less flexible languages? If so, is it because the English language has a richer array of linguistic roots to draw from to accommodate change? Or is the language more flexible because the culture is more flexible? Or both?

    • Actually, it’s the possibility you didn’t consider: that there are no language that are “more flexible” or “less flexible”. All language accommodate change. English isn’t peculiar that way.

  • FreeSpeech101

    Tamil is oldest language. But i do not believe in lemuria continent fantasy.

    Reason is simple. I find korean speak atleast 1000 tamil words. Tamil any city name end up with oor means village/city. The name is used all over Asia. Cameroon kid speak tamil words. Kinda distorted but it was astonishing. No country except korea has many tamil words outside india. It has almost same culture even though heavy chinese influence.

    My theory is proto tamil came from north africa and people moved north first and east to mangolia and japan and america. Another branch went through thailand indonesia to australia. I do not think the people who spoke tamil probably way different to people who spoke tamil in 100bc in tamil nadu.

    Because my theory is that people probably moved 10000-5000bc. Now making any statement of oldest language is just ideology.

    I really disagree click sound is original language or first language. You are by product of your surroundings. If there are lots of birds and sounds people tend to copy them. Language in other areas formed on different sources.

    Also claiming click sound language is old is simply stupid and idiotic.

    Proto Tamil may be oldest language. Now what is proto tamil? No clue. Simply because each region has their own distorted form. Only reason i say tamil even cameroonian kids speak tamil words. engae Erruku. That is impossible to copy even for coincidence. Also ant name erumbu also spoke same way.

    Research more. But some tamil ideologist go over board with lemuria continent fantasy. But it has possibility if we have proof of archeological record of people lived in india 20000 years ago.

  • johninokinawa

    Asya, please don’t overcomplicate the issue. The question is very simple. There is a time track. On this time track, which language that we know of exists at the earliest point in time? Can you answer this? I’d really like to know.

    • I explained why the question is moot. Read the post.

    • Ilya Zlatanov

      According to Herodotus, Pharaoh Psammetichus wanted to determine the oldest nation and establish the world’s original language. For this purpose, he ordered two children to be reared by a shepherd, forbidding him to let them hear a single word, and charging him to report the children’s first utterance. After two years, the shepherd reported that on entering their chamber, the children came up to him, extending their hands, calling bekos. Upon enquiry, the pharaoh discovered that this was the Phrygian word for bread, after which the Egyptians conceded that the Phrygian nation was older than theirs.

      It is simple, right? But not scientific. The same root is found in Albanian bukë ‘bread’ and Greek phōgō ‘to roast’. It is cognate to the English bake from PIE *bheg-/*bhō̆g-. But the Indo-European word itself is cognate to the Altaic *bū́gà and Kartvelian *bug-, which altogether descend from Borean *PVKV. The suppositional Borean languages were spoken during the Upper Paleolithic in the millennia following the Last Glacial Maximum.

    • As I explained in the post, johninokinawa, this question is moot.

  • Siva Rathina Pandiyan

    Why didn’t you spoke out any word of the world’s oldest surviving language Tamil. Don’t you believe in it…

  • Hugh Cipher

    This is ridiculous and why would you feel it necessary to even attempt to refute the fact the Khosian people speak the oldest language on the planet. Todays spoken Languages have not been around long enough to over turn & be replaced completely by new languages to the point there’s no trace of them left. Id hoped this was written before the research was complied to create the documentary “Before Babel”. I mean at least that would be an excuse for ignoring the fact that Khosian has the highest number of phenonemes & this corolates with the # of genetic mutations in the Khosian people who are thus labeled the oldest group of people on earth….but you seem to be quoting that same research in order to refute it so i guess bullocks for that. And the statement that the question “what’s the oldest language is meaningless” is absurd. It No more meaningless than knowing who’s writing system is the oldest or what human population is genetically the oldest. That kind of stuff is important to science. You also present arguments that bolster your opisistion,are moot or don’t address your point ie: “only dental, alveolar and lateral clicks are found in Bantu languages, whereas Khoisan languages may also have bilabial and palatal clicks.”
    Uhhh duh yeah that’s because Khosian “is” older thus has more phenonemes in the form of those clicks. Then you talk about English & how it’s primarily Anglo-Saxon with what remained of the original Keltic spoken & Romantic additives but not any enfluence from the later Viking & Norman invaders. I’m like ok annndd ?that proves what? You seem to have forgotten the Saxons spent the most time on the isle of the Angles having 1st invaded before 5ad and continued to invade & migrate until 11ad. The Normans invaded & conquered a people who already had a fully developed language & were few in number compared to the full conquered population. You have take in these fact before you use them to refute a conclusion that’s been arrived at academically

    • You don’t seem to have understood my point. Also, you’re making some strange and unjustified assumptions, such as that click sounds indicate the age of the language (whatever that is supposed to be). Similarly, if “The Normans invaded & conquered a people who already had a fully
      developed language & were few in number compared to the full
      conquered population”, what about Anglo-Saxons vis-a-vis the Celts? The Anglo-Saxons were also “few in number compared to the conquered population” and the Celts had a fully developed language…

      • Hugh Cipher

        Well please correct me but your point seems to be that Khoisan is not the oldest spoken language today nor would it matter anyway. Two point I disagree with. As for the Anglo-Saxons, they were not low in number. They flooded into Britain in several successive waves ( http://mbe.oxfordjournals.org/content/19/7/1008.full ) . There also have been theories that the Saxons migrated into Britain and killed doff the Celts there but they more than likely interbreed with the population they didn’t kill off. (http://harvardmagazine.com/2009/07/who-killed-the-men-england ) This is probably why very little of the Celtic language remains in England save Welsh and Cornish. This is why the way your comparing there influence to the Normans is inaccurate. The Anglo-Saxons not only conquered Britain but their peoples migrated in mass. Historically there is no true Norman “migration”. The Normans were a conquering class of noble men with an horse equipped army. The Normans who followed William the conqueror were fewer than 10k. This was less than 1% of the English population at the time. This is why the French speaking Normans eventually began to speak “Anglo-ish” or as we say “English”. The Saxons on the other hand invaded in mass and killed off 10’s of thousands if not more than a million Celtic speaking peoples thus we not only speak English we speak an English with little Celtic engluence

        • Correct: my claim is that “the oldest spoken language” was spoken some 100K years ago and died out shortly thereafter. Of the languages spoken today, all are of the same age because they are all descendants of that original Proto-Human language. The difference is solely in how we label them, and the labeling is fairly random. In my class, I play recordings of Proto-Indo-European and Old English and ask students if either sounds familiar (they’ve not had any prior exposure to either). They say no. That one of them is labeled “something ENGLISH” and the other is not “something ENGLISH” is purely a matter of random labeling conventions, not substance. Both are ancestral to English we speak today and neither is mutually understandable with it.

          As for the “Anglo-Saxons killed off all/most of the Celts”, that theory is outdated. Most recent research shows that Celts were more numerous in the resultant population and lived alongside the Anglo-Saxons for centuries after the “invasion”.