How many words in English?
Numerous posts in this blog discussed attempts of scholars in a variety of fields outside linguistics — biologists, statisticians, anthropologists — to do linguistics. These attempts commonly result in “work” that is laughable, and hence rarely if ever addressed by professional linguists. This blog is often the only exception. I am now adding a new thematic category — “bad linguistics” — to discuss exactly this sort of quasi-scientific approaches to language.
A recent example of such quasi-science, published once again in Science: a paper published by a group of social scientists and evolutionary theorists, plus the Google Books team, “gave the best-yet estimate of the true number of words in English”, according to a report in Wall Street Journal. Their “golden number”: “a million, far more than any dictionary has recorded”. The more the merrier, right?!
However, searching for the number of words in English (or any other language for that matter) is meaningless in more than one sense. According to the Oxford English Dictionary,
“How many words are there in the English language? There is no single sensible answer to this question. It’s impossible to count the number of words in a language, because it’s so hard to decide what actually counts as a word.”
Since new words are coined — estimates range from 8,000 to 25,000 a year, and that’s 22 to 68 words a day! — and old words fall out of use, any number is by necessity a snapshot of the language on a given day. Since there is no accepted authority over the English language, who decided whether any given “word” (or word form, or spelling) counts? And most importantly, what are the boundaries of what we call “English”? How do we decide what is included and what is excluded as an archaic, dialectal, regional, obsolete, professionalism, etc.?
These decisions are made by lexicographers on a case by case basis, and that’s also why different dictionaries list differing numbers of entries. For example, the abovementioned Oxford English Dictionary (2nd edition) lists over 600,000 definitions; Webster’s Third New International Dictionary, Unabridged lists 475,000 main headwords; whereas The Global Language Monitor announced that the English language had crossed the 1,000,000-word threshold on June 10, 2009.
Let me conclude once again with my favorite quote from Geoffrey Pullum:
“Precision, richness, and eloquence don’t spring from dictionary page count. They’re a function not of how well you’ve been endowed by lexicographical history but of how well you use what you’ve got. People don’t seem to understand that vocabulary-size counting is to language as penis-length measurement is to sexiness.”
And to the editors of Science: counting does not make science!