[This post was originally published in May 2013]

As explored in my earlier posts (see also here, here, and here), the spatial distribution of words for a given meaning can reveal interesting patterns of both language spread and language contact. While both factors are always at play, language contact is more evident in regard to words for cultural innovations, such as ‘tea’ or ‘computer’. Another interesting case is the geography of words for ‘book’, which many languages borrowed along with the general concept of ‘book’ and more often than not with one particularly important religious text.


As can be seen from the map on the left, several roots for ‘book’ are particularly common in Eurasia and Africa, including those related to the Latin liber shown in red; to the Proto-Germanic *bōks shown in blue; to the Arabic kitāb shown in green; to the Proto-Slavic *kъniga shown in purple; and to the Sanskrit pustaka shown in pink. Some other words, whose etymology will not be considered here in detail, are shown in black.

Let’s start with the Latin root for ‘book’, liber. Several possibilities have been explored for its uncertain etymology. One relates it to the Proto-Indo-European *hlewdh– ‘people’, whose cognates include the Ancient Greek eleutheros, German Leute, Old English lēod, Lithuanian liaudis, Russian ljudi ‘people’. Another etymology derives this word as a cognate of Old Church Slavonic lubŭ ‘bark of a tree’ and Lithuanian lùpti ‘to peel, to shell’. As one would expect, its descendants are attested in Romance languages: French livre, Italian libero, Spanish libre, Portuguese livro, Occitan liure, Catalan llibre, Galician libro, Sicilian libbru. Interestingly, Romanian uses carte, which is related to another Latin root, which gives us the English card and charter (however, Romanian also retained the Latin root liber in a different meaning). But Romance languages are not the only ones with the reflexes of this Latin root. Celtic languages generally have words for ‘book’’ that descend from the Latin liber: Welsh has llyfr, Irish and Scottish Gaelic both have leabhar, Breton has levr. These words were transmitted from Latin to the Celtic languages when the Celts were Christianized. This is the first of many examples of the word ‘book’ spreading with religion, more of which we shall see below. Two languages outside the Romance and Celtic families have words that reflect the Latin liber: Albanian and Ilocano. In the case of Ilocano, the main language of northern Luzon in the Philippines, the word libro ‘book’ was borrowed from Spanish. Generally, Ilocano features numerous Spanish loanwords. Other languages of the Philippines such as Tagalog and Cebuano, however, have non-borrowed words for ‘book’, aklat and basahon, respectively.

Unlike the Latin root for ‘onion’, the root liber did not spread into other Indo-European branches such as Germanic or Slavic, nor are Germanic and Slavic roots for ‘book’ related. In Slavic languages, the words for ‘book’ are all very similar: East Slavic languages have kniga (Russian), knyha (Ukrainian), and kniha (Belarusian); South Slavic languages have kniga (Bulgarian and Macedonian) and knjiga (Serbo-Croatian and Slovenian; and West Slavic languages feature such forms as księga, książka (Polish), kniha (Czech and Slovak), knigła (Lower Sorbian), and knéga, knéżka (Kashubian). All these forms derive from the reconstructed Proto-Slavic form *kъniga. Its etymology is rather controversial, with at least three theories proposed about its origin. One theory derives it from Old High German kenning ‘symbol, sign’ or from other Germanic source (cf. Gothic kunnan ‘to know’ and Old Norse kunna ‘to know’). Another hypothesis relates the Proto-Slavic word to the Akkadian kunukku ‘seal-cylinder’ or kanikku ‘sealed object: document, sack bulla, etc.’, as well as to Old Armenian knik’ ‘seal’. A third theory links the Proto-Slavic form to Chinese words (e.g. Old Chinese küen ‘scroll’, Mandarin juǎn, possibly via Turkic küiniŋ). This theory is buttressed by the fact that paper was invented in China ca. the 1st century CE. One way or another, Slavic languages share the root for ‘book’, and one of them in particular—Russian—also “donated” this root to many other languages of Eurasia, as we shall see below.

Germanic languages generally have words that descend from the Proto-Germanic *bōks ‘book’, which in turn derives from a Proto-Indo-European word reconstructed as *bheh2g- (Beekes 1995) to mean ‘beech’. That reconstructed meaning is rather problematic, however, as some of the reflexes of this root in Indo-European languages refer to other tree species: for example, the Greek reflex of the PIE root *bheh2g-, phēgós, also means ‘oak’ (Beekes 1995: 48), while the Russian cognate buzina refers to ‘elder tree, Sambucus’. As we can see, in Germanic this root acquired a different meaning entirely. Modern Germanic languages whose word for book derives from this root include English book, as well as West Frisian boek, Dutch and Afrikaans boek, Limburgish book, German Buch, Yiddish bukh, Allemanic buech, Danish bog, Norwegian and Swedish bok, and Icelandic bók. But just as the Latin root liber has spread outside the Romance family, so did the root bok-/buk-. For example, it has penetrated some Austronesian languages such as Bahasa Malay, Bahasa Indonesian, Balinese, Sundanese, and Javanese, in all of which it is buku (though the word probably was borrowed from English into Bahasa Malay, and from Dutch into the other languages mentioned above). Similarly, another Austronesian language, Malagasy, has boky, probably derived from English. English is likewise the source of the word buk in Tok Pisin, a nativizing pidgin of Papua New Guinea. Also from English are such forms as the Lingala búku, Somali buug, and Shona bhuku. This Germanic root was spread into these various non-European languages by colonialism rather than religion, as several of the areas where buk-languages are spoken are predominantly Muslim. Intriguingly, in the Hausa language of northern Nigeria, the same root yields boko, which literally means ‘alphabet’ but which has come to stand in for ‘Western education’, as can be seen in the name of the militant Islamist group, Boko Haram (“Western Education Is Forbidden”).

Other languages in Africa have either native words for ‘book’ (e.g. Bambara gafɛ and Yoruba ìwé), or a loanword from another language such as Arabic, which donated its word for ‘book’ along with the spread of Islam. The Arabic word for ‘book’ is kitāb; as with other Semitic languages, Arabic has non-concatenative (i.e. root-and-pattern) morphology so that lexical roots typically consist of three consonants, while the vowels indicate mostly grammatical information. In the case of kitāb, the root, whose general meaning concerns writing, is K-T-B. To form the plural of ‘book’, one changes the vowels: kutub means ‘books’. Other Arabic words formed from the same root include kitaba ‘writing’, kātib ‘writer’, maktab ‘desk’, as well as verbal forms kataba ‘he wrote’, kutiba ‘it was written’, kattaba ‘he caused to write’, and many others. The same root is found in Maltese, another Semitic language, whose word for ‘book’ is ktieb. Interestingly, Hebrew has the same root, as in the verbal forms katav ‘he wrote’, hixtiv ‘he dictated’ (i.e. caused to write), hitkatavnu ‘we exchanged letters’ (i.e. wrote to each other), nixtav ‘it is written’, and many others, as well as in nouns such as mixtav ‘letter (to be sent)’, katav ‘writer’, ktiva ‘writing’, and the like. However, a different root, S-P-R, (which means ‘to count’) is the base for the Hebrew word for ‘book’: sefer.

African languages that have borrowed the Arabic word kitāb, which typically belonging to the Bantu family, have reanalyzed the word as consisting not of a tri-consonantal root K-T-B and vowels, but of the prefix denoting noun class and a root. For example, the Swahili word for ‘book’ is kitabu, an Arabic loanword which has been reanalyzed as containing the noun class prefix ki- and the root tabu. Noun class systems are found in Bantu languages, but also in Dyirbal and Nunggubuyu, both Aboriginal Australian languages; Ingush, a Northeast Caucasian language; Ju|’hoan, a Khoisan language; and Yimas, a Papuan language. Such systems are not unlike grammatical gender systems in more familiar languages such as Spanish, German, or Russian. Like gender systems, noun class systems divide nouns into groups that usually have some semantic coherence. However, instead of relying on categories pertaining to the biological sex of the individual, other semantically motivated categories come into play in noun class systems, including shapes, sizes, materials, origin (natural vs. man-made objects), animacy (humans and animals vs. other objects), abstractness etc. Also unlike grammatical genders, noun classes are usually numbered rather than named. In Swahili, noun class 7 denotes man-made objects (among other things, such as languages and diminutives). To make a noun plural, a noun class prefix is switched to another: for instance, ki- is replaced by vi-. The Swahili word for ‘knife’, kisu, thus becomes visu in its plural form. Loanwords are pluralized in the same way: hence, the plural of kitabu is vitabu. The story of the Kinyarwanda word for ‘book’, igitabo, is similar.

The Arabic word kitāb penetrated languages of other families as well, most notably Turkic and the Indo-Iranian branch of Indo-European. But not all languages in those families have a kitāb-derived word for ‘book’. Among Turkic languages, one finds reflexes of the Arabic kitāb in Turkish kitap, Azeri kitab, Uzbek kitob, Kazakh kitap, Tatar kitap, and Bashkort kitap. Yet other Turkic languages—for example, Chuvash and Sakha—have words related to the Russian word kniga: kĕneke and kinige, respectively. (It is probably not coincidental that the Chuvash and Sakha people, unlike most other Turkic-speaking peoples, did not convert to Islam.) The Indo-Iranian languages also vary on this issue: some have a word based on the Arabic kitāb, while others do not. Sometimes, even closely related languages have a different word for ‘book’, as is the case in Zazaki, which has kıtabi, and Kurdish, which has pirtûk. Tajik is another Iranian language with an Arabic-derived loanword for ‘book’, kitob. Indo-Aryan languages too split into those that have an Arabic loanword (e.g. Hindi kitāba) and those that do not (e.g. Bengali ba’i).

Another common root among languages of India, whether Indo-Aryan or Dravidian, is linked to the Sanskrit word pustikā. Its etymology is not entirely certain, but some scholars believe it to be borrowed from some Middle Iranian language. It is comparable to the Sogdian pwst’k ‘book, document, sutra’, Parthian pwstg ‘book, parchment’, and Persian pust ‘skin, hide’. Reflexes of this root are found in such Indo-Aryan languages as Assamese puthi, Bengali pustôk, Bhojpuri pōthī, Gujarati pustak, Kashmiri pūthi, Kumaoni pothī, Maithili pothā, pothī, Marathi pustak, Nepali pothi, Oriya pothā, pothi, puthi, Pali potthaka, Punjabi pustak, Sindhi pothu, pothī, Singhalese pota, and Urdu pustak. Sanskrit-derived forms are found also in several Dravidian languages, where it is an Indo-Aryan loanword. Compare, for instance, Malayalam pustakam, Tamil puttagam, Telugu pustakam, and Kannada pustaka. Another language with a related form is Malay, where pustaka coexists with the abovementioned buku.

One final family to be considered here is Finno-Ugric languages. Languages in this family have very different roots for ‘book’: compare the Hungarian könyv, Komi nebög, Estonian raamat, Finnish kirja, and Mari knaga. Where do these words come from? The Mari word—like its counterpart in the neighboring Turkic language Chuvash—comes from Russian. The Estonian word raamat (and its cognate in Latvian, grāmata) also derives from Old Russian gramota meaning ‘document, writing’, which derives in turn from Ancient Greek grámmata ‘letters, writing’, which also gives us grammar and— perhaps surprisingly—glamour. In Finnish, the same root gave rise to raamattu meaning ‘Bible’—once again a connection between ‘a book’ and the Book, the Holy Scriptures, is undeniable. The Finnish word kirja (and its Veps cognate kirj) originally meant ‘carved mark/decoration’; Estonian retains the root in kiri ‘letter (to be sent)’. I am not aware of the etymology of the Komi nebög or the Hungarian könyv.



