On Google Translate, again

Sep 17, 2011 by

In a couple of postings last winter (see here and here), I already discussed Google’s machine translation tool, Google Translate (GT) and expressed my skepticism about it. Today, another article on GT by David Bellos from The Independent has crossed my desktop. Three points made in that article are worth discussing.

Point number one: as I showed experimentally with my little “gay goose” investigation, and as the article confirms, GT does not really translate from any one of its 58 languages to any one of the 58 languages directly. Only a few pairs are subject to direct translation. So in fact GT does not really provide 3,306 separate translation services, as advertised. Rather, it uses the so-called “pivots”, or intermediary languages. For example, if you ask GT to translate a bit of text from Farsi into Icelandic (or vice versa), the translation will be mediated by English (which is indeed the most common intermediary language, for obvious reasons).

The reason that GT uses intermediary languages is because GT does not really “translate”. Instead, it searches an enormous database of translations already made by human translators for a good match. Believe it or not, a big chunk of that database is mystery novels (I just knew it that mystery novels are good for something!!!). Thus, as the author of the article puts it, “John Grisham makes a bigger contribution to the quality of GT’s Icelandic-Farsi translation device than Rumi or Halldór Laxness ever will”. As any human translator would know, using intermediary languages doesn’t improve the quality of the translation. Many anecdotal stories about such “mediated” translations are out there and you might have heard some. If you still don’t believe it, read what happened to the “happy geese” in my mediated translation exercise.

Point number two: although the author of the article David Bellos admits that GT “may also produce nonsense”, he claims that

“the kind of nonsense a translation machine produces is usually less dangerous than human-sourced bloopers. You can usually see instantly when GT has failed to get it right, because the output makes no sense, and so you disregard it. (This is why you should never use GT to translate into a language you do not know very well. Use it only to translate into a language in which you are sure you can recognise nonsense.)”

While it is true that a GT-produced “translation” may be less “dangerous” in the sense that it is typically so outlandish that anyone with some knowledge of the target language will be able to spot it, it is also far less useful in practical terms. Because in most cases — our little “gay geese” experiments aside — people use a translation tool (or turn to a human translator, for that matter) because they need a useful translation. When GT spits out lots of nonsense, what are you going to do with it, even if you recognize it as such? You needed a text translated, you used GT and you are still no closer to having a decent translation than you were before you started. If you need to decipher a sentence or two, you might curse under your breath and ask your Facebook friends for help. If you have a real-life large text, say, a legal document or a technical manual, you will call your local translation company for their estimate. One way or another, you will turn to human translators who may produce “bloopers”, may even produce a bad, inaccurate translation, for all you know (and assuming you don’t know the source language, you might not be able to recognize it), but they give you a chance of producing a translation that you can actually use.

As for the danger of human translators producing errors that can’t be easily recognized by someone who knows only the target language, it is true that it may happen. And it often does. Because errare humanum est, as we all know. But that’s exactly why any reputable translation service will use quality control measures which in the overwhelming majority of cases will catch the errors: each translation done by a reputable firm will be edited by an editor and proofread by yet another set of eyes. Sometimes, a document will even be backtranslated, edited, proofread and compared to the original. For things like the international space station documentation or clitical trial protocols, the QA is very strict.

In contrast, the assumption with GT is that the machine will either produce a workable translation or such outlandish garbage that it will be thrown away immediately. While this is often the case, more subtle bloopers are not completely excluded either. Once again, I refer you to the “gay geese” story: if you only know the target language, which is not English (and English is used as an intermediary), you might wonder whether a children’s song really talks about some rare breed of birds with an odd sexual orientation (odd for birds, anyway). Or you might just “buy” it. Therefore, using machine translation does not really guarantee that you will be able to catch the errors. But what’s wrong with using the same (human) editors and proofreaders, you might ask? Again, speaking from experience with machine translation in industrial situations, everything! Because most of the time machine translation tools spit out stuff that it is so nonsense, that in practical terms it is easier to do the human translation from scratch than to use an editor to fix a machine-made translation. Thus, while GT may be a cute tool for games and experiments, it is not a tool that can be used for any large-scale translation projects.

Point three: the author challenges all that is known in current linguistic theory about how people know, produce and process language (is he another “amateurish linguist”? — probably!) by saying that

GT is also a splendidly cheeky response to one of the great myths of modern language studies. It was claimed, and for decades it was barely disputed, that what was so special about a natural language was that its underlying structure allowed an infinite number of different sentences to be generated by a finite set of words and rules.

According to this writer, human translators really do the same thing that GT does: scan some huge database of prior translations in search for a good match. Supposedly, humans do not decompose a sentence into its constituent parts, translate those, then understanding the context, recompose them again into a meaningful sentence in the target language. They search their memories for bits and pieces of things they’ve already translated and use those.

If you, like David Bellos, think that human translators store sentences they’ve already translated, try this little experiment. When you are in the middle of a conversation or discussion with someone, stop them and ask them to repeat verbatim the previous sentence they’ve just said. Chances are, they will remember the “pure meaning” of what they said, but not verbatim how they said it. (You might want to wear a wire in order to confirm!). In the off-chance that you get a correct response, as rare as that is, next time ask you interlocutor to repeat verbatim the third sentence back from where you stop them. I’ve tried many times, and always got a negative result (and a stare of incomprehension to go with it!). When you scare all your friends off with your little crazy experiments, try it on yourself — just stop suddenly and think what your sentence three sentences ago was, verbatim.

What this experiment will convince you of, I am sure, is that, contrary to David Bellos’s beliefs, even if we “encounter the same needs, feel the same fears, desires and sensations at every turn”, we do not “say the same things over and over again”, at least not in exactly the same way. Although when I debate the merits of machine translation with its advocates, it does seem to me that we do.


Subscribe For Updates

We would love to have you back on Languages Of The World in the future. If you would like to receive updates of our newest posts, feel free to do so using any of your favorite methods below:

      
  • John Cowan

    I would certainly never use GT for technical translation, but I do use it several times a week to read one of my favorite blogs, Martian Spoken Here. Despite the title and the current top story, it is almost entirely in French, with some English and some Mauritian Creole (the title is a pun). My experience is that it reads very well for the most part, but there are some GT oddities that stick out (sometimes I comment on them, as I am encouraged to comment in English).

    Much of the time I mouse over one of the translated sentences to see the original, especially when a lexical contrast is being made that GT obscures (the subject matter of the blog is Mauritian French usage). The result is that I feel as if I am able to read French even though I am not, simply by having most of the vocabulary supplied for me. Of course, both English and French are fairly nuclear members of the European Sprachbund, so this is not really surprising.

    Of course, GT is hopeless with the Creole, so I have to (a) try to figure out the French equivalent, if any; and (b) try to convert that to English with my less than rudimentary French. Or just skip it, in accordance with one of the language acquisition rules for children: "Pretend you know what's going on, even when you don't."

  • pkaustin

    There is quite a bit of literature which supports the notion that a lot of language use is routinised and that we rely heavily of prefabricated bits, not just words and expressions, but whole phrases and clauses (sometimes called "constructions"). The "creativity" aspect of language that you emphasise is rather overdrawn. Note also that your request to "repeat exactly what you just said" may fail because we have a memory flushing mechanism that makes room for upcoming language searches/creations, rather than storing past sequences in immediate processing memory.

  • Ran

    Bellos' thought process seems to be: "I've noticed that Google Translate sometimes outputs nonsense. This really stands out. Therefore, that must be the only sort of mistake it makes! When it outputs something that's not nonsense, that must mean it's correct!"

  • Asya Pereltsvaig

    @Ran: Thank you for your comment! I agree with you that a lot of nuance is lost in GT. And there's a lot of grey area between "nonsense" and "correct".

  • Asya Pereltsvaig

    @pkaustin: Of course, we use lots of "prefabricated bits" in discourse, like "Hello", "How are you?" etc. But these are usually not the things one needs translated. As for "constructions", it is not clear if that's how language is processed or if it is just a convenient theoretical construct. One way or another, constructions are useful exactly because different lexical items can go into them, meaning that the actual chunks are not memorized…

  • Asya Pereltsvaig

    @John Cowan: Thank you for your comment! First of all, good for you for reading another language-related blog! Secondly, I think the experience you relate is very similar to what other people commented on here and in response to other postings on GT: you can get the gist of it, but if this were a human-made translation, you would hardly agree to pay for such poor quality, right? One additional point: as you confirm translation from French into English are much better than what I described for other language pairs. I expect that this is one of the pairs that is done directly, which confirms my point that using intermediary languages decreases the quality of translations. Besides, like you point out, English and French are closely related and very similar. I wouldn't call them members of a Sprachbund in a technical sense, but they are members of the same Indo-European family (actually, of the same western half of the family, which makes them even closer). Both languages (and their respective sub-families) developed in pretty much the same direction: lost of nominal inflection, loss of (some) verbal inflection, strict word order, etc. Moreover, French (and its parent, Latin) has had a strong influence on English (via the Normans, as well as more recently), and vice versa French is the most Germanized of the Romance languages (hence, it is a non-pro-drop language, for example). English too has had an influence on French. So it is not surprising that French-English translation is not the worst in GT…

  • Ran

    I'm sure that the similarities between French and English are relevant, but I also find that Google Translate is much less bad at translating Hebrew to English than vice versa. I could actually imagine someone using its Hebrew-to-English translations as John Cowan describes, to help understand a Hebrew text (provided they already know some Hebrew themselves), but I could not imagine someone doing that with its English-to-Hebrew translations. I think the English-to-Hebrew translations are chiefly useful for entertainment purposes, and for demonstrating the limitations of the current state of machine translation.

  • Asya Pereltsvaig

    @Ran: I guess one can get the gist of the text from a Hebrew-to-English translation but quite often you get badly misled. As an experiment, I translated a random sentence from a Wikipedia Hebrew homepage (פורטל ארכאולוגיה של המזרח הקרוב מוקדש לגילוי תרבויות העבר המפוארות בארץ ישראל, מצרים, סוריה, עיראק, לבנון ואנטוליה, וביניהן שומר, אכד, מצרים העתיקה, אשור, בבל, פרס העתיקה החתים, ממלכת ישראל וממלכת יהודה, תרבויות כנען, אוגרית, צור וצידון.) Even though most of the text is just a list of terms and there is no complex syntax here, the GT produced a couple of errors that are significant to the understanding of the text (Travel Archaeology of the Near East is dedicated to the discovery of past cultures magnificent land of Israel, Egypt, Syria, Iraq, Lebanon and Anatolia, including the guard, Akkad, ancient Egypt, Assyria, Babylon, Persia, the ancient Hittites, the Kingdom of Israel and the Kingdom of Judah, cultures of Canaan, Ugarit, Tyre and Sidon). First, it seems to ascribe the property "magnificent" to "land of Israel" (note that in English we'd need a definite article too) rather than to "past cultures" (a more natural way to say it is "cultures of the past"). Also, what's the connection between "cultures" and the geographical terms listed? There should be an "of" somewhere? Second, not only did the GT mess up the only syntax it had in the sentence, it mess up the translation of terms as well: "guard", what guard? Note that even if you click for alternative translation, the correct one — Sumer — is not provided. If you know Hebrew, you might backtranslate "guard" and guess at the correct translation, but then why would you need GT at all? My point precisely: it doesn't translate the text in the sense of producing a useable translation. One needs to know both the target and the source language (at least to some degree) in which case it is not clear why one would need to use a GT at all.

  • Ran

    And it also mistranslates "portal" as "travel". But I think you're underestimating the potential usefulness of even bad translations, provided the source text is kept conveniently at hand (which is something that Google Translate has a wonderful UI for, usually, though sometimes its offsets are off a bit and it highlights the wrong bit of source text . . .). Someone who knows Hebrew a little, but who reads very slowly or who has a very limited vocabulary, could certainly make use of the Google Translate translation to speed up their reading and increase their reading comprehension. Have you ever had the experience, when you were just learning a language, that you tried to read a text and found that you couldn't get a handle even on what it was about — but that then, when someone told you what it was about, you suddenly found that you could read and understand a lot of it? Google Translate is sometimes — depending on the source and target languages — perfect for that sort of thing, except that it can help even at the level of individual phrases and sentences.

    It's no substitute for skilled human translation, because it's not even minimally competent at the things that human translation is used for, but since it's free and instant, it can still be helpful for things that aren't worth the time and cost of hiring a human translator.

    (And really, all I was saying is that, in my experience, it's not as bad at Hebrew→English as at English→Hebrew. You explained its skill at French→English in terms of the similarities between French and English, and I'm certain you're right about that, I just think there might be other factors in play as well.)

  • Asya Pereltsvaig

    @Ran: I didn't mean to offend you (I am sorry if I did) and indeed the point you made about GT being not as bad at Hebrew→English as at English→Hebrew is interesting. From my experience with other machine translation and text analysis tools, they are typically *worse* at Hebrew→English and I am not sure how GT achieved the opposite results.

    And I am glad you agree with me that GT is "not even minimally competent at the things that human translation is used for". I am not denying that it can be minimally useful for little bits of, say, webpages and such. Still, it is far less than what its proponents — including David Bellos — say it is. Worse still, I don't see how it can be significantly improved to make it produce results that are comparable to what human translators do. And as wary as I am about GT used for translation, I am even more wary about it being used as a language teaching tool, as you suggest (not to replace human teachers, but even to supply summaries and "gist" of texts). To me, it's equivalent to learning a language in an immersion classroom where everybody speaks bad target language: how can you expect to learn to speak it well then? Reading texts of appropriate complexity (to one's level of knowledge of the target language), with a dictionary if needed, is still the best way to learn… Better than GT, for sure.

  • Ran

    Don't worry, I wasn't offended; sorry if I gave that impression.

  • Asya Pereltsvaig

    @Ran: no problem, I am glad we are on the same page. And the point about "portal" vs. "travel" is a good one: I didn't notice this…

  • John Cowan

    Why do you reject the term Sprachbund in this connection? "Both languages (and their respective sub-families) developed in pretty much the same direction […]: [one] has had a strong influence on [the other] […] and vice versa" seems to me the very definition of a Sprachbund. I have just finished reading the book The Changing Languages of Europe by Berndt & Kuteva (2006), and it lays out in detail many of the properties of the European Sprachbund, which is by no means coterminous with Indo-European: it excludes the Celtic, Armenian, and Indo-Iranian branches altogether (and marginalizes North Germanic), but includes Hungarian (and marginally Finnish and Estonian).

  • Asya Pereltsvaig

    @John Cowan: I haven't read the book yet, but thank you for the reference! I am not sure what was meant by the European Sprachbund, but the term is typically applied to areas where grammatical features seem to spread across family relationships. For example, in the Balkans, Romance, Slavic and other languages have come to share certain grammatical features — not shared by other members of their respective families (or branches within Indo-European families). Similarly, in the Baltic area, Baltic, Slavic and Finno-Ugric languages have come to share grammatical features not shared by other languages in their families or branches. Depending on how exactly the European Sprachbund is defined we may or may not be able to treat it as a Sprachbund, and from your description I don't see how we can. Even if we include only Germanic and Romance languages, the mutual grammatical influence doesn't extend much beyond English, French and northern Italian dialects. What's the grammatical influence of German on Spanish? Or of Portuguese on Dutch? Etc. Essentially, thus defined area includes too many languages that have not influenced each other grammatically. That languages within the same family develop in the same direction is not technically evidence of a Sprachbund. I hope that makes sense…

  • John Cowan

    I grant that the European or SAE (Standard Average European) Sprachbund is big and diffuse compared to the Balkan one, but I don't think you can reasonably require that every language affect every other language in order to have a Sprachbund. If we consider SAE at the level of families, we know that the Romance family developed the definite and indefinite articles from the demonstrative pronoun and the word for 'one' in post-Roman times, and that they spread from there into the Germanic languages a few centuries later (Icelandic still has no indefinite article, probably as a consequence of its remoteness). Greek is the only family with an original definite article (though the older the Greek, the fewer definite articles it has) and from there it spread into the Balkan Sprachbund, which is part of SAE. A similar story can be told about the have-perfect (sometimes for just transitive verbs, sometimes for both transitive and intransitive verbs), an SAE innovation.

    A shorter article on the subject is Martin Haspelmath's "How Young is Standard Average European? (behind a paywall; I can send you a copy if you don't have access to ScienceDirect). He lists as additional features of SAE: the participal passive, the tendency to make inchoatives (anticausatives) from causatives with a reflexive pronoun, the use of nominative case for experiencers, negative indefinite pronouns (with or without negated verbs), particle-based comparatives, the word for 'and' as a proclitic on the second conjunct, postnominal relative clauses beginning with an inflected pronoun (derived from an interrogative) that also serves as a resumptive, and verb fronting in yes/no questions. All these features are rare outside Europe, none are inherited from PIE (and many are found in Hungarian). From other sources we hear of the comitative/instrumental fusion, the semantic merger of perfect and preterite, the use of a word for 'like' in equative comparisons to mark the standard of comparison, and others.

  • Asya Pereltsvaig

    @John Cowan: perhaps you are right, but I would have to read more (thanks for the link!) before I can tell you if I change my mind…

  • Paul Ogden

    My first language is English. I am also quite competent in Hebrew. I use Google Translate frequently for rough translations from Hebrew to English, which, with a few minutes of light editing, are sufficiently accurate to send to a household-name client. That said, when real precision is required I also purchase a "proper" translation for the client.

    I believe the original IBM English-French corpus was based on the Canadian Hansard, the proceedings of the Canadian parliament. I recall reading that this was by far the largest English-French corpus at the time. Given how much politicians like to talk, maybe it still is. 😉

  • swahili translation services

    Great Post. This is amazing blog ever. thanks

  • Asya Pereltsvaig

    @swahili translation service: Thank you!

  • Pingback: Google Translates Gender | Computational Linguistics | Languages Of The World()

  • Asya, I scanned your erudite treatment of the subject, but forgive me if I missed this aspect. I’ve used GT for almost five years now (or since it existed, whichever came first) as I had looked for others prior like Dragon Natural Language, Babylon etc. and Babblefish worked best before IMHO as it seemed to respect context (meaning the longer the sentence the better it became, viz. web pages, and it was the only Russian for the longest time). Perhaps I can contribute from the IT perspective, tho I am by no means an expert, just an observer in what I hope to be astute enough to avoid major pitfalls LOL

    Have you addressed how computer system learn over time? For example I use LinkedIn a lot, and while at first its suggestions for connections appeared pretty random, over time it appeared to find my recent connections in SoCal, then my earlier ones in Texas, and now it’s working all the way back to my earliest possible in Calgary. Likewise you must know Google isn’t an actual or encyclopaedic search, but rather a meta-search or a search of searches – all it indexes are its own, and it does so very well and very fast give them that, but it is a mandarin system in that it basically reinforces what is known and doesn’t go where no man has gone before (apologies to Star Trek). So IT geeks fare very well unlike amateur medievalists like me, simply because of the sheer number of searches that happen in one compared with the other. 

    What I’m trying to say, is that on the plus side, such computer systems are self-learning systems: they get better and better at sifting thru more and more info they store and collect and build rule bases around. That is what I experienced in GT, that Latin and Hungarian were pretty poor 2-3 years ago, but over time they’ve improved and now they’re pretty decent… given the caveats of such media and we manage our expectations around that! I use Latin to remind me of words as I remember grammar and syntax (indeed that’s what helped launch my 2nd career in computers as I could fix code and now I can write scripts) but I forget vocabulary. I use Hungarian to check and if needed upgrade my vocabulary, which I learned from my parents post-1956 uprising emigration and therefore vintage mid-fifties LOL I know there may be other tools better suited but GT is so handy! (and ease of use / accessibility is a whole other topic OMG)

    So what I’m getting at, is how about machine learning as applied to the related topics of search and translations? Servus, Andrew

    • My point in this series of posts on GT is that it doesn’t really translate. What it does can be helpful to some degree in some situations, but whether machine learning is at play, GT may “learn” to do the same wrong thing better, but it will not learn to translate.

      • I understand that, Saya but you hit the point – translate may be a misnomer, but it means many things to many people – I say it’s a Faustian / Promethean thing, in that machines are only as good as their creators: no matter what anyone says, they cannot improve on our reasoning unless we put in the right semiotics. The semantic web Berners Lee et al. worry about, and metadata I worry about in my speciality are small steps toward that, but we have a looong way to go if you think how much it took us to get here – and you describe it admirably in this blog and http://geocurrents.info! In closing I’ll take a risk and say that I wouldn’t put it as “do[ing] the same wrong things better”… that’s a judgemental statement that doesn’t become a smart gal like you! Keep’on bloggin’, A

      • I understand that, Asya but you hit the point – translate may be a misnomer, but it means many things to many people – I say it’s a Faustian / Promethean thing, in that machines are only as good as their creators: no matter what anyone says, they cannot improve on our reasoning unless we put in the right semiotics. The semantic web Berners Lee et al. worry about, and metadata I worry about in my speciality are small steps toward that, but we have a looong way to go if you think how much it took us to get here – and you describe it admirably in this blog and http://geocurrents.info! In closing I’ll take a risk and say that I wouldn’t put it as “do[ing] the same wrong things better”… that’s a judgemental statement that doesn’t become a smart gal like you! Keep’on bloggin’, A

        • Well, the point of this blog is to express my opinions, not just impart information…

  • Pingback: Islamic Fatwas, Grammatical Gender, and Translation—Or Beware of Those Sexualized Vegetables! | GeoCurrents()