Will “Google Conversation” really converse?

Last month, Google unveiled its latest innovation, an app for phones that can near-simultaneously translate speech from one language to another. “Google Conversation” is so far only available to translate between Spanish and English, but it already generates excited headlines speculating that a true universal translator — an idea popularized by “Star Trek” — might be just around the corner.

While the potential is obvious, I, however, remain sceptical. How can we have an app translating (or more correctly, “interpreting”) from any language to any language if we are not even sure what languages are there? For example, Ethnologue‘s figures are rather hotly debated. But leaving aside such fantastical perspectives, I am doubtful even about the quality of Google Conversation’s ability to do simple translations from/into some of the world’s most widely spoken languages. After all, it can’t be much better than its written translation counterpart, Google Translate, which I already criticized in an earlier posting.

One important thing to understand is that these new-generation translation tools — Google Translate and Google Conversation alike — do not do what human translators do. They do not deconstruct the text, analyze its grammatical structure (which human translators do, even if subconsciously), figure out the meaning and then reconstruct it in another language. In effect, Google Translate/Conversation do not translate. They match. More specifically, they match (bits of) the original text with best translations, where “best” means most frequently found in a large corpus such as the World Wide Web. For example, when translating Shakespeare’s “To be or not to be — that is the question”, why bother with understanding the meaning of individual words and how they are put together. There are already hundreds of (human-made) translations available on the web. One can just choose the one that appears the most frequently. That’s the principle behind Google Translate — and Google Conversation. It does work, but with limited field of application and limited success.

Here’s another problem with a universal translator, and with its first-attempt imitation — Google Translate. From what I’ve seen, it doesn’t translate from any of the languages in the list to any of the languages in the list, at least not directly. How do I know? Let’s do a simple experiment (thanks to Olga Kagan for inspiring it!): take, for example, a simple text to translate, a popular nursery rhyme known to every Russian child. It goes as follows:

Жили у бабуси два веселых гуся, один белый, другой серый, два веселых гуся.
In transliteration: Zhili u babusi dva veselyx gusja, odin belyj, drugoj seryj, dva veselyx gusja.

A human translator — in this case, me — would give you the following English equivalent:

There lived with a grandma two happy geese, one — white, the other grey, two happy geese.

Yes, it doesn’t scan like the original, but we won’t care about the poetic qualities of the translation, just the meaning. Google Translate is completely stumbled by the word babusja, a diminutive form of ‘grandmother’. So for the purposes of this experiment, I replaced it with the more neutral babushka ‘grandmother’. With this modification, Google Translate does fairly well with the Russian-to-English translation, spitting out:

lived with her grandmother two gay geese, one white the other gray two gay goose.

But who’s her referring to? More importantly, note that the Russian veselyj is translated as gay — this will be important shortly. The Multilex dictionary gives the following English translations for veselyj (in alphabetical order): cheerful, chirpy, debonaire, exhilarated, festive, frolic, gay, genial, glad, gleeful, happy, hilarious, jaunty, jestful, jolly, jovial, light-hearted, lively, merry, mirthful, perky, playful, sprightful, vivacious and a few others. This does not mean that Russian has fewer words than English to express joy and happiness (I can almost hear certain readers suggest that this has to do with the brooding nature of the Russian soul, a la Dostoevsky). In addition to veselyj, Russian has bodryj, neunyvajuschij, radostnyj, zhizneradostnyj, zhivoj, schastlivyj and many others. In fact, some scholars have suggested that Russians distinguish more shades of joy and happiness than English speakers do. I will leave this linguistic relativity issue aside for now and go back to our Google Translate experiment.

Let’s now try to translate the same rhyme from Russian into French. Google Translate comes up with:

a vécu avec sa grand-mère deux oies gay, l’une blanche l’autre gris deux oies gay.

Once again, it is not clear whose grandmother it is, as it is ‘her’, but there is no ‘she’ in the rhyme. But more importantly, notice what happened to the joyfulness of the geese! They are now oies gay — ‘two gay geese’! As in ‘two homosexual geese’. Yeah, right! But things get “curiouser and curiouser” if we attempt a Russian-to-Hebrew translation of the same rhyme. This time Google Translate will delight us with the following:

חי עם האווזים שלה שני סבתא הומו, אחד לבן והשני אווז אפור שני הומואים.

The literal translation of the above is: ‘lives with the geese her two grandmother homo, one white and the second goose gray two homos’.

While the English gay is ambiguous between ‘cheerful’ and ‘homosexual’, neither the Russian veselyj, nor any of its context-appropriate translations in Hebrew are similarly ambiguous. So if Google Translate were indeed translating from Russian into Hebrew, this comment on the sexual orientation of grandmother’s geese would not have crept in. It must be that the translation is mediated by an additional step of translating into/from English. The same is true of Russian-to-French translation: it took must be mediated by English.

A quick work of trying to translate the geese rhyme into other languages reveals that with the exception of two of Russian’s closest relatives — Belorussian and Ukrainian, which have a form of veselyj — all other languages I could decipher had either ‘homosexual geese’ (as in the Albanian, German, Dutch, Danish, Swedish, Maltese, Czech, Slovak, even Bulgarian translations) or a form of ‘gay’ (curiously, the Norwegian translation has ‘homophile geese’). Thus, with the not unexpected exceptions of Belorussian and Ukrainian all other translations from Russian are done through English. It is obvious what the creators of Google Translate were thinking: A computer can make a translation so fast. Why have translation tools for each language pair? It can be done through a chain translation just as fast. Who would notice the difference?! But I did.

There are other, structural problems with the Hebrew translation above: it is not clear who lived with who (grandmother with the geese or the geese with the grandmother), why there are ‘two grandmother’ (note also the non-agreement in number), who’s ‘her’ — and it even appears that the grandmother is a lesbian too! All this is a children’s rhyme?! I guess without Google Translate’s help we’d never know the true meaning of this rhyme…

In sum, Google Translate spits out translations that are “instructive” or outright hilarious (in Russian, veselyj), but it’s a far cry from replacing human translators even when aestetic qualities of the translation are not at issue.

