At the end of its I/O presentation on Wednesday, Google pulled out a “one more thing”-esque surprise. In a short video, Google showed augmented reality glasses that have only one purpose: to display audible language translations right in front of your eyes. In the video, Google product manager Max Spear called the capabilities of this prototype “subtitles for the world” and we see family members interact for the first time.
Wait a second. Like many people, we’ve used Google Translate before and largely consider it a very impressive tool that just happens to make a lot of embarrassing mistakes. While we can trust it to give us directions to the bus, that’s not nearly the same as trusting it to correctly interpret and relay our parents’ childhood stories. And hasn’t Google said it’s finally breaking the language barrier?
In 2017, Google released real-time translation as a feature of its original Pixel Buds. Our former colleague Sean O’Kane described the experience as “a laudable idea with a deplorable execution” and reported that some people he tried it with said it sounded like he was a five-year-old. That’s not quite what Google showed in its video.
We also don’t want to ignore the fact that Google promises that this translation will take place in AR glasses† Not to get you in a sore spot, but the reality of augmented reality hasn’t quite caught up with Google’s concept video from ten years ago† You know, the one that acted as a precursor to the much-maligned and embarrassing Google Glass?
To be fair, Google’s AR translation glasses seem a lot more focused than what Glass was trying to achieve. From what Google showed, they’re meant to do one thing — display translated text — not act as an environmental computing experience that could replace a smartphone. But even then, making AR glasses isn’t easy. Even a moderate amount of ambient light can make viewing text on translucent screens very difficult. It’s challenging enough to read subtitles on a TV with some sun glare through a window; Now imagine that experience strapped to your face (and with the added pressure of having a conversation with someone you can’t understand on your own).
But hey, technology moves fast — Google may be able to overcome a hurdle that has thwarted its competitors. That wouldn’t change the fact that Google Translate isn’t a panacea for conversations in multiple languages. If you’ve ever tried to have a real conversation through a translation app, you probably know that you need to speak slowly. And methodical. And clear. Unless you want to risk an unreadable translation. One slip of the tongue and you could just be done.
People don’t talk in a vacuum or like machines do. Just as we code-switch when speaking to voice assistants like Alexa, Siri, or the Google Assistant, we know we need to use much simpler phrases when dealing with machine translation. And even if we speak correctly, the translation can still be uncomfortable and misinterpreted. Some of our roadside colleagues who speak Korean fluently pointed out that Google’s own pre-roll countdown for I/O showed an honorary version of “Welcome” in Korean that no one actually uses.
That mildly embarrassing flub pales in comparison to the fact that, according to tweets from Rami Ismail and Sam EttingerGoogle showed over half a dozen backwards, broken, or otherwise incorrect scripts on a slide during his Translate presentation† †android police notes that a Google employee acknowledged the error and it was corrected in the YouTube version of the keynote.) To be clear, it’s not that we expect perfection, but Google is trying to tell us it’s almost cracking in real-time translation, and mistakes like that make that incredibly unlikely.
congratulations with @Google to get Arabic script backwards and disconnect during @sundarpichai‘s presentation on *Google Translate*, because small independent startups like Google can’t afford to hire someone who knows Arabic script in primary school. pic.twitter.com/pSEvHTFORv
— Rami Ismail (رامي) (@tha_rami) May 11, 2022
Google is trying to fix a problem enormous complicated problem. Translating words is easy; Figuring out grammar is difficult but possible. But language and communication are much more complex than just those two things. A relatively simple example: Antonio’s mother speaks three languages (Italian, Spanish and English). Sometimes she borrows words from language to language in the middle of a sentence, including her regional Italian dialect (that’s like a fourth language). That sort of thing is relatively easy for a human to dissect, but can Google’s prototype glasses handle it? Don’t pay attention to the messier parts of a conversation, such as unclear references, incomplete thoughts, or allusions.
It’s not that Google’s goal isn’t admirable. We definitely want to live in a world where everyone can experience what the research participants in the video are doing, staring wide-eyed as they see the words of their loved ones appear before them. Breaking down language barriers and understanding each other in ways we couldn’t before is something the world needs much more; it’s just that there’s still a long way to go before we get to that future. Machine translation is here and has been around for a long time. But despite the plethora of languages it can handle, it still doesn’t speak humans.