Some red flags around Google’s language-to-text AR glasses

By SA Applin 3 minutes Read

Google seems to be working on AR glasses again, but this time it’s showing off a new feature that translates speech into readable text.

At last week’s Google I/O 2022, the company demonstrated a prototype of AR glasses that can translate spoken language into readable display text. Google hasn’t hinted if they’ll be developing these as a product, or when, but the fact that they’ve shown them to developers indicates that they’re thinking about expanding the AR glasses model to leverage their massive data sets and existing technologies. use.

If Google moves forward with the product, it will likely frame it as a device that would try to break the language barriers. Sounds great, right? No more searching for Google Translate on the web and pecking phrases in our mobile phones to translate things. When (or if) these come on the market, we’ll finally be able to read foreign signs, order correctly at restaurants, and even make new friends more easily when we travel. More importantly, there would be a way to quickly translate communication in an emergency, when people might not all speak the same language. On another level, these “translation glasses” could also open communication channels for the deaf and hard of hearing, giving them a new way of communicating with those around them.

But as with all new technology ideas, Google’s translation glasses can come with huge social costs: to our privacy, our well-being, and our collaboration with each other in our communities. What does it mean if Google becomes the translator for our lives, and are we comfortable with that idea?

The problem with any type of technology translation device is that it has to “listen” to the people around it to get the data to be translated. And if the AR glasses are listening, we should know to what or who they listen to – and when they listen?† At the moment we do not know whether these glasses can also distinguish more than one person at a time. We will also need to know if it is legal for these glasses to listen without permission-and if you need someone’s permission to record them to translate them, do you need the glasses to translate the permission? We don’t know if in the future these goggles will have the capacity to record what they translate, nor will we know if they can identify who they’re recording at any given time, or what range they can listen in. And if they record glasses, or even with the transcribed text, we need to know if it’s stored somewhere that can be erased, and if people can sign out in a public place without being recorded while doing so.

Let’s assume for a moment that these Google glasses don’t pick us up, and Google manages to figure out permission and consent. Given that in our busy, noisy world, the usual problems with speech-to-text can still abound in the form of misunderstandings, misrepresentations, etc., in what Google “hears” and what it writes as a result of that hearing. The tech can also have a lot of spelling mistakes and confusion with mixing languages. if The edge be aware, many of us “code-switch” with words from many different languages ​​interspersed with the added complexity that not all of them read from left to right, which also has to be adapted.

Now add to that a total population using these as they roam, which evokes much of what I wrote with Dr. Catherine Flick about Meta’s Pre-Ray-Ban Stories Project Aria Glasses† Many of the same problems remain, except that with these new Google glasses, people may be walking around and reading transcripts, which is again more like what happened in the early days of cell phones and Divided Attention, create potentially dangerous results when distracted people walk into traffic or fall into fountains.

One of the biggest concerns with the glasses is Google’s apparent assumption that technology can solve cultural problems — and that if the technology doesn’t work, the solution is to develop and adopt more technology. In this case, solving intercultural communication problems cannot be completely solved with language translation. Technology can help, but these glasses don’t translate culture or cultural norms, such as whether a person is comfortable being direct or indirect, or any of the many cultural nuances and cues found in the way different people in different groups communicate with each other. For that we need other people to guide us.

S.A. Applin, PhD, is an anthropologist whose research explores the domains of human action, algorithms, AI and automation in the context of social systems and sociability. You can find more at @anthropunk and PoSR.org