CIT Solutions Blog
Is Your Smart Assistant Undermining Your Security?
Smart assistants commonly appear in the office and home, so much so that the novelty seems to have finally worn off and they are now just another appliance—and, like any other appliance, there are a few quirks that can be frustrating to deal with. For instance, anyone living around these devices has shared a particular experience: the device registering something as a wake word that certainly wasn’t meant to be the wake word.
While this may just seem to be a mildly amusing annoyance, this phenomenon has some concerning security ramifications. Let’s discuss how deep the rabbit hole goes, and what the impact could be to your security.
What Do Our Smart Assistants Actually Hear?
You’re certainly aware by now of how these smart assistants work. A small device lives in your home or office, either as a standalone device or piggybacked into your phone or other appliance. With a simple voice command, assorted information can be shared or activities can be completed with little effort. By default, this voice command is dictated by which device is being used:
- Amazon Alexa devices respond to the term “Alexa,” ”Computer,” ”Amazon,” or “Echo.”
- Google Home devices wake up to “Okay/Hey, Google.”
- Apple’s Siri responds to “Hey Siri.”
- Microsoft’s Cortana reacts to its name, “Cortana,” or “Hey, Cortana.”
However, we’ve all also seen examples of these smart assistants picking up other sounds when we aren’t expecting it to react. How often have you seen someone say something, only to be interrupted as their smart assistant responds?
To be honest with yourself, how often have you been the one to say the wrong thing and trigger an out-of-context response?
You are far from alone. Many people have done the same, and there are some legitimate security concerns paired to this phenomenon. In fact, these incorrect wake words have even inspired academic research.
The Research
In their report, Unacceptable, where is my privacy? Exploring Accidental Triggers of Smart Speakers, researchers used a variety of smart devices to listen to various samples of audio material, including popular television shows like Modern Family and Game of Thrones, news broadcasts, as well as the professional audio data used to train these speakers.
With this approach, the researchers analyzed when the terms that successfully activated the assistants were spoken, ultimately generating a list of over a thousand audio sequences. From there, they were even able to break down the words into their individual sounds and identify other potential false triggers that also activated the voice assistants.
For instance, depending on the pronunciation of the word, the following substitutions awakened the voice assistants:
- Alexa devices also responded to “unacceptable” and “election,” while “tobacco” could stand in for the wake word “Echo.” Furthermore, “and the zone” was mistaken for “Amazon.”
- Google Home devices would wake up to “Okay, cool.”
- Apple’s Siri also reacted to “a city.”
- Microsoft’s Cortana could be activated by “Montana.”
This phenomenon was not only found in devices trained in English, either. Speakers set to German and some from Chinese manufacturers set to Chinese were also tested, with some samples being more resistant to accidental activation, while some new examples proved very effective—for instance, the German phrase for “On Sunday” (“Am Sonntag”) was commonly mistaken for “Amazon.”
What This Means to Privacy
While the results of this study are fascinating, the true purpose is more disconcerting. Let’s go back to the way these assistants work.
As we said, once the wake word or phrase is recognized by the device, it actively begins listening. In an ideal world, the assistant would only recognize the predetermined words and activate when those specific words were spoken. However, we know that isn’t the case, as this study proves.
So, now we have a situation in which there are devices scattered around, waiting for something close enough to their trigger word to register. Keep that in mind.
We have also mentioned that this data is transcribed and reviewed manually to check for accuracy, which means that another person could potentially be given access to the recording. While we obviously can’t say that we know that one of these people could use this access to their own, personal advantage, we also can’t say that we know they wouldn’t.
Let’s put together a scenario: you’re on the phone with a coworker, talking about a client. Your coworker needs access to the client’s data, so you give them the access credentials to do so. Trouble is, at some point in the conversation, your smart assistant heard a potential trigger word and started recording.
As a result, there is now a recording of your client’s account credentials in the cloud, and potentially being anonymously reviewed by a complete stranger. Setting aside the workplace for a moment, how easily do you think it could be that a smart assistant could pick up some other piece of juicy or embarrassing personal information?
While we aren’t trying to scare you away from using smart speakers, we are trying to demonstrate how important it is that you use them mindfully. There unfortunately is not an option to use a customized word to register that the speaker should listen in (as of yet), so for right now, just try to be more aware of what you’re saying when you’re within “earshot” of them. That, and you should make it a habit to disable the device when not in use, and especially when discussing sensitive information.
For more technology tips, best practices, and security advice, make sure you subscribe to our blog!
Comments