The NSA’s voice-recognition system raises hard questions for Echo and Google Home
Suppose you’re looking for a single person, somewhere
in the world. (We’ll call him Waldo.) You know who he is, nearly
everything about him, but you don’t know where he’s hiding. How do you
find him?
The scale is just too great for anything but a
computerized scan. The first chance is facial recognition — scan his
face against cameras at airports or photos on social media — although
you’ll be counting on Waldo walking past a friendly camera and giving it
a good view. But his voice could be even better: How long could Waldo
go without making a phone call on public lines? And even if he’s careful
about phone calls, the world is full of microphones — how long before
he gets picked up in the background while his friend talks to her Echo?
As it turns out, the NSA had roughly the same idea. In an Intercept piece on Friday,
reporter Ava Kofman detailed the secret history of the NSA’s speaker
recognition systems, dating back as far as 2004. One of the programs was
a system known as Voice RT, which was able to match speakers to a given
voiceprint (essentially solving the Waldo problem), along with
generating basic transcriptions. According to classified documents, the
system was deployed in 2009
to track the Pakistani army’s chief of staff, although officials
expressed concern that there were too few voice clips to build a viable
model. The same systems scanned voice traffic to more than 100 Iranian
delegates’ phones when President Mahmoud Ahmadinejad visited New York City in 2007.
We’ve seen voice recognition systems like this before — most recently with the Coast Guard
— but there’s never been one as far-reaching as the Voice RT, and it
raises difficult new questions about voice recordings. The NSA has
always had broad access to US phone infrastructure, something driven
home by the early Snowden documents, but the last few years have seen an
explosion of voice assistants like the Amazon Echo and Google Home,
each of which floods more voice audio into the cloud where it could be
vulnerable to NSA interception. Is home assistant data a target for the
NSA’s voice scanning program? And if so, are Google and Amazon doing
enough to protect users?
In previous cases, law enforcement has chiefly been
interested in obtaining specific incriminating data picked up by a home
assistant. In the Bentonville murder case
last year, police sought recordings or transcripts from a specific
Echo, hoping the device might have triggered accidentally during a
pivotal moment. If that tactic worked consistently, it might be a
privacy concern for Echo and Google Home owners — but it almost never
does. Devices like the Echo and Google Home only retain data after
hearing their wake word (“Okay Google” or “Alexa”), which means all
police would get is a list of intentional commands. Security researchers
have been trying to break past that wake-word safeguard for years, but
so far, they can’t do it without an in-person firmware hack, at which point you might as well just install your own microphone.
But the NSA’s tool would be after a person’s voice
instead of any particular words, which would make the wake-word
safeguard much less of an issue. If you can get all the voice commands
sent back to Google or Amazon servers, you’re guaranteed a full profile
of the device owner’s voice, and you might even get an errant houseguest
in the background. And because speech-to-text algorithms are still
relatively new, both Google and Amazon keep audio files in the cloud as a
way to catalog transcription errors. It’s a lot of data, and The Intercept is right to think that it would make a tempting target for the NSA.
When police try to collect recordings from a voice
assistant, they have to play by roughly the same warrant rules as your
email or Dropbox files — but the NSA might have a way to get around the
warrant too. Collecting the data would still require a court order (in
the NSA’s case, one approved by the FISA court), but the data wouldn’t
necessarily need to be collected. In theory, the NSA could appeal to
platforms to scan their own archives, arguing they would be helping to
locate a dangerous terrorist. It would be similar to the scans companies
already run for child abuse, terrorism or copyright-protected material
on their networks, all of which are largely voluntary. If companies
complied, the issue could be kept out of conventional courts entirely.
Albert Gidari, director of privacy at the Stanford Center
for Internet and Society, says that kind of standoff is an inherent
problem when platforms are storing biometric-friendly data. After years
of sealed litigation, it’s still unclear how much help the government
has a right to compel. “To the extent platforms store biometrics, they
are vulnerable to government demands for access and disclosure,” says
Gidari. “I think the government could obtain a technical assistance
order to facilitate the scan, and under [the technical assistance provision in] FISA, perhaps to build the tool, too.”
We still don’t have any real evidence that those orders are being served. All The Intercept
article speaks to is how the program worked within the NSA, and no one
at Google or Amazon has ever suggested something like this might be
possible. But there’s still good reason to be suspicious: if such order
were delivered to a tech company, it would probably come with a gag order preventing them from talking about what they’d done.
So far, there’s been little transparency about how much
data agencies are getting from personal voice assistants, if any. Amazon
has been noticeably shifty about listing requests for Echo data in its transparency report. Google treats the voice recordings as general user data,
and doesn’t break out requests that are specific to Google Home.
Reached for comment, an Amazon representative said the company “will not
release customer information without a valid and binding legal demand
properly served on us.”
The most ominous sign is how much data personal
assistants are still retaining. There’s no technical reason to store
audio of every request by default, particularly if it poses a privacy
risk. If Google and Amazon wanted to decrease the threat, they could
stop logging requests under specific users, tying them instead to an
anonymous identifier as Siri does. Failing that, they could retain text
instead of audio, or even process the speech-to-text conversion on the
device itself.
But the Echo and the Home weren’t made with the NSA in
mind. Google and Amazon were trying to build useful assistants, and they
likely didn’t consider that it could also be a tool of surveillance.
Even more, they didn’t consider that a person’s voice might be something
they would have to protect. Like ad-targeting
and cloud hosting itself, what started as information technology is
turning into a system of surveillance and control. What happens next is
up to Google, Amazon, and their customers. (Via theVerge)
Post a Comment