Alexa Gives Amazon a Powerful Data Advantage

This content has been archived. It may no longer be accurate or relevant.

From the MIT Technology Review:

“Hey, Alexa”—a phrase that millions of people call out at home just before telling Amazon their desires at that moment. All those people asking Alexa to order kitchen supplies, turn on the lights, or play music gives Amazon a valuable stockpile of data that it could use to fend off competitors and make breakthroughs in what voice-operated assistants can do.

Recommended for You
For $149 a Month, the Doctor Will See You as Often as You Want
Mathematical Model Reveals the Patterns of How Innovations Arise
Questionable “Young Blood” Transfusions Offered in U.S. as Anti-Aging Remedy
Will Science Have a Seat at President Trump’s Table?
A 100-Drone Swarm, Dropped from Jets, Plans Its Own Moves
“There are millions of these in households, and they’re not collecting dust,” Nikko Strom, a speech-recognition expert and founding member of the team at Amazon that built Alexa and Echo, said at the AI Frontiers conference in Santa Clara, California, last week. “We get an insane amount of data coming in that we can work on.”

Strom said that data had already helped the company make progress on a longstanding challenge in speech recognition known as the cocktail party problem, where the challenge is to pick out a single voice from a hubbub of many people talking.

Initially Alexa could easily tell that someone had called out its name, but—like other voice-recognition systems—it struggled to know which words being said around it were the request being issued. Then Strom’s team developed a system that notes characteristics of a voice that calls out “Alexa” and uses them to home in on the words of the person asking for help.

The data Amazon is amassing to take on problems like that could be unique. Standard datasets available for training and testing speech recognition systems don’t usually include audio captured in home environments, or using microphone arrays like that the Echo uses to focus on speech from a particular direction, says Abeer Alwan, a professor at University of California, Los Angeles, who works on speech recognition.

. . . .

Strom said he also hopes that his team’s data trove could eventually help upgrade Alexa to being able to follow two people speaking simultaneously. “It’s hard, but there’s been some progress,” he said. “It’s super interesting for us if we could solve that problem.”

Strom didn’t say what Alexa might be able to do once that problem is solved. But it might make it more natural for multiple people to interact with an Echo or other device at once, whether that’s kids peppering Alexa with questions or their parents rattling off a shopping list.

The data piling up from Alexa could also help Amazon fend off Google’s Echo competitor, Google Home, which launched late last year. Google can draw on years of work in Web search and voice search, and sizeable investments in artificial intelligence. But its previous products and businesses don’t naturally collect speech like that of a person calling out to a device in the home, or on the same type of requests people ask home assistants to serve.

Link to the rest at MIT Technology Review