Amazon Hits Snags in Revamping Alexa Into Agentic AI
Two years into its planned revamp of Alexa into a smarter voice assistant with generative artificial intelligence (AI) at its core, Amazon is running into deployment roadblocks that range from hallucinations to organizational challenges.
In an interview with the Financial Times, Rohit Prasad, who leads the artificial general intelligence (AGI) team at Amazon, said the company is still trying to solve problems such as hallucinations, where the AI model makes things up, as well as latency, reliability and other issues.
Hallucinations have to be “close to zero,” Prasad said. Since large language models are probabilistic, they can hallucinate when asked questions or encounter scenarios outside their training data. That means when a customer orders from Alexa, the agentic AI assistant might purchase another product or invent the quantity of orders, for example.
A stickler is that the eCommerce giant has to tread carefully because of Alexa’s global scale: It is used in half-a-billion devices in real time. Prasad said this was an “unprecedented” scale for AI assistants. Today, the goal is not only to enable a GenAI-powered Alexa but also make it agentic so it can accomplish tasks in addition to providing information, according to the FT.
To be sure, Amazon’s competitors have rolled out their own GenAI-powered virtual assistants to many users globally, although these are not yet agentic AI.
Meta has deployed Meta AI to its global user base. In a Dec. 6 post on Threads, CEO Mark Zuckerberg said Meta AI had nearly 600 million monthly active users. Meta’s open source family of language models, Llama, power Meta AI.
A few days prior to that, OpenAI said in a Dec. 4 post on X (formerly Twitter) that ChatGPT had more than 300 million weekly active users, with one billion daily messages sent.
Fresh numbers shared by @sama earlier today:
300M weekly active ChatGPT users
1B user messages sent on ChatGPT every day
1.3M devs have built on OpenAI in the US
— OpenAI Newsroom (@OpenAINewsroom) December 4, 2024
Prasad said one complexity in making Alexa into an AI agent that operates at scale in real time is that it has to be able to call hundreds of third-party software and services to complete its tasks.
“Sometimes we underestimate how many services are integrated into Alexa, and it’s a massive number. These applications get billions of requests a week, so when you’re trying to make reliable actions happen at speed … you have to be able to do it in a very cost-effective way,” Prasad said.
Is Cost a Key Bottleneck?
While costs to run GenAI have been coming down, it can still be expensive to run at scale. Pre-training the foundation models can be costly, but inference — when the AI model applies its training on new data — doesn’t come cheap either. Amazon thought of charging a subscription for an LLM-powered Alexa or taking a cut of eCommerce sales, a former employee told the FT.
But it is the technical hurdles that are holding back most of the progress. Making Alexa smarter is not as simple as adding a large language model to it to replace its simpler algorithms. (Amazon recently introduced its own family of Nova foundation models, but did not disclose their parameter sizes.)
“It’s not as simple as moving from one model to another,” Mike Finley, CTO and co-founder of AnswerRocket, told PYMNTS. “Agentic AI is a bit more nuanced. It needs more structure and guidance to get us a better result. We will have to give it the original ‘prompt’ like we would in the past, but there’s more work to shape the AI behavior we want.”
Moreover, “for Alexa to level-up in usefulness, we’re going to have to trust it more. Would you trust Alexa to send the babysitter some cash? What if it hallucinates a couple of zeros on the wrong side? Agentic models can use resources like the ability to access databases or financial accounts, websites, documents, spreadsheets. But powering Alexa with those tools means gaining consumer trust to click ‘allow,’” Finley said.
Alexa falling behind its tech giant competitors has irked former Alexa research scientist Mihail Eric, who posted on X in June that Amazon had dropped the ball. He said Alexa had been ahead of the pack, but frittered away its lead due to disorganization at the company, thinning engineering teams and competitive infighting among decentralized teams.
How Alexa dropped the ball on being the top conversational system on the planet
—
A few weeks ago OpenAI released GPT-4o ushering in a new standard for multimodal, conversational experiences with sophisticated reasoning capabilities.Several days later, my good friends at PolyAI…
— Mihail Eric (@mihail_eric) June 11, 2024
Amazon’s practice of protecting customer data with guardrails, which Eric acknowledged is a “crucial” policy, also nevertheless meant that the internal infrastructure for developers was “agonizingly painful to work with.” It would take weeks to access any data for analysis or experimentation, he added.
Eric recommended the following ways to accelerate Alexa’s development: invest in robust developer infrastructure, especially around access to compete, data quality assurance and streamlined data collection processes; make LLMs the fundamental building block of the dialogue flows; and ensure product timelines don’t dictate science research time frames.
An Amazon spokesperson told PYMNTS: “Our vision for Alexa is to build the world’s best personal assistant. Generative AI offers a huge opportunity to make Alexa even better for our customers, and we are working hard to enable even more proactive and capable assistance on the over half-a-billion Alexa-enabled devices already in homes around the world.”
The post Amazon Hits Snags in Revamping Alexa Into Agentic AI appeared first on PYMNTS.com.