Nvidia’s Jensen Huang Says AI Compute Could Near $1 Trillion by 2027
When Conan O’Brien took the stage at the 98th Academy Awards on Sunday night, his opening line landed as both a joke and a warning. “I am honored to be the last human host of the Academy Awards,” he said, adding that next year’s host would be “a Waymo in a tux.”
Less than 24 hours later, a moment at Nvidia’s annual GTC conference made that joke feel a little less far-fetched.
At the end of his keynote Monday, Nvidia CEO Jensen Huang played an animated video showing several robots, along with a digital version of himself, sitting around a campfire singing a country-style song about the conference. The robots joked about tokens, open-source software and artificial intelligence development, closing the event with a playful reminder of how quickly AI is moving from research labs into everyday life.
The moment captured the unusual range of this year’s GTC keynote. The presentation spanned everything from Disney’s Olaf appearing on stage to discussions about computing systems designed for space exploration, and major announcements about the infrastructure needed to power the next wave of AI.
But behind the playful demonstrations was a much bigger message about where the AI industry is heading.
Huang said the industry is entering what he described as an “inference inflection point,” a phase where the demand for computing power is shifting rapidly from training AI models to running them continuously in real-world applications. The scale of that shift, Huang argued, could drive one of the largest technology infrastructure expansions in history.
“AI computing could approach a trillion dollars of data center infrastructure between now and 2027,” Huang said.
Inference and the Token Economy
Much of that demand is tied to inference, the process where a trained AI model generates responses for users. Every time a chatbot answers a question, an AI assistant writes an email, or a coding tool produces software, the system generates pieces of output known as tokens.
Tokens are the basic units of AI-generated text or data. A short sentence might contain dozens of tokens, while a longer response could contain hundreds.
Because inference happens continuously as users interact with AI systems, the computing demand can far exceed the resources needed to train the models in the first place. While training large models requires massive bursts of computing power, inference workloads run constantly as millions of users interact with AI services.
That shift means the long-term economics of AI are increasingly tied to how efficiently companies can generate tokens at scale.
“Inference is your new workload, tokens are your new commodity,” Huang said during the keynote. “You want to make sure that the architecture is as optimized as you can in the future.”
To underscore the point, Huang at one point lifted a championship-style belt on stage labeled “InferenceX,” referencing analysis from SemiAnalysis that ranked Nvidia’s systems as leaders in inference performance.
The visual framed Nvidia’s position in the AI market as something akin to a “token king,” highlighting the company’s focus on delivering the lowest cost per token as AI systems generate ever-larger volumes of output.
“The agentic AI inflection point has arrived,” Huang said while introducing Nvidia’s next-generation Vera Rubin platform.
The rise of AI agents, software systems that can perform tasks on behalf of users, is expected to significantly increase the number of tokens generated across enterprise software, digital assistants and automated workflows.
AI Factories and the Infrastructure Boom
Huang described the emerging AI economy using the concept of “AI factories,” specialized data centers designed to generate AI outputs at massive scale.
“In the age of AI, intelligence tokens are the new currency, and AI factories are the infrastructure that generates them,” he said.
To support that shift, Nvidia unveiled its next-generation AI computing platform called Vera Rubin. According to Nvidia, the system is designed to deliver up to 10 times higher inference performance per watt while reducing the cost of generating tokens by roughly 90%.
Taken together, Huang said the shift toward inference-driven workloads is transforming how the technology industry thinks about computing infrastructure. Instead of building data centers primarily for periodic model training, companies are now building massive systems designed to generate tokens continuously.
That shift, Huang suggested, could redefine the economics of computing itself. “The future of computing will be built around AI factories,” Huang said.
The post Nvidia’s Jensen Huang Says AI Compute Could Near $1 Trillion by 2027 appeared first on PYMNTS.com.