Google DeepMind’s Demis Hassabis on the long game of AI
In 1988, a London pre-teen with a penchant for programming and gaming wrote a version of the classic board game Othello—also known as Reversi—for his Amiga 500 home computer. Teaching a piece of software to play the game was an ambitious coding project for someone so young.
And with that, Demis Hassabis notched his first achievement in the field of artificial intelligence.
The Othello-playing app “beat my kid brother, who was only five at the time,” Hassabis remembers. “It was an ‘a-ha’ moment for me, because I just thought, ‘Wow, it’s incredible that you can make a program that’s inanimate and it can go off and do something on your behalf.'”
That proved to be a fateful epiphany. More than two decades later, it led to him cofounding DeepMind, the AI startup that did much to push the technology forward, both before and after its acquisition by Google in 2014. In 2023, Google merged DeepMind with Google Brain, its other highly productive AI arm, and named Hassabis as CEO of the combined operation, Google DeepMind. The AI model he oversees, Gemini, is now at the heart of Google products used by billions of people.
Long before the fruits of DeepMind’s work were everywhere, the company was a research lab whose early focus was on training algorithms to play games. That didn’t just connect them back to Hassabis’s childhood Othello app. From the very dawn of AI, researchers have used gaming as a canvas for discovery. For example, back in 2019, I wrote about a 1960 TV special that documented IBM’s checkers-playing computer.
Games are so powerful as a research tool because they’re “a microcosm of something important in real life,” explains Hassabis. “And we get to practice it many times in an environment that’s serious, but not serious, in a sense.”
Last month marked the tenth anniversary of the capstone to that quest—a history-making moment not just for DeepMind, but the entire AI field. The 2,500-year-old Chinese board game Go had been considered, in Hassabis’s words, “the Mount Everest of game AI”—so deep and mystical in its mechanics that for years, computers struggled to play it even poorly, let alone well. But from March 9-15 2016, in a match held in Seoul, DeepMind’s AlphaGo software beat Lee Sedol, Go’s world champion, four games to one.
The victory reverberated far beyond the crowd of obsessives who had wondered if it was even possible. “Maybe, looking back on it now, it was the beginning of what we would consider the modern AI era,” says Hassabis. It was certainly tangible proof that the tech could amaze even the people responsible for its breakthroughs. It was soon joined by other signs, such as Google Brain’s June 2017 research paper on “transformers”—the fundamental ingredient that would give us generative AI.
AlphaGo also marked a transition for DeepMind. Once its AI had beaten Go, gaming was short on obvious Mount Everests to conquer, and more consequential challenges beckoned. In 2018, DeepMind unveiled the first version of AlphaFold, its algorithm for predicting protein structures. That breakthrough’s transformative implications in areas such as drug discovery and materials research inspired the creation of Isomorphic Labs, a new startup within Google’s parent company Alphabet, and led to Hassabis and DeepMind distinguished scientist John Jumper sharing the 2024 Nobel Prize in Chemistry.
Today, Google DeepMind’s website reflects its wide-ranging research efforts, from predicting weather to error-correcting quantum computers to understanding how dolphins communicate. But Hassabis doesn’t talk about games like they’re a musty part of his past. Indeed, he’s as engaged and proud talking about the long road that led to AlphaGo’s big win as when discussing Google DeepMind’s current activities. Gaming just happened to be the first type of artificial intelligence that captured his imagination. What he learned along the way remains as relevant as ever.
“It was obvious to me from 16, 17 years old that AI was what I was going to do with my career,” he says. “And, if it could work, the biggest thing of all time.”
From chess to Pong to Go
By the time Hassabis tackled Othello on his Amiga, he was already an old hand at board-game wizardry. At four, he took up chess. At eight, he’d earned enough playing it competitively to buy his first computer. At 13, he became the world’s second-highest rated player under the age of 14, after the legendary Judit Polgár.
Hassabis credits his time as a chess prodigy with sharpening his skills at problem-solving, visualization, and thinking clearly under pressure; it doesn’t seem a stretch to guess that it might have been a boon to his self-confidence as well. “There aren’t many things children can do where they can compete against adults at the highest level when they’re five or six years old,” he says. (He recommends chess as part of school curriculums and still plays it online in the middle of the night as “a gym for the mind.”)
Still a wunderkind at age 17, Hassabis won an internship at computer game studio Bullfrog after entering a competition in a magazine for Amiga users. Before long, he’d co-created Theme Park, an amusement-park simulator that sold tens of millions of copies.
Theme Park didn’t just let players choose rides. They also set prices, hired staff, operated concessions, sold stock, and otherwise optimized the business to thrive. Unlike a board game or most computer games, it offered entirely open-ended play, powered by an algorithm rather than a fixed set of rules.
As Hassabis saw his creation behave in ways he hadn’t explicitly programmed into it, his mind reeled. “The key thing was that every time someone played the game, they had a unique experience, because the AI would react to how they were playing it,” he recalls. “We got letters from kids. They sent screenshots of these amazing end states they got their theme parks into. And we had no idea you could even do that, even though we’d made the game.”
Sixteen years elapsed between Theme Park‘s release and DeepMind’s inception. During them, Hassabis earned a BA in computer science and a PhD in cognitive neuroscience, with more time in the game business sandwiched in between.
When he and his friends Shane Legg and Mustafa Suleyman decided to start an AI company together, it was with the aspiration—even loftier in 2010 than now—of developing algorithms that could at least match human cognitive ability at typical tasks. (Legg called that artificial general intelligence, or AGI, a term the entire field embraced.) But the cofounders began with a vastly more manageable project: training AI to excel at early Atari home video games such as Pong, Breakout, and Space Invaders.
Not that it was a sure thing at the time. “We might have been 20 years too early,” says Hassabis. “Nobody knew. And so we had to try it.”
The fact that the video games in question were ultra-minimalist 1970s relics didn’t result in immediate gratification. “It took months to win a single point at Pong, the simplest Atari game,” Hassabis remembers. Eventually, though, “We won the game 21-nil,” he says. “And then we could play all Atari games after another year or so.”
The technique DeepMind used to trounce Pong—deep reinforcement learning—had broad applicability in AI beyond gaming. Heartened by its progress, the company turned its attention to Go.
Though leaping directly from some of the world’s most basic games to one of unrivaled complexity might sound jarring, it may have been inexorable. Teaching AI to play Go at the highest possible level had been an irresistibly audacious goal for computer scientists since the 1970s. It had also been on Hassabis’s own mind for 20 years, even though he was only an amateur at the game himself.
As a Cambridge undergrad, he’d discussed AI and Go with a classmate, David Silver. In 2008, a program Silver had co-created, MoGo, became the first software to beat a professional Go player, albeit while competing with the advantage of a handicap. Hassabis was reunited with his old friend when Silver joined DeepMind, where he worked on the Atari project and went on to lead AlphaGo’s development.
Decades of thought had also gone into chess-playing AI before IBM’s Deep Blue beat reigning world champion Garry Kasparov in 1997. But compared to Go, chess looked like Candyland. “In Go, there are 10 to the power 170 possible board positions—far more than there are atoms in the universe,” says Hassabis. That ruled out brute-force approaches such as programming the AI to handle every theoretical combination of pieces, as IBM had done for Deep Blue.
DeepMind ended up training a deep neural network with reinforcement learning to only explore meaningful moves for any given layout of pieces on the Go board. Hassabis compares the approach to infusing the algorithm with human intuition. Except AlphaGo was capable of taking more data into consideration than even the most gifted and disciplined human player, providing it with the opportunity to make decisions that felt not just intuitive, but magical.
That point was proven early in game two of AlphaGo’s match with Sedol, in a way that left jaws agape when it happened and still resonates today. For the game’s 37th move—forever after known as “Move 37″—the AI chose a play so unexpected that eyewitnesses wondered if Aja Huang, the DeepMind scientist responsible for moving AlphaGo’s pieces on the board, had made it in error.
“Lee Sedol chose that moment to go and have a smoke on the balcony,” recounts Hassabis. “He comes back in, and he sees Move 37. You see his facial expression change, and he’s sort of amazed by it. And bemused, perhaps.”
Everyone involved knew that no human Go master would have made Move 37. But it wasn’t clear until much later in the game if it had been remarkably smart or remarkably dumb. Eventually, however, it turned out to be essential to beating Sedol—”almost as if AlphaGo put the piece there for 100 moves later,” says Hassabis. “Not only was it unusual, it was the pivotal move to win the game. That’s what makes it one of the greatest Go moves of all time.”
Maybe you’d need to be a serious Go aficionado—which I’m not—to truly appreciate what made Move 37 special. But it’s easy to get swept up in its drama when watching AlphaGo, the 2017 documentary about the match. It continues to be fodder for courses, presentations, blog posts, and podcasts, making it a strong candidate for the most-analyzed single decision made by AI to date.
Of course, if Move 37 was merely a startling bit of board-game play, it wouldn’t be so endlessly compelling. By making it, AlphaGo showed how AI is capable of not just simulating human thought, but going beyond it. Achieving that higher state of reasoning was why DeepMind took on Go in the first place.
Subsequent research efforts such as AlphaFold have aimed to catalyze a similar effect. “The real world’s a lot harder than a game,” says Hassabis, but “You need that element of finding a new insight or new structure in the data. That’s what you’re looking for in science.” He adds that Move 37-like thinking is also apparent in current Google products such as the Deep Think version of Gemini, which is tuned for applications in science, math, and engineering.
At its best, human game play—be it on a computer, a board, or an athletic field—is always an act of creativity. Hassabis doesn’t hesitate to call Move 37 creative. But mind-blowing though it was, he doesn’t consider it equal to human creativity at its most inspired.
“It’s not true out-of-the-box creativity,” he stresses. “Because that would be something like [telling] the AI system, ‘Come up with an elegant game that only takes a few hours to play. It takes five minutes to learn the rules, but several lifetimes to master. And it’s esoterically beautiful as well.'”
In other words, he says, AI must do more than conjure up additional moments like Move 37 to prove its creative bona fides: “It needs to invent a game as deep and as beautiful as Go—and obviously, with today’s systems, we’re nowhere near that.” That gives AI researchers at Google DeepMind and elsewhere another gaming Everest to scale—and we humans comforting evidence that we remain unbeatable, for now, on at least one meaningful front.