AI arms race: are Anthropic and OpenAI handing hackers the ultimate weapon?
Claims that new AI models can outperform humans at some hacking tasks has sparked widespread alarm about the future of digital security.
Tech firms “usually create buzz around products they plan to release”, said The Economist. American artificial intelligence lab Anthropic, “has managed to create excitement – and a good deal of worry – around something it plans not to”, having announced that its new Claude Mythos model would not be released to the general public.
The problem is not that the new model is “buggy or unreliable” but rather “that it works so well that releasing it would put the world’s digital infrastructure at risk”.
What did the commentators say?
This next generation of AI models such as Anthropic’s Mythos or OpenAI’s new closed-version GPT 5.4-Cyber can not only write code, but also recognise errors – or “bugs” – in the code, which can be used to both identify potential weaknesses but also ways to attack computer systems.
“It’s impressive – and, at the same time, worrying” – because it makes cyberattacks “easier”, said professor of cyber security Florian Tramèr on ETH Zurich university’s website. A lone hacker “can suddenly try out thousands of variants” and “if one attack fails, he or she can simply try with the next one.” “This increases the risks for companies, state institutions or even private individuals,” especially “if such models become cheaper and more efficient”.
Recognising the danger this might pose, Anthropic has limited access to Mythos to a handful of trusted tech companies under an initiative called Project Glasswing. Similarly, OpenAI is providing limited access to GPT-5.4-Cyber to vetted security professionals so they can use it for defensive cybersecurity measures.
Yet even Anthropic’s strict security protocols appear to have been breached, after the company confirmed it was investigating how a group of users gained “unauthorised access” to Mythos Preview “through one of our third-party vendor environments”.
The risk of unauthorised access will only “add to anxiety” about Mythos, and “raises concerns” about whether Anthropic “can keep the technology it develops out of the hands of bad actors”, said Cristina Criddle in the Financial Times.
News of these new models’ cybercapabilities had already “sent shockwaves through the markets and prompted high-level discussions among financial institutions and global regulators”, with finance ministers from across the G7 hosting bank bosses to discuss what AI-enabled hacking might mean for their businesses.
What next?
Capitalising on a “mix of fear and excitement over AI and its future impact” has “become a hallmark of the sector and its marketing strategies in recent years”, said BBC reporters Liv McMahon and Joe Tidy.
In the case of Mythos, “we still do not know enough about it to know whether these hopes or fears are justified, or more a reflection of the hype surrounding the industry”.
In reality, “like other tools from the long history of cybersecurity”, the latest AI models “can be used for both offence and defence”, said The New York Times.
There is still disagreement on “whether one side of this struggle has gained a significant advantage through AI” and experts are “unsure how the battle will play out in the coming years”. Most agree, however, that “the companies and governments that do not embrace the latest AI for defensive purposes will leave themselves enormously vulnerable”.
With the cyberenvironment experiencing the “most change” ever, said Francis deSouza, the chief operating officer and president of security products at Google Cloud, “you have to fight AI with AI.”