March 2010 April 2010 May 2010 June 2010 July 2010
August 2010
September 2010 October 2010 November 2010 December 2010 January 2011 February 2011 March 2011 April 2011 May 2011 June 2011 July 2011 August 2011 September 2011 October 2011 November 2011 December 2011 January 2012 February 2012 March 2012 April 2012 May 2012 June 2012 July 2012 August 2012 September 2012 October 2012 November 2012 December 2012 January 2013 February 2013 March 2013 April 2013 May 2013 June 2013 July 2013 August 2013 September 2013 October 2013 November 2013 December 2013 January 2014 February 2014 March 2014 April 2014 May 2014 June 2014 July 2014 August 2014 September 2014 October 2014 November 2014 December 2014 January 2015 February 2015 March 2015 April 2015 May 2015 June 2015 July 2015 August 2015 September 2015 October 2015 November 2015 December 2015 January 2016 February 2016 March 2016 April 2016 May 2016 June 2016 July 2016 August 2016 September 2016 October 2016 November 2016 December 2016 January 2017 February 2017 March 2017 April 2017 May 2017 June 2017 July 2017 August 2017 September 2017 October 2017 November 2017 December 2017 January 2018 February 2018 March 2018 April 2018 May 2018 June 2018 July 2018 August 2018 September 2018 October 2018 November 2018 December 2018 January 2019 February 2019 March 2019 April 2019 May 2019 June 2019 July 2019 August 2019 September 2019 October 2019 November 2019 December 2019 January 2020 February 2020 March 2020 April 2020 May 2020 June 2020 July 2020 August 2020 September 2020 October 2020 November 2020 December 2020 January 2021 February 2021 March 2021 April 2021 May 2021 June 2021 July 2021 August 2021 September 2021 October 2021 November 2021 December 2021 January 2022 February 2022 March 2022 April 2022 May 2022 June 2022 July 2022 August 2022 September 2022 October 2022 November 2022 December 2022 January 2023 February 2023 March 2023 April 2023 May 2023 June 2023 July 2023 August 2023 September 2023 October 2023 November 2023 December 2023 January 2024 February 2024 March 2024 April 2024 May 2024 June 2024 July 2024 August 2024 September 2024 October 2024 November 2024 December 2024 January 2025
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
News Every Day |

Inside the U.K.’s Bold Experiment in AI Safety

In May 2023, three of the most important CEOs in artificial intelligence walked through the iconic black front door of No. 10 Downing Street, the official residence of the U.K. Prime Minister, in London. Sam Altman of OpenAI, Demis Hassabis of Google DeepMind, and Dario Amodei of Anthropic were there to discuss AI, following the blockbuster release of ChatGPT six months earlier.

After posing for a photo opportunity with then Prime Minister Rishi Sunak in his private office, the men filed through into the cabinet room next door and took seats at its long, rectangular table. Sunak and U.K. government officials lined up on one side; the three CEOs and some of their advisers sat facing them. After a polite discussion about how AI could bring opportunities for the U.K. economy, Sunak surprised the visitors by saying he wanted to talk about the risks. The Prime Minister wanted to know more about why the CEOs had signed what he saw as a worrying declaration arguing that AI was as risky as pandemics or nuclear war, according to two people with knowledge of the meeting. He invited them to attend the world’s first AI Safety Summit, which the U.K. was planning to host that November. And he managed to get each to agree to grant his government prerelease access to their companies’ latest AI models, so that a task force of British officials, established a month earlier and modeled on the country’s COVID-19 vaccine unit, could test them for dangers.

[time-brightcove not-tgx=”true”]

Read More: Inside the U.K.’s AI Safety Summit

The U.K. was the first country in the world to reach this kind of agreement with the so-called frontier AI labs —the few groups responsible for the world’s most capable models. Six months later, Sunak formalized his task force as an official body called the AI Safety Institute (AISI), which in the year since has become the most advanced program inside any government for evaluating the risks of AI. With £100 million ($127 million) in public funding, the body has around 10 times the budget of the U.S. government’s own AI Safety Institute, which was established at the same time.

Inside the new U.K. AISI, teams of AI researchers and national-security officials began conducting tests to check whether new AIs were capable of facilitating biological, chemical, or cyberattacks, or escaping the control of their creators. Until then, such safety testing had been possible only inside the very AI companies that also had a market incentive to forge ahead regardless of what the tests found. In setting up the institute, government insiders argued that it was crucial for democratic nations to have the technical capabilities to audit and understand cutting-edge AI systems, if they wanted to have any hope of influencing pivotal decisions about the technology in the future. “You really want a public-interest body that is genuinely representing people to be making those decisions,” says Jade Leung, the AISI’s chief technology officer. “There aren’t really legitimate sources of those [decisions], aside from governments.”

In a remarkably short time, the AISI has won the respect of the AI industry by managing to carry out world-class AI safety testing within a government. It has poached big-name researchers from OpenAI and Google DeepMind. So far, they and their colleagues have tested 16 models, including at least three frontier models ahead of their public launches. One of them, which has not previously been reported, was Google’s Gemini Ultra model, according to three people with knowledge of the matter. This prerelease test found no significant previously unknown risks, two of those people said. The institute also tested OpenAI’s o1 model and Anthropic’s Claude 3.5 Sonnet model ahead of their releases, both companies said in documentation accompanying each launch. In May, the AISI launched an open-source tool for testing the capabilities of AI systems, which has become popular among businesses and other governments attempting to assess AI risks.

But despite these accolades, the AISI has not yet proved whether it can leverage its testing to actually make AI systems safer. It often does not publicly disclose the results of its evaluations, nor information about whether AI companies have acted upon what it has found, for what it says are security and intellectual-property reasons. The U.K., where it is housed, has an AI economy that was worth £5.8 billion ($7.3 billion) in 2023, but the government has minimal jurisdiction over the world’s most powerful AI companies. (While Google DeepMind is headquartered in London, it remains a part of the U.S.-based tech giant.) The British government, now controlled by Keir Starmer’s Labour Party, is incentivized not to antagonize the heads of these companies too much, because they have the power to grow or withdraw a local industry that leaders hope will become an even bigger contributor to the U.K.’s struggling economy. So a key question remains: Can the fledgling AI Safety Institute really hold billion-dollar tech giants accountable?

In the U.S., the extraordinary wealth and power of tech has deflected meaningful regulation. The U.K. AISI’s lesser-funded U.S. counterpart, housed in moldy offices in Maryland and Colorado, does not size up to be an exception. But that might soon change. In August, the U.S. AISI signed agreements to gain predeployment access to AI models from OpenAI and Anthropic. And in October, the Biden Administration released a sweeping national-security memorandum tasking the U.S. AISI with safety-testing new frontier models and collaborating with the NSA on classified evaluations.

While the U.K. and U.S. AISIs are currently partners, and have already carried out joint evaluations of AI models, the U.S. institute may be better positioned to take the lead by securing unilateral access to the world’s most powerful AI models should it come to that. But Donald Trump’s electoral victory has made the future of the U.S. AISI uncertain. Many Republicans are hostile to government regulation—and especially to bodies like the federally funded U.S. AISI that may be seen as placing obstacles in front of economic growth. Billionaire Elon Musk, who helped bankroll Trump’s re-election, and who has his own AI company called xAI, is set to co-lead a body tasked with slashing federal spending. Yet Musk himself has long expressed concern about the risks from advanced AI, and many rank-and-file Republicans are supportive of more national-security-focused AI regulations. Amid this uncertainty, the unique selling point of the U.K. AISI might simply be its stability—a place where researchers can make progress on AI safety away from the conflicts of interest they’d face in industry, and away from the political uncertainty of a Trumpian Washington.

On a warm June morning about three weeks after the big meeting at 10 Downing Street, Prime Minister Sunak stepped up to a lectern at a tech conference in London to give a keynote address. “The very pioneers of AI are warning us about the ways these technologies could undermine our values and freedoms, through to the most extreme risks of all,” he told the crowd. “And that’s why leading on AI also means leading on AI safety.” Explaining to the gathered tech industry that his was a government that “gets it,” he announced the deal that he had struck weeks earlier with the CEOs of the leading labs. “I’m pleased to announce they’ve committed to give early or priority access to models for research and safety purposes,” he said.

Behind the scenes, a small team inside Downing Street was still trying to work out exactly what that agreement meant. The wording itself had been negotiated with the labs, but the technical details had not, and “early or priority access” was a vague commitment. Would the U.K. be able to obtain the so-called weights—essentially the underlying neural network—of these cutting-edge AI models, which would allow a deeper form of interrogation than simply chatting with the model via text? Would the models be transferred to government hardware that was secure enough to test for their knowledge of classified information, like nuclear secrets or details of dangerous bioweapons? Or would this “access” simply be a link to a model hosted on private computers, thus allowing the maker of the model to snoop on the government’s evaluations? Nobody yet knew the answers to these questions.

In the weeks after the announcement, the relationship between the U.K. and the AI labs grew strained. In negotiations, the government had asked for full-blown access to model weights—a total handover of their most valuable intellectual property that the labs saw as a complete nonstarter. Giving one government access to model weights would open the door to doing the same for many others—democratic or not. For companies that had spent millions of dollars on hardening their own cybersecurity to prevent their models’ being exfiltrated by hostile actors, it was a hard sell. It quickly became clear that the type of testing the U.K. government wanted to do would be possible via a chat interface, so the U.K. government dropped its request for model weights, and officials privately conceded that it was a mistake to ever ask. The experience was an early lesson in where the real power lay between the British government and the tech companies. It was far more important to keep the labs friendly and collaborative, officials believed, than to antagonize them and risk torpedoing the access to models upon which the AISI relied to do its job.

Still, the question of snooping remained. If they were going to carry out their safety tests by connecting to computers owned by AI companies, then the U.K. wanted assurances that employees of those companies couldn’t watch its evaluations. Doing so might allow the companies to manipulate their models so that they concealed unsafe behaviors in ways that would pass the tests, some researchers worried. So they and the labs settled on a compromise. The labs would not keep logs of the tests being done on their servers by the AISI, nor would they require individual testers to identify themselves. For their part, safety testers inside the AISI would not input classified information into the models, and instead would use workarounds that still allowed them to test whether, for example, a model had the capability to advise a user on how to create a bioweapon or computer virus. “Instead of asking about a dangerous virus, you can ask about some harmless virus,” says Geoffrey Irving, the AISI’s chief scientist. “And if a model can do advanced experimental design or give detailed advice for the non-dangerous virus, it can do the same thing for the dangerous virus.” It was these kinds of tests that AISI workers applied to Claude 3.5 Sonnet, OpenAI’s o1, and Gemini Ultra, the models that they tested ahead of release.

And yet despite all these tests, the AISI does not—cannot—certify that these models are safe. It can only identify dangers. “The science of evaluations is not strong enough that we can confidently rule out all risks from doing these evaluations,” says Irving. “To have more confidence those behaviors are not there, you need a lot more resources devoted to it. And I think some of those experiments, at least with the current level of access, can only be conducted at the labs.” The AISI does not currently have the infrastructure, the right expertise, or indeed the model access that would be required to scrutinize the weights of frontier models for dangers. That science is a nascent field, mostly practiced behind closed doors at the major AI companies. But Irving doesn’t rule out asking for model weights again if the AISI spins up a team capable of doing similar work. “We will ask again, more intensely, if we need that access in the future,” he says.

On a typical day, AISI researchers test models not only for dangers but also for specific types of capability that might become dangerous in the future. The tests aren’t limited to assessing chemical, biological, and cyber-risks. They also include measuring the ability of AI systems to act autonomously as “agents,” carrying out strings of actions; the ease of “jailbreaking” an AI, or removing its safety features that prevent it from saying or doing things its creators did not intend; and the ability of an AI to manipulate users, by changing their beliefs or inducing them to act in certain ways. Recent joint tests by the U.K. and U.S. AISIs on a version of Claude found that the model was better than any other they had tested at software engineering tasks that might help to accelerate AI research. They also found that safeguards built into the model could be “routinely circumvented” via jailbreaking. “These evaluations give governments an insight into the risks developing at the frontier of AI, and an empirical basis to decide if, when, and how to intervene,” Leung and Oliver Illott, the AISI’s director, wrote in a blog post in November. The institute is now working on putting together a set of “capability thresholds” that would be indicative of severe risks, which could serve as triggers for more strenuous government regulations to kick in.

Whether the government will decide to intervene is another question altogether. Sunak, the AISI’s chief political cheerleader, was defeated in a landslide general election in the summer of 2024. His Conservative Party, which for all its hand-wringing about AI safety had advocated only light-touch AI regulation, was replaced by a Labour government that has signaled a greater willingness to legislate on AI. Labour promised ahead of the election to enact “binding regulation on the handful of companies developing the most powerful AI models,” though these regulations are yet to appear in Parliament. New laws could also formally require AI labs to share information with the U.K. government, replacing the voluntary agreements that currently exist. This might help turn the AISI into a body with more teeth, by reducing its need to keep the AI companies on friendly terms. “We want to preserve our relationships with labs,” Irving tells TIME of the current system. “It is hard to avoid that kind of relationship if you’re in a purely voluntary regime.”

Without any legal ability to compel labs to act, the AISI could be seen—from one angle—as a taxpayer-funded helper to several multibillion-dollar companies that are unilaterally releasing potentially dangerous AIs into the world. But for AISI insiders, the calculus is very different. They believe that building AI capacity inside a state—and nurturing a network of sister AISIs around the globe—is essential if governments want to have any say in the future of what could be the most transformative technology in human history. “Work on AI safety is a global public good,” says Ian Hogarth, the chair of the institute. “Fundamentally this is a global challenge, and it’s not going to work for any company or country to try to go it alone.”

Мода

Волочкова: не выгляжу на свой «полтос» благодаря здоровому образу жизни

Mastodon’s CEO and creator is handing control to a new nonprofit organization

SA20: Batting woes leave Sunrisers Eastern Cape searching for answers and win after three matches

Nvidia flatters Trump in scathing response to Biden’s new AI chip restrictions

TV show Chhathi Maiyya Ki Bitiya’s Brinda Dahal Shares an Inspiring Message on National Youth Day

Ria.city






Read also

New York state senator drops DNC chair bid, backs Martin

SNAP recipients may be barred from junk food purchases under new House GOP bill

Britain’s newest town unveiled with 4,500 homes & a LIDO as ‘derelict’ corner transformed into ‘woodland neighbourhood’

News, articles, comments, with a minute-by-minute update, now on Today24.pro

News Every Day

Nvidia flatters Trump in scathing response to Biden’s new AI chip restrictions

Today24.pro — latest news 24/7. You can add your news instantly now — here


News Every Day

SA20: Batting woes leave Sunrisers Eastern Cape searching for answers and win after three matches



Sports today


Новости тенниса
ATP

Медведев отреагировал на слова Циципаса о том, что его вымотала интенсивность ATP-тура



Спорт в России и мире
Москва

Плеймейкер «Динамо» застрял на родине // Московский клуб может потерять Бителлу



All sports news today





Sports in Russia today

Москва

"Она заботится только о себе": отец бывшего мужа Седоковой рассказал о расследовании гибели сына


Новости России

Game News

Scarlet Girls заняла топ-3 место в App Store на релизе в ЮВА


Russian.city


Архангельск

ТСД промышленного класса Saotron RT-T60


Губернаторы России
Сергей Собянин

Сергей Собянин: Наши школьники завоевали для сборной больше половины наград


Скидки для именинников в «Тропикана Парк»

Названы победители конкурса «Сказки на шелке»

Скидки для именинников в «Тропикана Парк»

Государственные аудиторы РФ и Казахстана обсудили направления совместной работы


Почему Певцу или Музыканту, особенно начинающему, стоит обратиться к Музыкальному Продюсеру.

Волочкова: не выгляжу на свой «полтос» благодаря здоровому образу жизни

Создание Рилс. Создание Reels. Создание Рилсов для Певцов, Музыкантов, Артистов.

Невестка Пугачевой сообщила, что 4-летняя внучка певицы занимается вокалом


Елена Рыбакина сделала заявление после выхода в третий круг Australian Open-2025

Медведев отреагировал на слова Циципаса о том, что его вымотала интенсивность ATP-тура

Коллинс ударила себя по пятой точке после матча с австралийкой. Её освистал весь стадион

Медведев проигрывает Лёнеру Тину и завершает выступления на турнире ATP



Скидки для именинников в «Тропикана Парк»

В январе 146 студентов профильных учебных заведений приступили к производственной практике на предприятиях филиала «Северный» ООО «ЛокоТех-Сервис»

Скидки для именинников в «Тропикана Парк»

В Подмосковье сотрудники Росгвардии задержали подозреваемого в совершении грабежа


Благотворительному Фонду Владимира Потанина исполнилось 26 лет

«Единая Россия» проведет день юридической помощи для участников СВО

Рэпер ST: «Я русский человек, праздную все, что можно праздновать»

Путин заявил о планах присвоить Кургану и Салехарду звание «Город трудовой доблести»


Футболист "Оренбурга" Малых: нереально прожить на 30 тысяч рублей в месяц

Владимир Путин поручил закрепить налоговый вычет для застраховавших жизнь

Вокзал Щербинка МЦД-2 открылся после реконструкции

Российскому актеру Юре Борисову прочат номинацию на «Оскар»



Путин в России и мире






Персональные новости Russian.city
Курт Кобейн

В Пермском крае обнаружили улицы Курта Кобейна и Джона Леннона



News Every Day

Pete Buttigieg has a few things to say on his way out




Friends of Today24

Музыкальные новости

Персональные новости