News Every Day | Yesterday, 21:16

OpenAI's GPT-5.2 is here: what enterprises need to know

The rumors were true: OpenAI on Thursday announced the release of its new frontier large language model (LLM) family, GPT-5.2.

It comes at a pivotal moment for the AI pioneer, which has faced intensifying pressure since rival Google’s Gemini 3 LLM seized the top spot on major third-party performance leaderboards and many key benchmarks last month, though OpenAI leaders stressed in a press briefing that the timing of this release had been discussed and worked on well in advance of the release of Gemini 3.

OpenAI describes GPT-5.2 as its "most capable model series yet for professional knowledge work," aiming to reclaim the performance crown with significant gains in reasoning, coding, and agentic workflows.

"It’s our most advanced frontier model and the strongest yet in the market for professional use," Fidji Simo, OpenAI’s CEO of Applications, said during a press briefing today. "We designed 5.2 to unlock even more economic value for people. It's better at creating spreadsheets, building presentations, writing code, perceiving images, understanding long context, using tools, and handling complex, multi-step projects."

GPT-5.2 features a massive 400,000-token context window — allowing it to ingest hundreds of documents or large code repositories at once — and a 128,000 max output token limit, enabling it to generate extensive reports or full applications in a single go.

The model also features a knowledge cutoff of August 31, 2025, ensuring it is up-to-date with relatively recent world events and technical documentation. It explicitly includes "Reasoning token support," confirming the underlying architecture uses the chain-of-thought processing popularized by the "o1" series.

The 'Code Red' Reality Check

The release arrives following The Information's report of an emergency "Code Red" directive to OpenAI staff from CEO Sam Altman to improve ChaTGPT — a move reportedly designed to mobilize resources following the "quality gap" exposed by Gemini 3. The Verge similarly reported on the timing of GPT-5.2's release ahead of the official announcement.

During the briefing, OpenAI executives acknowledged the directive but pushed back on the narrative that the model was rushed solely to answer Google.

"It is important to note this has been in the works for many, many months," Simo told reporters. She clarified that while the "Code Red" helped focus the company, it wasn't the sole driver of the timeline.

"We announced this Code Red to really signal to the company that we want to marshal resources in one particular area... but that's not the reason it's coming out this week in particular."

Max Schwarzer, lead of OpenAI's post-training team, echoed this sentiment to dispel the idea of a panic launch. "We've been planning for this release since a very long time ago... this specific week we talked about many months ago."

A spokesperson from OpenAI further clarified that the "Code Red" call applied to ChatGPT as a product, not solely underlying model development or the release of new models.

Under the Hood: Instant, Thinking, and Pro

OpenAI is segmenting the GPT-5.2 release into three distinct tiers within ChatGPT, a strategy likely designed to balance the massive compute costs of "reasoning" models with user demand for speed:

GPT-5.2 Instant: Optimized for speed and daily tasks like writing, translation, and information seeking.
GPT-5.2 Thinking: Designed for "complex, structured work" and long-running agents, this model leverages deeper reasoning chains to handle coding, math, and multi-step projects.
GPT-5.2 Pro: The new heavyweight champion. OpenAI describes this as its "smartest and most trustworthy option," delivering the highest accuracy for difficult questions where quality outweighs latency.

For developers, the models are available immediately in the application programming interface (API) as gpt-5.2, gpt-5.2-chat-latest (Instant), and gpt-5.2-pro.

The Numbers: Beating the Benchmarks

The GPT-5.2 release includes leading metrics across most domains — specifically those that target the "professional knowledge work" gap where competitors have recently gained ground.

OpenAI highlighted a new benchmark called GDPval, which measures performance on "well-specified knowledge work tasks" across 44 occupations.

"GPT-5.2 Thinking is now state-of-the-art on that benchmark... and beats or ties top industry professionals on 70.9% of well-specified professional tasks like spreadsheets, presentations, and document creation, according to expert human judges," Simo said.

In the critical arena of coding, OpenAI is claiming a decisive lead. Schwarzer noted that on SWE-bench Pro, a rigorous evaluation of real-world software engineering, GPT-5.2 Thinking sets a new state-of-the-art score of 55.6%.

He emphasized that this benchmark is "more contamination resistant, challenging, diverse, and industrially relevant than previous benchmarks like SWE-bench Verified."Other key benchmark results include:

GPQA Diamond (Science): GPT-5.2 Pro scored 93.2%, edging out GPT-5.2 Thinking (92.4%) and surpassing GPT-5.1 Thinking (88.1%).
FrontierMath: On Tier 1-3 problems, GPT-5.2 Thinking solved 40.3%, a significant jump from the 31.0% achieved by its predecessor.
ARC-AGI-1: GPT-5.2 Pro is reportedly the first model to cross the 90% threshold on this general reasoning benchmark, scoring 90.5%

The Price of Intelligence

Performance comes at a premium. While ChatGPT subscription pricing remains unchanged for now, the API costs for the new flagship models are steep compared to previous generations, reflecting the high compute demands of "thinking" mode. They're also on the upper-end of API costs for the industry.

GPT-5.2 Thinking: Priced at $1.75 per 1 million input tokens and $14 per 1 million output tokens.
GPT-5.2 Pro: The costs jump significantly to $21 per 1 million input tokens and $168 per 1 million output tokens.

GPT-5.2 Thinking is priced 40% higher in the API than the standard GPT-5.1 ($1.25/$10), signaling that OpenAI views the new reasoning capabilities as a tangible value-add rather than a mere efficiency update.

The high-end GPT-5.2 Pro follows the same pattern, costing 40% more than the previous GPT-5 Pro ($15/$120). While expensive, it still undercuts OpenAI’s most specialized reasoning model, o1-pro, which remains the most costly offering on the menu at a staggering $150 per million input tokens and $600 per million output tokens.

OpenAI argues that despite the higher per-token cost, the model’s "greater token efficiency" and ability to solve tasks in fewer turns make it economically viable for high-value enterprise workflows.

Here's how it compares to the current API costs for other competing models across the LLM field:

Model	Input (/1M)	Output (/1M)	Total Cost	Source
Qwen 3 Turbo	$0.05	$0.20	$0.25	Alibaba Cloud
Grok 4.1 Fast (reasoning)	$0.20	$0.50	$0.70	xAI
Grok 4.1 Fast (non-reasoning)	$0.20	$0.50	$0.70	xAI
deepseek-chat (V3.2-Exp)	$0.28	$0.42	$0.70	DeepSeek
deepseek-reasoner (V3.2-Exp)	$0.28	$0.42	$0.70	DeepSeek
Qwen 3 Plus	$0.40	$1.20	$1.60	Alibaba Cloud
ERNIE 5.0	$0.85	$3.40	$4.25	Qianfan
Claude Haiku 4.5	$1.00	$5.00	$6.00	Anthropic
Qwen-Max	$1.60	$6.40	$8.00	Alibaba Cloud
Gemini 3 Pro (≤200K)	$2.00	$12.00	$14.00	Google
GPT-5.2	$1.75	$14.00	$15.75	OpenAI
Gemini 3 Pro (>200K)	$4.00	$18.00	$22.00	Google
Claude Sonnet 4.5	$3.00	$15.00	$18.00	Anthropic
Claude Opus 4.5	$5.00	$25.00	$30.00	Anthropic
GPT-5.2 Pro	$21.00	$168.00	$189.00	OpenAI

Image Generation: Nothing New Yet...But 'More to Come'

During the briefing, VentureBeat asked the OpenAI participants if the new release included any boost to image generation capabilities, noting the excitement around similar features in recent competitor launches like Google's Gemini 3 Image aka Nano Banana Pro.

Unfortunately for those seeking to recreate the kind of text-and-information heavy graphics and image editing capabilities, OpenAI executives clarified that GPT-5.2 comes with no current image improvements over the prior GPT-5.1 and OpenAI's integrated DALL-E 3 and gpt-4o native image generation models.

"On image Gen, nothing to announce today, but more to come," Simo said. She acknowledged the popularity of the feature, adding, "We know this is a very important use case that people love, that we introduced [to] the market, and so definitely more to come there."

Aidan Clark, OpenAI's lead of training, also declined to comment on visual generation specifics, stating simply, "I can't really speak to image Gen myself."

The 'Mega-Agent' Era

Beyond raw scores, OpenAI is positioning GPT-5.2 as the engine for a new generation of "long-running agents" capable of executing multi-step workflows without human hand-holding."

Box found that 5.2 can extract information from long, complex documents about 40% faster, and also saw a 40% boost in reasoning accuracy for Life Sciences and healthcare," Simo said.

She also noted that Notion reported the model "outperforms 5.1 across every dimension... and it excels at the kind of really ambiguous, longer rising tasks that define real knowledge work."Schwarzer added that coding startups like Augment Code found the model "delivered substantially stronger deep code capabilities than any prior model," which is why it was selected to power their new code review agent.Visual capabilities have also seen an upgrade.

OpenAI's release blog post shows an example where "a traveler reports a delayed flight, a missed connection, an overnight stay in New York, and a medical seating requirement."

The outcome? "GPT‑5.2 manages the entire chain of tasks—rebooking, special-assistance seating, and compensation—delivering a more complete outcome than GPT‑5.1."

A new evaluation called ScreenSpot-Pro, which tests a model's ability to understand GUI screenshots, shows GPT-5.2 Thinking achieving 86.3% accuracy, compared to just 64.2% for GPT-5.1.

Science and Reliability

OpenAI leaders also stressed the model's utility for scientific research, attempting to move the conversation beyond simple chatbots to research assistants.

Aidan Clark, lead of the training team, shared an example of a senior immunology researcher testing the model.

"They tested it by asking it to generate the most important unanswered questions about the immune system," Clark said. "That immunology researcher reported that GPT-5.2 produced sharper questions and stronger explanations for why those questions... matter compared to any previous pro model.

"Reliability was another key focus. Schwarzer claimed the new model "hallucinates substantially less than GPT-5.1," noting that on a set of de-identified queries, "responses contained errors 38% less often."

The 'Vibe' Shift

Interestingly, OpenAI acknowledged that not every user might immediately prefer the new models.

When asked why legacy models like GPT-5.1 would remain available, Schwarzer admitted that "models change a little bit every time.

"Some users may find that they prefer the vibes of the previous model, even though we think the latest one is across the board generally much better," Schwarzer said. He also noted that for some enterprise customers who have "really fine-tuned a prompt for a specific model," there might be "small regressions," necessitating access to the older versions.

Safety, 'Adult Mode,' and Future Roadmap

Addressing safety concerns, Simo confirmed that the company is preparing to roll out an "Adult Mode" in the first quarter of next year, following the implementation of a new age prediction system.

"We're in the process of improving that," Simo said regarding the age prediction technology.

"We want to do that ahead of launching adult mode."Looking further ahead, industry reports suggest OpenAI is working on a more fundamental architectural shift under the codename "Project Garlic," targeting a flagship release in early 2026.

While executives did not comment on specific future roadmaps during the briefing, Simo remained optimistic about the economics of their current trajectory.

"If you look at historical trends, compute has increased about 3x every year for the last three years," she explained. "Revenue has also increased at the same pace... creating this virtuous cycle."

Clark added that efficiency is improving rapidly: "The model we're releasing today achieves an even better score [on ARC-AGI] with almost 400 times less cost and less compute associated with it" compared to models from a year ago.

GPT-5.2 Instant, Thinking, and Pro begin rolling out in ChatGPT today to paid users (Plus, Pro, Team, and Enterprise). The company notes the rollout will be gradual to maintain stability.

OpenAI's GPT-5.2 is here: what enterprises need to know

The 'Code Red' Reality Check

Under the Hood: Instant, Thinking, and Pro

The Numbers: Beating the Benchmarks

The Price of Intelligence

Image Generation: Nothing New Yet...But 'More to Come'

The 'Mega-Agent' Era

Science and Reliability

The 'Vibe' Shift

Safety, 'Adult Mode,' and Future Roadmap

Read also

Michael Jordan Earns Major Sports Victory in Court Settlement

Judge orders 'immediate release' of wrongly detained Maryland dad

Trump admin eyes yanking visas of Musk critics: report

Sports today

All sports news today

Sports in Russia today

Friends of Today24