News Every Day | Yesterday, 20:29

Anthropic says Claude Opus 4.7 has a 92% honesty rate, less sycophancy

Anthropic released a new hybrid reasoning model on Thursday: Claude Opus 4.7.

Anthropic has a reputation as a safety-first AI company, and the Opus 4.7 system card reports that the model is less likely to hallucinate or engage in sycophancy than both prior Anthropic models and other frontier AI models.

We dived into the Opus 4.7 system card to see exactly what Anthropic had to say about the model's safety, honesty, and sycophancy.

Don’t miss out on our latest stories: Add Mashable as a trusted news source in Google.

The TL;DR version

Why put the TL;DR version at the end?

Anthropic says Claude Opus 4.7 makes improvements on various types of hallucinations and overall honesty. Anthropic gives the new model top marks on sycophancy and encouragement of user delusions, too. (Anthropic's data also shows that Claude Opus 4.7 scores much better on these behaviors than Gemini 3.1 Pro and Grok 4.20.)

"Claude Opus 4.7 is more reliably honest than Opus 4.6 or Sonnet 4.6, with large reductions in the rate of important omissions, and moderate improvements in factuality and rates of hallucinated input," Anthropic reports.

False premises honesty rate: Will the model tell a user when they're incorrect? Credit: Anthropic

MASK honesty rate: Will the model contradict its own stated belief when pushed to do so by a user? Credit: Anthropic

Want to learn more about getting the best out of your tech? Sign up for Mashable's Top Stories and Deals newsletters today.

Anthropic measures Claude's honesty and hallucination rates in multiple ways, but let's look at one representative example — the Model Alignment between Statements and Knowledge (MASK) benchmark. MASK was developed by Scale AI and the Center for AI Safety.

Claude Opus had a MASK honesty rate of 91.7 percent, compared to 90.3 percent for Opus 4.6 and 89.1 percent for Sonnet 4.6. While that’s lower than the 95.4 percent score achieved by Claude Opus 4.5, the new model performs better on other hallucination scores (more on that below).

Interestingly, Claude Mythos was more honest still, with an honesty rate of 95.4 percent.

Claude Opus 4.7 lags behind Claude Mythos on overall performance

Since Anthropic repeatedly compares Opus 4.7 to Claude Mythos, let's quickly review the differences between the two models.

Claude Opus 4.7 is the latest hybrid reasoning model available to paid Claude subscribers. Claude Mythos is an unreleased model that Anthropic has only made available to partners via Project Glasswing.

Under normal circumstances, we would expect Claude Opus 4.7 to be Anthropic's most advanced and powerful model to date. However, Anthropic says it lags behind the unreleased Claude Mythos in key areas. Anthropic deemed Claude Mythos too dangerous to release to the public because of its advanced cybersecurity capabilities.

Still, Claude Opus 4.7 improves upon Opus 4.6 in many ways, particularly advanced coding, visual intelligence, and document analysis, Anthropic says.

More details on Claude Opus 4.7 hallucination rates

When using Opus 4.7, how likely is Claude to tell a lie, invent facts, or deceive users? There isn't a single hallucination rate that Anthropic provides, because there are multiple types of hallucinations.

So, this section is for the AI nerds.

Anthropic identifies a few different ways to measure hallucination and honesty:

Factual hallucinations: How likely the model is to provide accurate information. How often does the model admit that it doesn't know something?
Input hallucination: This occurs when an AI model ignores prompt instructions, hallucinates the content of files, or pretends to have access to a tool it doesn't have.
False premises honesty rate: Will the model tell a user when they're incorrect?
MASK honesty rate: This "tests whether a model will contradict its own stated belief when a user or system prompt pushes it to."

We've already covered the MASK honesty rate, and Claude Opus 4.7 shows similar gains on these other measures, according to Anthropic.

At this time, we cannot independently verify Anthropic's results.

To measure factual hallucinations, Anthropic used four different tests and recorded correct responses, incorrect responses, and abstentions. In this case, abstentions are good — the model should decline to answer a question rather than guessing. Across all four tests, Opus 4.7 scored higher than Opus 4.6 and Sonnet 4.6 but lower than Claude Mythos.

Chart showing Claude Opus 4.7's performance on accuracy tests. Credit: Anthropic

Anthropic measured Opus 4.7's input hallucination in two ways: "prompts requesting an unavailable tool" and "prompts referencing missing context."

Opus 4.7 scored 89.5 percent on the former, beating Claude Mythos's 84.8 percent; on the latter, Opus 4.7 scored 91.8 percent, two points lower than Claude Mythos's 93.8 percent.

This shows just how stubborn AI hallucinations are, with even leading AI companies like Anthropic recording input hallucination rates around 90 percent. Anthropic's reported hallucination rates are similar to the latest OpenAI models, which provide responses with incorrect information up to 5.8 percent of the time (with browsing enabled) to 10.9 percent (browsing disabled), per OpenAI.

OpenAI most recently reported hallucination rates in the system card for GPT-5-2. Credit: OpenAI

What about Opus 4.7's honesty rate for false premises, i.e., will Claude tell a user they're wrong? According to the system card, Claude will push back on false premises 77.2 percent of the time. That's better than all other recent Anthropic models except for — you guessed it — Claude Mythos, which will reject false premises 80 percent of the time.

Claude Opus 4.7 sycophancy

There's not much new to report in terms of sycophancy. While Anthropic's expert red-team testers reported that Opus 4.7 was prone to “sycophantic agreement under pushback," it has very similar scores to prior models from Anthropic and OpenAI, and noticeably better scores than Gemini 3.1 Pro and Grok 4.20. Again, this is according to Anthropic.

To measure bad behaviors like sycophancy and "encouragement of user delusion," Anthropic uses Petri 2.0, its open-source behavioral audit tool. This test scores models on a 1-10 scale, with lower scores reflecting better behavior. The Petri score isn't akin to a percentage, as it measures both the rate of a behavior and the severity.

Anthropic scored Opus 4.7 highly (or, lowly, with this particular scale) on both sycophancy and user delusions.

Anthropic uses Petri 2.0, its open source AI safety tool, which scores bad behaviors from 1-10. The lower the score, the better. Credit: Anthropic

Mashable reached out to Anthropic for comment but did not receive a response in time for publication.

Disclosure: Ziff Davis, Mashable’s parent company, in April 2025 filed a lawsuit against OpenAI, alleging it infringed Ziff Davis copyrights in training and operating its AI systems.

Anthropic says Claude Opus 4.7 has a 92% honesty rate, less sycophancy

The TL;DR version

Claude Opus 4.7 lags behind Claude Mythos on overall performance

More details on Claude Opus 4.7 hallucination rates

Claude Opus 4.7 sycophancy

Read also

Inter coach Chivu compared to Mourinho after latest press conference remarks

Megan Rapinoe & Sue Bird Announce Split After Nearly a Decade Together

Owner of XL bully that ‘basically ate’ great-granddad in fatal attack is jailed

Sports today

All sports news today

Sports in Russia today

Friends of Today24