AI’s Double Launch Day: Claude 4.7 Surges, OpenAI Reinvents Codex
Yesterday’s AI calendar bent around two flagship launches within hours of each other. If you do any agentic coding work, your stack just got upgraded. But fair warning: the time it takes you to hit your rate limits probably shrank, too.
Here’s what happened
- Anthropic shipped Claude Opus 4.7 at the same $ 5/$25-per-million-tokens pricing as 4.6.
- Visual reasoning (how well AI sees) jumped from 69.1% to 82.1%, with image processing at up to 2,576 pixels on the long edge (over 3x any prior Claude).
- SWE-bench Pro (coding benchmark) went from 53.4% to 64.3%, and Opus 4.7 is now #1 on Vals AI’s Vibe Code Benchmark at 71%.
- The catch buried in Anthropic’s own docs: the new tokenizer can use up to 35% more tokens for the same text. With Claude Code’s new xhigh-effort default, Pro and Max users will hit weekly caps faster unless they manually dial it down.
Hours after Anthropic’s release…
OpenAI overhauled Codex, its coding app, to be “Codex for (almost) everything”… a.k.a OpenAi’s version of Claude Cowork. It comes with Mac-level computer use (agents click and type alongside you), an in-app browser, persistent memory, automations that wake up across days, and 90+ new plugins (Atlassian Rovo, CircleCI, Microsoft Suite, etc). It can do a lot more than that too.
Keeping score on the SaaSpocalypse: Anthropic CPO Mike Krieger resigned from Figma’s board the same day reports surfaced Anthropic is shipping design software. Figma stock slid. Rumor has it that was going to launch yesterday… but it did not.
Pro tips (straight from the Claude Code team): Boris Cherny and Cat Wu both shipped launch-day playbooks on how to actually use 4.7. The unifying framing, from Wu: treat it like an engineer you’re delegating to, not a pair programmer you’re guiding line by line.
The four moves that matter:
- Front-load your context. Goal, constraints, & acceptance criteria (more below), all in the first turn. 4.7 is built to take a full brief and run with it; give it a vague goal, and you’ll get 4.6-level results from a model capable of more.
- Turn on auto mode. Hit Shift+Tab in the terminal (Max, Team, Enterprise). Your permission prompts will pass through a safety classifier, so you can run multiple Claudes in parallel to get more work done without babysitting any single run.
- Tell it how to verify its own work. Both Cherny and Wu call this the 2-3× output-quality multiplier. Put your testing workflow in your claude.md so 4.7 runs tests every time, or install a /verify-app skill for your stack.
- xhigh is the new default for Claude Code. Use /effort to step down on routine work; save max for the hardest tasks (it’s session-only, so it doesn’t persist into your next session).
In the web app, you now have a feature called adaptive thinking that lets Claude decide how long to think. In other words: it’s a thinking router? Booo! Well, it’ll take time to “adapt” to this, but for now, use it because otherwise it won’t think much at all.
Our take
The story from what we’ve read is the tokenizer. Anthropic’s “same pricing as 4.6” is technically true; in practice, the same prompt maps to more tokens, the default effort runs higher, and the output is longer. A friend of ours hit their weekly Max limit in basically one prompt.
That said, Opus 4.7 is a better model. It’s also more expensive, even with unchanged sticker prices. The people who get the most out of it won’t be the ones running at defaults. They’ll be the ones who actually follow the best practices. Study up.
Editor’s note: This content originally ran in the newsletter of our sister publication, The Neuron. To read more from The Neuron, sign up for its newsletter here.
The post AI’s Double Launch Day: Claude 4.7 Surges, OpenAI Reinvents Codex appeared first on eWEEK.