Introducing Claude Opus 4.8


We’re upgrading Claude Opus to a brand new model: Claude Opus 4.8. It builds on Opus 4.7 with enhancements throughout benchmarks, and is a more practical collaborator. It’s obtainable as we speak for a similar value.

Opus 4.8 launches alongside a number of new options. Customers on claude.ai now have management over the quantity of effort Claude places right into a process. Claude Code has a brand new “dynamic workflows” function that enables it to sort out very large-scale issues. And quick mode for Opus 4.8—the place the mannequin can work at 2.5× the pace—is now thrice cheaper than it was for earlier fashions.

Opus 4.8’s capabilities

The desk under exhibits how Opus 4.8 compares to its predecessor and to different fashions on exams of coding, agentic abilities, reasoning, and sensible information work duties. Extra particulars and a a lot wider vary of functionality evaluations are offered within the Claude Opus 4.8 System Card.

Collaborating with Opus 4.8

Early testers have discovered Claude Opus 4.8 to be extra dependable and sharper in its judgement when it’s performing agentic duties. Under are quotes from many of those testers about their expertise collaborating with Opus 4.8:

Some of the distinguished enhancements in Opus 4.8 is its honesty. We prepare all our fashions to be sincere—for example, to keep away from making claims that they will’t help. However a common downside with AI fashions is that they often bounce to conclusions, confidently claiming to have made progress of their work regardless of the proof being skinny. Early testers report that Opus 4.8 is extra more likely to flag uncertainties about its work and fewer more likely to make unsupported claims. That is borne out in our evaluations, which present that Opus 4.8 is round 4 occasions much less seemingly than its predecessor to permit flaws in code it has written to go unremarked.

As at all times, we ran an in depth alignment evaluation on the mannequin earlier than launch. By way of optimistic traits, our Alignment workforce concluded that Opus 4.8 “reaches new highs on our measures of prosocial traits like supporting consumer autonomy and appearing within the consumer’s finest curiosity.” The evaluation additionally confirmed Opus 4.8 to have charges of misaligned conduct (resembling deception or cooperation with misuse) which are considerably decrease than Opus 4.7, and much like our best-aligned mannequin, Claude Mythos Preview. The total alignment evaluation, accompanied by a set of pre-deployment security exams, is reported within the Claude Opus 4.8 System Card.

Additionally launching as we speak

Along with Claude Opus 4.8, we’re making the next updates:

  • Dynamic workflows. This new function, obtainable in analysis preview, permits Claude to tackle even greater duties in Claude Code. Claude can plan the work after which run a whole bunch of parallel subagents in a single session (and with Opus 4.8, the brokers can run for even longer). It then verifies its outputs earlier than reporting again to the consumer. For instance, Claude Code with Opus 4.8 can now perform codebase-scale migrations throughout a whole bunch of hundreds of strains of code from kickoff to merge, with the present take a look at suite as its bar. You’ll be able to learn extra about dynamic workflows—obtainable in Claude Code for Enterprise, Staff, and Max plans—in this post.
  • Effort management in claude.ai and Cowork. A brand new management alongside the mannequin selector lets customers select how a lot effort Claude places right into a response. On greater effort settings, Claude will suppose extra regularly and extra deeply to offer higher responses. On decrease effort settings, Claude will reply quicker and deplete a consumer’s fee limits extra slowly. Customers now have this selection—the hassle management is accessible on all plans.
  • The Messages API now accepts system entries contained in the messages array. Builders can replace Claude’s directions mid-task with out breaking the immediate cache or routing the replace by means of a consumer flip. This can be utilized in a given harness to replace permissions, token budgets, or setting context as an agent runs.

A observe on effort

Opus 4.8 defaults to excessive effort, which we decide to be the very best general stability of high quality and consumer expertise. On coding duties, this effort stage spends the same variety of tokens as Opus 4.7’s default, however with higher efficiency. Customers can select “further” (“xhigh” in Claude Code) or “max,” and the mannequin will spend extra tokens to get higher outcomes; we suggest utilizing “further” for troublesome duties and long-running asynchronous workflows. We’ve got elevated fee limits in Claude Code to accommodate the upper token utilization of upper effort ranges; customers can choose whichever is sensible for his or her specific mission.

What’s subsequent?

Customers will discover Opus 4.8 to be a modest however tangible enchancment on its predecessor. There’s nonetheless extra to be executed: we’re engaged on creating and releasing fashions that present most of the identical capabilities as Opus at a decrease price.

Not solely that, however we plan to launch a brand new class of mannequin with even greater intelligence than Opus. As a part of Project Glasswing, a small variety of organizations are presently utilizing Claude Mythos Preview for cybersecurity work. Fashions of this functionality stage require stronger cyber safeguards earlier than they are often typically launched. We’re making swift progress on creating these safeguards and count on to have the ability to convey Mythos-class fashions to all our prospects within the coming weeks.

Availability

Claude Opus 4.8 is accessible all over the place as we speak. Pricing for normal utilization is unchanged from Opus 4.7: $5 per million enter tokens and $25 per million output tokens. Pricing for quick mode is $10 per million enter tokens and $50 per million output tokens. Builders can use claude-opus-4-8 through the Claude API.