Guides

Claude Opus 4.8: Model, API and Pricing

Claude Opus 4.8 Guide

Claude Opus 4.8 is Anthropic’s most capable Opus-tier model, launched May 28, 2026 — and the safeguard fallback model for the newer Claude Fable 5. Designed for agentic coding, long-horizon tasks, and complex reasoning — with improved reliability, better tool-calling efficiency, and significantly reduced fast mode pricing.

Specs & Pricing vs Opus 4.7 Adaptive Thinking Who Should Use It All Models

Claude Opus 4.8 is the latest model in Anthropic’s Opus line, succeeding Claude Opus 4.7. It ships with the same API pricing as Opus 4.7 but brings meaningful improvements in coding reliability, tool use efficiency, and long-session consistency — plus a 3× reduction in fast mode cost. This page covers everything developers and advanced users need to know.

Always verify current pricing and capabilities at platform.claude.com.

Claude Opus 4.8 Specs and Pricing

Property	Claude Opus 4.8
API model ID	`claude-opus-4-8`
Launch date	May 28, 2026
Input price (standard)	$5 per MTok
Output price (standard)	$25 per MTok
Fast mode (input / output)	$10 / $50 per MTok
Batch API (input / output)	$2.50 / $12.50 per MTok
Context window	1 million tokens
Max output (sync)	128,000 tokens
Knowledge cutoff	January 2026
Extended thinking	Not supported (returns 400 error)
Adaptive thinking	Yes — use `thinking: {type: "adaptive"}`
Effort default	`high` on all surfaces including API and Claude Code
xhigh effort	Yes (Opus 4.8 and 4.7 only)
Prompt cache minimum	1,024 tokens

Pricing verified from Anthropic’s official pricing page. All prices are per million tokens. API pricing is separate from claude.ai consumer plan pricing. Check official sources before making billing decisions.

Claude Opus 4.8 vs Opus 4.7

Opus 4.8 and Opus 4.7 share the same standard API pricing ($5 input / $25 output per MTok) and the same context window (1M tokens). The key differences:

Dimension	Opus 4.8	Opus 4.7
Standard pricing	$5 / $25 per MTok	$5 / $25 per MTok
Fast mode pricing	$10 / $50 per MTok	$30 / $150 per MTok
Coding reliability	~4× less likely to pass code flaws	Baseline
Tool-calling efficiency	Fewer steps to complete tool workflows	Baseline
Citation precision	Improved	Baseline
Long session consistency	Better context & style across sessions	Baseline
Mid-conversation system messages	Supported (new in 4.8)	Not supported (400 error)
Prompt cache minimum	1,024 tokens (lower)	Higher
Adaptive thinking	Yes	Yes
Deprecation status	Active — retirement not before May 2027	Active — retirement not before April 2027

Opus 4.7 remains active and is not deprecated. If you are already using Opus 4.7 and your use case does not involve agentic coding or long tool-calling sessions, there is no immediate urgency to upgrade. The main reasons to upgrade to Opus 4.8 are: fast mode cost reduction (3×), improved agentic coding reliability, and support for mid-conversation system messages.

Adaptive Thinking and Effort Control

Claude Opus 4.8 uses adaptive thinking — it dynamically decides how much reasoning to apply based on task complexity and the effort level you set. Manual extended thinking (the thinking parameter with budget_tokens) is not supported on Opus 4.8 and returns a 400 error.

Effort levels supported on Opus 4.8:

Level	Best for	Notes
`max`	Deepest analysis, no token constraints	Highest cost; use with large max_tokens
`xhigh`	Long-horizon agentic coding (30+ min runs)	Opus 4.8 and 4.7 only; set max_tokens ≥ 64k
`high`	Complex reasoning, difficult tasks (default)	Same as omitting the effort parameter
`medium`	Balanced speed, cost, performance	Good starting point for cost-sensitive workflows
`low`	Simple tasks, subagents, speed/cost priority	Significant token savings

The effort parameter is set via the output_config object in the API. No beta header is required. For agentic and coding workflows, Anthropic recommends starting with xhigh and tuning down if quality holds.

// Example: Opus 4.8 with xhigh effort for coding
{
  "model": "claude-opus-4-8",
  "max_tokens": 64000,
  "output_config": { "effort": "xhigh" },
  "messages": [{ "role": "user", "content": "Refactor this module..." }]
}

Who Should Use Claude Opus 4.8?

Claude Opus 4.8 is the right choice when you need maximum reliability for complex multi-step work. It is not always the right choice — Claude Sonnet 5 handles most everyday tasks well at a fraction of the price, and has closed most of the agentic-coding gap to Opus 4.8.

Agentic coding workflows: Complex multi-file tasks, debugging with Claude Code, architecture planning
Long tool-calling sessions: Tasks requiring many sequential tool calls where reliability matters
Long document analysis: Up to 1 million tokens — large codebases, legal documents, technical reports
Citation-sensitive work: Research, legal analysis, document review where source precision is critical
Extended autonomous sessions: Work that runs 30+ minutes with minimal supervision

Use Sonnet 5 ($2/$10 per MTok introductory) for most writing, research, and API workflows — it supports adaptive thinking and has a 1M context window. Use Haiku 4.5 ($1/$5 per MTok) for fast, high-volume, or cost-sensitive tasks.

Not sure which model fits your use case? Try the Claude Model Selector tool.

Opus 4.8 vs Claude Fable 5

On June 9, 2026, Anthropic launched Claude Fable 5 (claude-fable-5) as its most capable widely released model, positioned above Opus 4.8. Fable 5 costs $10 / $50 per MTok — double Opus 4.8’s standard pricing — with the same 1M context window and 128k max output. Opus 4.8 remains the top Opus-tier model and plays a specific role in the Fable 5 system: when Fable 5’s safety classifiers trigger (in under 5% of sessions, per Anthropic), the request is routed to Opus 4.8. For most complex reasoning and agentic coding, Opus 4.8 remains a strong and cheaper choice; reach for Fable 5 when you need the highest available capability for demanding reasoning, long-horizon agentic work, or vision-heavy tasks.

Claude Code and Dynamic Workflows

Claude Opus 4.8 is available in Claude Code, Anthropic’s AI coding tool. A research preview feature called Dynamic Workflows enables parallel subagent execution for large-scale tasks — currently available in Claude Code for Enterprise, Team, and Max plans.

Claude Code’s ultracode mode pairs the xhigh effort level with standing permission for multi-agent workflows, making Opus 4.8 particularly suited to complex, autonomous coding tasks. For developers using the raw API, the xhigh effort level achieves similar behavior.

API Usage

Access Claude Opus 4.8 via the Anthropic API using the model ID claude-opus-4-8:

import anthropic

client = anthropic.Anthropic()
message = client.messages.create(
    model="claude-opus-4-8",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello, Claude Opus 4.8"}]
)
print(message.content)

See the Claude API Guide for getting started with keys and billing. Use the Claude API Cost Estimator to project costs before scaling.

Migrating from the original Claude Opus 4? The claude-opus-4-20250514 API model ID retires on June 15, 2026. Replace it with claude-opus-4-8 in your API calls — no other parameter changes are required to switch.

Frequently Asked Questions

What is Claude Opus 4.8?

Claude Opus 4.8 is Anthropic’s most capable Opus-tier model, launched May 28, 2026. Since June 9, 2026, Claude Fable 5 is Anthropic’s most capable widely released model overall. Opus 4.8 succeeds Claude Opus 4.7 with improved reliability for agentic coding, better tool-calling efficiency, and significantly reduced fast mode pricing (/ per MTok vs /0 for Opus 4.7).

What is the Claude Opus 4.8 API model ID?

The API model ID for Claude Opus 4.8 is claude-opus-4-8. Use this in your API calls when building with the Anthropic API.

What does Claude Opus 4.8 cost?

Claude Opus 4.8 standard pricing is per MTok (input) and per MTok (output). Fast mode is / per MTok — 3× cheaper than Opus 4.7’s fast mode. Batch API pricing is .50/.50 per MTok. Always verify at platform.claude.com/docs/en/about-claude/pricing.

Does Claude Opus 4.8 support extended thinking?

No. Claude Opus 4.8 does not support manual extended thinking — using a thinking budget returns a 400 error. Instead, use adaptive thinking and control reasoning depth with the effort parameter.

Should I upgrade from Opus 4.7 to Opus 4.8?

Opus 4.7 is still active and not deprecated. Upgrade to Opus 4.8 if you use agentic coding, long tool-calling sessions, or fast mode (3× cost reduction). For most other workloads, both models perform comparably. No breaking API changes are required to switch.

What is Claude Opus 4.8’s context window?

Claude Opus 4.8 has a 1 million token context window, the same as Opus 4.7. Max synchronous output is 128,000 tokens. Batch API requests can produce up to 300,000 output tokens with the appropriate beta header.