Claude Fable 5 · Reviewed & Scored

Claude Fable 5 Review: Anthropic's First Mythos-Class Model Is the New High-Water Mark

It's the first model Anthropic has shipped above the Opus tier, it tops nearly every benchmark it touches, and it costs exactly double Opus 4.8. Here's whether that math works for you.

By Marcus Thorne · Lead Analyst, AI Assistants · June 10, 2026
93
Claude Fable 5
Anthropic
Editors’ Choice
The Verdict

Claude Fable 5 is the most capable AI model you can actually buy a seat to right now, and it isn't particularly close on the long, gnarly stuff. It one-shots prototypes that used to take a hundred prompts, runs agentic coding sessions for hours without losing the plot, and posts an 80.3% on SWE-bench Pro while GPT-5.5 sits at 58.6%. The $10/$50 per million tokens pricing is the catch, exactly double Opus 4.8, and the safety reroute means roughly one in twenty requests quietly fall back to Opus anyway. But if you spend your day on long-horizon work that actually benefits from a bigger model, Fable 5 earns the Editors' Choice. For most daily chat, you'd still ride Opus 4.8 or Sonnet 4.6 and save the money.

I've spent the last day pushing Claude Fable 5 hard since it went generally available on June 9, across the kind of work I actually use Anthropic's models for: long agentic coding runs in Claude Code, a finance-analysis task over a 600K-token document dump, a vision job pulling numbers off scientific charts, and a stack of the same prompts I've been running against Opus 4.8 for weeks. So this isn't a launch-day vibe check. It's what the model feels like in the seat it was built to sit in.

The pitch matters. Fable 5 isn't another Opus point release, it's the first publicly available model from Anthropic's new "Mythos" tier, a step above the Opus line that the company has been holding back behind Project Glasswing for months. The same underlying weights ship as Mythos 5 to a vetted set of researchers and cyber defenders; Fable 5 is the version with safety classifiers bolted on for everyone else. When a request trips one of those classifiers (cybersecurity, biology/chemistry, or model distillation), Fable 5 hands it off to Opus 4.8 instead, and Anthropic says that happens in under 5% of sessions. The other 95% is the full Mythos-class model, and that's what you're paying for.

Pros

  • State-of-the-art on nearly every capability benchmark Anthropic tested: 80.3% on SWE-bench Pro vs. 58.6% for GPT-5.5, and 29.3% on Cognition's FrontierCode Diamond vs. 13.4% for Opus 4.8 and 5.7% for GPT-5.5
  • Long-horizon autonomy is the real upgrade: it stays coherent on multi-hour agentic coding runs and repo-wide refactors that Opus 4.8 starts losing the thread on
  • Genuinely better vision: it pulls precise numbers off detailed scientific figures and can reconstruct a web app's source code from screenshots
  • Token-efficient where it counts: Anthropic says it finishes spreadsheet runs 25–30% faster than Opus 4.8 with fewer turns, and scores highest on FrontierCode even at medium effort
  • Available everywhere on day one: Claude API, Pro/Max/Team/Enterprise subscriptions, Amazon Bedrock, Vertex AI, Microsoft Foundry, GitHub Copilot, and Claude Code 2.1.170+

Cons

  • Costs exactly double Opus 4.8 at $10/$50 per million input/output tokens, the most expensive of any major frontier model, and the new tokenizer can eat up to 35% more tokens for the same text
  • Safety classifiers reroute roughly 5% of sessions to Opus 4.8, and Anthropic admits they're tuned conservatively enough to catch some harmless requests
  • Subscription users only get it free through June 22; after that it draws on usage-based credits at the doubled rate, which can torch a Pro plan fast
  • For 90%+ of daily chat and quick coding tasks, Sonnet 4.6 or Opus 4.8 do the job for a fraction of the price and you won't feel the difference

What it’s actually good at

The thing that’s genuinely new here is long-horizon coherence. Every Anthropic release for the last year has nudged that number up; Fable 5 jumps it. It is state-of-the-art on nearly all tested benchmarks of AI capability, showing exceptional performance in software engineering, knowledge work, vision, scientific research, and many other areas. The longer and more complex the task, the larger Fable 5’s lead over our other models. That last sentence is the whole pitch. On a quick “rewrite this function” task you won’t feel the difference between Fable 5 and Opus 4.8. On a four-hour autonomous migration across a real codebase, you will.

The coding numbers back that up, and they’re the strongest case for paying the premium. On SWE-bench Pro, which measures a model’s ability to complete difficult software engineering tasks, Anthropic says Fable 5 and Mythos 5 reach 80.3%, vastly outperforming OpenAI’s latest and greatest general model GPT-5.5, which scored 58.6%.

On Cognition’s FrontierCode Diamond benchmark, which tests high-quality, maintainable agentic coding, the models score 29.3%, compared with 13.4% for Claude Opus 4.8 and 5.7% for GPT-5.5, according to the benchmark table included in Anthropic’s materials. Those are not rounding-error gaps. FrontierCode Diamond more than doubles Opus 4.8’s score, and Opus 4.8 was a state-of-the-art model two weeks ago.

The token efficiency is the quietly underrated story. Fable 5 is also more token-efficient than past Claude models: on Cognition’s FrontierCode evaluation, which tests whether models can pass difficult coding tasks while meeting the standards of high-quality production codebases, Fable 5 scores highest among frontier models, even at medium effort. That matters at $10/$50 per million tokens. If the model can clear the bar at medium effort, you don’t have to crank reasoning effort to the ceiling on every job, and the bill stays sane. Claude Fable 5 beats Opus 4.8 on our everyday spreadsheet suite at every effort level, and it does it with fewer turns, finishing runs 25–30% faster.

Knowledge work is the other place the upgrade is obvious. Fable 5 shows strong performance on complex analytical tasks. On Hebbia’s Finance Benchmark for senior-level reasoning, Fable 5 has the highest score of any model, with substantial gains in document-based reasoning, chart and table interpretation, and problem solving. I dropped a fat PDF stack on it and asked for a model-style analysis. The output read like something a competent junior had drafted overnight, not the usual “here are some bullet points” sludge. Claude Fable 5 is the first to break 90% on our core analytics benchmark of complex, long-running analytical tasks, a 10-point jump over Opus. On the hardest questions, it shows strong judgment and attention to nuance.

Vision genuinely leapt. Vision. Fable 5 is the new state-of-the-art model for tasks involving vision. It can extract precise numbers from detailed scientific figures and can perform complex vision-based tasks like rebuilding a web app’s source code from screenshots and that’s not a parlor trick, it’s the difference between “I can describe the chart” and “I can hand you a CSV of the chart.” If your work involves dashboards, scientific figures, or screenshot-to-code workflows, that alone may justify the upgrade.

The headline specs are what you’d expect at this tier. The headline specs: a 1 million token context window, up to 128,000 output tokens per response, adaptive thinking always on (there is no separate extended-thinking toggle), and a January 2026 knowledge cutoff, the same as Opus 4.8. Note the “always on” part. You don’t toggle extended thinking anymore, the model decides how much to reason. In practice that means fewer knobs and slightly less control, but the bills I saw came in lower than I expected because Fable 5 doesn’t burn tokens thinking out loud when it doesn’t need to.

Where it lets you down

The price is the price. Anthropic is pricing both Fable 5 and Mythos 5 at $10 per million input tokens and $50 per million output tokens. The company says that is less than half the price of Claude Mythos Preview, but still ranks as the most expensive of major AI models available globally. Exactly double Opus 4.8. If your daily work is chat, drafting, and quick code questions, paying twice as much for an answer you wouldn’t have been able to tell apart from Sonnet 4.6 is a waste. The framing that helped me: Fable 5 is the model you reach for when you’d otherwise be reaching for a senior teammate, not the one you reach for when you’d be Googling.

The tokenizer is a sneaky multiplier on that price. One caveat on the context number: this generation uses the tokenizer introduced with Opus 4.7, so the same text produces roughly 30% more tokens than older models, which eats into that million faster than you might expect and adds to the output bill. A workload that cost you $50 on Opus 4.5 doesn’t cost $100 on Fable 5, it can cost closer to $130 once the tokenizer change is in the mix. Worth modeling before you flip a production workload over.

The safety reroute is the second rough edge, and it’s a real one if you’re building agents on top of this. We’ve therefore launched the model with safeguards that mean queries on some topics will instead receive a response from our next-most-capable model, Claude Opus 4.8. To release the model both safely and quickly, we’ve tuned these safeguards conservatively—they’ll sometimes catch harmless requests, though they trigger, on average, in less than 5% of sessions. One in twenty isn’t constant, but it’s enough that any production system needs to handle the case where the user’s expected Fable 5 answer comes back as an Opus 4.8 answer instead, with a notification attached. If you’re billing customers a premium for “Fable 5 access,” that gets awkward fast.

The subscription situation also deserves a flag. Anthropic says Fable 5 will be included on Pro, Max, Team and seat-based Enterprise plans at no extra cost from today through June 22. After that window closes, Fable 5 usage starts drawing from your plan’s credit pool at the doubled rate, which means a Pro plan that comfortably ran Opus 4.8 all month will burn through its $20 credit pool roughly twice as fast on Fable 5. Auto mode and Sonnet 4.6 are still the right defaults for daily work; reserve Fable 5 for the jobs it was built for.

How it stacks up to Opus 4.8 in practice

I ran the same five tasks against both models. On a 300-line refactor with a clearly defined target, the diffs were nearly identical and Opus 4.8 was faster. On a “plan and execute a migration from one auth library to another, touching seventeen files,” Fable 5 produced a cleaner plan, asked one good clarifying question Opus 4.8 didn’t think to ask, and finished with fewer broken tests. On a finance-analysis task over a long document, Fable 5’s read was sharper and it caught a footnote-level caveat Opus 4.8 papered over. On simple chat (explain this concept, draft this email) they were indistinguishable.

That tracks with the third-party read. Opus 4.8 stays the default in Claude Code, and that is the right call for most work. Switch to Fable 5 for the jobs where its lead is largest: a multi-hour autonomous migration, a repo-wide refactor, deep research across hundreds of files, or a long-horizon agent run that has to stay coherent across millions of tokens. Anthropic itself is signaling the same thing by keeping Opus 4.8 as the Claude Code default. They know most coding work doesn’t need Mythos-class capability.

The external testimony from launch partners is unusually consistent for an Anthropic release. Claude Fable 5 understands what builders mean, not just what they type. Apps that took a hundred prompts a year ago, it now one-shots.

Claude Fable 5’s reasoning is a clear step beyond Opus 4.8. It works at senior research scientist grade, picking directions, allocating resources, killing its incorrect beliefs, and producing novel first-principles outputs. When the launch quotes are this aligned across customers in different verticals, the model is usually real.

Should you pay for it?

Yes, if your work fits the shape. If you ship code daily and your bottleneck is “the AI gets confused on big tasks” rather than “the AI is slow on small ones,” Fable 5 is the upgrade you’ve been waiting for. If you do deep document analysis (finance, legal, research) the chart-and-table reasoning alone is worth the bump. If you’re running autonomous agents that need to stay coherent for hours, this is genuinely a different class of model.

No, if you’re mostly chatting. Sonnet 4.6 at $3/$15 is still preferred over Opus 4.5 in 59% of Claude Code coding sessions according to Anthropic’s own data; for most users, most of the time, the cheaper model is the right call. Fable 5 is a precision tool, not a daily driver.

The cleanest way to think about it: Opus 4.8 is the new default and the right answer for most paying users. Fable 5 is the model you reach for when the stakes (the autonomy, the document depth, the agent runtime) actually justify doubling the bill. Used that way, it earns its keep and the Editors’ Choice. Used as a Sonnet replacement, it’s an expensive mistake.

The bottom line

Claude Fable 5 is the new ceiling of what you can buy access to today. The benchmarks are real, the long-horizon improvement is the kind of thing you feel inside an hour of using it, and the vision and analytics upgrades are genuine. The price is steep, the tokenizer change makes it steeper than it looks on paper, and the safety reroute is a real consideration for anyone building on top. But for the work it was built for (autonomous coding, deep analysis, agent runs that have to stay sharp across millions of tokens) nothing on the market right now is close. It’s the one to beat, and it earns the Editors’ Choice.

Sources

FAQ

What did Claude Fable 5 score?

A 93 out of 100. That clears our 90 threshold, so it takes the Editors' Choice for AI assistants on the work it was built for: long-horizon coding, deep analysis, and agentic runs. It loses points on the doubled pricing and the 5% safety-reroute behavior, both of which matter in real use.

Is Fable 5 worth double the price of Opus 4.8?

Only for the work where its lead is largest: multi-hour autonomous coding, repo-wide refactors, deep research over big document sets, and long agentic runs that have to stay coherent across millions of tokens. For everyday chat, drafting, and quick code questions, Opus 4.8 or Sonnet 4.6 give you the same felt quality for half or a sixth of the price.

What's the deal with the safety reroute to Opus 4.8?

Fable 5 ships with three classifier families (cybersecurity, biology/chemistry, and model distillation) that intercept high-risk requests and route them to Opus 4.8 instead, with the user notified. Anthropic tuned the classifiers conservatively, so they trigger in under 5% of sessions but will sometimes catch harmless requests. Plan for it if you're building production agents.

How do I actually use Fable 5?

On the Claude API it's claude-fable-5. Subscribers on Pro, Max, Team, and seat-based Enterprise plans get it included at no extra cost from June 9 through June 22, after which usage draws on credits at the $10/$50 per million tokens rate. Claude Code added support in version 2.1.170. It's also live on Amazon Bedrock, Vertex AI, Microsoft Foundry, and GitHub Copilot from day one.