Anthropic just confirmed Claude Mythos, the model it says is too capable to release publicly

Last week, code references to a model called Mythos started circulating online. Anthropic confirmed its existence shortly after. Today, through an initiative called Project Glasswing, the company gave its first detailed account of what the model actually does.

The short version: Mythos finds software vulnerabilities that humans have missed for decades, and does it autonomously.

Anthropic says the model found zero-day vulnerabilities in every major operating system and every major web browser. Among the specific cases it disclosed: a 27-year-old bug in OpenBSD, a security-hardened operating system used to run firewalls and critical infrastructure, that had survived decades of human review. It also found a 16-year-old flaw in FFmpeg in a line of code that automated testing tools had hit five million times without catching it.

The numbers on autonomous exploit development are significant. Where Claude Opus 4.6 turned browser vulnerabilities into working exploits twice out of hundreds of attempts, Mythos did it 181 times.

The benchmarks:

SWE-bench Pro: Opus 4.6 scored 53.4%. Mythos scored 77.8%
SWE-bench Verified: Opus 4.6 scored 80.8%. Mythos scored 93.9%
SWE-bench Multimodal: Opus 4.6 scored 27.1%. Mythos scored 59.0%
Humanity’s Last Exam (no tools): Opus 4.6 scored 40.0%. Mythos scored 56.8%
CyberGym (vulnerability reproduction): Opus 4.6 scored 66.6%. Mythos scored 83.1%

The SWE-bench Multimodal jump is the one worth pausing on. Opus to Mythos is more than double. On a benchmark designed to test real-world software engineering across modalities, that is not a small gap.

Anthropic says Mythos has outrun its standard evaluation benchmarks entirely. The company has started using real-world vulnerability discovery as the new measure of capability.

The access list:

Anthropic is giving Mythos access to a group of launch partners. Among them:

AWS, Google, Microsoft, Apple — all participating in defensive security work
Cisco, CrowdStrike, Palo Alto Networks — security-focused partners
NVIDIA, Broadcom, JPMorganChase, the Linux Foundation — rounding out the coalition

Anthropic is committing $100M in usage credits to cover the research preview period. After that, Mythos will be available to participants at $25 per million input tokens and $125 per million output tokens. The model is accessible through the Claude API, Amazon Bedrock, Google Cloud’s Vertex AI, and Microsoft Foundry.

The company has also donated $4M to open-source security organizations, including Alpha-Omega, OpenSSF, and the Apache Software Foundation.

There are no plans for a general public release. Anthropic says it needs to develop new cybersecurity safeguards before that becomes possible, and plans to test those safeguards with an upcoming Claude Opus model first.

Bottom line: Anthropic just ran one of the more effective product launches in recent AI memory. The model is too dangerous to release publicly, so access goes to Apple, Google, Microsoft, and friends. Withholding a model because it’s too dangerous for the public is, regardless of whether that’s the full story, about as good a marketing move as it gets.

RunPod

If you need on-demand GPUs for training, fine-tuning, inference, or running open-source models, give RunPod a try.

Available hardware: H100, H200, A100, L40S, RTX 4090, RTX 5090, and 30+ more
Cost: significantly cheaper than AWS or GCP, billed per second, no contracts
Setup: spins up in under a minute, 30+ regions worldwide

Try RunPod →

Affiliate disclosure: We may earn a commission if you sign up via our link, at no extra cost to you.