Cursor launches Composer 2, claiming it beats Claude Opus 4.6 in benchmarks

Anysphere launched Composer 2 for the Cursor editor as a highly affordable alternative to frontier models like Claude Opus 4.6 and GPT-5.4.

The pricing: Composer 2 Standard costs $0.50 per million input tokens, massively undercutting standard industry pricing.
The integration: The model is built directly into the Cursor environment to handle autonomous file modifications and parallel agent execution.

Cursor announced that Composer 2 scored 61.3 percent on CursorBench, officially beating Claude Opus 4.6. However, CursorBench is a self-reported, proprietary evaluation. The test is heavily adjusted to favor models running specifically inside Cursor’s own controlled environment, making direct comparisons to general-purpose models inherently flawed.

The community also immediately questioned Cursor’s claim of releasing an “in-house” model after users found evidence of external foundations.

The Kimi connection: Developers discovered leaked model IDs referencing “kimi-k2p5-rl” variants within the system.
The foundation: The developer community widely believes Composer 2 is actually a heavily fine-tuned version of Moonshot AI’s Kimi K2.5, rather than a model built from scratch.
The silent launch: Cursor has not publicly confirmed the exact base model, focusing entirely on its reinforcement learning improvements instead.

was messing with the OpenAI base URL in Cursor and caught this

accounts/anysphere/models/kimi-k2p5-rl-0317-s515-fast

so composer 2 is just Kimi K2.5 with RL
at least rename the model ID https://t.co/MQOuEuF3Pd pic.twitter.com/fyUWbo1InF
— Fynn (@fynnso) March 19, 2026

The Bottom Line: Cursor built a highly capable, incredibly cheap coding agent by deeply optimizing a model for its specific application. The community debate proves that self-reported benchmark scores matter much less than how well a tool actually functions inside a developer’s daily workflow.

Efficienist Newsletter

Get the core business tech news delivered straight to your inbox. We track AI, automation, SaaS, and cybersecurity so you don't have to.

Just read what you want, and be done with it.

Cursor launches Composer 2, claiming it beats Claude Opus 4.6 in benchmarks

Anthropic doubles Claude Code limits after signing compute deal with SpaceX

A 12 million token LLM appeared out of nowhere, and the AI community isn’t sure what to make of it

Karpathy joins Anthropic, Google I/O delivers Gemini 3.5 and a 24/7 personal agent, Standard Chartered cuts 7,000 for AI

Federal Jury Dismisses Musk’s $150B OpenAI Claim in Under Two Hours

Cerebras IPO Soars 68%, Codex goes mobile, and Grok Build enters the coding agent race

Anthropic doubles Claude Code limits after signing compute deal with SpaceX

A 12 million token LLM appeared out of nowhere, and the AI community isn’t sure what to make of it