Cursor Composer 2 Revealed: The Truth Behind the 'Self-Developed' Code Model from Kimi 2.5

Cursor released Composer 2 claiming 'self-developed', but community questioned it's based on Kimi 2.5 fine-tuning. VP admits open-source base, details RL training, benchmarks, pricing strategy, and developer implications.

NixAPI Team March 23, 2026 ~11 min read
Cursor Composer 2 Revealed Cover

March 22, 2026 Update: AI coding company Cursor launched a new model this week called Composer 2, promoted as offering “frontier-level coding intelligence.” However, X user Fynn claimed Composer 2 was “just Kimi 2.5 with additional reinforcement learning.” Cursor’s VP of developer education Lee Robinson acknowledged, “Yep, Composer 2 started from an open-source base!” but emphasized ~3/4 of compute came from Cursor’s own training. This article is based on reports from TechCrunch, 36Kr, and other media outlets, detailing the truth and implications for developers.


📢 Event Timeline: From “Self-Developed” to “Open-Source Base”

Timeline

DateEvent
March 20Cursor releases Composer 2, claims “self-developed code model”
March 20 PMX user Fynn questions: Composer 2 is “just Kimi 2.5”
March 21Cursor VP Lee Robinson admits using open-source base
March 22TechCrunch and others report, controversy spreads

Core Controversy Points

Cursor’s Official Claims:

  • Composer 2 offers “frontier-level coding intelligence”
  • Outperforms Claude Opus 4.6
  • Drastic price cut (less than half)

Community Questions:

  • X user Fynn: Composer 2 is “just Kimi 2.5 plus extra RL”
  • Kimi 2.5 is an open-source model from Moonshot AI
  • Questions whether Cursor’s model is truly “self-developed”

Cursor’s Response:

“Yep, Composer 2 started from an open-source base! But only ~1/4 of the compute spent on the final model came from the base, the rest is from our training.”

— Lee Robinson, VP of Developer Education at Cursor


🔍 Technical Analysis: How Was Composer 2 Actually Built?

Base Model: Kimi 2.5

Kimi 2.5 is an open-source code model released by Moonshot AI in early 2026.

FeatureDescription
Open Source LicenseApache 2.0 (allows commercial use and fine-tuning)
ParametersNot disclosed (estimated 30-50B)
Training DataCode + Math + Reasoning mixed data
Context Window128K tokens
InvestorsAlibaba, HongShan (formerly Sequoia China)

Cursor’s Training Method

According to Lee Robinson’s disclosure and 36Kr reporting:

Composer 2 Training Pipeline:

1. Base Model (Kimi 2.5)
   ↓ 25% compute
   
2. Cursor Reinforcement Learning Training
   ↓ 75% compute
   - New RL method (details not disclosed)
   - Context summarization ability internalized
   - Long-task memory optimization
   
3. Final Model (Composer 2)

Key Technology: RL + Context Summarization

Problem: Traditional methods lose key information in long tasks.

Cursor’s Solution:

  1. Summarization Importance: Regularly summarize key information in long tasks
  2. Internalize Summarization: Train summarization ability into the model itself, not relying on external prompts

Results:

  • Traditional summarization requires thousands of tokens for summary prompts
  • Compressed results average over 5000 tokens
  • Composer 2 internalizes summarization, reducing token consumption

📊 Performance Benchmarks: Composer 2 vs Competitors

Official Benchmarks

According to Cursor official data:

ModelSWE-benchHumanEvalPrice (per 1M tokens)
Composer 268.2%91.5%$0.75 (input) / $3.00 (output)
Claude Opus 4.665.8%90.1%$15.00 (input) / $75.00 (output)
GPT-5.466.5%92.3%$2.50 (input) / $10.00 (output)
Kimi 2.558.3%85.2%Open Source Free

💡 Key Findings:

  • Composer 2 surpasses Claude Opus 4.6 on SWE-bench (+2.4%)
  • Price is only 1/20 of Claude Opus 4.6
  • Compared to base model Kimi 2.5, SWE-bench improved +9.9%

Real-World Testing: High-Difficulty Software Engineering Tasks

36Kr reported test results on a set of high-difficulty software engineering tasks:

Task TypeComposer 2Claude Opus 4.6GPT-5.4
Code Refactoring92%88%90%
Bug Fixing89%91%87%
New Feature Development85%83%86%
Code Review91%93%89%
Average89.25%88.75%88.00%

💰 Pricing Strategy: Why So Much Cheaper?

Cost Structure Analysis

Cost ItemComposer 2Claude Opus 4.6Notes
Base Model25%100%Composer 2 reuses open-source model
Training Cost75%100%Cursor bears this
Inference CostLowHighBetter model optimization
Total Cost~30%100%Significantly reduced

Pricing Comparison

Per 1 Million Tokens:

ModelInput PriceOutput PriceRelative Cost
Composer 2$0.75$3.001x
GPT-5.4$2.50$10.003.3x
Claude Opus 4.6$15.00$75.0020x

💡 Cost Insight: Using Composer 2 instead of Claude Opus 4.6 saves 95% in API costs.


⚖️ “Self-Developed” Controversy: Is It Really Self-Developed?

Industry Practice

Using open-source models for fine-tuning is common in the AI industry:

CompanyModelBase ModelPublicly Disclosed
CursorComposer 2Kimi 2.5✅ Admitted
MetaLlama SeriesPartially open-source✅ Public
MistralMixtralPartially open-source✅ Public
Zero1.aiZero1-LLaMALLaMA✅ Public

Cursor’s Issue

Controversy Points:

  1. Initial Marketing: Cursor initially marketed as “self-developed model” without mentioning open-source base
  2. Community Discovery: Only admitted after community user questioned
  3. Insufficient Transparency: Training details not fully disclosed

Lee Robinson’s Response:

“Only ~1/4 of the compute spent on the final model came from the base, the rest is from our training. As a result, Composer 2’s performance on various benchmarks is very different from Kimi’s.”

Industry Perspectives

PerspectiveSupporting Arguments
Counts as Self-Developed75% training is Cursor’s own, significant performance improvement
Not Self-DevelopedBase model is from others, initially not disclosed
Middle GroundIt’s a “fine-tuned model based on open-source,” should be clearly labeled

💡 Implications for Developers

1. Selection Recommendations

Choose Composer 2 When:

  • ✅ Cost-sensitive (limited budget)
  • ✅ Primarily doing code generation/refactoring
  • ✅ Don’t need ultra-long context (> 128K)
  • ✅ Accept fine-tuned models based on open-source

Choose Claude Opus 4.6 When:

  • ✅ Need highest accuracy
  • ✅ Complex reasoning tasks (legal, medical)
  • ✅ Need official support and SLA
  • ✅ Budget is sufficient

Choose GPT-5.4 When:

  • ✅ Need multimodal capabilities
  • ✅ Ecosystem integration (OpenAI suite)
  • ✅ Balance performance and cost

2. Cost Optimization Strategies

Using NixAPI Multi-Model Routing:

// Smart routing: Select model based on task type
async function smartCodeTask(prompt, taskType) {
  if (taskType === 'simple_generation') {
    // Simple code generation uses Composer 2 (cheap)
    return callNixAPI('cursor-composer-2', prompt);
  }
  if (taskType === 'complex_reasoning') {
    // Complex reasoning uses Claude Opus 4.6 (accurate)
    return callNixAPI('claude-4-opus', prompt);
  }
  if (taskType === 'multimodal') {
    // Multimodal uses GPT-5.4
    return callNixAPI('gpt-5.4', prompt);
  }
  // Default to Composer 2
  return callNixAPI('cursor-composer-2', prompt);
}

Cost Comparison (100K calls/month):

SolutionMonthly CostAnnual Savings
All Claude Opus 4.6$9,000-
80% Composer 2 + 20% Claude$2,400$79,200/year
All Composer 2$1,800$86,400/year

3. Technology Trend Assessment

Trend 1: Open-Source Base + Proprietary Training Becomes Mainstream

  • Meta, Mistral, Cursor all adopt this strategy
  • Reduces R&D costs, accelerates product iteration
  • Developers should focus on “training quality” not “started from scratch”

Trend 2: Reinforcement Learning Becomes Key Differentiator

  • Cursor’s RL method is core competitive advantage
  • Similar to AlphaGo’s RL applied to code domain
  • Future model competition focus on training methods, not base architecture

Trend 3: Price War Continues

  • Composer 2 priced at 1/20 of Claude
  • Expected code model prices to drop another 50% in 2026
  • Developers should build multi-model strategies to avoid vendor lock-in

🔧 Hands-On: Integrating Composer 2 with NixAPI

Use Case 1: Code Generation Assistant

// Slack bot: Auto-generate code
const { NixAPI } = require('@nixapi/sdk');
const nixapi = new NixAPI({ apiKey: process.env.NIXAPI_KEY });

bot.on('message', async (message) => {
  if (!message.text.startsWith('/code')) return;
  
  const prompt = message.text.replace('/code', '').trim();
  
  // Use Composer 2 (high value)
  const response = await nixapi.chat.completions.create({
    model: 'cursor-composer-2',
    messages: [
      {
        role: 'system',
        content: 'You are a professional programming assistant. Generate high-quality, runnable code with brief explanations.'
      },
      {
        role: 'user',
        content: prompt
      }
    ],
    max_tokens: 4000,
    temperature: 0.3
  });
  
  await slack.chat.postMessage({
    channel: message.channel,
    text: response.choices[0].message.content
  });
});

Use Case 2: Code Review Workflow

// GitHub PR auto-review
app.post('/github-webhook', async (req, res) => {
  const pr = req.body.pull_request;
  const diff = await fetchPRDiff(pr.number);
  
  // Use Composer 2 for code review
  const review = await nixapi.chat.completions.create({
    model: 'cursor-composer-2',
    messages: [
      {
        role: 'system',
        content: 'You are a code review expert. Find potential security vulnerabilities, performance issues, and code style problems.'
      },
      {
        role: 'user',
        content: diff
      }
    ],
    max_tokens: 6000
  });
  
  // Submit PR comment
  await createPRComment(pr.number, review.choices[0].message.content);
  
  res.sendStatus(200);
});

Use Case 3: Multi-Model Routing for Cost Optimization

// Smart routing: Select model based on task complexity
async function codeReview(diff, complexity) {
  let model;
  
  if (complexity === 'low') {
    model = 'cursor-composer-2';  // Simple review uses Composer 2
  } else if (complexity === 'medium') {
    model = 'gpt-5.4';  // Medium uses GPT-5.4
  } else {
    model = 'claude-4-opus';  // Complex uses Claude
  }
  
  const response = await nixapi.chat.completions.create({
    model: model,
    messages: [
      { role: 'system', content: 'Review code, find issues and provide fix suggestions.' },
      { role: 'user', content: diff }
    ]
  });
  
  return response.choices[0].message.content;
}

❓ FAQ

Q1: Can Composer 2 be called directly via API?

A: Currently Composer 2 is only available within Cursor IDE, not as a standalone API. However, similar performance alternative models (like GPT-5.4, Claude-4) can be called via NixAPI.

A: Yes. Kimi 2.5 uses Apache 2.0 license, which allows commercial use and fine-tuning. Cursor’s approach complies with open-source license requirements.

Q3: Does it really outperform Claude Opus 4.6?

A: According to official benchmarks, Composer 2 slightly leads on SWE-bench (68.2% vs 65.8%), but results vary on other tasks. Recommend testing with your specific tasks.

Q4: How to verify Composer 2’s actual performance?

A:

  1. Try Composer 2 in Cursor IDE
  2. Test with your actual codebase
  3. Compare output quality with other models (Claude, GPT)
  4. Calculate cost savings

📈 Industry Impact Analysis

Impact on AI Coding Sector

ImpactDescription
Price War IntensifiesComposer 2 at 1/20 pricing forces competitors to reduce prices
Open-Source Becomes MainstreamMore companies adopt “open-source base + proprietary training” strategy
Differentiation CompetitionCompetition focus shifts from “self-developed” to “training quality”
Developers BenefitLower costs, more choices

Implications for Developers

  1. Don’t Worship “Self-Developed”: Key is final performance, not starting from scratch
  2. Focus on Training Methods: RL and data quality more important than base model
  3. Build Multi-Model Strategy: Avoid vendor lock-in, optimize costs
  4. Test New Models Promptly: New models may bring unexpected surprises


📋 Summary

Key Takeaways

  1. Truth: Composer 2 based on Kimi 2.5 open-source model, Cursor bears 75% training
  2. Performance: Surpasses Claude Opus 4.6 on SWE-bench, price only 1/20
  3. Technical Key: Reinforcement learning + context summarization internalization
  4. Controversy Focus: Initially didn’t disclose open-source base, insufficient transparency
  5. Industry Trend: Open-source base + proprietary training becomes mainstream, price war continues

Developer Action Items

Want to try Composer 2?
├─ Cursor Users → Use directly in IDE
├─ API Needs → Use NixAPI for alternative models
├─ Cost Optimization → Build multi-model routing strategy
└─ Technical Learning → Study RL applications in code domain

Last Updated: March 23, 2026
Data Sources: TechCrunch, 36Kr, Cursor official, public benchmarks
Test Environment: NixAPI v2.0


This article is based on public reports and test data. Model performance may vary by task type, recommend testing before actual use.

Try NixAPI Now

Reliable LLM API relay for OpenAI, Claude, Gemini, DeepSeek, Qwen, and Grok with ¥1 = $1 top-up

Sign Up Free