MAI-Code-1-Flash
modelMicrosoft's Copilot-native agentic coding model, derived from MAI-Thinking-1's base and tuned specifically for the GitHub Copilot harness. Per the official model card: 137B parameters, sparse MoE Transformer, 256K context, trained March–May 2026 with a December 2025 data cut-off. Marketed as a 5B-class model in user-facing positioning, suggesting ~5B active per token (active count not explicitly disclosed in the model card). English-only.
Trained inside Copilot's production harness rather than benchmarked externally and deployed in — Microsoft frames this as a reliability advantage for agentic workflows. Pipeline: starts from MAI-Thinking-1 checkpoint → SFT on ~2M synthetic agentic tasks → RL across 150,000+ environments. Features adaptive solution-length control (stays concise on easy tasks, spends more reasoning budget on hard ones).
Benchmarks (per model card): SWE-Bench Verified 71.6% (vs Claude Haiku 4.5's 66.6%), SWE-Bench Pro 51.2% (vs 35.2%), SWE-Bench Multilingual 65.5%, Terminal-Bench 2 54.8%. Launch post claims up to 60% fewer tokens on hard coding tasks.
Status: rolling out to all GitHub Copilot tiers (Free, Pro, Pro+, Max) in VS Code from June 2, 2026; CLI / Foundry / OpenRouter / Fireworks / Baseten access listed as future. Pricing "to be finalized." Proprietary, deployment-tied license. Not on HuggingFace as of release.
Model Details
Benchmark Scores
| Benchmark | Score | Mode |
|---|---|---|
| SWE-Bench Verified | 71.6% | — |
| SWE-Bench Pro | 51.2% | — |
| SWE-Bench Multilingual | 65.5% | — |
| Terminal-Bench 2 | 54.8% | — |