MiMo-V2.5-Pro
model1.02T total / 42B active MoE with hybrid attention (sliding window + global at 6:1 ratio, 128-token window) and Multi-Token Prediction. 1M-token context. AA Intelligence Index: 54. Developed under Luo Fuli (former DeepSeek core member). Open-sourced under MIT license, compatible with 5 domestic Chinese chip platforms from day one.
ClawEval: 63.8% using ~40-60% fewer tokens than Opus 4.6, Gemini 3.1 Pro, and GPT-5.4. GDPval-AA ELO: 1581 (surpassing Kimi K2.6 and GLM-5.1). KV-cache reduced ~7x via learnable attention pool bias. By Xiaomi MiMo team.
Model Details
Architecture MOE
Parameters 1.02T
Active params 42B
Context window 1,000,000
AA Intelligence 54
License MIT