119B total / 6.5B active MoE (128 experts, top-4 routing). First Mistral model to unify instruct, reasoning, multimodal, and agentic coding in one architecture. 256K context. 3x throughput vs Mistral Small 3 with 40% latency reduction. Configurable reasoning effort.

AA Intelligence Index: 19 (non-reasoning, #6/38). Apache 2.0.

Model Details

Architecture MOE
Parameters 119B
Active params 6.5B
Context window 256,000
moeopen-weightreasoningmultimodalagentic

Related