Reasoning model family trained with pure reinforcement learning — no distillation from external reasoning models. ~50% boost in AIME-24 (pass@1) over the Mistral Medium 3 base checkpoint.

RL on text maintains multimodal understanding, instruction following, and function calling. Magistral Small (24B, Apache 2.0) and Magistral Medium (proprietary).

Model Details

Architecture DENSE

Paper

arXiv: 2506.10910

reasoningopen-weight