Nemotron 3 Super | Lab Index

NVIDIA's current flagship. 120B total / 12B active per token using LatentMoE + Mamba-2 + Attention hybrid architecture with Multi-Token Prediction. Trained in NVFP4 precision. 1M context window. 5x throughput over previous generation.

AA Intelligence Index: 36 (#2 in class, 293 t/s — fastest among top models). AIME25: 90.21, HMMT Feb25: 93.67, GPQA: 79.23/82.70 (tools), MMLU-Pro: 83.73, SWE-Bench: 60.47 (OpenHands), RULER@1M: 91.75. Open weights with complete training data (10T+ tokens) and recipes released.

Paper (arXiv)Paper (Nemotron 3 family, arXiv)HuggingFace Artificial Analysis OpenRouter

Model Details

Architecture MOE

Parameters 120B

Active params 12B

Context window 1,000,000

AA Intelligence 36

Paper

arXiv HTML

moeopen-weightfrontierreasoning

Model Details

Paper

Related