Nemotron Nano V2
modelHybrid Mamba-Transformer reasoning models (12B and 9B compressed via Minitron). Trained on 20T tokens in FP8. On par with Qwen3-8B on reasoning with up to 6x higher inference throughput. 128K context, fits on single A10G 22GB in BF16.
Nemotron Nano V2 VL adds vision encoder for document understanding, long video comprehension, and reasoning. 35% higher throughput than predecessor on multi-page document tasks.
Paper (arXiv)Paper (VL, arXiv)HuggingFace (12B)Artificial AnalysisOpenRouter (9B)OpenRouter (12B VL)
Model Details
Architecture DENSE
Parameters 12B
Context window 128,000
Variants
| Name | Parameters | Notes |
|---|---|---|
| Nemotron Nano 12B V2 | 12B | — |
| Nemotron Nano 9B V2 | 9B | Compressed via Minitron |
| Nemotron Nano 12B V2 VL | 12B | Vision-language variant |
Paper
arXiv: 2508.14444