Nemotron-Labs Diffusion

NVIDIA Labs tri-mode LLM family: at each step the model can run as autoregressive, diffusion, or self-speculation, with a learned controller picking the regime. Reports 2.6–6.4× throughput over AR baselines while matching or beating Qwen3 accuracy on standard benchmarks.

Three sizes (3B / 8B / 14B), each with Base, Instruct, and VL variants; eight model uploads total on May 23. Released under the Nemotron Open Model License.

HuggingFace Blog HuggingFace (14B)NVIDIA Research publication

Model Details

Parameters 14B

Variants

Name	Parameters	Notes
Nemotron-Labs-Diffusion-3B (Base/Instruct/VL)	3B	—
Nemotron-Labs-Diffusion-8B (Base/Instruct/VL)	8B	—
Nemotron-Labs-Diffusion-14B (Base/Instruct)	14B	—

foundationaldiffusionopen-weightefficiency

Nemotron-Labs Diffusion

Your notes

Model Details

Variants

Related