DeepSeek-V3.2
model paperV3.2 family introducing DeepSeek Sparse Attention for long-context efficiency. V3.2-Speciale achieves gold-medal performance on IMO and IOI 2025, surpassing GPT-5.
Outputs 2
DeepSeek-V3.2
model Architecture MOE
Parameters 685B
Active params 37B
Variants
| Name | Parameters | Notes |
|---|---|---|
| DeepSeek-V3.2-Exp | — | Released 2025-09-29 |
| DeepSeek-V3.2 | — | — |
| DeepSeek-V3.2-Speciale | — | Max reasoning variant |
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models
paperTechnical report introducing DeepSeek Sparse Attention, scalable RL framework, and large-scale agentic task synthesis. 685B-parameter MoE model.
arXiv: 2512.02556