Incentivizing reasoning capability in LLMs via reinforcement learning. R1-Lite-Preview released 2024-11-20. Full R1 paper 2025-01-20. R1-0528 update released 2025-05-28.

Model Details

Architecture MOE
Parameters 671B
Active params 37B

Variants

Name Parameters Notes
DeepSeek-R1-Lite-Preview Released 2024-11-20
DeepSeek-R1
DeepSeek-R1-0528 Released 2025-05-28

Paper

arXiv: 2501.12948

reasoningopen-weighttraining