Open reasoning models (7B and 32B) trained via RL on DeepSeek-R1-Distill base with long chain-of-thought. Fully open: weights, training code, and datasets.

OR1-32B: AIME24 82.2, AIME25 73.3, LiveCodeBench 63.0 — outperforms DeepSeek-R1 and Qwen3-32B on math benchmarks. OR1-7B: AIME24 70.2, AIME25 54.6.

Model Details

Architecture DENSE
Parameters 32B

Variants

Name Parameters Notes
Skywork-OR1-7B 7B
Skywork-OR1-32B 32B

Paper

arXiv: 2505.22312

reasoningopen-weightopen-source