Second-generation reasoning model, advancing o1's test-time compute scaling. 200K token context. Released alongside o4-mini, a smaller reasoning variant. o3-mini launched earlier (January 2025) as a cost-efficient option with selectable reasoning effort (low/medium/high).

o3 achieved 96.7% on AIME 2024 and scored 87.7% on GPQA-Diamond. o3-pro (June 2025) used parallel test-time compute for the highest reasoning accuracy. AA Intelligence Index: 38 (o3), 26 (o3-mini), 33 (o4-mini), 41 (o3-pro). Proprietary.

Model Details

Context window 200,000

Variants

Name Parameters Notes
o3-mini Cost-efficient, January 2025
o3 Full model, April 2025
o4-mini Smaller reasoning variant, April 2025
o3-pro Parallel test-time compute, June 2025
frontierreasoning

Related