Back to timeline

o3 (OpenAI)

OpenAI announces o3, a reasoning-focused model that achieves breakthrough scores on ARC-AGI and math benchmarks using extended chain-of-thought.

Model Release

What Happened

OpenAI announced o3, the successor to o1, as part of its reasoning-focused model line. The o3 model demonstrated dramatic improvements on difficult reasoning benchmarks, most notably scoring 87.5% on the ARC-AGI benchmark (up from o1's 32%) in its high-compute configuration. It also achieved strong results on competition mathematics (ELO 2727 on Codeforces) and formal reasoning tasks.

Why It Matters

o3 represented a significant advance in AI reasoning capabilities. The ARC-AGI benchmark, designed to measure fluid intelligence and novel problem-solving, had been considered a major challenge for AI systems. o3's performance suggested that scaling test-time compute (allowing models to "think longer") could unlock qualitatively new reasoning abilities, complementing the gains from scaling training compute. The result reignited debates about the trajectory toward artificial general intelligence.

Technical Details