OpenAI o1 Results on ARC-AGI-Pub

🧀

View Website ARC Prize Technical Guide OpenAI Research GitHub ARC-AGI

OpenAI's new o1 models demonstrate incremental progress towards AGI, but new ideas are still needed. Over the past 24 hours, o1-preview and o1-mini, with improved chain-of-thought reasoning, were tested on ARC Prize and showed promising results, but their performance does not signal the arrival of AGI. The models exhibit a log-linear relationship between accuracy and test-time compute, prompting the pursuit of more efficient refinement and search methods. The ARC Prize calls on the community to contribute innovative approaches.

o1-preview scored 21.2% accuracy on ARC-AGI.
o1-preview is on par with Claude 3.5 Sonnet.
o1-mini scored 12.8% accuracy on ARC-AGI.
CoT reasoning shows a log-linear relationship with compute.
New ideas beyond fitting curves to data distributions are needed for AGI.

View Website ARC Prize Technical Guide OpenAI Research GitHub ARC-AGI

Social

OpenAI o1 Results on ARC-AGI-Pub