A new, challenging AGI test stumps most AI models

The New Frontier of AI Intelligence: ARC-AGI-2

A New Challenge for AI Models

The Arc Prize Foundation, a nonprofit co-founded by prominent AI researcher François Chollet, has announced the creation of a new, challenging test to measure the general intelligence of leading AI models. The test, called ARC-AGI-2, has already stumped most models, with even the best-performing AI systems struggling to achieve more than 1% to 1.3% accuracy.

How the Test Works

The ARC-AGI tests consist of puzzle-like problems where an AI has to identify visual patterns from a collection of different-colored squares and generate the correct "answer" grid. The problems were designed to force an AI to adapt to new problems it hasn’t seen before.

Human vs. AI Performance

To establish a human baseline, the Arc Prize Foundation had over 400 people take the ARC-AGI-2 test. On average, "panels" of these people got 60% of the test’s questions right – much better than any of the AI models’ scores.

A New Metric: Efficiency

Unlike the first iteration of the test, ARC-AGI-1, the new test, ARC-AGI-2, introduces a new metric: efficiency. It requires models to interpret patterns on the fly instead of relying on memorization. This makes it more challenging for AI systems to perform well.

The Future of AI Intelligence

The arrival of ARC-AGI-2 comes as many in the tech industry are calling for new, unsaturated benchmarks to measure AI progress. The Arc Prize Foundation’s tests are aimed at evaluating whether an AI system can efficiently acquire new skills outside the data it was trained on.

Conclusion

The new test is a significant step forward in measuring the intelligence of AI models. It challenges AI systems to adapt to new situations and demonstrates the importance of efficiency in their performance. The Arc Prize Foundation’s goal is to encourage the development of more effective and efficient AI systems that can benefit humanity.

FAQs

Q: What is ARC-AGI-2?
A: ARC-AGI-2 is a new test designed to measure the general intelligence of leading AI models.

Q: How does ARC-AGI-2 differ from ARC-AGI-1?
A: ARC-AGI-2 introduces a new metric: efficiency, and requires models to interpret patterns on the fly instead of relying on memorization.

Q: What is the goal of the Arc Prize Foundation?
A: The Arc Prize Foundation aims to encourage the development of more effective and efficient AI systems that can benefit humanity.

Q: What is the new Arc Prize 2025 contest?
A: The Arc Prize 2025 contest challenges developers to reach 85% accuracy on the ARC-AGI-2 test while only spending $0.42 per task.

Post Views: 104

A new, challenging AGI test stumps most AI models

MIT student teams win top honors in NASA competition | MIT News

5 Design Considerations for Effective Employee Recognition Programs

Agibot reaches new milestone as its 15,000th humanoid robot rolls off production line

How AI Navigation is Improving the Performance of Robotic Pool Cleaners

Generate single title from this title SAP aligns commerce data for AI personalisation in 100 -150 characters. And it must return only title i...

MIT student teams win top honors in NASA competition | MIT News

5 Design Considerations for Effective Employee Recognition Programs

Agibot reaches new milestone as its 15,000th humanoid robot rolls off production line

How AI Navigation is Improving the Performance of Robotic Pool Cleaners

Generate single title from this title SAP aligns commerce data for AI personalisation in 100 -150 characters. And it must return only title i...

Goodwood Festival of Speed unveils Future Lab lineup for 2026

Generate single title from this title Build an AI-Powered Equipment Repair Assistant Using Amazon Bedrock AgentCore in 100 -150 characters. And it must return...

LLMs help robots understand vague instructions and focus on key details | MIT News

LEAVE A REPLY Cancel reply

Latest

MIT student teams win top honors in NASA competition | MIT News

5 Design Considerations for Effective Employee Recognition Programs

Agibot reaches new milestone as its 15,000th humanoid robot rolls off production line

Categories

Useful Links

Our Newsletter