AI’s "IQ" and the Flawed Benchmark
AI CEO Sam Altman’s Claim
During a recent press appearance, OpenAI CEO Sam Altman said that he’s observed the "IQ" of AI rapidly improve over the past several years. He stated that, "Very roughly, it feels to me like — this is not scientifically accurate, this is just a vibe or spiritual answer — every year we move one standard deviation of IQ."
The Problem with IQ as a Benchmark
Altman isn’t the first to use IQ, an estimation of a person’s intelligence, as a benchmark for AI progress. AI influencers on social media have given models IQ tests and ranked the results. However, many experts say that IQ is a poor measure of a model’s capabilities — and a misleading one.
IQ: A Flawed Measure
Sandra Wachter, a researcher studying tech and regulation at Oxford, noted, "It can be very tempting to use the same measures we use for humans to describe capabilities or progress, but this is like comparing apples with oranges." IQ tests are relative — not objective — measures of certain kinds of intelligence. There’s some consensus that IQ is a reasonable test of logic and abstract reasoning. But it doesn’t measure practical intelligence — knowing how to make things work — and it’s at best a snapshot.
AI’s Unfair Advantage
AI likely has an unfair advantage on IQ tests, as well, considering that models have massive amounts of memory and internalized knowledge at their disposal. Often, models are trained on public web data, and the web is full of example questions taken from IQ tests.
A Different Way of Solving Problems
Mike Cook, a research fellow at King’s College London specializing in AI, noted, "When I learn something, I don’t get it piped into my brain with perfect clarity 1 million times, unlike AI, and I can’t process it with no noise or signal loss, either." IQ tests were designed for humans, intended as a way to evaluate general problem-solving abilities. They’re inappropriate for a technology that approaches solving problems in a very different way than people do.
The Need for Better AI Tests
Heidy Khlaaf, chief AI scientist at the AI Now Institute, emphasized the need for better AI tests. "In the history of computation, we haven’t compared computing abilities to that of humans’ precisely because the nature of computation means systems have always been able to complete tasks already beyond human ability."
Conclusion
In conclusion, IQ is a flawed measure for AI progress. It’s a benchmark designed for humans, not machines. AI has an unfair advantage on IQ tests, and it’s time to develop better tests that account for the unique abilities of artificial intelligence.
FAQs
Q: What is IQ, and how is it used to measure intelligence?
A: IQ is a score that estimates an individual’s intelligence, often based on their performance on standardized tests. It’s a complex and controversial measure that’s been used to evaluate human intelligence, but it’s not suitable for evaluating artificial intelligence.
Q: Why is IQ a flawed measure for AI progress?
A: IQ is a benchmark designed for humans, not machines. It’s based on assumptions about human cognition and problem-solving abilities, which are different from those of AI. AI has access to vast amounts of data and processing power, giving it an unfair advantage on IQ tests.
Q: What are some alternative ways to measure AI’s capabilities?
A: There are various ways to evaluate AI’s abilities, such as task-based assessments, problem-solving tests, and domain-specific evaluations. These approaches can provide a more accurate picture of AI’s capabilities and progress.

