Google DeepMind Publishes Exhaustive Paper on AGI Safety
Google DeepMind on Wednesday published an exhaustive paper on its safety approach to AGI, roughly defined as AI that can accomplish any task a human can.
AGI Controversy
AGI is a bit of a controversial subject in the AI field, with naysayers suggesting that it’s little more than a pipe dream. Others, including major AI labs like Anthropic, warn that it’s around the corner, and could result in catastrophic harms if steps aren’t taken to implement appropriate safeguards.
DeepMind’s Predictions and Concerns
DeepMind’s 145-page document, which was co-authored by DeepMind co-founder Shane Legg, predicts that AGI could arrive by 2030, and that it may result in what the authors call “severe harm.” The paper doesn’t concretely define this, but gives the alarmist example of “existential risks” that “permanently destroy humanity.”
Exceptional AGI and Superintelligent AI
“[We anticipate] the development of an Exceptional AGI before the end of the current decade,” the authors wrote. “An Exceptional AGI is a system that has a capability matching at least 99th percentile of skilled adults on a wide range of non-physical tasks, including metacognitive tasks like learning new skills.”
Off the bat, the paper contrasts DeepMind’s treatment of AGI risk mitigation with Anthropic’s and OpenAI’s. Anthropic, it says, places less emphasis on “robust training, monitoring, and security,” while OpenAI is overly bullish on “automating” a form of AI safety research known as alignment research.
Recursive AI Improvement and AI Safety
The paper also casts doubt on the viability of superintelligent AI — AI that can perform jobs better than any human. (OpenAI recently claimed that it’s turning its aim from AGI to superintelligence.) Absent “significant architectural innovation,” the DeepMind authors aren’t convinced that superintelligent systems will emerge soon — if ever.
The paper does find it plausible, though, that current paradigms will enable “recursive AI improvement”: a positive feedback loop where AI conducts its own AI research to create more sophisticated AI systems. And this could be incredibly dangerous, assert the authors.
Proposed Solutions and Challenges
At a high level, the paper proposes and advocates for the development of techniques to block bad actors’ access to hypothetical AGI, improve the understanding of AI systems’ actions, and “harden” the environments in which AI can act. It acknowledges that many of the techniques are nascent and have “open research problems,” but cautions against ignoring the safety challenges possibly on the horizon.
“The transformative nature of AGI has the potential for both incredible benefits as well as severe harms,” the authors write. “As a result, to build AGI responsibly, it is critical for frontier AI developers to proactively plan to mitigate severe harms.”
Expert Disagreements
Some experts disagree with the paper’s premises, however.
Heidy Khlaaf, chief AI scientist at the nonprofit AI Now Institute, told TechCrunch that she thinks the concept of AGI is too ill-defined to be “rigorously evaluated scientifically.” Another AI researcher, Matthew Guzdial, an assistant professor at the University of Alberta, said that he doesn’t believe recursive AI improvement is realistic at present.
“[Recursive improvement] is the basis for the intelligence singularity arguments,” Guzdial told TechCrunch, “but we’ve never seen any evidence for it working.”
Conclusion
Comprehensive as it may be, DeepMind’s paper seems unlikely to settle the debates over just how realistic AGI is — and the areas of AI safety in most urgent need of attention.
FAQs
Q: What is AGI?
A: AGI is roughly defined as AI that can accomplish any task a human can.
Q: When can we expect AGI to arrive?
A: DeepMind predicts that AGI could arrive by 2030.
Q: What are the potential risks of AGI?
A: The authors of the paper warn of “severe harm” and even “existential risks” that could “permanently destroy humanity.”
Q: What is recursive AI improvement?
A: It is a positive feedback loop where AI conducts its own AI research to create more sophisticated AI systems.
Q: What is the proposed solution to mitigate AGI risks?
A: The paper proposes techniques to block bad actors’ access to hypothetical AGI, improve the understanding of AI systems’ actions, and “harden” the environments in which AI can act.

