DeepSeek’s AI Breakthrough Stuns the World
A Small Chinese Artificial Intelligence Lab’s Technical Recipe for Cutting-Edge Model
A small Chinese artificial intelligence lab stunned the world this week by revealing the technical recipe for its cutting-edge model, turning its reclusive leader into a national hero who has defied US attempts to stop China’s high-tech ambitions.
DeepSeek’s R1 Model Release
DeepSeek, founded by hedge fund manager Liang Wenfeng, released its R1 model on Monday, explaining in a detailed paper how to build a large language model on a bootstrapped budget that can automatically learn and improve itself without human supervision.
US Companies’ Reaction
US companies including OpenAI and Google DeepMind pioneered developments in reasoning models, a relatively new field of AI research that is attempting to make models match human cognitive capabilities. In December, the San Francisco-based OpenAI released the full version of its o1 model but kept its methods secret. DeepSeek’s R1 release sparked a frenzied debate in Silicon Valley about whether better resourced US AI companies, including Meta and Anthropic, can defend their technical edge.
Liang’s Rise to Fame
Liang has become a focal point of national pride at home. This week, he was the only AI leader selected to attend a publicised meeting of entrepreneurs with the country’s second-most powerful leader, Li Qiang. The entrepreneurs were told to "concentrate efforts to break through key core technologies."
Liang’s Background
In 2021, Liang started buying thousands of Nvidia graphic processing units for his AI side project while running his quant trading fund High-Flyer. Industry insiders viewed it as the eccentric actions of a billionaire looking for a new hobby. "When we first met him, he was this very nerdy guy with a terrible hairstyle talking about building a 10,000-chip cluster to train his own models. We didn’t take him seriously," said one of Liang’s business partners.
DeepSeek’s Focus on Research
Industry insiders say DeepSeek’s singular focus on research makes it a dangerous competitor because it is willing to share its breakthroughs rather than protect them for commercial gains. DeepSeek has not raised money from outside funds or made significant moves to monetise its models.
Conclusion
DeepSeek’s recent model releases demonstrate that "there is no moat when it comes to AI capabilities," said Ritwik Gupta, AI policy researcher at the University of California, Berkeley. The company’s ability to train a model with 671bn parameters using just 2,048 Nvidia H800s and $5.6mn has stunned the industry.
FAQs
Q: What is DeepSeek’s R1 model?
A: DeepSeek’s R1 model is a large language model that can automatically learn and improve itself without human supervision.
Q: How did Liang Wenfeng, the founder of DeepSeek, start his AI journey?
A: Liang started buying thousands of Nvidia graphic processing units for his AI side project while running his quant trading fund High-Flyer in 2021.
Q: What is the significance of DeepSeek’s model release?
A: DeepSeek’s model release has sparked a frenzied debate in Silicon Valley about whether better resourced US AI companies can defend their technical edge.
Q: What is the focus of DeepSeek’s research?
A: DeepSeek’s focus is on research and engineering, with a goal of developing human-level AI.
Q: What is the future of DeepSeek?
A: It remains an open question whether DeepSeek can continue to be competitive as the industry evolves, with US rivals building mega "clusters" of Nvidia’s next-generation Blackwell chips.

