How Small Chinese AI Start-up DeepSeek Shocked Silicon Valley

DeepSeek’s AI Breakthrough Stuns the World

A Small Chinese Artificial Intelligence Lab’s Technical Recipe for Cutting-Edge Model

A small Chinese artificial intelligence lab stunned the world this week by revealing the technical recipe for its cutting-edge model, turning its reclusive leader into a national hero who has defied US attempts to stop China’s high-tech ambitions.

DeepSeek’s R1 Model Release

DeepSeek, founded by hedge fund manager Liang Wenfeng, released its R1 model on Monday, explaining in a detailed paper how to build a large language model on a bootstrapped budget that can automatically learn and improve itself without human supervision.

US Companies’ Reaction

US companies including OpenAI and Google DeepMind pioneered developments in reasoning models, a relatively new field of AI research that is attempting to make models match human cognitive capabilities. In December, the San Francisco-based OpenAI released the full version of its o1 model but kept its methods secret. DeepSeek’s R1 release sparked a frenzied debate in Silicon Valley about whether better resourced US AI companies, including Meta and Anthropic, can defend their technical edge.

Liang’s Rise to Fame

Liang has become a focal point of national pride at home. This week, he was the only AI leader selected to attend a publicised meeting of entrepreneurs with the country’s second-most powerful leader, Li Qiang. The entrepreneurs were told to "concentrate efforts to break through key core technologies."

Liang’s Background

In 2021, Liang started buying thousands of Nvidia graphic processing units for his AI side project while running his quant trading fund High-Flyer. Industry insiders viewed it as the eccentric actions of a billionaire looking for a new hobby. "When we first met him, he was this very nerdy guy with a terrible hairstyle talking about building a 10,000-chip cluster to train his own models. We didn’t take him seriously," said one of Liang’s business partners.

DeepSeek’s Focus on Research

Industry insiders say DeepSeek’s singular focus on research makes it a dangerous competitor because it is willing to share its breakthroughs rather than protect them for commercial gains. DeepSeek has not raised money from outside funds or made significant moves to monetise its models.

Conclusion

DeepSeek’s recent model releases demonstrate that "there is no moat when it comes to AI capabilities," said Ritwik Gupta, AI policy researcher at the University of California, Berkeley. The company’s ability to train a model with 671bn parameters using just 2,048 Nvidia H800s and $5.6mn has stunned the industry.

FAQs

Q: What is DeepSeek’s R1 model?
A: DeepSeek’s R1 model is a large language model that can automatically learn and improve itself without human supervision.

Q: How did Liang Wenfeng, the founder of DeepSeek, start his AI journey?
A: Liang started buying thousands of Nvidia graphic processing units for his AI side project while running his quant trading fund High-Flyer in 2021.

Q: What is the significance of DeepSeek’s model release?
A: DeepSeek’s model release has sparked a frenzied debate in Silicon Valley about whether better resourced US AI companies can defend their technical edge.

Q: What is the focus of DeepSeek’s research?
A: DeepSeek’s focus is on research and engineering, with a goal of developing human-level AI.

Q: What is the future of DeepSeek?
A: It remains an open question whether DeepSeek can continue to be competitive as the industry evolves, with US rivals building mega "clusters" of Nvidia’s next-generation Blackwell chips.

Post Views: 51

How Small Chinese AI Start-up DeepSeek Shocked Silicon Valley

Generate single title from this title A New AI Model Could Help Scientists Design New Forms of Life in 100 -150 characters. And it...

Generate single title from this title Train CodeFu-7B with veRL and Ray on Amazon SageMaker Training jobs in 100 -150 characters. And it must...

Generate single title from this title Nearly half of high school students now use AI in college search in 100 -150 characters. And it...

Engineering confidence to navigate uncertainty | MIT News

Generate single title from this title Best of MWC 2026: Live updates on phones, concepts, and robots we’re seeing in 100 -150 characters. And...

Generate single title from this title A New AI Model Could Help Scientists Design New Forms of Life in 100 -150 characters. And it...

Generate single title from this title Train CodeFu-7B with veRL and Ray on Amazon SageMaker Training jobs in 100 -150 characters. And it must...

Generate single title from this title Nearly half of high school students now use AI in college search in 100 -150 characters. And it...

Engineering confidence to navigate uncertainty | MIT News

Generate single title from this title Best of MWC 2026: Live updates on phones, concepts, and robots we’re seeing in 100 -150 characters. And...

Featured video: Coding for underwater robotics | MIT News

Generate single title from this title Upgrading agentic AI for finance workflows in 100 -150 characters. And it must return only title i dont...

Generate single title from this title Making Softmax More Efficient with NVIDIA Blackwell Ultra in 100 -150 characters. And it must return only title...

LEAVE A REPLY Cancel reply

Latest

Generate single title from this title A New AI Model Could Help Scientists Design New Forms of Life in 100 -150 characters. And it...

Generate single title from this title Train CodeFu-7B with veRL and Ray on Amazon SageMaker Training jobs in 100 -150 characters. And it must...

Generate single title from this title Nearly half of high school students now use AI in college search in 100 -150 characters. And it...

Categories

Useful Links

Our Newsletter