The True Price of DeepSeek’s New Models: A Game Changer for AI
The Cost of Innovation
The true price of developing DeepSeek’s new models remains unknown, with one figure quoted in a single research paper potentially not capturing the full picture of its costs. "I don’t believe it’s $6 million, but even if it’s $60 million, it’s a game changer," says Umesh Padval, managing director of Thomvest Ventures, a company that has invested in Cohere and other AI firms. "It will put pressure on the profitability of companies which are focused on consumer AI."
Cutting Costs with DeepSeek’s Techniques
Shortly after DeepSeek revealed the details of its latest model, Ghodsi of Databricks says customers began asking whether they could use it as well as DeepSeek’s underlying techniques to cut costs at their own organizations. He adds that one approach employed by DeepSeek’s engineers, known as distillation, which involves using the output from one large language model to train another model, is relatively cheap and straightforward.
Benefits and Concerns
Padval says that the existence of models like DeepSeek’s will ultimately benefit companies looking to spend less on AI, but he says that many firms may have reservations about relying on a Chinese model for sensitive tasks. So far, at least one prominent AI firm, Perplexity, has publicly announced it’s using DeepSeek’s R1 model, but it says it is being hosted "completely independent of China."
Industry Reaction
Amjad Massad, the CEO of Replit, a startup that provides AI coding tools, told WIRED that he thinks DeepSeek’s latest models are impressive. While he still finds Anthropic’s Sonnet model is better at many computer engineering tasks, he has found that R1 is especially good at turning text commands into code that can be executed on a computer. "We’re exploring using it especially for agent reasoning," he adds.
DeepSeek’s Capabilities
DeepSeek’s latest two offerings—DeepSeek R1 and DeepSeek R1-Zero—are capable of the same kind of simulated reasoning as the most advanced systems from OpenAI and Google. They all work by breaking problems into constituent parts in order to tackle them more effectively, a process that requires a considerable amount of additional training to ensure that the AI reliably reaches the correct answer.
Research Paper
A paper posted by DeepSeek researchers last week outlines the approach the company used to create its R1 models, which it claims perform on some benchmarks about as well as OpenAI’s groundbreaking reasoning model known as o1. The tactics DeepSeek used include a more automated method for learning how to problem-solve correctly as well as a strategy for transferring skills from larger models to smaller ones.
Hardware Speculation
One of the hottest topics of speculation about DeepSeek is the hardware it might have used. The question is especially noteworthy because the US government has introduced a series of export controls and other trade restrictions over the last few years aimed at limiting China’s ability to acquire and manufacture cutting-edge chips that are needed for building advanced AI.
Conclusion
DeepSeek’s latest models are a game changer for the AI industry, offering a less expensive and more accessible alternative to traditional approaches. While there may be concerns about relying on Chinese models for sensitive tasks, the benefits of DeepSeek’s technology are clear. As the industry continues to evolve, it will be interesting to see how companies adapt to this new landscape.
FAQs
Q: What is DeepSeek’s latest model?
A: DeepSeek’s latest models, R1 and R1-Zero, are capable of simulated reasoning and can perform on some benchmarks as well as OpenAI’s o1 model.
Q: How much did it cost to develop DeepSeek’s models?
A: The true price of developing DeepSeek’s models remains unknown, with one figure quoted in a single research paper potentially not capturing the full picture of its costs.
Q: Can I use DeepSeek’s models to cut costs at my own organization?
A: Yes, according to Ghodsi of Databricks, customers are already asking whether they can use DeepSeek’s underlying techniques to cut costs at their own organizations.
Q: Are there concerns about relying on a Chinese model for sensitive tasks?
A: Yes, according to Padval, many firms may have reservations about relying on a Chinese model for sensitive tasks.

