Apple Reveals Secret Sauce Behind DeepSeek AI

The Artificial Intelligence Market is Rocked by the Sudden Popularity of DeepSeek

The artificial intelligence market — and the entire stock market — was rocked on Monday by the sudden popularity of DeepSeek, the open-source large language model developed by a China-based hedge fund that has bested OpenAI’s best on some tasks while costing far less.

Why Does DeepSeek Work So Well?

It turns out it’s a broad approach within deep learning forms of artificial intelligence to squeeze more out of computer chips by exploiting a phenomenon known as “sparsity.”

Sparsity and its Role in AI

The ability to use only some of the total parameters of a large language model and shut off the rest is an example of sparsity. That sparsity can have a major impact on how big or small the computing budget is for an AI model.

Optimizing AI with Fewer Parameters

As Abnar and team put it in technical terms, “Increasing sparsity while proportionally expanding the total number of parameters consistently leads to a lower pretraining loss, even when constrained by a fixed training compute budget.” The term “pretraining loss” is the AI term for how accurate a neural net is. Lower training loss means more accurate results.

The Future of Sparsity Research

Details aside, the most profound point about all this is that sparsity as a phenomenon is not new in AI research, nor is it a new approach in engineering.

The Magic Dial of Sparsity

The magic dial of sparsity doesn’t only shave computing costs, as in the case of DeepSeek — it works in the other direction too: it can also make bigger and bigger AI computers more efficient.

Conclusion

DeepSeek is only one example of a broad area of research that many labs are already following, and that many more will now jump on in order to replicate DeepSeek’s success. The magic dial of sparsity is profound because it not only improves economics for a small budget, as in the case of DeepSeek, it also works in the other direction: Spend more, and you’ll get even better benefits via sparsity.

FAQs

Q: What is DeepSeek?
A: DeepSeek is an open-source large language model developed by a China-based hedge fund that has bested OpenAI’s best on some tasks while costing far less.

Q: What is sparsity in AI?
A: Sparsity in AI refers to the ability to use only some of the total parameters of a large language model and shut off the rest, which can have a major impact on how big or small the computing budget is for an AI model.

Q: How does sparsity improve AI performance?
A: Sparsity can improve AI performance by allowing for more accurate results with less computing power, making it a key avenue of research to change the state of the art in the field.

Q: What are the implications of DeepSeek’s success?
A: The implications of DeepSeek’s success are that it highlights a sea change in AI that could empower smaller labs and researchers to create competitive models and diversify the field of available options.

Post Views: 5

Apple Reveals Secret Sauce Behind DeepSeek AI

The Artificial Intelligence Market is Rocked by the Sudden Popularity of DeepSeek

Why Does DeepSeek Work So Well?

Sparsity and its Role in AI

Optimizing AI with Fewer Parameters

The Future of Sparsity Research

The Magic Dial of Sparsity

Conclusion

FAQs

Elegoo Mercury Plus V3: Neater 3D Printing

ChatGPT’s Deep Research just identified 20 jobs it will replace

Boston Dynamics joins forces with its former CEO to speed the learning of its Atlas humanoid robot.

China Tariffs

7-Zip 0-day exploited in Russia’s Ukraine invasion

Elegoo Mercury Plus V3: Neater 3D Printing

ChatGPT’s Deep Research just identified 20 jobs it will replace

Boston Dynamics joins forces with its former CEO to speed the learning of its Atlas humanoid robot.

China Tariffs

7-Zip 0-day exploited in Russia’s Ukraine invasion

SAS Brings AI to All with Packaged Models

A.I. Isn’t Coming for Moe

GTC 2025: Expert Sessions

LEAVE A REPLY Cancel reply

Latest

Elegoo Mercury Plus V3: Neater 3D Printing

ChatGPT’s Deep Research just identified 20 jobs it will replace

Boston Dynamics joins forces with its former CEO to speed the learning of its Atlas humanoid robot.

Categories

Useful Links

Our Newsletter