Date:

Apple Reveals Secret Sauce Behind DeepSeek AI

The Artificial Intelligence Market is Rocked by the Sudden Popularity of DeepSeek

The artificial intelligence market — and the entire stock market — was rocked on Monday by the sudden popularity of DeepSeek, the open-source large language model developed by a China-based hedge fund that has bested OpenAI’s best on some tasks while costing far less.

Why Does DeepSeek Work So Well?

It turns out it’s a broad approach within deep learning forms of artificial intelligence to squeeze more out of computer chips by exploiting a phenomenon known as “sparsity.”

Sparsity and its Role in AI

The ability to use only some of the total parameters of a large language model and shut off the rest is an example of sparsity. That sparsity can have a major impact on how big or small the computing budget is for an AI model.

Optimizing AI with Fewer Parameters

As Abnar and team put it in technical terms, “Increasing sparsity while proportionally expanding the total number of parameters consistently leads to a lower pretraining loss, even when constrained by a fixed training compute budget.” The term “pretraining loss” is the AI term for how accurate a neural net is. Lower training loss means more accurate results.

The Future of Sparsity Research

Details aside, the most profound point about all this is that sparsity as a phenomenon is not new in AI research, nor is it a new approach in engineering.

The Magic Dial of Sparsity

The magic dial of sparsity doesn’t only shave computing costs, as in the case of DeepSeek — it works in the other direction too: it can also make bigger and bigger AI computers more efficient.

Conclusion

DeepSeek is only one example of a broad area of research that many labs are already following, and that many more will now jump on in order to replicate DeepSeek’s success. The magic dial of sparsity is profound because it not only improves economics for a small budget, as in the case of DeepSeek, it also works in the other direction: Spend more, and you’ll get even better benefits via sparsity.

FAQs

Q: What is DeepSeek?
A: DeepSeek is an open-source large language model developed by a China-based hedge fund that has bested OpenAI’s best on some tasks while costing far less.

Q: What is sparsity in AI?
A: Sparsity in AI refers to the ability to use only some of the total parameters of a large language model and shut off the rest, which can have a major impact on how big or small the computing budget is for an AI model.

Q: How does sparsity improve AI performance?
A: Sparsity can improve AI performance by allowing for more accurate results with less computing power, making it a key avenue of research to change the state of the art in the field.

Q: What are the implications of DeepSeek’s success?
A: The implications of DeepSeek’s success are that it highlights a sea change in AI that could empower smaller labs and researchers to create competitive models and diversify the field of available options.

Latest stories

Read More

Elegoo Mercury Plus V3: Neater 3D Printing

Resin 3D Printing: The Elegoo Mercury V3 Wash and...

ChatGPT’s Deep Research just identified 20 jobs it will replace

Works Cuts Illustration This week, OpenAI launched its Deep Research...

China Tariffs

US Consumers Feel the Impact of Tariffs on Chinese...

7-Zip 0-day exploited in Russia’s Ukraine invasion

Researchers Discover Zero-Day Vulnerability in 7-Zip Archiving Utility Researchers said...

SAS Brings AI to All with Packaged Models

SAS Unveils "Game-Changing" Approach to Tackle Business Challenges with...

A.I. Isn’t Coming for Moe

The Future of Voice Acting: Human Touch in an...

GTC 2025: Expert Sessions

Featured Researcher and Educator Sessions at NVIDIA GTC 2025 Advancements...

LEAVE A REPLY

Please enter your comment!
Please enter your name here