Date:

Apple Reveals Secret Sauce Behind DeepSeek AI

The Artificial Intelligence Market is Rocked by the Sudden Popularity of DeepSeek

The artificial intelligence market — and the entire stock market — was rocked on Monday by the sudden popularity of DeepSeek, the open-source large language model developed by a China-based hedge fund that has bested OpenAI’s best on some tasks while costing far less.

Why Does DeepSeek Work So Well?

It turns out it’s a broad approach within deep learning forms of artificial intelligence to squeeze more out of computer chips by exploiting a phenomenon known as “sparsity.”

Sparsity and its Role in AI

The ability to use only some of the total parameters of a large language model and shut off the rest is an example of sparsity. That sparsity can have a major impact on how big or small the computing budget is for an AI model.

Optimizing AI with Fewer Parameters

As Abnar and team put it in technical terms, “Increasing sparsity while proportionally expanding the total number of parameters consistently leads to a lower pretraining loss, even when constrained by a fixed training compute budget.” The term “pretraining loss” is the AI term for how accurate a neural net is. Lower training loss means more accurate results.

The Future of Sparsity Research

Details aside, the most profound point about all this is that sparsity as a phenomenon is not new in AI research, nor is it a new approach in engineering.

The Magic Dial of Sparsity

The magic dial of sparsity doesn’t only shave computing costs, as in the case of DeepSeek — it works in the other direction too: it can also make bigger and bigger AI computers more efficient.

Conclusion

DeepSeek is only one example of a broad area of research that many labs are already following, and that many more will now jump on in order to replicate DeepSeek’s success. The magic dial of sparsity is profound because it not only improves economics for a small budget, as in the case of DeepSeek, it also works in the other direction: Spend more, and you’ll get even better benefits via sparsity.

FAQs

Q: What is DeepSeek?
A: DeepSeek is an open-source large language model developed by a China-based hedge fund that has bested OpenAI’s best on some tasks while costing far less.

Q: What is sparsity in AI?
A: Sparsity in AI refers to the ability to use only some of the total parameters of a large language model and shut off the rest, which can have a major impact on how big or small the computing budget is for an AI model.

Q: How does sparsity improve AI performance?
A: Sparsity can improve AI performance by allowing for more accurate results with less computing power, making it a key avenue of research to change the state of the art in the field.

Q: What are the implications of DeepSeek’s success?
A: The implications of DeepSeek’s success are that it highlights a sea change in AI that could empower smaller labs and researchers to create competitive models and diversify the field of available options.

Latest stories

Read More

The Zenfone 12 Ultra is another big phone the US won’t get

The Zenfone 12 Ultra: A New Chapter in the...

Docker Exercises: Part 1

Table of Contents Questions Create a Dockerfile that installs...

Chipmakers Qualcomm and Arm post sales rise on smartphone strength

Qualcomm and Arm Post Strong Quarterly Sales Growth Qualcomm and...

DeepSeek A.I. Is a Danger to Party Control

China's Ambitious AI Plan In 2017, China watched in awe...

AI Discovers Hidden Cancer Markers

AI Tool Finds Cancer Signs Missed by Doctors Pioneering Research...

Amazon Nova: Cost-Effective and Performant Cloud Computing Options

Security Teams Leverage Amazon Nova Micro to Automate Threat...

LEAVE A REPLY

Please enter your comment!
Please enter your name here