Nvidia to Boost DeepSeek AI Speed by 30 Times

Nvidia Seeks to Capitalize on Cheaper AI with New Software and Hardware

In January, the emergence of DeepSeek’s R1 artificial intelligence program prompted a stock market selloff. Seven weeks later, chip giant Nvidia, the dominant force in AI processing, seeks to place itself squarely in the middle of the dramatic economics of cheaper AI that DeepSeek represents.

Accelerating DeepSeek R1 with Blackwell Chips

On Tuesday, at the SAP Center in San Jose, Calif., Nvidia co-founder and CEO Jensen Huang discussed how the company’s Blackwell chips can dramatically accelerate DeepSeek R1. Nvidia claims that its GPU chips can process 30 times the throughput that DeepSeek R1 would normally have in a data center, measured by the number of tokens per second, using new open-source software called Nvidia Dynamo.

Dynamo Software

Dynamo can capture that benefit and deliver 30 times more performance in the same number of GPUs in the same architecture for reasoning models like DeepSeek, said Ian Buck, Nvidia’s head of hyperscale and high-performance computing, in a media briefing before Huang’s keynote at the company’s GTC conference.

The Dynamo software, available today on GitHub, distributes inference work across as many as 1,000 Nvidia GPU chips. More work can be accomplished per second of machine time by breaking up the work to run in parallel.

Premium Services

The result: For an inference task priced at $1 per million tokens, more of the tokens can be run each second, boosting revenue per second for services providing the GPUs. Service providers can then decide to run more customer queries on DeepSeek or devote more processing to a single user to charge more for a "premium" service.

AI Factories

"AI factories can offer a higher premium service at premium dollar per million tokens," said Buck, "and also increase the total token volume of their whole factory." The term "AI factory" is Nvidia’s coinage for large-scale services that run a heavy volume of AI work using the company’s chips, software, and rack-based equipment.

New Software and Hardware Announcements

The prospect of using more chips to increase throughput (and therefore business) for AI inference is Nvidia’s answer to investor concerns that less computing would be used overall because DeepSeek can cut the amount of processing needed for each query.

By using Dynamo with Blackwell, the current model of Nvidia’s flagship AI GPU, the Dynamo software can make such AI data centers produce 50 times as much revenue as with the older model, Hopper, said Buck.

Additional Announcements

Nvidia has posted its own tweaked version of DeepSeek R1 on HuggingFace. The Nvidia version reduces the number of bits used by R1 to manipulate variables to what’s known as "FP4," or floating-point four bits, which is a fraction of the computing needed for the standard floating-point 32 or B-float 16.

Conclusion

Nvidia’s announcements at the GTC conference aim to position the company as a leader in the rapidly evolving AI landscape, with new software and hardware offerings that can help accelerate the development and deployment of AI applications.

Frequently Asked Questions

Q: What is Nvidia’s response to the emergence of DeepSeek’s R1 AI program?
A: Nvidia is positioning itself as a leader in the rapidly evolving AI landscape, with new software and hardware offerings that can help accelerate the development and deployment of AI applications.

Q: How does Nvidia’s Blackwell chip accelerate DeepSeek R1?
A: Nvidia’s GPU chips can process 30 times the throughput that DeepSeek R1 would normally have in a data center, using new open-source software called Nvidia Dynamo.

Q: What is Dynamo software?
A: Dynamo software distributes inference work across as many as 1,000 Nvidia GPU chips, allowing for more work to be accomplished per second of machine time by breaking up the work to run in parallel.

Post Views: 74

Nvidia to Boost DeepSeek AI Speed by 30 Times

SmartThings Blog

Generate single title from this title 3 ways students can use AI tools to improve their literacy skills in 100 -150 characters. And it...

Tackling the housing shortage with robotic microfactories | MIT News

Generate single title from this title Data Science • AI • Advanced Analytics in 100 -150 characters. And it must return only title i...

the ‘Friend Yet Foe’ Paradox

SmartThings Blog

Generate single title from this title 3 ways students can use AI tools to improve their literacy skills in 100 -150 characters. And it...

Tackling the housing shortage with robotic microfactories | MIT News

Generate single title from this title Data Science • AI • Advanced Analytics in 100 -150 characters. And it must return only title i...

the ‘Friend Yet Foe’ Paradox

Assetisation, LinkedIn, and the Future of Work

Assetisation and the reconfiguration of work

Amazon workers in Coventry helped make this happen

LEAVE A REPLY Cancel reply

Latest

SmartThings Blog

Generate single title from this title 3 ways students can use AI tools to improve their literacy skills in 100 -150 characters. And it...

Tackling the housing shortage with robotic microfactories | MIT News

Categories

Useful Links

Our Newsletter