Date:

DeepSeek-R1 Now Live With NVIDIA NIM

DeepSeek-R1: A State-of-the-Art Reasoning Model for Agentic AI Inference

DeepSeek-R1: A Perfect Example of Test-Time Scaling

DeepSeek-R1 is an open model with state-of-the-art reasoning capabilities. Instead of offering direct responses, reasoning models like DeepSeek-R1 perform multiple inference passes over a query, conducting chain-of-thought, consensus, and search methods to generate the best answer.

The Importance of Test-Time Scaling

Performing this sequence of inference passes – using reason to arrive at the best answer – is known as test-time scaling. DeepSeek-R1 is a perfect example of this scaling law, demonstrating why accelerated computing is critical for the demands of agentic AI inference.

Model Quality and Test-Time Compute

As models are allowed to iteratively “think” through the problem, they create more output tokens and longer generation cycles, so model quality continues to scale. Significant test-time compute is critical to enable both real-time inference and higher-quality responses from reasoning models like DeepSeek-R1, requiring larger inference deployments.

High-Quality Inference and Real-Time Performance

R1 delivers leading accuracy for tasks demanding logical inference, reasoning, math, coding, and language understanding while also delivering high inference efficiency.

Introducing the DeepSeek-R1 NIM Microservice

To help developers securely experiment with these capabilities and build their own specialized agents, the 671-billion-parameter DeepSeek-R1 model is now available as an NVIDIA NIM microservice on build.nvidia.com. The DeepSeek-R1 NIM microservice can deliver up to 3,872 tokens per second on a single NVIDIA HGX H200 system.

Getting Started with the DeepSeek-R1 NIM Microservice

Developers can test and experiment with the application programming interface (API), which is expected to be available soon as a downloadable NIM microservice, part of the NVIDIA AI Enterprise software platform. The DeepSeek-R1 NIM microservice simplifies deployments with support for industry-standard APIs. Enterprises can maximize security and data privacy by running the NIM microservice on their preferred accelerated computing infrastructure.

Creating Customized DeepSeek-R1 NIM Microservices

Using NVIDIA AI Foundry with NVIDIA NeMo software, enterprises will also be able to create customized DeepSeek-R1 NIM microservices for specialized AI agents.

Conclusion

The DeepSeek-R1 model is a large mixture-of-experts (MoE) model that incorporates an impressive 671 billion parameters, supporting a large input context length of 128,000 tokens. The model uses an extreme number of experts per layer, with each layer having 256 experts, with each token routed to eight separate experts in parallel for evaluation.

Frequently Asked Questions

Q: What is the main goal of the DeepSeek-R1 model?
A: The main goal of the DeepSeek-R1 model is to provide state-of-the-art reasoning capabilities for agentic AI inference.

Q: What is test-time scaling?
A: Test-time scaling refers to the process of using reason to arrive at the best answer, which requires multiple inference passes over a query.

Q: What is the importance of test-time compute for reasoning models like DeepSeek-R1?
A: Test-time compute is critical to enable both real-time inference and higher-quality responses from reasoning models like DeepSeek-R1, requiring larger inference deployments.

Q: How can I get started with the DeepSeek-R1 NIM microservice?
A: Developers can experience the DeepSeek-R1 NIM microservice on build.nvidia.com and can test and experiment with the API, which is expected to be available soon as a downloadable NIM microservice, part of the NVIDIA AI Enterprise software platform.

Latest stories

Read More

Google Releases Responsible AI Report, Drops Anti-Weapons Pledge

The Most Notable Part of Google's Latest Responsible AI...

Nintendo Switch 2 Announcement

It's Finally Time: Nintendo Reveals the Successor to the...

Go Module Mirror Served Backdoor to Devs for 3+ Years

A Backdoored Package Lurked in a Google-Run Go Mirror...

TechCrunch Sessions: AI

Step into the Future of AI at TechCrunch Sessions Register...

DeepSeek Serves as a Warning About Big Tech

A.I.'s Sputnik Moment: A Canary in the Coal Mine When...

Adobe Slashes 70% Off Creative Cloud for Education

Adobe Offers 70% Discount on Creative Cloud All Apps...

UK hospitals begin live trial of prostate cancer-detecting AI

Three English Hospitals Launch Clinical Trial of AI Technology...

The Thing: Comic Perfect

Marvel's Fantastic Four: First Steps Trailer Sparks Excitement Over...

LEAVE A REPLY

Please enter your comment!
Please enter your name here