NVIDIA NIM on AWS Supercharges AI Inference

Generative AI on AWS: NVIDIA NIM Microservices for Secure, High-Performance Inference Solutions

Expanding Collaboration between AWS and NVIDIA

Amazon Web Services (AWS) and NVIDIA have expanded their collaboration, announcing that NVIDIA NIM microservices are now available directly from the AWS Marketplace, Amazon Bedrock Marketplace, and Amazon SageMaker JumpStart. This move enables developers to deploy NVIDIA-optimized inference for commonly used models at scale, driving faster AI inference and lower latency for generative AI applications.

What are NVIDIA NIM Microservices?

NVIDIA NIM microservices are a set of easy-to-use microservices designed for secure, reliable deployment of high-performance, enterprise-grade AI model inference across clouds, data centers, and workstations. These microservices are built on robust inference engines, such as NVIDIA Triton Inference Server, NVIDIA TensorRT, NVIDIA TensorRT-LLM, and PyTorch, and support a broad spectrum of AI models, from open-source community ones to NVIDIA AI Foundation models and custom ones.

Key Features and Benefits

Prebuilt containers for secure, reliable deployment of high-performance AI model inference
Support for a broad spectrum of AI models, including open-source community ones, NVIDIA AI Foundation models, and custom ones
Easy-to-use microservices for secure, reliable deployment of high-performance AI model inference
Deployment across various AWS services, including Amazon Elastic Compute Cloud (EC2), Amazon Elastic Kubernetes Service (EKS), and Amazon SageMaker
Preview over 100 NIM microservices built from commonly used models and model families

NIM Microservices Available on AWS

The following NIM microservices are now available on AWS:

NVIDIA Nemotron-4
Llama 3.1 8B-Instruct
Llama 3.1 70B-Instruct
Mixtral 8x7B Instruct v0.1

Case Studies: SoftServe’s Generative AI Solutions

SoftServe, an IT consulting and digital services provider, has developed six generative AI solutions fully deployed on AWS and accelerated by NVIDIA NIM and AWS services. These solutions are available on AWS Marketplace and include:

SoftServe Gen AI Drug Discovery
SoftServe Gen AI Industrial Assistant
Digital Concierge
Multimodal RAG System
Content Creator
Speech Recognition Platform

Getting Started with NIM on AWS

Developers can deploy NVIDIA NIM microservices on AWS according to their unique needs and requirements. By doing so, developers and enterprises can achieve high-performance AI with NVIDIA-optimized inference containers across various AWS services.

Conclusion

The collaboration between AWS and NVIDIA has paved the way for widespread adoption of generative AI technology. With NIM microservices now available on AWS, developers and enterprises can deploy high-performance AI models with ease, efficiency, and security.

FAQs

Q: What are NVIDIA NIM microservices?
A: NVIDIA NIM microservices are a set of easy-to-use microservices designed for secure, reliable deployment of high-performance, enterprise-grade AI model inference across clouds, data centers, and workstations.

Q: What are the key features and benefits of NVIDIA NIM microservices?
A: The key features and benefits of NVIDIA NIM microservices include prebuilt containers for secure, reliable deployment of high-performance AI model inference, support for a broad spectrum of AI models, easy-to-use microservices for secure, reliable deployment, and deployment across various AWS services.

Q: What NIM microservices are available on AWS?
A: The following NIM microservices are available on AWS: NVIDIA Nemotron-4, Llama 3.1 8B-Instruct, Llama 3.1 70B-Instruct, and Mixtral 8x7B Instruct v0.1.

Q: What are the benefits of using NIM microservices on AWS?
A: The benefits of using NIM microservices on AWS include high-performance AI, lower latency, and cost-effectiveness, as well as ease of deployment and management.

Post Views: 29

NVIDIA NIM on AWS Supercharges AI Inference

Generate single title from this title Train CodeFu-7B with veRL and Ray on Amazon SageMaker Training jobs in 100 -150 characters. And it must...

Generate single title from this title Nearly half of high school students now use AI in college search in 100 -150 characters. And it...

Engineering confidence to navigate uncertainty | MIT News

Generate single title from this title Best of MWC 2026: Live updates on phones, concepts, and robots we’re seeing in 100 -150 characters. And...

Featured video: Coding for underwater robotics | MIT News

Generate single title from this title Train CodeFu-7B with veRL and Ray on Amazon SageMaker Training jobs in 100 -150 characters. And it must...

Generate single title from this title Nearly half of high school students now use AI in college search in 100 -150 characters. And it...

Engineering confidence to navigate uncertainty | MIT News

Generate single title from this title Best of MWC 2026: Live updates on phones, concepts, and robots we’re seeing in 100 -150 characters. And...

Featured video: Coding for underwater robotics | MIT News

Generate single title from this title Upgrading agentic AI for finance workflows in 100 -150 characters. And it must return only title i dont...

Generate single title from this title Making Softmax More Efficient with NVIDIA Blackwell Ultra in 100 -150 characters. And it must return only title...

Generate single title from this title Nvidia shares fall as blockbuster results fail to dazzle in 100 -150 characters. And it must return only...

LEAVE A REPLY Cancel reply

Latest

Generate single title from this title Train CodeFu-7B with veRL and Ray on Amazon SageMaker Training jobs in 100 -150 characters. And it must...

Generate single title from this title Nearly half of high school students now use AI in college search in 100 -150 characters. And it...

Engineering confidence to navigate uncertainty | MIT News

Categories

Useful Links

Our Newsletter