Date:

NVIDIA NIM on AWS Supercharges AI Inference

Generative AI on AWS: NVIDIA NIM Microservices for Secure, High-Performance Inference Solutions

Expanding Collaboration between AWS and NVIDIA

Amazon Web Services (AWS) and NVIDIA have expanded their collaboration, announcing that NVIDIA NIM microservices are now available directly from the AWS Marketplace, Amazon Bedrock Marketplace, and Amazon SageMaker JumpStart. This move enables developers to deploy NVIDIA-optimized inference for commonly used models at scale, driving faster AI inference and lower latency for generative AI applications.

What are NVIDIA NIM Microservices?

NVIDIA NIM microservices are a set of easy-to-use microservices designed for secure, reliable deployment of high-performance, enterprise-grade AI model inference across clouds, data centers, and workstations. These microservices are built on robust inference engines, such as NVIDIA Triton Inference Server, NVIDIA TensorRT, NVIDIA TensorRT-LLM, and PyTorch, and support a broad spectrum of AI models, from open-source community ones to NVIDIA AI Foundation models and custom ones.

Key Features and Benefits

  • Prebuilt containers for secure, reliable deployment of high-performance AI model inference
  • Support for a broad spectrum of AI models, including open-source community ones, NVIDIA AI Foundation models, and custom ones
  • Easy-to-use microservices for secure, reliable deployment of high-performance AI model inference
  • Deployment across various AWS services, including Amazon Elastic Compute Cloud (EC2), Amazon Elastic Kubernetes Service (EKS), and Amazon SageMaker
  • Preview over 100 NIM microservices built from commonly used models and model families

NIM Microservices Available on AWS

The following NIM microservices are now available on AWS:

  • NVIDIA Nemotron-4
  • Llama 3.1 8B-Instruct
  • Llama 3.1 70B-Instruct
  • Mixtral 8x7B Instruct v0.1

Case Studies: SoftServe’s Generative AI Solutions

SoftServe, an IT consulting and digital services provider, has developed six generative AI solutions fully deployed on AWS and accelerated by NVIDIA NIM and AWS services. These solutions are available on AWS Marketplace and include:

  • SoftServe Gen AI Drug Discovery
  • SoftServe Gen AI Industrial Assistant
  • Digital Concierge
  • Multimodal RAG System
  • Content Creator
  • Speech Recognition Platform

Getting Started with NIM on AWS

Developers can deploy NVIDIA NIM microservices on AWS according to their unique needs and requirements. By doing so, developers and enterprises can achieve high-performance AI with NVIDIA-optimized inference containers across various AWS services.

Conclusion

The collaboration between AWS and NVIDIA has paved the way for widespread adoption of generative AI technology. With NIM microservices now available on AWS, developers and enterprises can deploy high-performance AI models with ease, efficiency, and security.

FAQs

Q: What are NVIDIA NIM microservices?
A: NVIDIA NIM microservices are a set of easy-to-use microservices designed for secure, reliable deployment of high-performance, enterprise-grade AI model inference across clouds, data centers, and workstations.

Q: What are the key features and benefits of NVIDIA NIM microservices?
A: The key features and benefits of NVIDIA NIM microservices include prebuilt containers for secure, reliable deployment of high-performance AI model inference, support for a broad spectrum of AI models, easy-to-use microservices for secure, reliable deployment, and deployment across various AWS services.

Q: What NIM microservices are available on AWS?
A: The following NIM microservices are available on AWS: NVIDIA Nemotron-4, Llama 3.1 8B-Instruct, Llama 3.1 70B-Instruct, and Mixtral 8x7B Instruct v0.1.

Q: What are the benefits of using NIM microservices on AWS?
A: The benefits of using NIM microservices on AWS include high-performance AI, lower latency, and cost-effectiveness, as well as ease of deployment and management.

Latest stories

Read More

LEAVE A REPLY

Please enter your comment!
Please enter your name here