Date:

Building an Enterprise RAG Pipeline Blueprint with NVIDIA

Building an Enterprise RAG Pipeline with NVIDIA AI Blueprint

Architecture Diagram

Key Features

  • OpenAI-compatible APIs
  • Multi-turn conversations
  • Multi-collection
  • Multi-session support
  • Multilingual and cross-lingual retrieval
  • Optimized data storage
  • Configurability options for NIM selection and NIM endpoints
  • Reranking usage

Minimum System Requirements

Hardware Requirements

  • The blueprint by default uses API endpoints, making it very easy to experience without needing GPUs.
  • It is expected that the NIM microservices will need to be self-hosted as you progress in your RAG development. For self-hosting the blueprint with these microservices locally deployed, the recommended system requirement is 5 H100 or A100 GPUs with the Llama 3.1 70b NIM, the NeMo Retriever embedding and reranking NIM, and the Milvus database accelerated with NVIDIA cuVS.

OS Requirements

Deployment Options

Software used in this blueprint

NIM microservices

NVIDIA Technology

3rd Party Software

Ethical Considerations

NVIDIA believes Trustworthy AI is a shared responsibility, and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their supporting model team to ensure the models meet requirements for the relevant industry and use case and address unforeseen product misuse. For more detailed information on ethical considerations for the models, please see the Model Card++ Explainability, Bias, Safety & Security, and Privacy Subcards. Please report security vulnerabilities or NVIDIA AI concerns here.

License

Use of the models in this blueprint is governed by the NVIDIA AI Foundation Models Community License.

Terms of Use

GOVERNING TERMS: The software and materials are governed by the NVIDIA Software License Agreement and the Product-Specific Terms for NVIDIA AI Products, except that models are governed by the AI Foundation Models Community License Agreement and the NVIDIA RAG dataset is governed by the NVIDIA Asset License Agreement. ADDITIONAL INFORMATION: for Meta/llama-3.1-70b-instruct model the Llama 3.1 Community License Agreement, for nvidia/llama-3.2-nv-embedqa-1b-v2model the Llama 3.2 Community License Agreement, and for the nvidia/llama-3.2-nv-rerankqa-1b-v2 model the Llama 3.2 Community License Agreement. Built with Llama.

Conclusion

The NVIDIA AI Blueprint for RAG provides developers with a foundational starting point for building scalable and customizable retrieval pipelines that deliver high-accuracy and throughput. With its openAI-compatible APIs, multi-turn conversations, and multilingual and cross-lingual retrieval capabilities, this blueprint enables the creation of context-aware responses that connect LLMs to large corpora of enterprise data, enabling actionable insights grounded in relevant data.

FAQs

Q: What are the minimum system requirements for this blueprint?

A: The recommended system requirement is 5 H100 or A100 GPUs with the Llama 3.1 70b NIM, the NeMo Retriever embedding and reranking NIM, and the Milvus database accelerated with NVIDIA cuVS for self-hosting the blueprint with NIM microservices locally deployed.

Q: What is the license for this blueprint?

A: The use of the models in this blueprint is governed by the NVIDIA AI Foundation Models Community License.

Q: What are the ethical considerations for this blueprint?

A: NVIDIA believes Trustworthy AI is a shared responsibility, and we have established policies and practices to enable development for a wide array of AI applications. For more detailed information on ethical considerations for the models, please see the Model Card++ Explainability, Bias, Safety & Security, and Privacy Subcards.

Latest stories

Read More

Hitachi Ventures Raises $400M Fund

Hitachi Ventures Secures $400 Million for Fourth Fund Hitachi Ventures...

Gesture Drawing Essentials: 2 & 5 Minute Practice Methods

Gesture Drawing Basics One of an artist's greatest tools to...

AI Startups Raised $8 Billion in 2024

Artificial Intelligence Summit: France's Thriving AI Ecosystem The Rise of...

We Need to Talk About Austen

Penguin's 'TikTok-ified' Covers for Jane Austen's Novels Spark Outrage Publishers...

Revamped ChatGPT: The Ultimate Messaging Revolution

New ChatGPT Experience in WhatsApp On Monday, OpenAI announced that...

Pixelated Perfection: ASCII Art Revival

Amid all the fuss over DeepSeek, OpenAI has pushed...

Titanfall Battle Royale

A Surprising Turn: How Titanfall 3 Became Apex Legends The...

AI-Powered Skin Cancer Prevention

AI-Assisted Cancer Diagnosis: The Future of Skin Cancer Detection Remarkable...

LEAVE A REPLY

Please enter your comment!
Please enter your name here