Load Balancing with Auto-Scaling
Traditional data processing often relies on batch processing, where large volumes of data are accumulated to process at one time, progressing through one stage at a time. This introduces two primary problems:
* Efficiency: It’s difficult to efficiently use heterogeneous resources. When the batch is working on a CPU-heavy stage, the GPU resources will be underused, and the reverse is also true.
* Latency: The need to store and load intermediate data products to and from the cluster storage between processing stages introduces significant latency.
In contrast, streaming processing directly pipes intermediate data products between stages, and begins next-stage processing on individual data as soon as the previous stages are complete.
GPU Workers During a Run of NeMo Curator
The efficient use of GPU resources can be further demonstrated by tracking the worker utilization of GPU stages of the NeMo Curator pipeline during execution.
This experiment was done with 32 GPUs on four nodes. If you run with fewer GPUs, the GPU worker usage will be lower because the auto-scaler has a coarser granularity for GPU allocation. If you run with more GPUs, the GPU worker usage will be even closer to 100%.
Request Early Access
We have been actively working with a variety of early access account partners to enable best-in-class TCO for video data curation, unlocking the next generation of multi-modal models for physical AI and beyond:
- Blackforest Labs
- Canva
- Deluxe Media
- Getty Images
- Linker Vision
- Milestone Systems
- Nexar
- Twelvelabs
- Uber
To get started without needing your own compute infrastructure, NeMo Video Curator is available through early access on NVIDIA DGX Cloud. Submit your interest in video data curation and model fine-tuning to the NVIDIA NeMo Curator Early Access Program and select “managed services” when submitting your interest. NeMo Video Curator is also available through early access as a downloadable SDK, suitable for running on your own infrastructure, also through the Early Access Program.
Conclusion
In conclusion, the NeMo Curator pipeline has demonstrated significant improvements in efficiency and throughput, while also providing a scalable and flexible solution for video data curation and model fine-tuning. With its ability to efficiently utilize GPU resources and its support for auto-scaling, the NeMo Curator pipeline is well-positioned to meet the demands of the growing physical AI market.
FAQs
Q: What is the NeMo Curator pipeline?
A: The NeMo Curator pipeline is a flexible, GPU-accelerated streaming pipeline for large-scale video data curation and model fine-tuning.
Q: What are the benefits of using the NeMo Curator pipeline?
A: The NeMo Curator pipeline provides a scalable and flexible solution for video data curation and model fine-tuning, offering significant improvements in efficiency and throughput while providing a lower total cost of ownership (TCO).
Q: How does the NeMo Curator pipeline use GPU resources?
A: The NeMo Curator pipeline uses GPU resources efficiently, with the auto-scaler rearranging resources to keep GPU stage workers busy over 99.5% of the time.
Q: How can I get started with the NeMo Curator pipeline?
A: You can get started with the NeMo Curator pipeline by submitting your interest in video data curation and model fine-tuning to the NVIDIA NeMo Curator Early Access Program and selecting “managed services” or “downloadable SDK” when submitting your interest.

