Nvidia Sweeps MLPerf Benchmark with New Focus on Generative AI
Introduction
The latest round of MLPerf benchmark results is out, and Nvidia’s general-purpose GPU chips have once again made a nearly clean sweep of the test. The focus this time is on generative AI applications, including large language models (LLMs) and graph neural networks.
Background
The MLPerf benchmark test, organized by the MLCommons industry consortium, measures the speed of machines in producing tokens, processing queries, or outputting samples of data. The test is designed to simulate real-world scenarios, such as chatbots, where response time is crucial.
New Tests and Results
The latest round of MLPerf benchmarks features two new tests, representing common generative AI uses. The first test is how fast the chips perform on Meta’s open-source LLM Llama 3.1 405b, one of the larger gen AI programs in common use. The second test is an interactive version of Meta’s smaller Llama 2 70b, simulating the need for a quick response when someone has typed a prompt.
The results show that Nvidia’s GPUs produced top results in almost every test in the closed division, where the rules for the software setup are the most strict. The only exception was the top score in two tests of Llama 2 70b, which was taken by AMD’s MI300X GPU.
Competitors and Results
Other competitors in the MLPerf benchmark included Google, which submitted a system with its Trillium chip, and startup MangoBoost, which makes plug-in cards that can speed data transfer between GPU racks. Google’s system trailed far behind Nvidia’s Blackwell in a test of how fast the computer could answer queries for the Stable Diffusion image-generation test.
Conclusion
The latest round of MLPerf benchmark results demonstrates Nvidia’s dominance in the field of artificial intelligence. The company’s general-purpose GPU chips have once again produced top results in almost every test, with a focus on generative AI applications.
FAQs
Q: What is MLPerf benchmark?
A: The MLPerf benchmark is a test organized by the MLCommons industry consortium, designed to measure the speed of machines in producing tokens, processing queries, or outputting samples of data.
Q: What is the focus of this round of MLPerf benchmark?
A: The focus of this round is on generative AI applications, including large language models (LLMs) and graph neural networks.
Q: Which company took the top score in two tests of Llama 2 70b?
A: AMD’s MI300X GPU took the top score in two tests of Llama 2 70b.
Q: Which company submitted a system with its Trillium chip?
A: Google submitted a system with its Trillium chip.
Q: What is the name of the startup that makes plug-in cards that can speed data transfer between GPU racks?
A: The startup is called MangoBoost.

