Date:

FlashBlade//EXA Moves Data at 10+ TB/sec

FlashBlade//EXA: A New All-Flash Storage Array for AI Factories and Multi-Modal AI Training

Pure Storage today unveiled FlashBlade//EXA, a new all-flash storage array designed to meet the demanding needs of AI factories and multi-modal AI training. FlashBlade//EXA separates the metadata layer from the data path in the I/O stream, which Pure says enables the array to move data rates exceeding 10 terabytes per second per namespace.

Architecture and Design

The new array splits the high-speed I/O into two parts. The metadata is routed through the metadata core component of the FlashBlade//EXA, which is based on high-speed DirectFlash Module (DFM) nodes that house the company’s scale-out distributed key-value store. The metadata core nodes run on the Purity//FB operating system, which has been bolstered with support for Parallel NFS (pNFS) to communicate with the compute nodes.

Block data is separately routed over Remote Direct Memory Access (RDMA) to the data nodes, which are industry standard Linux-based servers with this release (the company plans to incorporate its DFM tech in a future release). This architecture allows FlashBlade//EXA to reach the maximum allowable bandwidth between the data storage and the compute nodes.

Eliminating I/O Bottlenecks

"This segregation provides non-blocking data access that increases exponentially in high-performance computing scenarios where the metadata requests can equal, if not outnumber, data I/O operations," writes Alex Castro, a Pure Storage vice president, in a blog post.

Comparison to Other Solutions

When Sun Microsystems created NFS back in 1984, functionality was the primary focus, not performance, Castro says. However, legacy NAS devices that require more I/O controllers to be added with each new data node have created a bottleneck to performance. Splitting the I/O is the key to unlocking the bottleneck created by legacy NAS arrays, he says.

Some storage vendors have resorted to using specialized file systems, such as Lustre, to deliver the parallelism needed for large-scale projects, Castro writes, but these environments were prone to metadata latency and required Ph.D.-level skills to manage. On the other side, other vendors have inserted a compute aggregation layer between the compute clients and the data source.

AI Factories and Multi-Modal AI Training

Pure says it developed the FlashBlade//EXA to meet the emerging needs of "AI factories," and in particular the need to keep thousands of high-end GPUs fed with data. In terms of scale, AI factories sit in the middle. On the low end are enterprise AI workloads, such as inference and RAG, that work on 50TB to 100PB of data, while AI factories will need access to up to 10,000 GPUs on data sets from 100PB to multiple exabytes. At the high end, hyperscalers can have upwards of 100EBs and more than 10,000 GPUs. At all levels, having idle GPUs is an impediment to productivity.

Conclusion

Pure Storage says it expects to start shipping FlashBlade//EXA this summer.

FAQs

Q: What is FlashBlade//EXA?
A: FlashBlade//EXA is a new all-flash storage array designed to meet the demanding needs of AI factories and multi-modal AI training.

Q: What are the key features of FlashBlade//EXA?
A: The key features of FlashBlade//EXA include its ability to separate the metadata layer from the data path in the I/O stream, which enables the array to move data rates exceeding 10 terabytes per second per namespace.

Q: How does FlashBlade//EXA differ from other storage solutions?
A: FlashBlade//EXA differs from other storage solutions in that it splits the high-speed I/O into two parts, allowing it to reach the maximum allowable bandwidth between the data storage and the compute nodes.

Latest stories

Read More

LEAVE A REPLY

Please enter your comment!
Please enter your name here