Monday, March 18th 2024

MemVerge and Micron Boost NVIDIA GPU Utilization with CXL Memory

MemVerge, a leader in AI-first Big Memory Software, has joined forces with Micron to unveil a groundbreaking solution that leverages intelligent tiering of CXL memory, boosting the performance of large language models (LLMs) by offloading from GPU HBM to CXL memory. This innovative collaboration is being showcased in Micron booth #1030 at GTC, where attendees can witness firsthand the transformative impact of tiered memory on AI workloads.

Charles Fan, CEO and Co-founder of MemVerge, emphasized the critical importance of overcoming the bottleneck of HBM capacity. "Scaling LLM performance cost-effectively means keeping the GPUs fed with data," stated Fan. "Our demo at GTC demonstrates that pools of tiered memory not only drive performance higher but also maximize the utilization of precious GPU resources."
The demonstration, conducted by engineers from MemVerge and Micron featured a FlexGen high-throughput generation engine and OPT-66B large language model running on a Supermicro Petascale Server equipped with an AMD Genoa CPU, Nvidia A10 GPU, Micron DDR5-4800 DIMMs, CZ120 CXL memory modules, and MemVerge Memory Machine X intelligent tiering software.

The results of the demonstration were impressive. The FlexGen benchmark, utilizing tiered memory, completed tasks in less than half the time compared to conventional NVMe storage methods. Simultaneously, GPU utilization soared from 51.8% to 91.8%, thanks to the transparent management of data tiering across DIMMs and CXL modules facilitated by MemVerge Memory Machine X software.

This collaboration between MemVerge, Micron, and Supermicro marks a significant milestone in advancing the capabilities of AI workloads, enabling organizations to achieve unprecedented levels of performance, efficiency, and time-to-insight. By harnessing the power of CXL memory and intelligent tiering, businesses can unlock new opportunities for innovation and accelerate their journey towards AI-driven success.

"Through our collaboration with MemVerge, Micron is able to demonstrate the substantial benefits of CXL memory modules to improve effective GPU throughput for AI applications resulting in faster time to insights for customers. Micron's innovations across the memory portfolio provide compute with the necessary memory capacity and bandwidth to scale AI use cases from cloud to the edge," said Raj Narasimhan, senior vice president and general manager of Micron's Compute and Networking Business Unit.
Source: MemVerge
Add your own comment

3 Comments on MemVerge and Micron Boost NVIDIA GPU Utilization with CXL Memory

#2
Wirko
docnorth@TheLostSwede I think CXL memory needs PCIe-5 protocol? Thanks.
That's correct. In this case, CXL devices connect to Epyc's PCIe bus. They don't connect directly to the GPU (if that's what confused you).
Posted on Reply
#3
docnorth
WirkoThat's correct. In this case, CXL devices connect to Epyc's PCIe bus. They don't connect directly to the GPU (if that's what confused you).
No, I meant the CPU, that was the discussion when Alder Lake came out. Thanks.
Posted on Reply
Dec 21st, 2024 21:54 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts