Tuesday, November 19th 2024
NVIDIA Prepares GB200 NVL4: Four "Blackwell" GPUs and Two "Grace" CPUs in a 5,400 W Server
At SC24, NVIDIA announced its latest compute-dense AI accelerators in the form of GB200 NVL4, a single-server solution that expands the company's "Blackwell" series portfolio. The new platform features an impressive combination of four "Blackwell" GPUs and two "Grace" CPUs on a single board. The GB200 NVL4 boasts remarkable specifications for a single-server system, including 768 GB of HBM3E memory across its four Blackwell GPUs, delivering a combined memory bandwidth of 32 TB/s. The system's two Grace CPUs have 960 GB of LPDDR5X memory, making it a powerhouse for demanding AI workloads. A key feature of the NVL4 design is its NVLink interconnect technology, which enables communication between all processors on the board. This integration is important for maintaining optimal performance across the system's multiple processing units, especially during large training runs or inferencing a multi-trillion parameter model.
Performance comparisons with previous generations show significant improvements, with NVIDIA claiming the GB200 GPUs deliver 2.2x faster overall performance and 1.8x quicker training capabilities compared to their GH200 NVL4 predecessor. The system's power consumption reaches 5,400 watts, which effectively doubles the 2,700-watt requirement of the GB200 NVL2 model, its smaller sibling that features two GPUs instead of four. NVIDIA is working closely with OEM partners to bring various Blackwell solutions to market, including the DGX B200, GB200 Grace Blackwell Superchip, GB200 Grace Blackwell NVL2, GB200 Grace Blackwell NVL4, and GB200 Grace Blackwell NVL72. Fitting 5,400 W of TDP in a single server will require liquid cooling for optimal performance, and the GB200 NVL4 is expected to go inside server racks for hyperscaler customers, which usually have a custom liquid cooling systems inside their data centers.
Sources:
HardwareLuxx, via VideoCardz
Performance comparisons with previous generations show significant improvements, with NVIDIA claiming the GB200 GPUs deliver 2.2x faster overall performance and 1.8x quicker training capabilities compared to their GH200 NVL4 predecessor. The system's power consumption reaches 5,400 watts, which effectively doubles the 2,700-watt requirement of the GB200 NVL2 model, its smaller sibling that features two GPUs instead of four. NVIDIA is working closely with OEM partners to bring various Blackwell solutions to market, including the DGX B200, GB200 Grace Blackwell Superchip, GB200 Grace Blackwell NVL2, GB200 Grace Blackwell NVL4, and GB200 Grace Blackwell NVL72. Fitting 5,400 W of TDP in a single server will require liquid cooling for optimal performance, and the GB200 NVL4 is expected to go inside server racks for hyperscaler customers, which usually have a custom liquid cooling systems inside their data centers.
6 Comments on NVIDIA Prepares GB200 NVL4: Four "Blackwell" GPUs and Two "Grace" CPUs in a 5,400 W Server
wait for it........wait for it........
Tah Dah...
It's ALL nGreediya's fault, hahahahaha :D