NVIDIA Launches Blackwell-Powered DGX SuperPOD for Generative AI Supercomputing at Trillion-Parameter Scale

TheLostSwede · Mar 18, 2024

NVIDIA today announced its next-generation AI supercomputer—the NVIDIA DGX SuperPOD powered by NVIDIA GB200 Grace Blackwell Superchips—for processing trillion-parameter models with constant uptime for superscale generative AI training and inference workloads.

Featuring a new, highly efficient, liquid-cooled rack-scale architecture, the new DGX SuperPOD is built with NVIDIA DGX GB200 systems and provides 11.5 exaflops of AI supercomputing at FP4 precision and 240 terabytes of fast memory—scaling to more with additional racks.

Each DGX GB200 system features 36 NVIDIA GB200 Superchips—which include 36 NVIDIA Grace CPUs and 72 NVIDIA Blackwell GPUs—connected as one supercomputer via fifth-generation NVIDIA NVLink. GB200 Superchips deliver up to a 30x performance increase compared to the NVIDIA H100 Tensor Core GPU for large language model inference workloads.

"NVIDIA DGX AI supercomputers are the factories of the AI industrial revolution," said Jensen Huang, founder and CEO of NVIDIA. "The new DGX SuperPOD combines the latest advancements in NVIDIA accelerated computing, networking and software to enable every company, industry and country to refine and generate their own AI."

The Grace Blackwell-powered DGX SuperPOD features eight or more DGX GB200 systems and can scale to tens of thousands of GB200 Superchips connected via NVIDIA Quantum InfiniBand. For a massive shared memory space to power next-generation AI models, customers can deploy a configuration that connects the 576 Blackwell GPUs in eight DGX GB200 systems connected via NVLink.

New Rack-Scale DGX SuperPOD Architecture for Era of Generative AI
The new DGX SuperPOD with DGX GB200 systems features a unified compute fabric. In addition to fifth-generation NVIDIA NVLink, the fabric includes NVIDIA BlueField -3 DPUs and will support NVIDIA Quantum-X800 InfiniBand networking, announced separately today. This architecture provides up to 1,800 gigabytes per second of bandwidth to each GPU in the platform.

Additionally, fourth-generation NVIDIA Scalable Hierarchical Aggregation and Reduction Protocol (SHARP) technology provides 14.4 TeraFLOPS of In-Network Computing, a 4x increase in the next-generation DGX SuperPOD architecture compared to the prior generation.

Turnkey Architecture Pairs With Advanced Software for Unprecedented Uptime
The new DGX SuperPOD is a complete, data-center-scale AI supercomputer that integrates with high-performance storage from NVIDIA-certified partners to meet the demands of generative AI workloads. Each is built, cabled and tested in the factory to dramatically speed deployment at customer data centers.

The Grace Blackwell-powered DGX SuperPOD features intelligent predictive-management capabilities to continuously monitor thousands of data points across hardware and software to predict and intercept sources of downtime and inefficiency—saving time, energy and computing costs.

The software can identify areas of concern and plan for maintenance, flexibly adjust compute resources, and automatically save and resume jobs to prevent downtime, even without system administrators present.

If the software detects that a replacement component is needed, the cluster will activate standby capacity to ensure work finishes in time. Any required hardware replacements can be scheduled to avoid unplanned downtime.

NVIDIA DGX B200 Systems Advance AI Supercomputing for Industries
NVIDIA also unveiled the NVIDIA DGX B200 system, a unified AI supercomputing platform for AI model training, fine-tuning and inference.

DGX B200 is the sixth generation of air-cooled, traditional rack-mounted DGX designs used by industries worldwide. The new Blackwell architecture DGX B200 system includes eight NVIDIA Blackwell GPUs and two 5th Gen Intel Xeon processors. Customers can also build DGX SuperPOD using DGX B200 systems to create AI Centers of Excellence that can power the work of large teams of developers running many different jobs.

DGX B200 systems include the FP4 precision feature in the new Blackwell architecture, providing up to 144 petaflops of AI performance, a massive 1.4 TB of GPU memory and 64 TB/s of memory bandwidth. This delivers 15x faster real-time inference for trillion-parameter models over the previous generation.

DGX B200 systems include advanced networking with eight NVIDIA ConnectX -7 NICs and two BlueField-3 DPUs. These provide up to 400 gigabits per second bandwidth per connection—delivering fast AI performance with NVIDIA Quantum-2 InfiniBand and NVIDIA Spectrum -X Ethernet networking platforms.

Software and Expert Support to Scale Production AI
All NVIDIA DGX platforms include NVIDIA AI Enterprise software for enterprise-grade development and deployment. DGX customers can accelerate their work with the pretrained NVIDIA foundation models, frameworks, toolkits and new NVIDIA NIM microservices included in the software platform.

NVIDIA DGX experts and select NVIDIA partners certified to support DGX platforms assist customers throughout every step of deployment, so they can quickly move AI into production. Once systems are operational, DGX experts continue to support customers in optimizing their AI pipelines and infrastructure.

Availability
NVIDIA DGX SuperPOD with DGX GB200 and DGX B200 systems are expected to be available later this year from NVIDIA's global partners.

View at TechPowerUp Main Site | Source

Space Lynx · Mar 18, 2024

nvidia stock going to break 1k a share I think now lol

edit: only an 8 dollar gain, I think Nvidia maybe hit its cap lol

Denver · Mar 18, 2024

I'm curious about the effectiveness of a model operating at FP4 with sparsity.

System Name	Overlord Mk MLI
Processor	AMD Ryzen 7 7800X3D
Motherboard	Gigabyte X670E Aorus Master
Cooling	Noctua NH-D15 SE with offsets
Memory	32GB Team T-Create Expert DDR5 6000 MHz @ CL30-34-34-68
Video Card(s)	Gainward GeForce RTX 4080 Phantom GS
Storage	1TB Solidigm P44 Pro, 2 TB Corsair MP600 Pro, 2TB Kingston KC3000
Display(s)	Acer XV272K LVbmiipruzx 4K@160Hz
Case	Fractal Design Torrent Compact
Audio Device(s)	Corsair Virtuoso SE
Power Supply	be quiet! Pure Power 12 M 850 W
Mouse	Logitech G502 Lightspeed
Keyboard	Corsair K70 Max
Software	Windows 10 Pro
Benchmark Scores	https://valid.x86.fr/yfsd9w

Processor	7800X3D -25 all core
Motherboard	B650 Steel Legend
Cooling	Frost Commander 140
Video Card(s)	Merc 310 7900 XT @3100 core -.75v
Display(s)	Agon 27" QD-OLED Glossy 240hz 1440p
Case	NZXT H710 (Red/Black)
Audio Device(s)	Asgard 2, Modi 3, HD58X
Power Supply	Corsair RM850x Gold

NVIDIA Launches Blackwell-Powered DGX SuperPOD for Generative AI Supercomputing at Trillion-Parameter Scale

TheLostSwede

News Editor

Space Lynx

Astronaut

Denver