• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Microsoft Azure Announces New Scalable Generative AI VMs Featuring NVIDIA H100

Joined
May 30, 2015
Messages
1,924 (0.56/day)
Location
Seattle, WA
Microsoft Azure announced their new ND H100 v5 virtual machine which packs Intel's Sapphire Rapids Xeon Scalable processors with NVIDIA's Hopper H100 GPUs, as well as NVIDIA's Quantum-2 CX7 interconnect. Inside each physical machine sits eight H100s—presumably the SXM5 variant packing a whopping 132 SMs and 528 4th generation tensor cores—interconnected by NVLink 4.0 which ties them all together with 3.6 TB/s bisectional bandwidth. Outside each local machine is a network of thousands more H100s connected together with 400 GB/s Quantum-2 CX7 InfiniBand, which Microsoft says allows 3.2 Tb/s per VM for on-demand scaling to accelerate the largest AI training workloads.

Generative AI solutions like ChatGPT have accelerated demand for multi-ExaOP cloud services that can handle the large training sets and utilize the latest development tools. Azure's new ND H100 v5 VMs offer that capability to organizations of any size, whether you're a smaller startup or a larger company looking to implement large-scale AI training deployments. While Microsoft is not making any direct claims for performance, NVIDIA has advertised H100 as running up to 30x faster than the preceding Ampere architecture that is currently offered with the ND A100 v4 VMs.




Microsoft Azure provides the following technical specifications for the new VMs:
  • 8x NVIDIA H100 Tensor Core GPUs interconnected via next gen NVSwitch and NVLink 4.0
  • 400 Gb/s NVIDIA Quantum-2 CX7 InfiniBand per GPU with 3.2 Tb/s per VM in a non-blocking fat-tree network
  • NVSwitch and NVLink 4.0 with 3.6 TB/s bisectional bandwidth between 8 local GPUs within each VM
  • 4th Gen Intel Xeon Scalable processors
  • PCIE Gen 5 host to GPU interconnect with 64 GB/s bandwidth per GPU
  • 16 Channels of 4800 MHz DDR5 DIMMs

Judging by what we know of NVIDIA Hopper this likely means Microsoft is using either their own racks filled with DGX H100s, or utilizing NVIDIA's DGX SuperPOD which packs the DGX H100s five-high and as many as 16 across for a total 640 GPUs packing 337,920 tensor cores. Don't forget that each DGX H100 also contains two Intel Xeon Scalable processors. Since Microsoft has already specified their systems use Intel's latest Sapphire Rapids Xeons that can feature as many as 60 cores each, than there are potentially 9,600 x86 cores available to help feed those massive GPUs.

Microsoft Azure has opened up the preview of the ND H100 v5 VM service and you can sign up to request access here.

View at TechPowerUp Main Site | Source
 
Joined
Oct 6, 2021
Messages
1,605 (1.42/day)
Wow, Xeon still exists and Microsoft insists on using it for some random and unknown reason lol
 
Joined
May 19, 2009
Messages
1,861 (0.33/day)
Location
Latvia
System Name Personal \\ Work - HP EliteBook 840 G6
Processor 7700X \\ i7-8565U
Motherboard Asrock X670E PG Lightning
Cooling Noctua DH-15
Memory G.SKILL Trident Z5 RGB Black 32GB 6000MHz CL36 \\ 16GB DDR4-2400
Video Card(s) ASUS RoG Strix 1070 Ti \\ Intel UHD Graphics 620
Storage 2x KC3000 2TB, Samsung 970 EVO 512GB \\ OEM 256GB NVMe SSD
Display(s) BenQ XL2411Z \\ FullHD + 2x HP Z24i external screens via docking station
Case Fractal Design Define Arc Midi R2 with window
Audio Device(s) Realtek ALC1150 with Logitech Z533
Power Supply Corsair AX860i
Mouse Logitech G502
Keyboard Corsair K55 RGB PRO
Software Windows 11 \\ Windows 10
Wow, Xeon still exists and Microsoft insists on using it for some random and unknown reason lol
Uhhh... what now? Is this supposed to be a joke?
 
Joined
Dec 30, 2010
Messages
2,194 (0.43/day)
Buying large quantities of CPU's kind of guarantees you a discount as well. Intel is known for it.
 
Joined
Oct 6, 2021
Messages
1,605 (1.42/day)
Joined
May 19, 2009
Messages
1,861 (0.33/day)
Location
Latvia
System Name Personal \\ Work - HP EliteBook 840 G6
Processor 7700X \\ i7-8565U
Motherboard Asrock X670E PG Lightning
Cooling Noctua DH-15
Memory G.SKILL Trident Z5 RGB Black 32GB 6000MHz CL36 \\ 16GB DDR4-2400
Video Card(s) ASUS RoG Strix 1070 Ti \\ Intel UHD Graphics 620
Storage 2x KC3000 2TB, Samsung 970 EVO 512GB \\ OEM 256GB NVMe SSD
Display(s) BenQ XL2411Z \\ FullHD + 2x HP Z24i external screens via docking station
Case Fractal Design Define Arc Midi R2 with window
Audio Device(s) Realtek ALC1150 with Logitech Z533
Power Supply Corsair AX860i
Mouse Logitech G502
Keyboard Corsair K55 RGB PRO
Software Windows 11 \\ Windows 10

Considering the Xeon loses in pretty much every possible way, I think it's a pretty good joke. Maybe intel has returned to the strategy of giving generous discounts, I hope it doesn't suffer any more lawsuits. :p

Every major purchase at that level has bulk discounts. You can also be sure that this deal was made quite some time before, likely even before that Epyc came out. Also... Let's be honest, 70% difference sreams like cherrypicking at best.
 
Top