Monday, February 26th 2024

NVIDIA GH200 72-core Grace CPU Benched Against AMD Threadripper 7000 Series

GPTshop.ai is building prototypes of their "ultimate high-end desktop supercomputer," running the NVIDIA GH200 "Grace" CPU for AI and HPC workloads. Michael Larabel—founder and principal author of Phoronix—was first allowed to "remote access" a GPTshop.ai GH200 576 GB workstation converted model in early February—for the purpose of benchmarking it against systems based on AMD EPYC Zen 4 and Intel Xeon Emerald Rapids processors. Larabel noted: "it was a very interesting battle" that demonstrated the capabilities of 72 Arm Neoverse-V2 cores (in Grace). With this GPTshop.ai GH200 system actually being in workstation form, I also ran some additional benchmarks looking at the CPU capabilities of the GH200 compared to AMD Ryzen Threadripper 7000 series workstations."

Larabel had on-site access to two different Threadripper systems—a Hewlett-Packard (HP) Z6 G5 A workstation and a System76 Thelio Major semi-custom build. No comparable Intel "Xeon W hardware" was within reach, so the Team Green desktop supercomputer was only pitched against AMD HEDT processors. The HP review sample was configured with an AMD Ryzen Threadripper PRO 7995WX 96-core / 192-thread Zen 4 processor, 8 x 16 GB DDR5-5200 memory, and NVIDIA RTX A4000 GPU. Larabel said that it was an "all around nice high-end AMD workstation." The System76 Thelio Major was specced with an AMD Ryzen Threadripper 7980X processor "as the top-end non-PRO SKU." It is a 64-core / 128-thread part, working alongside 4 x 32 GB DDR5-4800 memory and a Radeon PRO W7900 graphics card.
Larabel noted that the three computer systems were: "all freshly tested with Ubuntu 23.10 with the Linux 6.5 kernel, the performance CPU frequency scaling governor, GCC 13.2, and other defaults for this latest Ubuntu release...All the CPUs were running at stock speeds. The high Threadripper frequencies reported is a known AMD P-State bug AMD is working to fix. As with the prior GH200 benchmarking article of the seventy-two Neoverse-V2 CPU cores with Grace, the current tests are just looking at the processor/system performance. I'm waiting on remote access again to the GH200 for running the GPU-accelerated portion of the tests so this article is intended at looking at how the Grace CPU compares to the AMD Ryzen Threadripper 7980X and PRO 7995WX x86_64 Linux workstations for various workloads. As noted in the prior article, no CPU power consumption numbers unfortunately due to no RAPL/PowerCap driver or similar exposure of just the CPU power consumption data currently under Linux for the GH200."
Phoronix's conclusion did not include any overall performance metrics for the competing processors/workstations (in 39 benchmarks)—according to further analysis conducted by Tom's Hardware, the GH200 Grace CPU beat the Threadripper 7980X in 17 tests and the 7995WX in 15. The NVIDIA processor is more of an efficiency-oriented server product, while the AMD Threadrippers are purpose-built for top desktop performance (as advertised).

The Phoronix chief author signed off with: "Those wanting to go through dozens more benchmarks of these three Linux workstations can find all of my raw data via this OB result page. So while there was a lot of NVIDIA GH200 vs. AMD EPYC vs. Intel Xeon benchmarks looking at the CPU performance earlier this month, those weighing the NVIDIA GH200 use for Linux workstation uses will hopefully find today's performance results against AMD Ryzen Threadripper useful. For HPC workloads that are AArch64-tuned and can leverage the available system memory effectively, the GH200 could deliver great performance against these Zen 4 Threadripper workstations. But for software extensively tuned for x86_64 and/or not as heavily dependent upon system memory bandwidth, the Threadripper 7980X and Threadripper PRO 7995WX are excellent workstation options."
Sources: Phoronix, Tom's Hardware, GPT Shop Dot AI
Add your own comment

6 Comments on NVIDIA GH200 72-core Grace CPU Benched Against AMD Threadripper 7000 Series

#2
Tropick
AnotherReaderThe wins are primarily due to much higher memory bandwidth. For purely CPU bound situations, the ARM cores are spanked by Zen 4.

Power metrics will definitely be interesting once NV decides to expose those stats to the linux driver but on a raw performance basis it sounds like Threadripper/EPYC/Zen4 in general will remain the standard to beat.
Posted on Reply
#3
AnotherReader
TropickPower metrics will definitely be interesting once NV decides to expose those stats to the linux driver but on a raw performance basis it sounds like Threadripper/EPYC/Zen4 in general will remain the standard to beat.
I expect the GH200's CPU to consume less power, but it isn't due to the CPUs being inherently any more power-efficient. Rather they don't have an IO die responsible for a large chunk of power.
Posted on Reply
#4
Aquinus
Resident Wat-man
AnotherReaderThe wins are primarily due to much higher memory bandwidth.
Are they really though? The strong matrix math showing I think tells a different story. Memory copy performance doesn't suggest bandwidth differs by a huge margin.
Posted on Reply
#5
AnotherReader
AquinusAre they really though? The strong matrix math showing I think tells a different story. Memory copy performance doesn't suggest bandwidth differs by a huge margin.
The results of the memory copy sub-test of stress-ng seem very odd; the 7995WX should have double the DRAM bandwidth of the 7980X, but the difference is barely 13%. Other matrix multiplication benchmarks like DGEMM show the GH200 to lag the Zen 4 based ThreadRippers appreciably. I think we need to be very careful when interpreting these results.

Posted on Reply
#6
theouto
Huh, I thought GH would do a lot better than it did, I am surprised.
Posted on Reply
May 21st, 2024 12:35 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts