Monday, January 8th 2024

AVX-512 Doubles Intel 5th Gen "Emerald Rapids" Xeon Processor Performance, Up to 10x Improvement in AI Workloads

Jan 8th, 2024 03:15 Discuss (30 Comments)

According to the latest round of tests by Phoronix, we are seeing proof of substantial performance gains Intel's 5th Gen Xeon Emerald Rapids server CPUs deliver when employing AVX-512 vector instructions. Enabling AVX-512 doubled throughput on average across a range of workloads, with specific AI tasks accelerating over 10x versus having it disabled. Running on the top-end 64-core Platinum 8592+ SKU, benchmarks saw minimal frequency differences between AVX-512 on and off states. However, the specialized 512-bit vector processing unlocked dramatic speedups, exemplified in the OpenVINO AI framework. Specifically, weld porosity detection, which has real-world applications, showed the biggest speedups. Power draw also increased moderately - the expected tradeoff for such an unconstrained performance upside.

With robust optimizations, the vector engine potential has now been fully demonstrated. Workloads spanning AI, visualization, simulation, and analytics could multiply speed by upgrading to Emerald Rapids. Of course, developer implementation work remains non-trivial. But for the data center applications that can take advantage, AVX-512 enables Intel to partially close raw throughput gaps versus AMD's core count leadership. Whether those targeted acceleration gains offset EPYC's wider general-purpose value depends on customer workloads. But with tests proving dramatic upside, Intel is betting big on vector acceleration as its ace card. AMD also supports the AVX-512 instruction set. Below, you can find the geometric mean of all test results, and check the review with benchmarks here.

Source: Phoronix

Add your own comment

30 Comments on AVX-512 Doubles Intel 5th Gen "Emerald Rapids" Xeon Processor Performance, Up to 10x Improvement in AI Workloads

#26

Panther_Seraphin

SquaredI don't think Intel had any reason to take away AVX-512 when E-cores are disabled other than product segmentation, unless it was literally removed from the die design to make for a smaller die.

Then that means 12600 nonk and below could have AVX512 but the ones above could not?

So in certain work loads a 90 dollar i3 could wipe the floor with a near 800 dollar i9? You really think Intel would let than fly? :D

#27

TumbleGeorge

SquaredSo why doesn't Meteor Lake support AVX-512?

Easy
1. Product positioning.
2. The way Intel implements, or used to implement, the avx-512 drew a lot of power when in use. Unlike Intel, for AMD realisation, we've already seen some tests where increase in consumption is negligible.
3. Meteor Lake, is a mobile series with which Intel bets on the maximum performance of the graphics chiplet and at the same time without exceeding the energy budget because to have advantage time for using with one battery charge.. For this purpose, it has even reduced the IPC slightly…

#28

ncrs

SquaredSo why doesn't Meteor Lake support AVX-512? I don't think it even supports AVX10. It has a new E-core, so this difference could've been solved. To follow your thinking and answer my own question, probably because Intel felt AVX-512 required too much silicon for an E-core and AVX10 didn't exist before Meteor Lake was finalized.

From GCC source code we know (P_PROC_AVX2 instead of P_PROC_AVX512F) that Arrow Lake, Lunar Lake and Panther Lake all won't have AVX-512. At least for now - AVX10 is a complicated issue despite trying to disentangle the AVX mess. I don't think it's even fully upstreamed and wired up in GCC.
Another reason is that Meteor Lake implements another level of E-cores called L(ow) P(ower) E-cores.

#29

ThrashZone

TumbleGeorgeEasy
1. Product positioning.
2. The way Intel implements, or used to implement, the avx-512 drew a lot of power when in use. Unlike Intel, for AMD realisation, we've already seen some tests where increase in consumption is negligible.
3. Meteor Lake, is a mobile series with which Intel bets on the maximum performance of the graphics chiplet and at the same time without exceeding the energy budget because to have advantage time for using with one battery charge.. For this purpose, it has even reduced the IPC slightly…

Hi,
Yep a lot of power and a lot of heat.

#30

Squared

TumbleGeorge2. The way Intel implements, or used to implement, the avx-512 drew a lot of power when in use. Unlike Intel, for AMD realisation, we've already seen some tests where increase in consumption is negligible.

The Phoronix article linked by this article shows that the power consumption of AVX-512 in Emerald Rapids isn't bad. Since Emerald Rapids is using Raptor Cove cores which are just Golden Cove cores (which came out 2 years ago) with more cache and since the Redwood Cove architecture in Meteor Lake is just the die shrink version of Raptor Cove, Intel would've known that AVX-512 wouldn't be an efficiency problem in Meteor Lake.

TumbleGeorge3. Meteor Lake, is a mobile series with which Intel bets on the maximum performance of the graphics chiplet and at the same time without exceeding the energy budget because to have advantage time for using with one battery charge.. For this purpose, it has even reduced the IPC slightly…

Meteor Lake wasn't designed to be mobile-only. As recently as last summer Intel was updating Linux code to support Meteor Lake-S, the desktop version. It seems the decision to cut the desktop line came very late. Perhaps it couldn't reach high enough clock speeds to compete with Raptor Lake-S. I've never seen evidence of an IPC decrease in Meteor Lake, nor any evidence of an architectural difference between Raptor Cove and Redwood Cove. Surely if an architecture update was made to improve efficiency Intel would've brought it up? Actually isolating a reduction in instructions per clock cycle would require comparing multiple Raptor Lake laptops to multiple Meteor Lake laptops running multiple benchmarks while monitoring the frequency looking for a pattern of similar frequency but lower performance. No one has done this test.

Add your own comment

AVX-512 Doubles Intel 5th Gen "Emerald Rapids" Xeon Processor Performance, Up to 10x Improvement in AI Workloads

30 Comments on AVX-512 Doubles Intel 5th Gen "Emerald Rapids" Xeon Processor Performance, Up to 10x Improvement in AI Workloads

Latest GPU Drivers

New Forum Posts

Popular Reviews

TPU on YouTube

Controversial News Posts

AVX-512 Doubles Intel 5th Gen "Emerald Rapids" Xeon Processor Performance, Up to 10x Improvement in AI Workloads

Related News

30 Comments on AVX-512 Doubles Intel 5th Gen "Emerald Rapids" Xeon Processor Performance, Up to 10x Improvement in AI Workloads

Latest GPU Drivers

New Forum Posts

Popular Reviews

TPU on YouTube

Controversial News Posts