Friday, December 8th 2023
Intel "Sierra Forest" Xeon System Surfaces, Fails in Comparison to AMD Bergamo
Intel's upcoming Sierra Forest Xeon server chip has debuted on Geekbench 6, showcasing its potential in multi-core performance. Slated for release in the first half of 2024, Sierra Forest is equipped with up to 288 Efficiency cores, positioning it to compete with AMD's Zen 4c Bergamo server CPUs and other ARM-based server chips like those from Ampere for the favor of cloud service providers (CSP). In the Geekbench 6 benchmark, a dual-socket configuration featuring two 144-core Sierra Forest CPUs was tested. The benchmark revealed a notable multi-core score of 7,770, surpassing most dual-socket systems powered by Intel's high-end Xeon Platinum 8480+, which typically scores between 6,500 and 7,500. However, Sierra Forest's single-core score of 855 points was considerably lower, not even reaching half of that of the 8480+, which manages 1,897 points.
The difference in single-core performance is a matter of choice, as Sierra Forest uses Crestmont-derived Sierra Glen E-cores, which are more power and area-efficient, unlike the Golden Cove P-cores in the Sapphire Rapids-based 8480+. This design choice is particularly advantageous for server environments where high-core counts are crucial, as CSPs usually partition their instances by the number of CPU cores. However, compared to AMD's Bergamo CPUs, which use Zen 4c cores, Sierra Forest lacks pure computing performance, especially in multi-core. The Sierra Forest lacks hyperthreading, while Bergaamo offers SMT with 256 threads on the 128-core SKU. Comparing the Geekbench 6 scores to AMD Bergamo EPYC 9754 and Sierra Forest results look a lot less impressive. Bergamo scored 1,597 points in single-core, almost double that of Sierra Forest, and 16,455 points in the multi-core benchmarks, which is more than double. This is a significant advantage of the Zen 4c core, which cuts down on caches instead of being an entirely different core, as Intel does with its P and E-cores. However, these are just preliminary numbers; we must wait for real-world benchmarks to see the actual performance.
Sources:
BenchLeaks, Tom's Hardware
The difference in single-core performance is a matter of choice, as Sierra Forest uses Crestmont-derived Sierra Glen E-cores, which are more power and area-efficient, unlike the Golden Cove P-cores in the Sapphire Rapids-based 8480+. This design choice is particularly advantageous for server environments where high-core counts are crucial, as CSPs usually partition their instances by the number of CPU cores. However, compared to AMD's Bergamo CPUs, which use Zen 4c cores, Sierra Forest lacks pure computing performance, especially in multi-core. The Sierra Forest lacks hyperthreading, while Bergaamo offers SMT with 256 threads on the 128-core SKU. Comparing the Geekbench 6 scores to AMD Bergamo EPYC 9754 and Sierra Forest results look a lot less impressive. Bergamo scored 1,597 points in single-core, almost double that of Sierra Forest, and 16,455 points in the multi-core benchmarks, which is more than double. This is a significant advantage of the Zen 4c core, which cuts down on caches instead of being an entirely different core, as Intel does with its P and E-cores. However, these are just preliminary numbers; we must wait for real-world benchmarks to see the actual performance.
76 Comments on Intel "Sierra Forest" Xeon System Surfaces, Fails in Comparison to AMD Bergamo
www.techpowerup.com/310057/amd-zen-4c-not-an-e-core-35-smaller-than-zen-4-but-with-identical-ipc?amp
This is about server SKUs, individual performance is irrelevant, all that matters is perf/W.
By lowering the design frequency just for the Zen4c cores allows the synthesising software to optimise those cores to be smaller purely because of design frequency.
AMD chiplet design weakness is uncore power overhead in relation to core count. IFOP power hasn't really improved monumentally in the server space, but Zen 4c doubling core count for a given #CCD count is a huge point in Bergamo's favour for perf/W.
You could say that E-cores are pushed too hard in Core I, and are best in their efficiency band running Xeon clocks, but the same goes for Bergamo. Server Zen 4c is also close to its happy place.
I am not disagreeing with you, by the way, just saying that “lower” is relative in this case.
Whether that's because CCD 4c runs into heat density issues, is incapable of clocking higher due to physical constraints, or Vcore requirements become prohibitive, who knows. Fair
The single core score is no surprise to me.
Regarding the multi core score, my opinion is that the inter core capability of those e-cores is known to be kinda bad/slow and the socket to socket penalty for this 144+144 core system is an additional limit vs one big Bergamo CPU.
Again, how is the value of those numbers to the target market, i dunno?
Regarding Zen4 to Zen4c, think of the "c" as for compactified, those cores are denser and thus have different electrical properties like lower sweetspot clockspeed.
Additionally one Bergamo CCD houses two CCX, each with 16MB L3 Cache like Zen2 had, but with 8 cores per CCX where Zen2 only had 4 cores per CCX.
Bergamo looks to me like having lower inter CCX capability than normal Zen4.
Because the IO-Die to Zen4c(CCD=twoCCX) have only the same perf as for Zen4(CCD=oneCCX)
Compared to desktop ratings, Intel server parts do a lot better at sticking to the designated rating so using that would be comparing perf/W.
Agreed about Geekbench multicore scores petering out as the core count gets extreme. Although, the compared parts are both extreme so maybe still comparable.
I wouldn't be surprised if the top 144 core version closes even the low-thread gap over Bergamo using higher frequencies than Bergamo.
After looking this up, Phoenix (Zen 4) maxes out at 16MB L3 / 8 cores, Phoenix 2 (Zen 4 + 4c) 16MB L3 / 2+4 cores, Genoa (Zen 4) 384MB / 96 cores (32MB / CCX), and Bergamo (Zen 4c) 256MB / 128 cores (16MB / CCX).
Old man once said: 'If you can't beat 'em, change the benchmark'.
:D
GB6:
Running File Compression
Running Navigation
Running HTML5 Browser
Running PDF Renderer
Running Photo Library
Running Clang
Running Text Processing
Running Asset Compression
Running Object Detection
Running Background Blur
Running Horizon Detection
Running Object Remover
Running HDR
Running Photo Filter
Running Ray Tracer
Running Structure from Motion
GB5:
Running AES-XTS
Running Text Compression
Running Image Compression
Running Navigation
Running HTML5
Running SQLite
Running PDF Rendering
Running Text Rendering
Running Clang
Running Camera
Running N-Body Physics
Running Rigid Body Physics
Running Gaussian Blur
Running Face Detection
Running Horizon Detection
Running Image Inpainting
Running HDR
Running Ray Tracing
Running Structure from Motion
Running Speech Recognition
Running Machine Learning