Friday, September 8th 2023

Intel Demos 6th Gen Xeon Scalable CPUs, Core Counts Leaked

Intel's advanced packaging prowess demonstration took place this week—attendees were able to get an early-ish look at Team Blue's sixth Generation Xeon Scalable "Sapphire Rapids" processors. This multi-tile datacenter-oriented CPU family is projected to hit the market within the first half of 2024, but reports suggest that key enterprise clients have recently received evaluation samples. Coincidentally, renowned hardware leaker—Yuuki_AnS—has managed to source more information from industry insiders. This follows their complete blowout of more mainstream Raptor Lake Refresh desktop SKUs.

The leaked slide presents a bunch of evaluation sample "Granite Rapids-SP" XCC and "Sierra Forest" HCC SKUs. Intel has not officially published core counts for these upcoming "Avenue City" platform product lines. According to their official marketing blurb: "Intel Xeon processors with P-cores (Granite Rapids) are optimized to deliver the lowest total cost of ownership (TCO) for high-core performance-sensitive workloads and general-purpose compute workloads. Today, Xeon enables better AI performance than any other CPU, and Granite Rapids will further enhance AI performance. Built-in accelerators give an additional boost to targeted workloads for even greater performance and efficiency."
The more frugal family is described as: "Intel Xeon processors with E-cores (Sierra Forest) are enhanced to deliver density-optimized compute in the most power-efficient manner. Xeon processors with E-cores provide best-in-class power-performance density, offering distinct advantages for cloud-native and hyperscale workloads."

The leaked information suggests that listed "Granite Rapids-SP" ES1 units max out at 56 cores along with 288 MB of cache on an eight-channel memory subsystem carrying two chiplets. It is possible that each tile carries either 28 or 30 cores, and two cores per chiplet being disabled for redundancy purposes. Final production processors could up the ante to around 84 - 90 cores. A Tom's Hardware analysis of Yuuki_AnS's slide proposes that: "the compute chiplets are made on Intel 3 (3 nm-class) process technology, whereas HSIO chiplets are fabbed on a 7 nm-class production node, which is a proven technology and is considered to be optimal for modern I/O chiplets in terms of performance and costs."
Source: Tom's Hardware
Add your own comment

12 Comments on Intel Demos 6th Gen Xeon Scalable CPUs, Core Counts Leaked

#1
Panther_Seraphin
Interesting to see the shift away from SMT due to all the vulernabilites that have been discovered in recent years.
Posted on Reply
#2
Wye
It looks like pretty low frequency: min 1.2-1.6, max 2.4-2.7 Ghz.
Single threaded flows would have low performance.
Posted on Reply
#3
lemonadesoda
I am not happy to see TDP so high as 350W
Posted on Reply
#4
marios15
Same clocks
Same core counts
Higher TDP

Such innovation!
Posted on Reply
#5
unwind-protect
WyeIt looks like pretty low frequency: min 1.2-1.6, max 2.4-2.7 Ghz.
Single threaded flows would have low performance.
You are supposed to have applications that these Xeons contain special accelerators for.

(I don't except for the new zstd compression)
Posted on Reply
#6
Toothless
Tech, Games, and TPU!
WyeIt looks like pretty low frequency: min 1.2-1.6, max 2.4-2.7 Ghz.
Single threaded flows would have low performance.
Built for multi-core workloads. Cores used>speed
Posted on Reply
#7
Minus Infinity
Panther_SeraphinInteresting to see the shift away from SMT due to all the vulernabilites that have been discovered in recent years.
That's not why they are doing this? They cannot get SMT to work with the new chiplet designs. It won't feature in Arrow Lake either but they say it may come back later post Luna Lake. The new architecture and chiplets and substrate etc are already proving enough of a headache for them. Thye are still claiming Arrow Lake will beat Raptor Lake in mulithreaded apps despite losing SMT, but I'm not sure if that's only for the halo 8P+32E variant or all varaints like-for-like.
Posted on Reply
#8
ncrs
WyeIt looks like pretty low frequency: min 1.2-1.6, max 2.4-2.7 Ghz.
Single threaded flows would have low performance.
I am pretty certain those specific clocks are due to the "ES1" suggesting an engineering sample.
I would be very surprised to see those clocks in the final product, especially since AMD can do way better:
This was even in a less-than-stellar cooling server where the temps were in the 75C range, yet all 256 threads were loaded and all 128 cores sat at 3.1GHz.
This is a next-next product manufactured on a completely new node, so it's expected that early samples have lower clocks.
Panther_SeraphinInteresting to see the shift away from SMT due to all the vulernabilites that have been discovered in recent years.
This is only for the E-core based Xeons while P-core ones will have SMT.
Minus InfinityThat's not why they are doing this? They cannot get SMT to work with the new chiplet designs. It won't feature in Arrow Lake either but they say it may come back later post Luna Lake. The new architecture and chiplets and substrate etc are already proving enough of a headache for them. Thye are still claiming Arrow Lake will beat Raptor Lake in mulithreaded apps despite losing SMT, but I'm not sure if that's only for the halo 8P+32E variant or all varaints like-for-like.
I don't think it's related to having chiplets since the current Sapphire Rapids Xeons are also chiplet-based and feature SMT. The SMT-less E-core Xeons are targeted towards a specific segments - mostly cloud computing for which it is not a desirable feature. AMD also has the EPYC 9754S with factory-disabled SMT which I find unusual due to the fact you can already disable SMT in BIOS. Not that it matters much since cloud vendors get specific off-market SKUs anyway.
Posted on Reply
#9
Wirko
ncrsI would be very surprised to see those clocks in the final product, especially since AMD can do way better
Not a fair comparison. At least in current products, AMD's Zen 4c core is 2/3 the size of a Zen 4 core while Intel's E core is 1/3 the size of a P core.
Posted on Reply
#10
ncrs
WirkoNot a fair comparison. At least in current products, AMD's Zen 4c core is 2/3 the size of a Zen 4 core while Intel's E core is 1/3 the size of a P core.
I'm not sure what you're getting at. Zen 4c is not directly comparable to Intel E-cores either since it retains all the features of Zen 4. It's a space-optimized version of the same architecture while E-cores employ a completely different design.
Both Intel 4th gen Xeons and Zen 4 EPYCs have higher clocks than what this ES1 presents.
My point was that this is just an engineering sample so the clocks shouldn't be taken as final. I probably shouldn't have compared it to AMD but to Intel's current gen, however that was the only solid source of raw clocks I remembered at the time. It's not something tested often.
Posted on Reply
#11
Wirko
ncrsI'm not sure what you're getting at. Zen 4c is not directly comparable to Intel E-cores either since it retains all the features of Zen 4. It's a space-optimized version of the same architecture while E-cores employ a completely different design.
Both Intel 4th gen Xeons and Zen 4 EPYCs have higher clocks than what this ES1 presents.
My point was that this is just an engineering sample so the clocks shouldn't be taken as final. I probably shouldn't have compared it to AMD but to Intel's current gen, however that was the only solid source of raw clocks I remembered at the time. It's not something tested often.
Both 4c and E are optimised for space, and the result of this optimisation (performance per mm² gained vs large cores) will probably be similar between AMD's and Intel's approach. That was the whole point of my comment.
Posted on Reply
#12
ncrs
WirkoBoth 4c and E are optimised for space, and the result of this optimisation (performance per mm² gained vs large cores) will probably be similar between AMD's and Intel's approach. That was the whole point of my comment.
It's not going to be similar because E-cores are not just smaller P-cores. They do not have the same microarchitectures. E-cores are missing both AVX-512 and AMX when compared to 3rd gen Xeon Scalable. If your workload can utilize AMX then even going back to AVX-512 is going to decrease performance dramatically. Further back to AVX2, which is what E-core Sierra Forest will have, yields even bigger loss of performance. The cache structure is also significantly different between them. There are more differences in core design as well.
On the other hand Zen 4c is the same core as Zen 4 which has the same capabilities just with less cache, slower frequency, and a slightly different structure due to having 2 CCXs on the CCD.
Your metric of perf per mm² gained can be easily calculated for Zen 4c, but for E-cores it's significantly harder due to its differences. It might work for workloads not utilizing anything above AVX2, but even then the cache structure complicates MT measurements.
Posted on Reply
Add your own comment
Nov 21st, 2024 06:15 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts