Wednesday, September 20th 2023

Intel 288 E-core Xeon "Sierra Forest" Out to Eat AMD EPYC Bergamo's Lunch

Intel at the 2023 InnovatiON event unveiled a 288-core extreme core-count variant of the Xeon "Sierra Forest" processor for high-density servers for scale-out, cloud-native environments. It succeeds the current 144-core model. "Sierra Forest" is a server processor based entirely on efficiency cores, or E-cores, based on the "Sierra Glen" core microarchitecture, a server-grade derivative of "Crestmont," Intel's second-generation E-core that's making a client debut with "Meteor Lake."

Xeon "Sierra Forest" is a chiplet-based processor, much like "Meteor Lake" and the upcoming "Emerald Rapids" server processor. It features a total of five tiles—two Compute tiles, two I/O tiles, and a base tile (interposer). Each of the two Compute tiles is built on the Intel 3 foundry node, a more advanced node than Intel 4, featuring higher-density libraries, and an undisclosed performance/Watt increase. Each tile has 36 "Sierra Glen" E-core clusters, 108 MB of shared L3 cache, 6-channel (12 sub-channel) DDR5 memory controllers, and Foveros tile-to-tile interfaces.
Each "Sierra Glen" E-core cluster features four CPU cores that share a 4 MB local L2 cache, and a 3 MB segment contributing to the tile's 108 MB L3 cache. Unlike the "Meteor Lake" Compute tile that uses a ringbus to connect its E-core clusters and P-cores, the Compute tile uses a Mesh topology interconnect for the large array of 36 E-core clusters. With 144 cores per tile, in its maximum configuration with three such tiles, "Sierra Forest" achieves 288 cores. "Sierra Glen" lacks SMT, just like "Crestmont," and so the OS only has 288 logical processors to address.
Besides the two Compute tiles, the processor has two I/O tiles. Unlike the similarly named "I/O tile" of the client "Meteor Lake" processor, the ones on "Sierra Forest" serve the functions of both the SoC and I/O PHY. With the memory controllers located on the Compute tiles, in its maximum 288-core variant, "Sierra Forest" features a 12-channel DDR5 memory interface.
The I/O tile is left with the UPI interconnect for 2P servers; application-specific accelerators, a 68-lane PCI-Express Gen 5 root complex that's flexible between PCIe Gen 5 and CXL 2.0, and the I/O Fabric. Despite being based on an advanced node like Intel 3, each of the two Compute tiles is an enormous 578 mm² in die-area, while each of the two I/O tiles is 241 mm².

The up to 12-channel memory interface of "Sierra Forest" comes with native support for ECC DDR5-6400 speed. The accelerators are carried over from the current "Granite Rapids" processor, and provide speed ups for popular cryptography, file-streaming, and and data-compression operations.

When it arrives in the first half of 2024, Xeon "Sierra Forest" will square off against AMD's EPYC "Bergamo" processor. "Bergamo" is based on a slightly different philosophy than "Sierra Forest." It is a 128-core/256-thread processor based on "Zen 4c" cores that don't quite qualify as E-cores, and have an identical IPC to regular "Zen 4" cores, an identical ISA, and SMT.
Source: Tom's Hardware
Add your own comment

40 Comments on Intel 288 E-core Xeon "Sierra Forest" Out to Eat AMD EPYC Bergamo's Lunch

#1
Jism
288 E-cores yes - not 288 real P cores or half baken cores. AMD still holds crown with the amount of real cores at this point. And AMD has a 192 core 384 thread behemoth upcoming too.
Posted on Reply
#2
not_my_real_name
Each tile has 36 "Sierra Glen" E-core clusters, 108 MB of shared L3 cache, 6-channel (12 sub-channel) DDR5 memory controllers, and Foveros tile-to-tile interfaces.
The diagram says that DDR controllers is located in the I/O tile


Interesting solution. It might be better at inter-core communication and have better timings overall due to EMIB, but it looks like it will have lower performance than Bergamo. And huge tiles, EMIB, it won't be cheap...
Posted on Reply
#3
JustBenching
Jism288 E-cores yes - not 288 real P cores or half baken cores. AMD still holds crown with the amount of real cores at this point. And AMD has a 192 core 384 thread behemoth upcoming too.
What is a real core? Please elaborate
Posted on Reply
#4
Wirko
not_my_real_nameAnd huge tiles, EMIB, it won't be cheap...
That's very probably true. We know quite a few technical details about these advanced packaging technologies, at least at Intel and TSMC. Unfortunately, we get to learn almost nothing about costs, yields, capacity constraints, logistics issues, suppliers other than Intel and TSMC, and more.
Posted on Reply
#5
NutZInTheHead
This is good. AMD and Intel getting at each others necks. We need the GPU side to be this fierce in competition. Not saying that AMD cards are weak by any means, but...
Posted on Reply
#6
Daven
Due to lack of SMT, Sierra Forest will need a huge IPC and/or clock increase to compete with Bergamo.

Sierra Forest - 288 threads, little E-cores
Bergamo - 256 threads, BIG Zen 4c cores
Posted on Reply
#7
qcmadness
fevgatosWhat is a real core? Please elaborate
The total multi-thread performance of this 288-core one is probably lower than that of 128-core Epyc.
Posted on Reply
#8
JustBenching
qcmadnessThe total multi-thread performance of this 288-core one is probably lower than that of 128-core Epyc.
And that answers the question how?
Posted on Reply
#9
ARF
fevgatosWhat is a real core? Please elaborate
fevgatosAnd that answers the question how?
:laugh:

A real core is a core which is a non-Atom type, with relatively high IPC.
Those e-core are quite weak, hence the term - not real.
Normally, intel uses these not real cores for background tasks and processes which do not require super high IPC/performance.
Posted on Reply
#10
Chaitanya
Jism288 E-cores yes - not 288 real P cores or half baken cores. AMD still holds crown with the amount of real cores at this point. And AMD has a 192 core 384 thread behemoth upcoming too.

Only 144 Cores/socket maximum not 288 Cores/socket.
Posted on Reply
#11
JustBenching
ARF:laugh:

A real core is a core which is a non-Atom type, with relatively high IPC.
Those e-core are quite weak, hence the term - not real.
Normally, intel uses these not real cores for background tasks and processes which do not require super high IPC/performance.
That's your definition of a core?? A non atom type? Okay buddy great definition. I say a real core is a non ryzen type.
Posted on Reply
#12
AnarchoPrimitiv
It'll be interesting to see how the Sierra forest core competes against a Zen5c core (and the rumored 20% IPC inprovement plus whatever improvements come from being on a 3nm node, Zen5 AmD has stated, is like a whole new architecture unlike Zen4). I guess the Bergamo successor will be 192 core/384 threads. I wonder if AMD will ever have a 4 socket platform for Zen dense....it'd seem like a good application to develop one....4x Zen5c Turin 192 core chips making a 768 core/1536 thread system on a single board....imagine that'd be a huge board though with the socket size, haha
Posted on Reply
#13
mrnagant
Chaitanya
Only 144 Cores/socket maximum not 288 Cores/socket.
Dang, seems like everyone is running with the 288 cores on the chip. Even looking at a picture of the Intel presentation, the Sierra Forrest Demo says "2 processors with 288 cores". The confusion isn't even caused by an asterisk or anything. It's right there in your face.

So really, the one up here I think, iirc Bergamo is only a single socket design.
Posted on Reply
#14
JustBenching
AnarchoPrimitivElaborate
Whats there to elaborate? If your core is named ryzen it's not a real core. Especially already ryzen cpus with zen 2 cores, those are too slow and shouldn't be called real cores.
Posted on Reply
#15
las
Jism288 E-cores yes - not 288 real P cores or half baken cores. AMD still holds crown with the amount of real cores at this point. And AMD has a 192 core 384 thread behemoth upcoming too.
Even ARM gains more and more server marketshare.

E-cores might be more suitable for servers than fewer P-cores

Depends on workload really.

E-cores are real cores anyway. Just slower but smaller. Depending on tasks E-cores can make more sense. 4 E-cores uses same die space as 1 P-core.
Posted on Reply
#16
ARF
lasE-cores are real cores anyway. Just slower but smaller. Depending on tasks E-cores can make more sense. 4 E-cores uses same die space as 1 P-core.
This means that the e-cores lack important components - like cache, some other units inside them which make them much weaker and slower than the real cores in Ryzen or p-cores in Core-i9.
Posted on Reply
#17
las
ARFThis means that the e-cores lack important components - like cache, some other units inside them which make them much weaker and slower than the real cores in Ryzen or p-cores in Core-i9.
E-cores have cache and which components do they lack?

Even ARM is gaining more and more marketshare in the enterprise markets. Many enterprise workloads prefers alot of cores, speed is not crucial

E-cores and ARM does not have to be much slower, depends on software
Posted on Reply
#18
AnotherReader
lasE-cores have cache and which components do they lack?

Even ARM is gaining more and more marketshare in the enterprise markets. Many enterprise workloads prefers alot of cores, speed is not crucial

E-cores and ARM does not have to be much slower, depends on software
E cores are slower than Zen 4 or Intel's big cores, but they are still fairly fast: approaching Skylake in IPC. For high core count servers, this performance level is adequate. I would have concerns around 12 channels being insufficient for 288 cores, but we'll see. For integer code, the E cores can be more efficient than the P cores if clocked sufficiently low. However, Zen 4 is very power efficient and Intel, despite the new node, will have a difficult time matching its power efficiency. Even Zen 2 beats E cores at performance per watt.

Posted on Reply
#19
TumbleGeorge
AnarchoPrimitivZen5 AmD has stated, is like a whole new architecture
I also believe in this. Before years. After reading more i believe only main change is in front end. You know, decoder, encoder, L1 I cache, branch predictor...And few other things.
Posted on Reply
#20
ARF
lasE-cores have cache and which components do they lack?
The components which enable hyper-threading at least. Also, I think I said less cache, not without any cache.

I am not going to dig into the details of the corresponding architectures... though...
Posted on Reply
#21
AnotherReader
TumbleGeorgeI also believe in this. Before years. After reading more i believe only main change is in front end. You know, decoder, encoder, L1 I cache, branch predictor...And few other things.
You might be thinking of Zen 4 which is a refined Zen 3. There isn't enough public information about Zen 5 to determine how different it is from Zen 4.
Posted on Reply
#22
zlobby
Well, they will be eating something, that's for sure! Sure as a load of manure!
Posted on Reply
#23
Patriot
mrnagantDang, seems like everyone is running with the 288 cores on the chip. Even looking at a picture of the Intel presentation, the Sierra Forrest Demo says "2 processors with 288 cores". The confusion isn't even caused by an asterisk or anything. It's right there in your face.

So really, the one up here I think, iirc Bergamo is only a single socket design.
Siena SP5 is 1p only Zen4c cores, Bergamo works in all SP5 boards that take Genoa.
www.amd.com/content/dam/amd/en/documents/products/epyc/epyc-9004-series-processors-data-sheet.pdf
Good catch on the Tom foolery

Sierra is still only 144c/144t per socket.

edit: up to 144c/die, up to 2 dies per socket.
Posted on Reply
#25
Patriot
AnotherReaderIf it's only 144 cores per socket, then it's DOA against Bergamo.
www.servethehome.com/intel-announces-288-e-core-sierra-forest-variant-at-innovation-2023/
Nevermind, looks like they have a dual die version with 2x 144e cores per socket.
www.intc.com/news-events/press-releases/detail/1648/intel-innovation-2023-empowering-developers-to-bring-ai
The slide referenced at hot chips was accurate, Intel was keeping the 288c dual die hidden and it will be rare and probably very low clocked.

I guess now the question is... since they said >205w /socket. If 205w is default 144c wattage... how low are clocks going to be, and how high of power are 2 dies going to need?
www.intel.com/content/www/us/en/newsroom/news/2023-intel-innovation-day-1-livestream-replay.html#gs.5z3mbk
48min into the livestream, we kept a little secret.
Posted on Reply
Add your own comment
Dec 20th, 2024 21:38 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts