Wednesday, September 20th 2023
Intel 288 E-core Xeon "Sierra Forest" Out to Eat AMD EPYC Bergamo's Lunch
Intel at the 2023 InnovatiON event unveiled a 288-core extreme core-count variant of the Xeon "Sierra Forest" processor for high-density servers for scale-out, cloud-native environments. It succeeds the current 144-core model. "Sierra Forest" is a server processor based entirely on efficiency cores, or E-cores, based on the "Sierra Glen" core microarchitecture, a server-grade derivative of "Crestmont," Intel's second-generation E-core that's making a client debut with "Meteor Lake."
Xeon "Sierra Forest" is a chiplet-based processor, much like "Meteor Lake" and the upcoming "Emerald Rapids" server processor. It features a total of five tiles—two Compute tiles, two I/O tiles, and a base tile (interposer). Each of the two Compute tiles is built on the Intel 3 foundry node, a more advanced node than Intel 4, featuring higher-density libraries, and an undisclosed performance/Watt increase. Each tile has 36 "Sierra Glen" E-core clusters, 108 MB of shared L3 cache, 6-channel (12 sub-channel) DDR5 memory controllers, and Foveros tile-to-tile interfaces.Each "Sierra Glen" E-core cluster features four CPU cores that share a 4 MB local L2 cache, and a 3 MB segment contributing to the tile's 108 MB L3 cache. Unlike the "Meteor Lake" Compute tile that uses a ringbus to connect its E-core clusters and P-cores, the Compute tile uses a Mesh topology interconnect for the large array of 36 E-core clusters. With 144 cores per tile, in its maximum configuration with three such tiles, "Sierra Forest" achieves 288 cores. "Sierra Glen" lacks SMT, just like "Crestmont," and so the OS only has 288 logical processors to address.Besides the two Compute tiles, the processor has two I/O tiles. Unlike the similarly named "I/O tile" of the client "Meteor Lake" processor, the ones on "Sierra Forest" serve the functions of both the SoC and I/O PHY. With the memory controllers located on the Compute tiles, in its maximum 288-core variant, "Sierra Forest" features a 12-channel DDR5 memory interface.The I/O tile is left with the UPI interconnect for 2P servers; application-specific accelerators, a 68-lane PCI-Express Gen 5 root complex that's flexible between PCIe Gen 5 and CXL 2.0, and the I/O Fabric. Despite being based on an advanced node like Intel 3, each of the two Compute tiles is an enormous 578 mm² in die-area, while each of the two I/O tiles is 241 mm².
The up to 12-channel memory interface of "Sierra Forest" comes with native support for ECC DDR5-6400 speed. The accelerators are carried over from the current "Granite Rapids" processor, and provide speed ups for popular cryptography, file-streaming, and and data-compression operations.
When it arrives in the first half of 2024, Xeon "Sierra Forest" will square off against AMD's EPYC "Bergamo" processor. "Bergamo" is based on a slightly different philosophy than "Sierra Forest." It is a 128-core/256-thread processor based on "Zen 4c" cores that don't quite qualify as E-cores, and have an identical IPC to regular "Zen 4" cores, an identical ISA, and SMT.
Source:
Tom's Hardware
Xeon "Sierra Forest" is a chiplet-based processor, much like "Meteor Lake" and the upcoming "Emerald Rapids" server processor. It features a total of five tiles—two Compute tiles, two I/O tiles, and a base tile (interposer). Each of the two Compute tiles is built on the Intel 3 foundry node, a more advanced node than Intel 4, featuring higher-density libraries, and an undisclosed performance/Watt increase. Each tile has 36 "Sierra Glen" E-core clusters, 108 MB of shared L3 cache, 6-channel (12 sub-channel) DDR5 memory controllers, and Foveros tile-to-tile interfaces.Each "Sierra Glen" E-core cluster features four CPU cores that share a 4 MB local L2 cache, and a 3 MB segment contributing to the tile's 108 MB L3 cache. Unlike the "Meteor Lake" Compute tile that uses a ringbus to connect its E-core clusters and P-cores, the Compute tile uses a Mesh topology interconnect for the large array of 36 E-core clusters. With 144 cores per tile, in its maximum configuration with three such tiles, "Sierra Forest" achieves 288 cores. "Sierra Glen" lacks SMT, just like "Crestmont," and so the OS only has 288 logical processors to address.Besides the two Compute tiles, the processor has two I/O tiles. Unlike the similarly named "I/O tile" of the client "Meteor Lake" processor, the ones on "Sierra Forest" serve the functions of both the SoC and I/O PHY. With the memory controllers located on the Compute tiles, in its maximum 288-core variant, "Sierra Forest" features a 12-channel DDR5 memory interface.The I/O tile is left with the UPI interconnect for 2P servers; application-specific accelerators, a 68-lane PCI-Express Gen 5 root complex that's flexible between PCIe Gen 5 and CXL 2.0, and the I/O Fabric. Despite being based on an advanced node like Intel 3, each of the two Compute tiles is an enormous 578 mm² in die-area, while each of the two I/O tiles is 241 mm².
The up to 12-channel memory interface of "Sierra Forest" comes with native support for ECC DDR5-6400 speed. The accelerators are carried over from the current "Granite Rapids" processor, and provide speed ups for popular cryptography, file-streaming, and and data-compression operations.
When it arrives in the first half of 2024, Xeon "Sierra Forest" will square off against AMD's EPYC "Bergamo" processor. "Bergamo" is based on a slightly different philosophy than "Sierra Forest." It is a 128-core/256-thread processor based on "Zen 4c" cores that don't quite qualify as E-cores, and have an identical IPC to regular "Zen 4" cores, an identical ISA, and SMT.
40 Comments on Intel 288 E-core Xeon "Sierra Forest" Out to Eat AMD EPYC Bergamo's Lunch
Interesting solution. It might be better at inter-core communication and have better timings overall due to EMIB, but it looks like it will have lower performance than Bergamo. And huge tiles, EMIB, it won't be cheap...
Sierra Forest - 288 threads, little E-cores
Bergamo - 256 threads, BIG Zen 4c cores
A real core is a core which is a non-Atom type, with relatively high IPC.
Those e-core are quite weak, hence the term - not real.
Normally, intel uses these not real cores for background tasks and processes which do not require super high IPC/performance.
Only 144 Cores/socket maximum not 288 Cores/socket.
So really, the one up here I think, iirc Bergamo is only a single socket design.
E-cores might be more suitable for servers than fewer P-cores
Depends on workload really.
E-cores are real cores anyway. Just slower but smaller. Depending on tasks E-cores can make more sense. 4 E-cores uses same die space as 1 P-core.
Even ARM is gaining more and more marketshare in the enterprise markets. Many enterprise workloads prefers alot of cores, speed is not crucial
E-cores and ARM does not have to be much slower, depends on software
I am not going to dig into the details of the corresponding architectures... though...
www.amd.com/content/dam/amd/en/documents/products/epyc/epyc-9004-series-processors-data-sheet.pdf
Good catch on the Tom foolery
Sierra is still only 144c/144t per socket.edit: up to 144c/die, up to 2 dies per socket.
Nevermind, looks like they have a dual die version with 2x 144e cores per socket.
www.intc.com/news-events/press-releases/detail/1648/intel-innovation-2023-empowering-developers-to-bring-ai
The slide referenced at hot chips was accurate, Intel was keeping the 288c dual die hidden and it will be rare and probably very low clocked.
I guess now the question is... since they said >205w /socket. If 205w is default 144c wattage... how low are clocks going to be, and how high of power are 2 dies going to need?
www.intel.com/content/www/us/en/newsroom/news/2023-intel-innovation-day-1-livestream-replay.html#gs.5z3mbk
48min into the livestream, we kept a little secret.