Wednesday, June 14th 2023

AMD Zen 4c Not an E-core, 35% Smaller than Zen 4, but with Identical IPC

AMD on Tuesday (June 13) launched the EPYC 9004 "Bergamo" 128-core/256-thread high density compute server processor, and with it, debuted the new "Zen 4c" CPU microarchitecture. A lot had been made out about Zen 4c in the run up to yesterday's launch, such as rumors that it is a Zen 4 "lite" core that has lesser number-crunching muscle, and hence lower IPC, and that Zen 4c is AMD's answer to Intel's E-core architectures, such as "Gracemont" and "Crestmont." It turns out that it's neither a lite version of Zen 4, nor is it an E-core, but a physically compacted version of the Zen 4 core, with identical number crunching machinery.

First things first—Zen 4c has the same exact IPC as Zen 4 (that's performance at a given clock-speed). This is because its front-end, execution stage, load/store component, and internal cache hierarchy is exactly the same. It has the same 88-deep load queue, 64-deep store queue, the same 675,000 µop cache, the exact same INT+FP issue width of 10+6, the same exact INT register file, the same scheduler, and cache latencies. The L1I and L1D caches are the same 32 KB in size as "Zen 4," and so is the dedicated L2 cache, at 1 MB.
The only thing that's changed is that the effective L3 cache per core has been reduced to 2 MB, from 4 MB on the 8-core "Zen 4" CCD. While the regular 8-core "Zen 4" CCD has eight "Zen 4" cores sharing a 32 MB L3 cache, the new 16-core "Zen 4c" CCD AMD introduced with "Bergamo" sees the chiplet pack two 8-core CCX (CPU core complexes), each with 16 MB of L3 cache shared among the 8 cores of the CCX. In this respect, the last-level cache and CPU core organization of the "Zen 4c" CCD has some similarities to the "Zen 2" CCD (which used two 4-core CCXs).

What's interesting is that the 16-core "Zen 4c" CCD isn't AMD's first product from this generation with lower last-level cache per core. The "Phoenix" APU silicon used in Ryzen 7040 series mobile processors sees eight "Zen 4" cores share a 16 MB L3 cache. For math-heavy compute workloads with lesser memory footprint, "Zen 4c" offers identical performance to "Zen 4," however, the smaller L3 cache should impact performance in bandwidth-sensitive workloads with large data-sets.
The Zen 4c CCD is built on the same exact TSMC 5 nm EUV foundry node that the company makes its regular 8-core Zen 4 CCD on, however, the Zen 4c CPU core is 35% smaller than the Zen 4 core, with a die area (per-core) of just 2.48 mm², compared to 3.84 mm². The die-size savings probably come from AMD "compacting" the various core components without reducing their form or function in any way. As we said earlier, the counts of the various core components remains the same, as do the sizes of the µ-op, L1, and L2 caches. EPYC 9004 "Bergamo" achieves its core-count of 128 using eight of these 16-core Zen 4c CCDs. In comparison, the regular "Genoa" processor achieves 96 cores over twelve 8-core Zen 4 CCDs.
Add your own comment

153 Comments on AMD Zen 4c Not an E-core, 35% Smaller than Zen 4, but with Identical IPC

#1
TheoneandonlyMrK
So all that's left unknown is the effect on max boost clock's.

I don't think the enterprise version ever needed the high frequency capability that zen has so these probably cannot run as fast.

But it's intriguing.
Posted on Reply
#2
Daven
TheoneandonlyMrKSo all that's left unknown is the effect on max boost clock's.

I don't think the enterprise version ever needed the high frequency capability that zen has so these probably cannot run as fast.

But it's intriguing.
There are clock specs listed at AMD.com.

www.amd.com/en/processors/epyc-9004-series

I guess those are the 4c products given the core counts.
Posted on Reply
#3
Mussels
Freshwater Moderator
Zen 3D - add more cache
This is Zen 1D, less cache!

(For certain workloads the cache matters less, so the product makes sense)
Posted on Reply
#4
Od1sseas
Intel can pack 4 E-Cores in the same size as 1 P-Core. What about AMD? How many Zen4c cores for one Zen 4 core?
Posted on Reply
#5
R0H1T
Does it matter? Two zen4c cores probably beat their best E (quad)cores across the board & are cheaper(?) to make with the chiplet approach!
Posted on Reply
#6
Daven
Od1sseasIntel can pack 4 E-Cores in the same size as 1 P-Core. What about AMD? How many Zen4c cores for one Zen 4 core?
As the headline states, Zen 4c not an E-core after all.
Posted on Reply
#7
Mussels
Freshwater Moderator
Od1sseasIntel can pack 4 E-Cores in the same size as 1 P-Core. What about AMD? How many Zen4c cores for one Zen 4 core?
It's literally there in the post - they used this to fit 128 cores in the space they previously fit 96
Intels E-cores are also drastically inferior, where these are designed to have the exact same performance (at their intended task)
Intels E cores are as bad as half the speed, whilst still being 20% less efficient


They even had a nice picture showing the sizes for you :(
Posted on Reply
#9
Mussels
Freshwater Moderator
It's all good, I just loathe intels E-cores because they used a name that is the exact opposite of the product to mislead people about them

They're more efficient at single threaded tasks, and then intel uses them exclusively for multi threaded tasks.
Just... Ugh.
Posted on Reply
#10
Chry
So a 35% increase in temperatures?
Posted on Reply
#12
Mussels
Freshwater Moderator
ChrySo a 35% increase in temperatures?
Did you pull that from the 35% decrease in size, and just hoped the math is the same?

Cause uh, halving the cache likely decreases those quite a bit
Posted on Reply
#13
Cippo95
I think that in the future AMD will combine this idea with 3d vertical cache: compact design with normal L1, L2 and little L3 + big L3 on a vertical layer.
Posted on Reply
#14
Daven
Maybe 4c stands for Zen 4 ‘compact’. Siena might introduce Zen 4e as an E-core like architecture. AFAIK, Intel’s E-core removes basic functional blocks to achieve a smaller footprint. Zen 4c does not remove any basic or even specialized (bfloat, VINNI, AVX512) functional blocks.
Posted on Reply
#15
R0H1T
Should've gone with zen 4l for lite or something, why the heck did they think 4c was better?
Posted on Reply
#16
Daven
MusselsIt's all good, I just loathe intels E-cores because they used a name that is the exact opposite of the product to mislead people about them

They're more efficient at single threaded tasks, and then intel uses them exclusively for multi threaded tasks.
Just... Ugh.
To be fair, I think the Thread director will schedule E-cores for background tasks first regardless of how many threads are needed. But yeah, foreground tasks will be scheduled first to P-cores until more than 8 cores (16 threads?) are needed. I am not sure if the Thread director is smart enough to know when a foreground task doesn’t need the computing might of P-cores and therefore falls back to E-cores to save power.
Posted on Reply
#17
KellyNyanbinary
R0H1TShould've gone with zen 4l for lite or something, why the heck did they think 4c was better?
“c” for “cloud”. Gotta get the buzzwords in.
Posted on Reply
#18
Daven
R0H1TShould've gone with zen 4l for lite or something, why the heck did they think 4c was better?
C might literally stand for ‘compact’.
KellyNyanbinary“c” for “cloud”. Gotta get the buzzwords in.
That makes sense too.
Posted on Reply
#19
R0H1T
Or maybe because they went with 3D vcache on the (EPYC)X variants? D with more cache & C for less? Still think 4l sounds better :ohwell:
Posted on Reply
#20
Darmok N Jalad
I believe AMD stated the "C" is for cloud. These are intended to offer more cores for processes to complete, as opposed to max performance.
Posted on Reply
#21
AusWolf
Thank god we're saved from heterogenous scheduler nightmares on AMD for now (let me just pretend that the 7900X3D and 7950X3D don't exist). :D

I wonder if this design will ever make its way into consumer desktop.
Posted on Reply
#22
TheoneandonlyMrK
Od1sseasIntel can pack 4 E-Cores in the same size as 1 P-Core. What about AMD? How many Zen4c cores for one Zen 4 core?
Apple's and orange's, there is a bigger gap between the e cores and p then this.

E cores are single threaded and have fewer resources and less capability And a reduced ISA no AVX for example.

So yes Intel do smaller but they are also weaker less capable and actually require process scheduler interaction.

A Zen4c does everything a zen4 does just probably a bit slower.
Posted on Reply
#23
R0H1T
TheoneandonlyMrKApple's and orange's, there is a bigger gap between the e cores and p then this.

E cores are single threaded and have fewer resources and less capability And a reduced ISA no AVX for example.

So yes Intel do smaller but they are also weaker less capable and actually require process scheduler interaction.

A Zen4c does everything a zen4 does just probably a bit slower.
They have AVX2 from ADL onwards, though Intel disabled AVX512 for that very reason on the P cores :slap:
Posted on Reply
#24
AusWolf
TheoneandonlyMrKA Zen4c does everything a zen4 does just probably a bit slower.
That will depend on the application, I think. In non-gaming workloads, I expect this to be just as good as regular Zen 4.
Posted on Reply
#25
R0H1T
It should theoretically be cooler as well because it'll be clocked lower. I hope we see mainstream 32 cores soon :pimp:
Posted on Reply
Add your own comment
Dec 21st, 2024 21:16 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts