No, it needs stupid high power to compete
in single threaded tasks. Remember, it uses a lot less power per core in MT than in ST! The first core boosts
71W over idle power! True, this number also includes the uncore, memory controller etc. ramping up, but as Anandtech says, even when accounting for this: "
we’re still looking at ~55-60 W enabled for a single core." (their emphasis). Compare this to Zen3,
which peaks at less than 21W/core, no matter what. In other words, you can run three 5.05GHz (
nominally 4.9GHz, but higher in real life according to AT's per-core testzing) Zen3 cores for each Golden Cove core at 5.2GHz in an instruction dense workload.
So, in short, ADL is quite wide, and specifically due to that it is a power hog at high clocks in heavy workloads. When the GC architecture was first discussed,
people expressed concerns about the power hungry nature of expanding an X86 decoder beyond being 4-wide - and that's likely part of what we're seeing here. As instruction density rises, power consumption skyrockets, and ADL
needs that 250W power budget to beat - or even keep up with! - Zen3 in low threaded applications. Remember, the 12900K reaches 160W with just 4 P-cores active, and creeps up on 200W with six of them active. None of which are at that 5.2GHz clock, of course.
Now, I spotted something looking back at the AT review that I missed yesterday: they note that in their SPEC testing, they saw 25-30W/core for ADL - which is obviously much better than 55-60W. They put this down to most SPEC workloads having much lower instruction density than POV-Ray, which they use for power testing. Still, assuming that ADL sees that 50% power drop in SPEC compared to POV-Ray and Zen3 sees
none (which is quite unlikely!), that's still a 25-50% power consumption advantage per core for AMD. Which is somewhat offset by AMD's ~10W higher uncore power, but that ADL advantage disappears once you exceed two cores active due to Zen3's lower per core power.
Going back to the ST/MT comparison and clock/power scaling for ADL: Extrapolating from Anandtech's per core load testing, and assuming uncore under load is 23W (78W package power minus their low 1c estimate of 55W): 2P = 44W/c, 3P =36.7W/c, 4P =34W/c, 5P = 30,4W/P, 6P = 29W/c, 7P = 27.3W/c, 8P = 27W/P. That last number is, according to the same AT test, at around 4.7GHz. Which is definitely
a lot better than 55W/core! But it's still 28% higher than Zen3's 21W (technically 20.6W) @ 4.9GHz. Now, ADL at 5.2GHz wins in SPEC ST by up to 16% (116% score vs. Zen3's 100%), with a 3% clock advantage (5.2 v 5.05GHz). Dropping its clocks to 4.7GHz, assuming a linear drop in performance, would drop that 16% advantage to a 4.8% advantage - and still at a presumable power disadvantage - or at best roughly on par. Sadly we don't have power scaling numbers per core for SPEC, but it's safe to assume that it doesn't see the same dramatic drop as POV-Ray, simply because it doesn't start as high.
And, crucially, all of this is comparing against Zen3's peak 1c power - and it too drops off noticeably as thread counts increase, with small clock speed losses.
The 5950X maintains 20W/c up to 4 active cores, then drops to <17W at 5 cores (@4.675GHz), and ~14-15W at 8c active (@4.6GHz). Zen3 also shows
massive efficiency scaling at lower clocks too, going as low as ~8W/core
@4ghz (13 cores active) or ~6W/core @ 3.775 (16 cores active).
If we graph out the respective clock/power scaling seen here, we get these two graphs (with the caveat that we don't have precise frequency numbers for ADL per core, and I have instead extrapolated these linearly between its 5.2GHz spec and the 4.7GHz seenin AT's 8P-core power testing):
View attachment 257237
What do we see there? That Zen3's power is still dropping quite sharply at 16t active (3.775GHz), while ADL's P cores are flattening out in terms of power draw even at just 8 P cores active. We obviously can't extrapolate a line directly from these graphs and towards zero clock speed and expect it to match reality, but it still says something about the power and clock scaling of these two implementations - and it demonstrates how Zen3 scales very well towards lower power. As an added comparison, look at EPYC Milan:
the core (not including IF/uncore) power of the 64-core 7763 is just 164W in SPECint, translating to a staggeringly low 2.6W/core, presumably at its 2450MHz base clock.
It is entirely possible -
even essentially establlished knowledge, given the much better efficiency of lower spec ADL chips like the 12300 - that ADL/GC sees a downward step or increased drop in power/clock at some lower clock than what the 12900K reaches at 8 P cores active, but there's still no question that Zen3 scales
far better than ADL towards lower clocks, an advantage
somewhat offset by its higher uncore power, but nowhere near completely. ADL still has a slight IPC advantage, and wins out in ST applications that can take advantage of its high per-core boost even for lower spec chips + its low latencies. And it doesn't suffer as badly power-wise in less instruction dense workloads overall. But that doesn't make it more efficient than Zen3 - that simply isn't the case.
And I don't care much about your pulled-from-thin-air numbers that do not account for the increased interconnect complexity and resultant core-to-core latency increase for such a larger die, or the other changes necessary for that implementation. Nor do you seem to grasp the issue of using CB as a single point of reference to somehow be the be-all, end-all reference point for performance. It's a single tiled rendering benchmark, with all the peculiarities and characteristics of such a workload - and isn't applicable to other nT workloads, let alone general workloads.
You mean the up-to-60c, 350W TDP, perennially delayed datacenter CPU that Intel
still hasn't released any actual specs for? Yeah, that seems like a very well functioning and unproblematic comparison, sure.
"Huge"? It's a 15% drop (or the 5800X runs 18% faster, depending on what's your baseline). It's clearly notable, but ... so what? Remember, at that point the 5800X runs literally twice the power per core compared to the 5950X. As seen above, Zen3 scales extremely well with lower clocks, and doesn't flatline until very low clocks. ADL, on the other hand, seems to flatline (at least for a while) in the high 4GHz range.
Heck, if you want to provide more data to challenge what I'm taking from AT's testing: run POV-Ray at your various power limits and record the clock speeds. CB is far, far less instruction dense than POV-Ray (in other words: it's a much lighter and lower power workload) and would thus let a power limited ADL CPU clock far higher, so you can't use your CB numbers as a counterpoint to AT's POV-Ray numbers.
Oh, man, this made me laugh so hard. "You don't need an industry standard benchmark with a wide variety of different real-world tests baked in, there are hundreds of single, far less representative workloads you can try." Like, do you even understand what you are saying here? Cinebench is
one test. SPEC2017 is
twenty different tests (for ST,
23 for nT). SPEC is a widely recognized industry standard, and includes rendering workloads, compression workloads, simulation workloads, and a lot more. What you mentioned were, let's see, a rendering workload, a rendering workload, a rendering workload, and a compression workload. Hmmmmm - I wonder which of these gives a more representative overview of the performance of a CPU?
Come on, man. Arguing for CB being a better benchmark than SPEC is like arguing that a truck stop hot dog is a better meal than a 7-course Michelin star restaurant menu. It's not even in the same ballpark.
As I said, I would if I had time and we had the ability to normalize for all the variables involved - storage speed, cooling, other software on the system, etc. Sadly I don't have that time, and certainly don't have the spare hardware to run a new Windows install or reconfigure my cooling to match your system, etc. We do not have what is necessary to do a like-for-like, comparable test run - and running a single benchmark like you're arguing for wouldn't be a represenative view of performance anyhow. So, doing so would at best result in rough ballpark results with all kinds of unkown variables. Hardly a good way of doing a comparison.