Monday, November 12th 2018

AMD "Zen 2" IPC 29 Percent Higher than "Zen"

AMD reportedly put out its IPC (instructions per clock) performance guidance for its upcoming "Zen 2" micro-architecture in a version of its Next Horizon investor meeting, and the numbers are staggering. The next-generation CPU architecture provides a massive 29 percent IPC uplift over the original "Zen" architecture. While not developed for the enterprise segment, the stopgap "Zen+" architecture brought about 3-5 percent IPC uplifts over "Zen" on the backs of faster on-die caches and improved Precision Boost algorithms. "Zen 2" is being developed for the 7 nm silicon fabrication process, and on the "Rome" MCM, is part of the 8-core chiplets that aren't subdivided into CCX (8 cores per CCX).

According to Expreview, AMD conducted DKERN + RSA test for integer and floating point units, to arrive at a performance index of 4.53, compared to 3.5 of first-generation Zen, which is a 29.4 percent IPC uplift (loosely interchangeable with single-core performance). "Zen 2" goes a step beyond "Zen+," with its designers turning their attention to critical components that contribute significantly toward IPC - the core's front-end, and the number-crunching machinery, FPU. The front-end of "Zen" and "Zen+" cores are believed to be refinements of previous-generation architectures such as "Excavator." Zen 2 gets a brand-new front-end that's better optimized to distribute and collect workloads between the various on-die components of the core. The number-crunching machinery gets bolstered by 256-bit FPUs, and generally wider execution pipelines and windows. These come together yielding the IPC uplift. "Zen 2" will get its first commercial outing with AMD's 2nd generation EPYC "Rome" 64-core enterprise processors.

Update Nov 14: AMD has issued the following statement regarding these claims.
As we demonstrated at our Next Horizon event last week, our next-generation AMD EPYC server processor based on the new 'Zen 2' core delivers significant performance improvements as a result of both architectural advances and 7nm process technology. Some news media interpreted a 'Zen 2' comment in the press release footnotes to be a specific IPC uplift claim. The data in the footnote represented the performance improvement in a microbenchmark for a specific financial services workload which benefits from both integer and floating point performance improvements and is not intended to quantify the IPC increase a user should expect to see across a wide range of applications. We will provide additional details on 'Zen 2' IPC improvements, and more importantly how the combination of our next-generation architecture and advanced 7nm process technology deliver more performance per socket, when the products launch.
Source: Expreview
Add your own comment

162 Comments on AMD "Zen 2" IPC 29 Percent Higher than "Zen"

#1
Prima.Vera
Bulldozer, Excavator, ... no thank you. No more hyping until the community benches are out. :rolleyes:
Posted on Reply
#2
londiste
Well, that definitely is not IPC, at least not until we know the clocks. The headline is bullshit.

For the rest of it, what exact tests are those? Zen2 apparently gets proper AVX which will indeed boost certain workloads considerably.
Posted on Reply
#3
Vayra86
Small disclaimer: *potentially* 29% higher than Zen, if nothing else gets in the way - which it always does.
Posted on Reply
#4
R0H1T
In case of AVX heavy benches, they will give similar real world throughput i.e. 29% or more. They pretty much doubled their AVX throughput in one go, the avg (across many other applications) could be half or a third of this.
londisteWell, that definitely is not IPC, at least not until we know the clocks. The headline is bullshit.

For the rest of it, what exact tests are those? Zen2 apparently gets proper AVX which will indeed boost certain workloads considerably.
AMD's probably given their best case performance numbers, why do you need to know the clocks when they've said the IPC is higher based on a performance index? Do you suppose they'll do an Intel here?
Posted on Reply
#5
btarunr
Editor & Senior Moderator
AMD's "59% higher" claims for Zen1 over Excavator invited the same ridicule.

Lisa Su is very careful about the guidance she puts out.
Posted on Reply
#6
kastriot
This is 29% based on same clock speeds zen1vszen2 or boosted zen 2 core clock(Like 4.5-4.8GHz?)
Posted on Reply
#7
londiste
R0H1TAMD's probably given their best case performance numbers, why do you need to know the clocks when they've said the IPC is higher based on a performance index? Do you suppose they'll do an Intel here?
IPC = Instructions Per Clock.

Edit:
I was wrong, AMD does say these tests measure IPC.
ir.amd.com/news-releases/news-release-details/amd-takes-high-performance-datacenter-computing-next-horizon
Estimated increase in instructions per cycle (IPC) is based on AMD internal testing for “Zen 2” across microbenchmarks, measured at 4.53 IPC for DKERN +RSA compared to prior “Zen 1” generation CPU (measured at 3.5 IPC for DKERN + RSA) using combined floating point and integer benchmarks.
Didn't Zen have hardware acceleration for RSA?
Posted on Reply
#8
R0H1T
londisteIPC = Instructions Per Clock.
I mean, we sure use the term incorrectly already but the clock part there is still crucial. I suppose the Performance Index comes from test results. Tests are run at some clock speed which are much more likely to be higher than Zen/Zen+ results, especially as AMD themselves makes no note of IPC.
Yes but we don't even know what performance index indicates in this case, for instance do you know if the tests were carried out using fixed clocks? But when AMD says (officially?) that the IPC gain is about 30% they can't be lying about it, IPC is a specific term & AFAIK Intel & AMD know exactly what it means. The point being ~ take this application/result as a best case scenario given what we already know about Zen2 like better AVX, deriving anything more from the headline grabbing number is pointless.
Posted on Reply
#9
randomUser
Simple math.

If Zen1 IPC is 1.00
Zen2 IPC is 29% higher than Zen1, so it will be 1.29

This means, that:
Zen1 will handle 1 instruction per 1 clock cycle
Zen2 will handle 1.29 instructions per 1 clock cycle.

If you your task requires 1000 instructions to be completed, then:
Zen1 will finish this task in 1000 clock cycles;
Zen2 will finish this task in 775 clock cycles.
Posted on Reply
#10
TheGuruStud
So 15% real world seems very doable. Oh, intel, luz. Better luck next time with your 15% in 8 yrs lol
Posted on Reply
#11
Lionheart
29% seems like a pipe dream but hey, I welcome it with open arms, I suspect 15% which is still a decent bump IMO :toast:
Posted on Reply
#12
dj-electric
If Zen2's gaming performance is similar per-core to coffee lake across the board, I'd have to slap my face a few times.
That would be waking up to a new reality, one that existed last time over 12 years ago. Point some guns at me, i have skepticism about that.
Posted on Reply
#13
MDDB
"(...) is part of the 8-core chiplets that aren't subdivided into CCX (8 cores per CCX). "

Is this confirmed, that the CCXs are 8 cores now? I don't think i've seen it explicited anywhere, would there be a source?
Posted on Reply
#14
dj-electric
MDDB"(...) is part of the 8-core chiplets that aren't subdivided into CCX (8 cores per CCX). "

Is this confirmed, that the CCXs are 8 cores now? I don't think i've seen it explicited anywhere, would there be a source?
It is clear as day from the design of new EPYC. It includes 8 chiplets of 8 cores each next to the IO controller to complete 64 cores.
The chiplets themselves are quite small, and 2 of them could very possibly fit into a dual-chiplet AM4 CPU with 16 cores.

Posted on Reply
#15
bubbleawsome
I like this news quite a bit. One of the quicker 6 core chips from this could be the replacement for my 4670k.
Posted on Reply
#16
R0H1T
dj-electricIt is clear as day from the design of new EPYC. It includes 8 chiplets of 8 cores each next to the IO controller to complete 64 cores.
The chiplets themselves are quite small, and 2 of them could very possibly fit into a dual-chiplet AM4 CPU with 16 cores.

It could still be 4 cores per CCX, from AT ~
The biggest downside from this being the insane number of IF links to make Rome o_O
Posted on Reply
#17
bug
Prima.VeraBulldozer, Excavator, ... no thank you. No more hyping until the community benches are out. :rolleyes:
You're right to point out historically numbers in advance din't do AMD ant favors. However, in this case we already know there was work left to do mainly around the memory controller. Some at AMD confirmed this much around Zen launch. So we knew there was (at least theoretical) untapped potential in Zen. Of course, the proof is still in the pudding, but unlike Bulldozer and Excavator (which everyone knew were built on shaky ground), I believe AMD is at least worth the benefit of doubt this time around. Plus, even if an average the improvement isn't 29%, but 20%, it would still be enough to gain a solid lead on Intel.
Posted on Reply
#18
dj-electric
R0H1TIt could still be 4 cores per CCX, from AT ~
Could very well be, but im not to sure how economically efficient it would be to separate them, since the die is much smaller one.
If ill have to bet, im taking a guess that they will always appear in full physical form, and of course AMD is going to take a freedom of shutting down cores, letting us also enjoy 10-12 core parts on AM4.

With Zen gen 1 they were huge compered to those.
Posted on Reply
#19
EntropyZ
I hope for AMD sake they aren't getting a bit overconfident. I'll wait until reviews come out to show how the improvements translate to performance gains in gaming and workstation workloads. They are surely keeping their momentum to steamroll Intel, they are winning some battles, but they haven't won the war.
Posted on Reply
#20
Aquinus
Resident Wat-man
R0H1TThe biggest downside from this being the insane number of IF links to make Rome o_O
The biggest benefit of moving I/O off to a different die is that it makes the CCXs smaller if you don't make them bigger because all of that logic isn't in the CCX anymore and is instead located in the centralized I/O hub. Smaller dies means better yields, better yields means an opportunity to add more cores.

Personally my concern is with latency but, I'm not sure if that's an unfounded issue or not. It's likely the case that it's more beneficial to move the I/O components. It's also possible that the I/O hub might not need to be done on the same process as the CCXs which might further improve yields if the larger die is being done on a more mature process.

I'm interested to see how Rome turns out because if it turns out well, it means that AMD is keeping up the pace that started with the first Zen chips which is necessary to keep Intel on the offensive. If AMD can effectively double the number of cores without too much more cost, then Intel is going to remain on the defensive.

Intel: We can make mainstream 8c/16t CPUs too.
AMD: Hold my beer.
Posted on Reply
#21
Assimilator
AquinusIntel: We can make mainstream 8c/16t CPUs too.
AMD: Hold my beer.
TBH I wouldn't call the 9900K "mainstream" due to its heat, price and availability. It's pretty clearly showing the limit of the Core uarch on 14nm, and I suspect that its successor will only show up once 10nm is fixed.
Posted on Reply
#22
TheGuruStud
bubbleawsomeI like this news quite a bit. One of the quicker 6 core chips from this could be the replacement for my 4670k.
It appears to be a waste of materials to make anything less than 8 core to me.
Posted on Reply
#23
nemesis.ie
@Aquinus It was confirmed at the NH event that the I/O chip is on 14nm.

My guess is that it could be from GF which keeps GF in the game.

@TheGuruStud I would think they will make all the chiplets 8c, but should still be able to cut them down for market segmentation and using ones with faulty parts. I'm sure that's what @bubbleawsome meant, buying a 6-core CPU that could be 1 x 8 core with 2 faulty cores or, if space allows on the AM4 package, potentially 2 x 8 cores with 10 faulty cores between them (the latter being less likely, those would more likely go to TR or Epyc parts depending on the clock speeds but it could be done).
Posted on Reply
#24
bug
AssimilatorTBH I wouldn't call the 9900K "mainstream" due to its heat, price and availability. It's pretty clearly showing the limit of the Core uarch on 14nm, and I suspect that its successor will only show up once 10nm is fixed.
95W+ or scarcity are not new to the mainstream market ;)
Even the price is not that out of this world, but at $500 it won't gain 10% market share, so yeah, not that mainstream after all.
Posted on Reply
#25
windwhirl
I think I'll keep my hopes for IPC improvement at 10-15 percent. Nearly 30% improvement is a bit too much to ask, although if it happens, well, that'd be nice.
Posted on Reply
Add your own comment
Dec 19th, 2024 00:58 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts