Monday, October 7th 2019

AMD Zen 3 Could Bid the CCX Farewell, Feature Updated SMT

With its next-generation "Zen 3" CPU microarchitecture designed for the 7 nm EUV silicon fabrication process, AMD could bid the "Zen" compute complex or CCX farewell, heralding chiplets with monolithic last-level caches (L3 caches) that are shared across all cores on the chiplet. AMD embraced a quad-core compute complex approach to building multi-core processors with "Zen." At the time, the 8-core "Zeppelin" die featured two CCX with four cores, each. With "Zen 2," AMD reduced the CPU chiplet to only containing CPU cores, L3 cache, and an Infinity Fabric interface, talking to an I/O controller die elsewhere on the processor package. This reduces the economic or technical utility in retaining the CCX topology, which limits the amount of L3 cache individual cores can access.

This and more juicy details about "Zen 3" were put out by a leaked (later deleted) technical presentation by company CTO Mark Papermaster. On the EPYC side of things, AMD's design efforts will be spearheaded by the "Milan" multi-chip module, featuring up to 64 cores spread across eight 8-core chiplets. Papermaster talked about how the individual chiplets will feature "unified" 32 MB of last-level cache, which means a deprecation of the CCX topology. He also detailed an updated SMT implementation that doubles the number of logical processors per physical core. The I/O interface of "Milan" will retain PCI-Express gen 4.0 and eight-channel DDR4 memory interface.
"Milan" is expected to see a Q3-2020 debut with EPYC. Around the same time, AMD tapes out "Genoa," the company's next-generation processor that heralds an all new enterprise socket dubbed SP5. A new socket gives AMD the opportunity to update and expand I/O such as increasing the memory interface width, add even more PCIe lanes, etc. The SP5 platform along with "Genoa" could see the light by 2021-22.
Source: Tom's Hardware
Add your own comment

26 Comments on AMD Zen 3 Could Bid the CCX Farewell, Feature Updated SMT

#1
R0H1T
This gets weirder & weirder, there's like 2 train(wrecks) of leaking thoughts running in opposing direction with speeds exceeding the record breaking TGV o_O

I wonder if all of this is hoax just to keep AMD in the spotlight, not by AMD of course.
Posted on Reply
#2
AsRock
TPU addict
What's good for the goose.
Posted on Reply
#4
FordGT90Concept
"I go fast!1!11!1!"
I'm wondering why they're not putting an L4 in the IO die.
Posted on Reply
#5
LiviuTM
FordGT90ConceptI'm wondering why they're not putting an L4 in the IO die.
Maybe the gains are too small to be worthing.
Posted on Reply
#6
R0H1T
Or maybe they're just waiting for HBM3 to get cheap enough so that they can make a killer APU, the same could be used for regular L4 as well :pimp:
Posted on Reply
#7
GoldenX
R0H1TOr maybe they're just waiting for HBM3 to get cheap enough so that they can make a killer APU, the same could be used for regular L4 as well :pimp:
That sounds too expensive.
Posted on Reply
#8
R0H1T
Right now, but as I said perhaps waiting for HBM to come down in price? I'm sure when China starts mass producing this stuff we'll most likely have cheaper options, event though the demand for HBM is also exploding.
Posted on Reply
#9
john_
R0H1TThis gets weirder & weirder, there's like 2 train(wrecks) of leaking thoughts running in opposing direction with speeds exceeding the record breaking TGV o_O

I wonder if all of this is hoax just to keep AMD in the spotlight, not by AMD of course.
Yeah, because it is so difficult lately for AMD to be in the spotlight and for all the good reasons.
Posted on Reply
#10
londiste
Unified L3 cache does not necessarily mean they are doing away with CCX-s.
Posted on Reply
#11
hellrazor
GoldenXThat sounds too expensive.
I'd spend the money for a 3900X with 8GB of HBM2 or (preferably) HBM3 as an L4 cache.
Posted on Reply
#12
discopanda
Interesting. I suppose going to a unified 8-core chiplet lets them eliminate cross-CCX latency and allow greater flexibility in binning since it will no longer matter which cores are defective.
Posted on Reply
#13
Tomgang
The goodbye to ccx is a great move forward. That shut mean les latency and if it comes to ryzen/threadripper as well. That shut mean improved performance in games as now all 8 cores can work together in the same Chiplet with out have to go throw the i/o chip first and the now shared l3 cashe shut also help in that case.

So that seems promising for zen 3 or 7 nm+ when it comes to games.
Posted on Reply
#14
dont whant to set it"'
So AMD is doing away with the mcm on package for the mainstream consumer market , yes , I'm referring about the desktop and mobile market and thus , their cpu's won't be hit by the ipc penalty or viewed another way their cpu's wiil get even more ipc by having all the necessaryes in a monolithic package, as current ZEN's do more rather than just being fancy coupled to high speed low CAS RAM and my coffee kicks in as I finish reading the material.Good morning.
Maybe the internal testing showed that it can be done with achieved or better than projected yeld's.
This , still sleepy degree-less in computer engineer armchair general.
Posted on Reply
#15
londiste
dont whant to set it"'So AMD is doing away with the mcm on package for the mainstream consumer market , yes , I'm referring about the desktop and mobile market and thus , their cpu's won't be hit by the ipc penalty or viewed another way their cpu's wiil get even more ipc by having all the necessaryes in a monolithic package, as current ZEN's do more rather than just being fancy coupled to high speed low CAS RAM
No they won't. Core die consists of two logical CCXs. This is what they are suspected to try and get rid of. It has no influence on the MCM on package setup.
Posted on Reply
#16
Vya Domus
The interesting information here is that Zen 3 EPYC will be 64 core max it seems. That's interesting, could indicate that they don't feel any pressure from whatever Intel plans to release.
Posted on Reply
#17
londiste
Zen2 > Zen3 seems to be largely same type of jump that Zen > Zen+ was. They will fix some more obvious problems and hopefully get the 10-15% from improved manufacturing process but no big changes.
Posted on Reply
#18
quadibloc
Splitting the L3 cache into pieces is done so that access to its content is faster. That, of course, doesn't entirely leave out the possibility that merging it might be advantageous, since doing so could be combined with enlarging the L2 cache.
Posted on Reply
#19
phill
I look forward to reading and hearing more about it :) Obviously seeing some Cinebench runs of it will make or break me buying something... (I jest) Reviews will let me know whether or not it's going to suck or be like most of AMDs recent reviews, to rave about :)

Exciting times :)
Posted on Reply
#20
qcmadness
londisteZen2 > Zen3 seems to be largely same type of jump that Zen > Zen+ was. They will fix some more obvious problems and hopefully get the 10-15% from improved manufacturing process but no big changes.
4c-ccx to 8c-ccx is a major change already.
Posted on Reply
#21
TheinsanegamerN
quadiblocSplitting the L3 cache into pieces is done so that access to its content is faster. That, of course, doesn't entirely leave out the possibility that merging it might be advantageous, since doing so could be combined with enlarging the L2 cache.
The latency reduction from combining the cache and cores and eliminating CCX latency would likely morde then offset the disadvantages of having 32MB of cache in a single piece. Especially if they put slightly lower latency in that L3 cache as well.
Posted on Reply
#22
dogsbody
Combined (and consequently larger) cache (employing same tipology) inherently comes with higher latency, not lower latency.
Posted on Reply
#23
TheGuruStud
dogsbodyCombined (and consequently larger) cache (employing same tipology) inherently comes with higher latency, not lower latency.
I think he meant higher speed to offset that.
Posted on Reply
#24
TheinsanegamerN
TheGuruStudI think he meant higher speed to offset that.
Yes. AMD cache typically ships with higher latency then Intel's cache does, and a common speed boost for phenom processors was tweaking cache latencies.

I dont know if ryzen has the same problem though, I assume AMD is still using larger, slower caches to be able to ship 32MB so cheaply.
Posted on Reply
Add your own comment
Nov 24th, 2024 03:33 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts