• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

AMD Zen 3 Could Bid the CCX Farewell, Feature Updated SMT

btarunr

Editor & Senior Moderator
Staff member
Joined
Oct 9, 2007
Messages
47,670 (7.43/day)
Location
Dublin, Ireland
System Name RBMK-1000
Processor AMD Ryzen 7 5700G
Motherboard Gigabyte B550 AORUS Elite V2
Cooling DeepCool Gammax L240 V2
Memory 2x 16GB DDR4-3200
Video Card(s) Galax RTX 4070 Ti EX
Storage Samsung 990 1TB
Display(s) BenQ 1440p 60 Hz 27-inch
Case Corsair Carbide 100R
Audio Device(s) ASUS SupremeFX S1220A
Power Supply Cooler Master MWE Gold 650W
Mouse ASUS ROG Strix Impact
Keyboard Gamdias Hermes E2
Software Windows 11 Pro
With its next-generation "Zen 3" CPU microarchitecture designed for the 7 nm EUV silicon fabrication process, AMD could bid the "Zen" compute complex or CCX farewell, heralding chiplets with monolithic last-level caches (L3 caches) that are shared across all cores on the chiplet. AMD embraced a quad-core compute complex approach to building multi-core processors with "Zen." At the time, the 8-core "Zeppelin" die featured two CCX with four cores, each. With "Zen 2," AMD reduced the CPU chiplet to only containing CPU cores, L3 cache, and an Infinity Fabric interface, talking to an I/O controller die elsewhere on the processor package. This reduces the economic or technical utility in retaining the CCX topology, which limits the amount of L3 cache individual cores can access.

This and more juicy details about "Zen 3" were put out by a leaked (later deleted) technical presentation by company CTO Mark Papermaster. On the EPYC side of things, AMD's design efforts will be spearheaded by the "Milan" multi-chip module, featuring up to 64 cores spread across eight 8-core chiplets. Papermaster talked about how the individual chiplets will feature "unified" 32 MB of last-level cache, which means a deprecation of the CCX topology. He also detailed an updated SMT implementation that doubles the number of logical processors per physical core. The I/O interface of "Milan" will retain PCI-Express gen 4.0 and eight-channel DDR4 memory interface.



"Milan" is expected to see a Q3-2020 debut with EPYC. Around the same time, AMD tapes out "Genoa," the company's next-generation processor that heralds an all new enterprise socket dubbed SP5. A new socket gives AMD the opportunity to update and expand I/O such as increasing the memory interface width, add even more PCIe lanes, etc. The SP5 platform along with "Genoa" could see the light by 2021-22.

View at TechPowerUp Main Site
 
This gets weirder & weirder, there's like 2 train(wrecks) of leaking thoughts running in opposing direction with speeds exceeding the record breaking TGV o_O

I wonder if all of this is hoax just to keep AMD in the spotlight, not by AMD of course.
 
What's good for the goose.
 
AMD-EPYC-Milan-Zen-3-Server-CPU.png


AMD-EPYC-Zen-CPU-Architecture-Roadmap.png
 
I'm wondering why they're not putting an L4 in the IO die.
 
Or maybe they're just waiting for HBM3 to get cheap enough so that they can make a killer APU, the same could be used for regular L4 as well :pimp:
 
Last edited:
Or maybe they're just waiting for HBM3 to get cheap enough so that they can make a killer APU, the same could be used for regular L4 as well :pimp:
That sounds too expensive.
 
Right now, but as I said perhaps waiting for HBM to come down in price? I'm sure when China starts mass producing this stuff we'll most likely have cheaper options, event though the demand for HBM is also exploding.
 
This gets weirder & weirder, there's like 2 train(wrecks) of leaking thoughts running in opposing direction with speeds exceeding the record breaking TGV o_O

I wonder if all of this is hoax just to keep AMD in the spotlight, not by AMD of course.

Yeah, because it is so difficult lately for AMD to be in the spotlight and for all the good reasons.
 
Unified L3 cache does not necessarily mean they are doing away with CCX-s.
 
Interesting. I suppose going to a unified 8-core chiplet lets them eliminate cross-CCX latency and allow greater flexibility in binning since it will no longer matter which cores are defective.
 
The goodbye to ccx is a great move forward. That shut mean les latency and if it comes to ryzen/threadripper as well. That shut mean improved performance in games as now all 8 cores can work together in the same Chiplet with out have to go throw the i/o chip first and the now shared l3 cashe shut also help in that case.

So that seems promising for zen 3 or 7 nm+ when it comes to games.
 
So AMD is doing away with the mcm on package for the mainstream consumer market , yes , I'm referring about the desktop and mobile market and thus , their cpu's won't be hit by the ipc penalty or viewed another way their cpu's wiil get even more ipc by having all the necessaryes in a monolithic package, as current ZEN's do more rather than just being fancy coupled to high speed low CAS RAM and my coffee kicks in as I finish reading the material.Good morning.
Maybe the internal testing showed that it can be done with achieved or better than projected yeld's.
This , still sleepy degree-less in computer engineer armchair general.
 
So AMD is doing away with the mcm on package for the mainstream consumer market , yes , I'm referring about the desktop and mobile market and thus , their cpu's won't be hit by the ipc penalty or viewed another way their cpu's wiil get even more ipc by having all the necessaryes in a monolithic package, as current ZEN's do more rather than just being fancy coupled to high speed low CAS RAM
No they won't. Core die consists of two logical CCXs. This is what they are suspected to try and get rid of. It has no influence on the MCM on package setup.
 
The interesting information here is that Zen 3 EPYC will be 64 core max it seems. That's interesting, could indicate that they don't feel any pressure from whatever Intel plans to release.
 
Zen2 > Zen3 seems to be largely same type of jump that Zen > Zen+ was. They will fix some more obvious problems and hopefully get the 10-15% from improved manufacturing process but no big changes.
 
Splitting the L3 cache into pieces is done so that access to its content is faster. That, of course, doesn't entirely leave out the possibility that merging it might be advantageous, since doing so could be combined with enlarging the L2 cache.
 
I look forward to reading and hearing more about it :) Obviously seeing some Cinebench runs of it will make or break me buying something... (I jest) Reviews will let me know whether or not it's going to suck or be like most of AMDs recent reviews, to rave about :)

Exciting times :)
 
Splitting the L3 cache into pieces is done so that access to its content is faster. That, of course, doesn't entirely leave out the possibility that merging it might be advantageous, since doing so could be combined with enlarging the L2 cache.
The latency reduction from combining the cache and cores and eliminating CCX latency would likely morde then offset the disadvantages of having 32MB of cache in a single piece. Especially if they put slightly lower latency in that L3 cache as well.
 
Combined (and consequently larger) cache (employing same tipology) inherently comes with higher latency, not lower latency.
 
Combined (and consequently larger) cache (employing same tipology) inherently comes with higher latency, not lower latency.

I think he meant higher speed to offset that.
 
I think he meant higher speed to offset that.
Yes. AMD cache typically ships with higher latency then Intel's cache does, and a common speed boost for phenom processors was tweaking cache latencies.

I dont know if ryzen has the same problem though, I assume AMD is still using larger, slower caches to be able to ship 32MB so cheaply.
 
Back
Top