Monday, May 31st 2021

AMD "Zen 3+" Microarchitecture is "Zen 3" with 3D Vertical Cache Technology, 15% Gaming Perf Gain

AMD CEO Dr Lisa Su, in her Computex 2021 Keynote address detailed what could very well be the "Zen 3+" microarchitecture that's been in the news lately, although the name "Zen 3+" was never used in the keynote address. AMD has collaborated with TSMC on developing a new die-on-die 3D stacking technology using TSVs (through-silicon vias) and structural silicon substrate, to place a 64 MB SRAM on top of the "Zen 3" CCD, which it calls 3D Vertical Cache. This cache die sits directly over the region that has the CCD's own 32 MB L3 cache, and the difference in height between the two dies is leveled using structural silicon. At this point we don't know how the cache hierarchy is changed, whether the 64 MB add-on cache is contiguous with the on-die L3 cache, or whether it's an L4 victim cache to the L3$. With it, the total cache amount of the CCD jumps to 100 MB (4 MB L2 caches + 32 MB L3 cache + 64 MB 3D Vertical Cache).

AMD has made some startling claims as to the performance impact of 3D Vertical Cache Technology. It claims that gaming performance improves by an average of 15%, which is akin to a generational performance impact in and of itself. With these gains, AMD hopes to make up whatever gaming performance deficit the "Zen 3" microarchitecture has against Intel's "Rocket Lake-S." The first processors implementing 3D Vertical Cache Technology will start arriving by the end of 2021, which means it could very well be the Ryzen 6000 series desktop processors, leaving the Ryzen 7000 series to be based on the 5 nm "Zen 4," on track to a 2022 release.
How AMD plans to release these updated dies on the client ecosystem remains a mystery. The prototype Dr Su showed in her keynote address clearly appears to be Socket AM4. If the new Socket AM5 is on course to later this year, it's very likely that these "Zen 3 + 3D VC" CCDs could be paired with an updated cIOD (client I/O die) that supports DDR5 memory, and packaged for AM5.
Add your own comment

56 Comments on AMD "Zen 3+" Microarchitecture is "Zen 3" with 3D Vertical Cache Technology, 15% Gaming Perf Gain

#26
Wirko
It looks like the cache die has the same surface area as the L3 cache that's part of the CCD, but twice the megabytes. How is that possible?
z1n0xCool stuff. That rock on the first screenshot though. ;)
What makes you think it's a rock and not a cut and polished part of a HBM stack?
Posted on Reply
#28
Punkenjoy
WirkoIt looks like the cache die has the same surface area as the L3 cache that's part of the CCD, but twice the megabytes. How is that possible?


What makes you think it's a rock and not a cut and polished part of a HBM stack?
it look like they can stack multiple layer, so it's possible this 64 MB is 2 layer of 32 MB. But it's also quite possible that it's just denser memory and it spread a bit on the cores space. The 32 MB use quite a bit of the central space but not all of it.

If you look at only the SRAM, it's only a portion of the actual space, maybe they reuse the same L3 control and tags.


I suspect the SRAM top top spread over the L2 cache of each core too.

As per Andreas Schilling on Twitter that spoke with AMD.

- Zen 3 was made with that in mind. No modification needed.
- it's 1 layer of 64 MB of cache
- It actually expend the level 3 cache. With minimal latency increase.

Based on that and the picture above I suspect the 64 MB is way denser than the 32 MB bellow. I suspect the L3 control and L3 tags support also the SRAM in the the V Cache.
Posted on Reply
#29
TheoneandonlyMrK
TumbleGeorgeCache on top on cristal? Will got all heat from other working transistors below it and we will have a roasted cache.
If you were paying attention Lisa Su said it's on top of the cach already there, the CPU get silicon shims by the look of it, not great but not the CPU baking the SRAM either.
Posted on Reply
#30
Minus Infinity
nguyenOh boy I hope Ryzen 6000 is still on AM4, that would be awesome.
Oh for sure it will be. AM5 is Raphael as is DDR5. I will probably upgrade my 3700X and put that in my older PC and grab the 6000 series equivalent, 6800X I presume, for my gaming rig.
Posted on Reply
#31
Wirko
TheoneandonlyMrKIf you were paying attention Lisa Su said it's on top of the cach already there, the CPU get silicon shims by the look of it, not great but not the CPU baking the SRAM either.
Hmm, I think we (including me) got fooled by those nice illustrations and forgot that CPUs come in flip-chip packages. The cache die and silicon shims are under the main die, away from the heat spreader. The shims don't need to conduct a lot of heat but they need to conduct power and signals, so they are actually interposers.
Posted on Reply
#32
Vya Domus
FouquinSRAM can be this dense because it doesn't get hot. No need to worry about thermals changing any significant amount. Go check out Fritz's videos with the bare die CPUs and watch the cache blocks barely change temp as cores fire up all around.
SRAM caches can be one of the hottest portions of a chip since they can also consume the most power. It depends somewhat on the workload, something that generates a lot of cache hits will create a lot of heat.

Also caches scale in frequency with the clock speed of the CPU, running the cores at something like 800mhz will of course result in dramatically less heat output from the caches.
Posted on Reply
#33
Fouquin
Vya DomusSRAM caches can be one of the hottest portions of a chip since they can also consume the most power. It depends somewhat on the workload, something that generates a lot of cache hits will create a lot of heat.

Also caches scale in frequency with the clock speed of the CPU, running the cores at something like 800mhz will of course result in dramatically less heat output from the caches.
I believe you have confused produced heat with conducted heat. SRAM is a good conduction vector for heat distribution within an IC, hence dense arrangements with higher heat production logic blocks being avoided. The most power intensive aspect of SRAM is pStatic continual drive voltage which is a few millivolts. Power density of SRAM can be as much as 70x lower than the SoC its mated to.
Posted on Reply
#35
TheoneandonlyMrK
WirkoHmm, I think we (including me) got fooled by those nice illustrations and forgot that CPUs come in flip-chip packages. The cache die and silicon shims are under the main die, away from the heat spreader. The shims don't need to conduct a lot of heat but they need to conduct power and signals, so they are actually interposers.
I wouldn't normally argue against what you just said but Lisa Su showed off a heat spreader less CPU with those shims removed , and at the heat spreader side, the tsv connection could be on the back, for the cache to attach to.
Posted on Reply
#36
Wirko
TheoneandonlyMrKI wouldn't normally argue against what you just said but Lisa Su showed off a heat spreader less CPU with those shims removed , and at the heat spreader side, the tsv connection could be on the back, for the cache to attach to.
Possibly. The animation in Lisa's video shows both dies with structures (transistors, metal) on top, which isn't helpful if you want to decode the true orientation of both. But I take it that the stacking order is shown correctly.

However, there's that interesting bit in the illustration, "direct copper-to-copper bond" between the two dies. "Copper" should be the top metal layer, so the dies are bonded top-to-top. Looks clever ... but it also means that, firstly, AMD would have to NOT use flip-chip die bonding, contrary to regular Ryzens and other modern CPUs. To make it work, they'd have to make a version of substrate with much different (mirrored) wire routing. Secondly, TSVs would not be needed for "silicon-to-silicon communication" but for CCD-to-package substrate connections.

I must be missing something here. @FritzchensFritz ... ah damn, he's not a TPU forum member.
Posted on Reply
#37
Vya Domus
FouquinI believe you have confused produced heat with conducted heat.
It's not that simple, if the SRAM cache heats up it will no longer conduct heat that well. I am sure AMD figured this will work just fine, but caches are really problematic when it comes down to the heat they can generate. Their saving grace is that they're uniform and are in contact with a lot of surface area.
Posted on Reply
#39
Unregistered
EmilyLooks like my 3900X won't last as long as I thought.
It will. It will continue doing exactly what's it's doing right now for the next 10 years. It's an outstanding CPU. :)

Chasing the absolute highest performance numbers only leads to disappointment. They're only the 'best' for six months to a year.

Until August of last year I was using a 2700k from 2012 - it was chuggin' along just fine.
#40
Punkenjoy
As per AMD, the cache chiplet is made with a different library than the Zen 3 chiplet. Since there is no logic, they can use a library dedicated for cache and can make it way denser than on the zen chiplet where they have to deal with logic circuit.
Posted on Reply
#41
Colddecked
weekendgeekIt will. It will continue doing exactly what's it's doing right now for the next 10 years. It's an outstanding CPU. :)

Chasing the absolute highest performance numbers only leads to disappointment. They're only the 'best' for six months to a year.

Until August of last year I was using a 2700k from 2012 - it was chuggin' along just fine.
TBF the Sandy Bridge i7 is like the GOAT of CPUs on so many levels.
Posted on Reply
#42
Mussels
Freshwater Moderator
ColddeckedTBF the Sandy Bridge i7 is like the GOAT of CPUs on so many levels.
Oh yeah. i STILL have my 2500k kicking around, they aged so well.
Posted on Reply
#43
tussinman
EmilyLooks like my 3900X won't last as long as I thought.
3900X will be like maybe 5% slower at 1440p and like 8-10% slower at 1080p. AMD and Intel claim max performance gains but in actual real world scenarios there's not a ton of difference to be honest unless you have a lowend chip
ColddeckedTBF the Sandy Bridge i7 is like the GOAT of CPUs on so many levels.
That only CPU I have above it is the i7 920. My brother still gets high settings on most games with it and he literally bought it when George Bush was still president (Winter of 2008). Not only that the 6 core 12 thread Xeon CPU's from that generation not only work for that socket but are dirt cheap, he could technically get a signficant performance boost for basically free on a platform thats almost 13 years old
Posted on Reply
#44
Mussels
Freshwater Moderator
I saw a huge difference upgrading between each gen, but thats because i have high end GPU's.

With a 3080 (before it died and i got the 3090) there was a good 30-40FPS gain every gen with ryzen, as i tested it with all the systems in the house for the lulz

That said, i game at 165Hz - those gains at max FPS are meaningless, if you dont have the high refresh monitor and high powered GPU to benefit from it.
5600x is my absolute happy recommendation for a max FPS gaming chip this gen, so stupidly fast without the 5800x heat issues at all core load.
Posted on Reply
#45
Punkenjoy
Geforce GPU don't perform well when they are CPU limited.
Posted on Reply
#46
tussinman
PunkenjoyGeforce GPU don't perform well when they are CPU limited.
Not only that the first 2 Ryzen generation where relatively weak so of course the jump will be big from that to 3rd gen and 4th gen. Do that same 3080 test but on a 8th gen intel 8600 all the way to 11th gen 11600 intel and you won't see anywhere close to 30-40 FPS per gen (heck on some games you won't even see 30FPS increase from 8th gen all the way straight to 11)
Posted on Reply
#47
Mussels
Freshwater Moderator
tussinmanNot only that the first 2 Ryzen generation where relatively weak so of course the jump will be big from that to 3rd gen and 4th gen. Do that same 3080 test but on a 8th gen intel 8600 all the way to 11th gen 11600 intel and you won't see anywhere close to 30-40 FPS per gen (heck on some games you won't even see 30FPS increase from 8th gen all the way straight to 11)
depends on the game engine, as you go from DX9 through DX12, they get better at multi threading
Posted on Reply
#48
Guwapo77
If this truly performs like the good Dr. says, I'll sell my 5950x for this. I want the best of the AM4 platform as I'm done with the early adopter crew. I'll see AM5 five or six years.
Posted on Reply
#49
mtcn77
PunkenjoyAs per AMD, the cache chiplet is made with a different library than the Zen 3 chiplet. Since there is no logic, they can use a library dedicated for cache and can make it way denser than on the zen chiplet where they have to deal with logic circuit.
AMD won't be stopping in the near future, by the looks of it. Wow, impressive!
Posted on Reply
#50
Midland Dog
Legacy-ZAIn South-Africa we always have stock of these CPU's however, though supply is no longer an issue, their prices remain the same, these stores run purely on so-called "discounts", as everything is discounted all the time (not really, they just look discounted, but it is just the normal price to fool people that think they are getting a great deal)
surely your 4670k has more than 4.2ghz in the tank. south africa cant be much hotter than western australia
Posted on Reply
Add your own comment
Dec 18th, 2024 07:31 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts