Monday, November 8th 2021
AMD Could Use Infinity Cache Branding for Chiplet 3D Vertical Cache
AMD in its Computex 2021 presentation showed off its upcoming "Zen 3" CCD (CPU complex dies) featuring 64 MB of "3D vertical cache" memory on top of the 32 MB L3 cache. The die-on-die stacked contraption, AMD claims, provides an up to 15% gaming performance uplift, as well as significant improvements for enterprise applications that can benefit from the 96 MB of total last-level cache per chiplet. Ahead of the its debut later today in the company's rumored EPYC "Milan-X" enterprise processor reveal, we're learning that AMD could brand 3D Vertical Cache as "3D Infinity Cache."
This came to light when Greymon55, a reliable source with AMD and NVIDIA leaks, used the term "3D IFC," and affirmed it to be "3D Infinity Cache." AMD realized that its GPUs and CPUs have a lot of untapped performance potential with use of large on-die caches that can make up for much of the hardware's memory-management optimization. The RDNA2 family of gaming GPUs feature up to 128 MB of on-die Infinite Cache memory operating at bandwidths as high as 16 Tbps, allowing AMD to stick to narrower 256-bit wide GDDR6 memory interfaces even on its highest-end RX 6900 XT graphics cards. For CCDs, this could mean added cushioning for data transfers between the CPU cores and the centralized memory controllers located in the sIOD (server I/O die) or cIOD (client I/O die in case of Ryzen parts).
Source:
Greymon55 (Twitter)
This came to light when Greymon55, a reliable source with AMD and NVIDIA leaks, used the term "3D IFC," and affirmed it to be "3D Infinity Cache." AMD realized that its GPUs and CPUs have a lot of untapped performance potential with use of large on-die caches that can make up for much of the hardware's memory-management optimization. The RDNA2 family of gaming GPUs feature up to 128 MB of on-die Infinite Cache memory operating at bandwidths as high as 16 Tbps, allowing AMD to stick to narrower 256-bit wide GDDR6 memory interfaces even on its highest-end RX 6900 XT graphics cards. For CCDs, this could mean added cushioning for data transfers between the CPU cores and the centralized memory controllers located in the sIOD (server I/O die) or cIOD (client I/O die in case of Ryzen parts).
11 Comments on AMD Could Use Infinity Cache Branding for Chiplet 3D Vertical Cache
In other words, it doesn't take a lot for next gen Zen to be faster in most metrics and more power efficient in all of them.
Either way, its good to see the giants topple over each other every release, that's where they need to be!
How much the extra 64 MB will help ? we will have to see. it still a significant upgrade on the L3 cache (3 time more). Zen 3 is already one of the CPU with the most L3 cache so that is a very significant number.
I can't wait to see how it will look like because caching is key for CPU since they can't really hide like GPU the latency.
For GPU, caching allow them to remain on a smaller (cheaper) bus with cheaper memory while still being competitive. It allow also higher clock. Lower clock with larger chip was one way to hide latency. But the higher the clock, the more cycle you loose waiting for your data. This is one of the reason why RDNA 2 was made to be able to get higher clock and not Ampere.
I would say 2010-2020 years was all about power efficiency. I think for the next decades, company that handle the best the data movement and access will get the lead.
I like what i see from AMD as they already seems to shift to this new paradigm.
CPU power usage is not only architecture, but also how the maker decide to run it. (the famous v/f curve).
In the case of ADL, it's clear that if pushed to the max, it's power usage is way over the board, but under smaller load, it can be similar to Zen 3 for few reasons:
- The CPU could process the load faster, reducing the number of work cycle, reducing the power usage
- The CPU could run cooler due to low load, reducing the need for higher voltage
- The CPU could have a V/F curve that do not push for high voltage when there is just an average load.
- The CPU could more easily disable unused part of the CPU or downclock them when needed
AMD have an advantage right now when all cores are being used. It's quite possible that when GPU limited in gaming, both CPU are equivalent because of these reason. But on desktop, The I/O die use almost all the time the same power and it require more power to move the data to the chiplets than moving the data inside a monolithic die. It doesn't matter much at max load but under smaller load, it can have a significance.
Not necessarily a problem that will arise in the first few years of ADL out in the wild, but its something to consider. We have a very limited view on what this CPU will draw.
Annotated Zen 3 die on Reddit
AMD could easily extend their cache die from 6x6 mm to 7x6 mm and put some additional L2 cache on it, like 1 MB per core, above the original 512 KB of L2 per core. Or maybe they've already done it but didn't tell anyone. Easier said than done of course, and it requires a lot of forward thinking because the Zen 3 die would have to be designed for that from the beginning, but not impossible.