Tuesday, October 6th 2020
AMD Big Navi GPU Features Infinity Cache?
As we are nearing the launch of AMD's highly hyped, next-generation RDNA 2 GPU codenamed "Big Navi", we are seeing more details emerge and crawl their way to us. We already got some rumors suggesting that this card is supposedly going to be called AMD Radeon RX 6900 and it is going to be AMD's top offering. Using a 256-bit bus with 16 GB of GDDR6 memory, the GPU will not use any type of HBM memory, which has historically been rather pricey. Instead, it looks like AMD will compensate for a smaller bus with a new technology it has developed. Thanks to the new findings on Justia Trademarks website by @momomo_us, we have information about the alleged "infinity cache" technology the new GPU uses.
It is reported by VideoCardz that the internal name for this technology is not Infinity Cache, however, it seems that AMD could have changed it recently. What does exactly you might wonder? Well, it is a bit of a mystery for now. What it could be, is a new cache technology which would allow for L1 GPU cache sharing across the cores, or some connection between the caches found across the whole GPU unit. This information should be taken with a grain of salt, as we are yet to see what this technology does and how it works, when AMD announces their new GPU on October 28th.
Source:
VideoCardz
It is reported by VideoCardz that the internal name for this technology is not Infinity Cache, however, it seems that AMD could have changed it recently. What does exactly you might wonder? Well, it is a bit of a mystery for now. What it could be, is a new cache technology which would allow for L1 GPU cache sharing across the cores, or some connection between the caches found across the whole GPU unit. This information should be taken with a grain of salt, as we are yet to see what this technology does and how it works, when AMD announces their new GPU on October 28th.
141 Comments on AMD Big Navi GPU Features Infinity Cache?
wow
how about 6gb cache we could not need
Fans desperately searching for some argument to say 256 bit GDDR6 will do anything more than hopefully get even with a 2080ti.
History repeats.
Bandwidth is bandwidth and cache is not new. Also... elephant in the room.... Nvidia needed expanded L2 Cache since Turing to cater for their new shader setup with RT/tensor in them...yeah, I really wonder what magic Navi is going to have with a similar change in cache sizes... surely they won't copy over what Nvidia has done before them like they always have right?! Surely this isn't history repeating, right? Right?!
:lovetpu: Let's revisit those assumptions post launch ;) That'll be fun, too. I'll take a bet... drivers will need hotfixing, which will likely come pretty late or creates new issues along the way (note: Nvidia has fallen prey to this just as well, this alone should say enough); things will be out of stock shortly after launch, its going to suck an easy 250-300W just as well, and yes, you do have 16GB on the top model.
If I'm wrong, I'll buy it :p
I am keeping my expectations really low after reading about that 256bit data bus.
"But you don't need Ray Tracing" is not an excuse for >500usd GPU.
Before you say that there are other API alternative for Ray Tracing, not having dedicated RT cores will just hammer performance, just look at Crysis Remastered as an example (the game can leverage the RT cores)
A 2080ti has 134% the performance of a 5700XT. The new flagship is said to have twice the shaders, likely higher clock speeds and improved IPC. Only a pretty avid fanboy of a certain color would think that such a GPU could only muster some 30% higher performance with all that. GPUs scale very well, you can expect it to be between 170-190% the performance of a 5700XT. Caches aren't new, caches as big as the ones rumored are a new thing. I should also point out that bandwidth and the memory hierarchy is completely hidden away from the GPU cores, in other words, whether it's reading at 100GB/s from DRAM or at 1 TB/s from a cache, it doesn't care, it's just operating on some memory at an address as far as the GPU core is concerned.
Rendering is also an iterative process where you need to go over the same data many times a second, if you can keep for example megabytes of vertex data in some fast memory close to the cores that's a massive win.
GPUs hide very well memory bottlenecks by scheduling hundreds of threads, another thing you might have missed is that over time the ratio of GB/s from DRAM per GPU core has been getting lower and lower. And somehow performance keeps increasing, how the hell does that work if "bandwidth is bandwidth" ?
Clearly, there are ways of increasing the efficiency of these GPU such that they need less DRAM bandwidth to achieve the same performance, this is another one of those ways. By your logic, we must have had GPUs with tens of TB/s by now because otherwise the performances wouldn't have gone up. They wont have much stock, most wafers are going to consoles. While performance/watt must have increased massively, perhaps even over Ampere, the highest end card will still be north of 250W.
Because it works and it does replace bandwidth, information that the GPU uses repeatedly is stored in and fetched from cache and thus does not have to travel through the memory bus each time. Therefore the memory bandwidth saved by using cache can instead be used for other information. So a 256-bit bus with a large very effective cache equals MORE MEMORY BANDWITH, Nvidia already uses this system on all their cards.
More notably though, 3080 is twice as fast.
Based on TPU review data: "Performance Summary" at 1920x1080, 4K for 2080 Ti and faster.
That said, lets not forget this GPU was pretty much made with the help of Sony and Microsoft because of their consoles using RDNA2, that is a lot of (smart) people working on a product, so I do have faith that it will be good.
And personally I care little for "beating" Nvidia in "performance".
If it delivers good frames, while going ez on the powerconsumption and while costing, finally again, a reasonable amount of money and not the obscene prices being asked as of late, its a winner in my book.
Heck I would REALLY love it if we had a new RX460/470/480 moment, where all games could be lifted up, where everyone could upgrade and get with the times.
This would also be really good for the evolution/implementation of Ray Tracing, the industry can only really make use of that if the world can use it.
Those RX cards were a godsend for me, it was a solid upgrade from my previous card w/o breaking the bank/my wallet.
Looking at the prices lately, most likely my only option will be the second hand market again if I want the same performance uplift as last time. 'went from a GTX 950 to RX 570'
Based on Xbox Series X performance scaling over the X1X it doesn't seem like RDNA2 has much in the way of IPC improvements over RDNA.
So with similar clocks I expect the top-end 80CU RDNA2 to be 55-65% faster than the 5700XT depending on the resolution. (Assuming there is no bandwidth bottleneck)
But as we all know RNDA2 will have noticeably higher clocks than RDNA1, I expect the average clocks of the 80CU part to be in the 2-2.1GHz range which is a decent 10-13% above the 5700XT, assuming semi-linear scaling, this clock boost alone will put RDNA2 10-12% above RDNA1, now with addition of that massive shader count increase It's probably reasonable to expect the top-end RDNA2 to be 75-85% faster than the 5700XT as Vya Domus predicted.
Expecting flagship RDNA2 to be only as fast as a 3070/2080Ti is not realistic, as it will probably beat them both comfortably.
Also, don't extrapolate RDNA2 performance based on console numbers. They're not exactly comparable, it's more like comparing cashews to figs.
just the Real-World performance increase didn't suggest higher IPC than RDNA1 to me, based on how RDNA performs in comparison to the console.
Big Navi was a pipe dream of AMD loyalists left wanting for a first gen Navi high-end card.
Not to mention the issues with cross-platform benchmarking due to most console titles being very locked down in terms of settings etc. Digital Foundry does an excellent job of this, but their recent XSX back compat video went to great lengths to document how and why their comparisons were problematic.