Tuesday, October 6th 2020
AMD Big Navi GPU Features Infinity Cache?
As we are nearing the launch of AMD's highly hyped, next-generation RDNA 2 GPU codenamed "Big Navi", we are seeing more details emerge and crawl their way to us. We already got some rumors suggesting that this card is supposedly going to be called AMD Radeon RX 6900 and it is going to be AMD's top offering. Using a 256-bit bus with 16 GB of GDDR6 memory, the GPU will not use any type of HBM memory, which has historically been rather pricey. Instead, it looks like AMD will compensate for a smaller bus with a new technology it has developed. Thanks to the new findings on Justia Trademarks website by @momomo_us, we have information about the alleged "infinity cache" technology the new GPU uses.
It is reported by VideoCardz that the internal name for this technology is not Infinity Cache, however, it seems that AMD could have changed it recently. What does exactly you might wonder? Well, it is a bit of a mystery for now. What it could be, is a new cache technology which would allow for L1 GPU cache sharing across the cores, or some connection between the caches found across the whole GPU unit. This information should be taken with a grain of salt, as we are yet to see what this technology does and how it works, when AMD announces their new GPU on October 28th.
Source:
VideoCardz
It is reported by VideoCardz that the internal name for this technology is not Infinity Cache, however, it seems that AMD could have changed it recently. What does exactly you might wonder? Well, it is a bit of a mystery for now. What it could be, is a new cache technology which would allow for L1 GPU cache sharing across the cores, or some connection between the caches found across the whole GPU unit. This information should be taken with a grain of salt, as we are yet to see what this technology does and how it works, when AMD announces their new GPU on October 28th.
141 Comments on AMD Big Navi GPU Features Infinity Cache?
Especially taking who reigns the console throne.
225W to 300W = 1.33x
1.33 x 1.5 = 2
So, yes. Given the higher TDP and increased power efficiency Big Navi could be twice as fast. If AMD really achieves 50% better power efficiency is another story. But I think they will be at least closer than what Nvidia promised. 1.9x on marketing slides, ~1.2x in reality. Rumors, rumors, rumors. There are also rumors about AMD giving false information to the AIBs and cards with locked BIOS. Nvidia gave false information to the AIBs too. For example, all AIBs had no clue about the real shader counts of Ampere before launch. The claim about AMD only targeting GA104 also makes no sense. AMD always said that Big Navi will target the high performance segment. Which definitely doesn't sound like a ...04 competitor.
Where did you say that before, btw? ;)
Spec wise Navi21 and 3070 are just freakishly similar
256bit bus - 448GBps bandwidth
20-22 TFLOPS FP32
220-230W TGP
Around 2080 Ti performance or +15%. Overall I think Navi21 will be splitting distance to 3080 at 1080p/1440p gaming, but not so much at 4K or Ray Tracing.
This thread is about a supposedly innovative memory architecture that would allow AMD to compete in the high end with only a 256bit memory bus and GDDR6. This should allow AMD to build these cards cheaper. Whether they will sell them cheaper, and by how much, is a completely different story.
TFLOPs can be compared across architectures as long as they are similar enough. Granted, that is never quite so easy. GCN had a problem with utilization, Turing has the FP32+INT thing, Ampere throws a huge wrench into trying to compare it with the 2xFP32 scheme it has.
But the last bit of your post hits home with me - AMD wasn't jebaited. They've had this all along and its all they'll write, it is completely plausible, as expected, and not Youtuber-madness territory as I've been saying all along ever since RDNA2 became a term. They're trailing 1,5 gen, still, and they always have and probably always will be since Polaris. The only improvement now is probably the time to market. Which is already a big thing - as you say, they don't NEED to fight the 3090. But they really did/should fight the 3080. That card is fast enough that most people can make do with 'a little less' but that really does relegate AMD to a competition over the 500 dollar price point, and not the 700 dollar one. Which means they're effectively busting the 3070 at best.
I'm also entirely with you when you say the best GPU isn't necessarily the fastest one. Yes. If AMD can pull off a more efficient, smaller die that performs 'just as well' but not in the top end, that is just fine. But that has yet to happen. They have a node advantage, but they're still not at featureset parity and already not as efficient as Nvidia's architecture. Now they need to add RT.
I'm not all that excited because all stars have already aligned long ago. You just gotta filter out the clutter, and those efficiency figures are among them. Reality shines through in the cold hard facts: limited to GDDR6, not as power efficient if they'd have to go for anything bigger than 256 bit, and certainly also not as small as they'd want, plus they've been aiming the development at a console performance target and we know where that is.
I reckon that Videocardz twitter blurb is pretty accurate.
And as for RDNA1 being "just GCN with a new logo"? How, then does it manage to dramatically outperform all incarnations of GCN? Heck, the 7nm, 40CU, 256-bit GDDR6, 225W 5700X essentially matches the 7nm, 60CU, 4096-bit HBM2, 295W Radeon VII. At essentially the same clock speeds (~100MHz more stock vs. stock). So "essentially the same" to you means a 50% performance/CU uplift on a much narrower memory bus, at a lower power draw despite a more power hungry memory technology, on the same node? Now, obviously the VII is by no means a perfect GPU - far from it! - but equating that development to being "just GCN with a new logo" is downright ridiculous. RDNA1 is - from AMD's own presentation of it at launch, no less - a stepping stone between GCN and a fully new architecture, but with the most important changes included. And that is reflected in both performance and efficiency.
There have been tons of situations where AMD have overpromised and underdelivered (Vega is probably the most egregious example), but the two you've picked out are arguably not that. I'm not going to go into speculation on these specific rumors just because there's too much contradictory stuff flying around at the moment, and no(!) concrete leaks despite launch being a few weeks away. (The same applies for Zen 3 btw, which suggests that AMD is in full lockdown mode pre-launch.) But again, your arguments here don't stand up. How have they been "aiming the development at a console performance target"? Yes, RDNA 2 is obviously developed in close collaboration with both major console makers, but how does that translate to their biggest PC GPU being "aimed at a console performance target"? The Xbox Series X has 52 CUs. Are you actually arguing that AMD made that design, then went and said "You know what we'll do for the PC? We'll deliver the same performance with a 50% larger die and 33% more CUs! That makes sense!"? Because if that's what you're arguing, you've gone way past reasonable skepticism.
You're also ignoring the indisputable fact that AMD's GPUs in recent years have been made on shoestring R&D budgets, with RDNA 1 being the first generation to even partially benefit from the Zen cash infusion. RDNA 2 is built almost entirely in the post-Zen period of "hey, we've suddenly got the money to pay for R&D!" at AMD. If you're arguing that having more R&D resources has no effect on the outcome of said R&D, you're effectively arguing that everyone at AMD is grossly incompetent. Which, again, is way past reasonable skepticism.
There are a few plausible explanations here:
- AMD is going wide and slow with Navi 21, aiming for efficiency rather than absolute performance. (This strongly implies there will be a bigger die later, though there obviously is no guarantee for that.)
- AMD has figured out some sort of mitigation for the bandwidth issue (though the degree of efficiency of such a mitigation is obviously entirely up in the air, as that would be an entirely new thing.)
- The bandwidth leaks are purposefully misleading.
The less plausible explanation that you're arguing:
- AMD has collectively lost its marbles and is making a huge, expensive die to compete with their own consoles despite those having much lower CU counts.
- Everyone at AMD is grossly incompetent, and can't make a high performance GPU no matter the resources.
If you ask me, there's reason to be wary of all the three first points, but much more reason to disbelieve the latter two.
TBH, I really dont care Hu is on first.
Vega 64 vs GTX 1080
Vega10 vs GP104
495mm2 vs 314mm2
484GBps vs 320GBps
300W vs 180W TDP
And Vega64 still lost to GTX 1080. Yeah Pascal kinda devastated AMD for the past 4 years. 1080 Ti (and also Titan XP) still has no worthy competition from AMD. Ampere is here so that massive amount of Pascal owners can upgrade to :D.
Unless you want to argue that this
Is radically different from this
Kinda? Maybe they scaled the 'demand' expectations on that as well :roll::roll::roll:
If we're going by those block diagrams - ignoring the fact that block diagrams are themselves extremely simplified representations of something far more complex, and assuming that they accurately represent the silicon layout - we see quite a few changes. Starting from the right, the L1 cache is now an L0 Vector cache (which begs the question of what is now L1, and where it is), the local data share is moved next to the texturing units rather than between the SPs, SPs and Vector Registers are in groups twice as large, the scheduler is dramatically shrunk, split up and distributed closer to the banks of SPs, the number of scalar units and registers is doubled, there are two entirely new caches in between the banks of SPs, also seemingly shared between the two CUs in the new Work Group Processor unit, and lastly there's no longer a branch & message unit in the diagram at all.
Sure, these look superficially similar, but expecting a complete ground-up redesign is unrealistic (there are only so many ways to make a GPU compatible with modern APIs, after all), and there are quite drastic changes even to the block layout here, let alone the actual makeup of the different parts of the diagram. These look the same only if you look from a distance and squint. Similar? Sure. But definitely not the same. I would think the change from Kepler to Maxwell is a much more fitting comparison than Maxwell to Pascal. That's true. But then you have
RX 5700 XT vs RTX 2070
Navi 10 vs TU106
251mm² vs 445mm²
448GBps vs. 448GBps
225W vs. 175W TDP
Of course this generation AMD has a node advantage, and the 5700 XT still loses out significantly in terms of efficiency in this comparison (though not at all if looking at versions of the same chip clocked more conservatively, like the 5600 XT, which beats every single RTX 20xx GPU in perf/W).
Ampere represents a significant density improvement for Nvidia, but it's nowhere near bringing them back to the advantage they had with Pascal vs. Vega.
Ampere is looking good only because Turing was so bad, over Pascal. Perf and price wise.