• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

AMD Big Navi GPU Features Infinity Cache?

Oh but they did. By throwing us bits and pieces, they practically created the hype.

show me! give me your claim to support your beyond creative analysis about upcoming Radeon lineups and their performance. The source must be officially from AMD.

:clap:

I mean look at Ampere, for comparison: we had good hints about a new power connector, some insane amount of VRAM, doubled RTRT performance. Of course, no product leaked entirely, but we had enough to set most expectations.

Aw, come on dude, don't change the topic. I know the NVIDIA is impeccable for you.
 
Last edited:
show me! give me your claim to support your beyond creative analysis about upcoming Radeon lineups and their performance. The source must be officially from AMD.

:clap:



Aw, come on dude, don't change the topic. I know the NVIDIA is impeccable for you.
No topic change. I was just giving you an objective comparison between the two launches.
 
It is a 1 year old patent, 5700 series might have it.

"But you don't need Ray Tracing" is not an excuse for >500usd GPU.
But you "need" RT, because there is that handful of games (most of which either sponsored by green, or outright developed by them) doesn't fly too far.

Especially taking who reigns the console throne.
 
History is good science, the problem with most TPU users is that they only go back 2 generations, which is not much of history if you ask me.
True.


I personally don't think that the new Big Navy will be only 256-bit card, it's just impossible.
Logic says, 256-bit isn't enough for such a card, especially at 4K. But there are also enough people who claim 10 GB VRAM is enough for 4K. ;) Needed bandwidth depends a lot on the cache system, data optimization and data compression. Let's see if AMD has done some "magic" with RDNA 2. If "Infinity Cache" is real and works as expected, why not 256-bit with GDDR6? It might still be a disadvantage in some cases at high resolutions. But it could give AMD an advantage with cost and power consumption that's worth it. Intel used an L4 cache on their Iris iGPUs before and it worked quite well.


So it seems AMD is playing the 2nd place but affordable card once again.
Such thoughts are for people who think 1st and 2nd place are determined by performance only. I think the one with better performance/watt and performance/mm² wins. Raw performance isn't everything. Or do you think Nvidia can always increase TDP by 100W just to keep the performance crown. ;)


225 > 300W

That is not 80-90% in any sliver of reality I know of. Not even with a minor shrink and IPC / efficiency bump. Because 50%... yeah.
As I said before, simple math.
225W to 300W = 1.33x
1.33 x 1.5 = 2

So, yes. Given the higher TDP and increased power efficiency Big Navi could be twice as fast. If AMD really achieves 50% better power efficiency is another story. But I think they will be at least closer than what Nvidia promised. 1.9x on marketing slides, ~1.2x in reality.


Rumors, rumors, rumors. There are also rumors about AMD giving false information to the AIBs and cards with locked BIOS. Nvidia gave false information to the AIBs too. For example, all AIBs had no clue about the real shader counts of Ampere before launch. The claim about AMD only targeting GA104 also makes no sense. AMD always said that Big Navi will target the high performance segment. Which definitely doesn't sound like a ...04 competitor.
 
Last edited:
As I said before, simple math.
225W to 300W = 1.33x
1.33 x 1.5 = 2
Solid first posts, welcome to TPU.

Where did you say that before, btw? ;)
 
Logic says, 256-bit isn't enough for such a card, especially at 4K. But there are also enough people who claim 10 GB VRAM is enough for 4K. ;) Needed bandwidth depends a lot on the cache system, data optimization and data compression. Let's see if AMD has done some "magic" with RDNA 2. If "Infinity Cache" is real and works as expected, why not 256-bit with GDDR6? It might still be a disadvantage in some cases at high resolutions. But it could give AMD an advantage with cost and power consumption that's worth it. Intel used an L4 cache on their Iris iGPUs before and it worked quite well.

Such thoughts are for people who think 1st and 2nd place are determined by performance only. I think the one with better performance/watt and performance/mm² wins. Raw performance isn't everything. Or do you think Nvidia can always increase TDP by 100W just to keep the performance crown. ;)

As I said before, simple math.
225W to 300W = 1.33x
1.33 x 1.5 = 2

So, yes. Big Navi could be twice as fast per watt. If AMD really achieves 50% better power efficiency is another story. But I think they will be at least closer than what Nvidia promised. 1.9x on marketing slides, ~1.2x in reality.

Here is a simpler math, Navi21 XT is 50% faster than 5700XT at the same power consumption just like AMD promised (so 50% perf/watt improvement). 5700XT is 220W TDP, same as 3070. Which make Navi21 XT compete directly with 3070. For the next month or so AMD will try to increase core clocks just so that Navi21 XT is a tad faster than 3070, using more power as a result.

Spec wise Navi21 and 3070 are just freakishly similar
256bit bus - 448GBps bandwidth
20-22 TFLOPS FP32
220-230W TGP

Around 2080 Ti performance or +15%. Overall I think Navi21 will be splitting distance to 3080 at 1080p/1440p gaming, but not so much at 4K or Ray Tracing.
 
Low quality post by Crustybeaver
Got to laugh at all the deluded types thinking they're going to get 3090 perf for considerably less than the price of a 3080. AMD fans really are a special type of special.
 
  • Like
Reactions: bug
Got to laugh at all the deluded types thinking they're going to get 3090 perf for considerably less than the price of a 3080. AMD fans really are a special type of special.
Where did you get that? Nobody knows anything about the pricing and that is decided at the last moment, or even after that sometimes.
This thread is about a supposedly innovative memory architecture that would allow AMD to compete in the high end with only a 256bit memory bus and GDDR6. This should allow AMD to build these cards cheaper. Whether they will sell them cheaper, and by how much, is a completely different story.
 
Here is a simpler math, Navi21 XT is 50% faster than 5700XT at the same power consumption just like AMD promised (so 50% perf/watt improvement). 5700XT is 220W TDP, same as 3070. Which make Navi21 XT compete directly with 3070. For the next month or so AMD will try to increase core clocks just so that Navi21 XT is a tad faster than 3070, using more power as a result.

Spec wise Navi21 and 3070 are just freakishly similar
256bit bus - 448GBps bandwidth
20-22 TFLOPS FP32
220-230W TGP

Around 2080 Ti performance or +15%. Overall I think Navi21 will be splitting distance to 3080 at 1080p/1440p gaming, but not so much at 4K or Ray Tracing.
The problem with that is that GPU TFlops cannot whatsoever be compared across GPU architectures for any other use than pure compute. Gaming performance/Tflop is vastly different between architectures. This is doubly true with Ampere and its doubled (but only some times) FP32 count. An example: The 2080 performs at 66% of the 3080 at 1440p in TPU's test suite. The 3080 delivers 29.77 TFlops FP32 vs. 10.07 for the 2080. 100 / 29.77 = 3,36. 66 / 10.07 = 6,55. In other words, the 2080 delivers twice the performance per teraflop of FP32 compute of the 3080. Similarly, when comparing to the 5700 XT, that calculation becomes 57 / 9.754 = 5.84, or 74% higher perf/tflop than the 3080. So please, for the love of all rational thinking, stop using FP32 Tflops as a way to estimate gaming performance across architectures. It is only somewhat valid within the same architecture.
 
The problem with that is that GPU TFlops cannot whatsoever be compared across GPU architectures for any other use than pure compute. Gaming performance/Tflop is vastly different between architectures. This is doubly true with Ampere and its doubled (but only some times) FP32 count. An example: The 2080 performs at 66% of the 3080 at 1440p in TPU's test suite. The 3080 delivers 29.77 TFlops FP32 vs. 10.07 for the 2080. 100 / 29.77 = 3,36. 66 / 10.07 = 6,55. In other words, the 2080 delivers twice the performance per teraflop of FP32 compute of the 3080. Similarly, when comparing to the 5700 XT, that calculation becomes 57 / 9.754 = 5.84, or 74% higher perf/tflop than the 3080. So please, for the love of all rational thinking, stop using FP32 Tflops as a way to estimate gaming performance across architectures. It is only somewhat valid within the same architecture.
True. Going by Nvidia promotional marketing (probably over-optimistic) the 3070 is equivalent in performance with the 3080Ti, which means that 20 Ampere TFlops are roughly equivalent in gaming with 14 Turing TFlops.
 
The problem with that is that GPU TFlops cannot whatsoever be compared across GPU architectures for any other use than pure compute. Gaming performance/Tflop is vastly different between architectures. This is doubly true with Ampere and its doubled (but only some times) FP32 count. An example: The 2080 performs at 66% of the 3080 at 1440p in TPU's test suite. The 3080 delivers 29.77 TFlops FP32 vs. 10.07 for the 2080. 100 / 29.77 = 3,36. 66 / 10.07 = 6,55. In other words, the 2080 delivers twice the performance per teraflop of FP32 compute of the 3080. Similarly, when comparing to the 5700 XT, that calculation becomes 57 / 9.754 = 5.84, or 74% higher perf/tflop than the 3080. So please, for the love of all rational thinking, stop using FP32 Tflops as a way to estimate gaming performance across architectures. It is only somewhat valid within the same architecture.
You would be better off comparing 2080Ti with 3080. Other than the 2xSP and faster VRAM they are very similar. Based on TPU reviews, even clock speeds are close enough.

TFLOPs can be compared across architectures as long as they are similar enough. Granted, that is never quite so easy. GCN had a problem with utilization, Turing has the FP32+INT thing, Ampere throws a huge wrench into trying to compare it with the 2xFP32 scheme it has.
 
Here's the way I'm seeing it from a AMD angle perhaps they've taken pretty much everything they've gleaned from the I/O chiplet and CCX issues with Ryzen and applied it towards their GPU architecture and gone steps further. Latency can make a pronounced difference especially depending upon which part of the spectrum it lands. L1 latency is actually a ideal scenario that's the end of the spectrum that's more beneficial. There will probably certain GPU workloads where if this infinity cache is indeed part of RDNA2 that it does exceedingly well at them.
 
Spec wise Navi21 and 3070 are just freakishly similar
256bit bus - 448GBps bandwidth
20-22 TFLOPS FP32
220-230W TGP
Actually they are not. I heard rumors of the TDP between 250W and 300W. With the latter being more likely. I never heard rumors about 220W. Maybe reasonable for cut down Navi 21, but not full Navi 21. You also cannot directly compare TFLOPS. If that was the case then 3070 should be waayyy faster than 2080 Ti (~20.3 vs ~13.5 fp32 TFLOPS). But both are expected to have similar performance. Ampere's TFLOPS scale much worse than Turing's. The main reason for that is the changed shader architecture. Fp and integer execution units were unified. There can only be one retired operation per clock, fp or integer. Which means a lot of fp resources are unused most of the time because they are reserved for integer operations. RDNA's and Turing's TFLOPS are much more comparable. I think that won't change much with RDNA 2. Which means Navi 21's rumored ~20.5 fp32 TFLOPS might be more like ~30 fp32 TFLOPS of Ampere. Which is in fact about 3080's fp32 TFLOPS. The number of TMUs is also more comparable between Navi 21 and GA102. GA104 has a lot less TMUs.
 
True.



Logic says, 256-bit isn't enough for such a card, especially at 4K. But there are also enough people who claim 10 GB VRAM is enough for 4K. ;) Needed bandwidth depends a lot on the cache system, data optimization and data compression. Let's see if AMD has done some "magic" with RDNA 2. If "Infinity Cache" is real and works as expected, why not 256-bit with GDDR6? It might still be a disadvantage in some cases at high resolutions. But it could give AMD an advantage with cost and power consumption that's worth it. Intel used an L4 cache on their Iris iGPUs before and it worked quite well.



Such thoughts are for people who think 1st and 2nd place are determined by performance only. I think the one with better performance/watt and performance/mm² wins. Raw performance isn't everything. Or do you think Nvidia can always increase TDP by 100W just to keep the performance crown. ;)



As I said before, simple math.
225W to 300W = 1.33x
1.33 x 1.5 = 2

So, yes. Given the higher TDP and increased power efficiency Big Navi could be twice as fast. If AMD really achieves 50% better power efficiency is another story. But I think they will be at least closer than what Nvidia promised. 1.9x on marketing slides, ~1.2x in reality.



Rumors, rumors, rumors. There are also rumors about AMD giving false information to the AIBs and cards with locked BIOS. Nvidia gave false information to the AIBs too. For example, all AIBs had no clue about the real shader counts of Ampere before launch. The claim about AMD only targeting GA104 also makes no sense. AMD always said that Big Navi will target the high performance segment. Which definitely doesn't sound like a ...04 competitor.

Agreed on all points really, I'm just more of a pessimist when it comes to 'magic' because really... there's never been any. As for what AMD said for Big Navi... lol. They said that for everything they released on the top of the stack and since Fury X none of it really worked out well. Literally nothing. It was always too late, too little, and just hot and noisy. RDNA1 was just GCN with a new logo. This time it'll be different, really? They're just going to make the next iterative step towards 'recovery'. It won't be a major leap, if they knew how to, they'd have done it years ago.

But the last bit of your post hits home with me - AMD wasn't jebaited. They've had this all along and its all they'll write, it is completely plausible, as expected, and not Youtuber-madness territory as I've been saying all along ever since RDNA2 became a term. They're trailing 1,5 gen, still, and they always have and probably always will be since Polaris. The only improvement now is probably the time to market. Which is already a big thing - as you say, they don't NEED to fight the 3090. But they really did/should fight the 3080. That card is fast enough that most people can make do with 'a little less' but that really does relegate AMD to a competition over the 500 dollar price point, and not the 700 dollar one. Which means they're effectively busting the 3070 at best.

I'm also entirely with you when you say the best GPU isn't necessarily the fastest one. Yes. If AMD can pull off a more efficient, smaller die that performs 'just as well' but not in the top end, that is just fine. But that has yet to happen. They have a node advantage, but they're still not at featureset parity and already not as efficient as Nvidia's architecture. Now they need to add RT.

I'm not all that excited because all stars have already aligned long ago. You just gotta filter out the clutter, and those efficiency figures are among them. Reality shines through in the cold hard facts: limited to GDDR6, not as power efficient if they'd have to go for anything bigger than 256 bit, and certainly also not as small as they'd want, plus they've been aiming the development at a console performance target and we know where that is.

I reckon that Videocardz twitter blurb is pretty accurate.
 
Last edited:
  • Like
Reactions: bug
Which means they're effectively busting the 3070 at best.
I have no idea how you can come to the conclusion that AMD will only compete with the 3070 with a 536mm² die. It's their biggest die ever, with pretty much the same amount of transistors as the 3080, only to compete with the GA104? If that were true, I think Radeon should just forget desktop graphics altogether for the future, but I'm pretty certain it's not, you should listen to Tom and Paul more.
 
I have no idea how you can come to the conclusion that AMD will only compete with the 3070 with a 536mm² die. It's their biggest die ever, with pretty much the same amount of transistors as the 3080, only to compete with the GA104? If that were true, I think Radeon should just forget desktop graphics altogether for the future, but I'm pretty certain it's not, you should listen to Tom and Paul more.

They'll end up between 3070 and 3080, but they won't fight the 3080 with that bandwidth. Just not happening.
 
They'll end up between 3070 and 3080, but they won't fight the 3080 with that bandwidth. Just not happening.
I'm marking up this post, you already said you'll buy this card if you are proved wrong, right? :D
 
I'm marking up this post, you already said you'll buy this card if you are proved wrong, right? :D

You can mark all of my posts, I've been saying this all along bud
 
Agreed on all points really, I'm just more of a pessimist when it comes to 'magic' because really... there's never been any. As for what AMD said for Big Navi... lol. They said that for everything they released on the top of the stack and since Fury X none of it really worked out well. Literally nothing. It was always too late, too little, and just hot and noisy. RDNA1 was just GCN with a new logo. This time it'll be different, really? They're just going to make the next iterative step towards 'recovery'. It won't be a major leap, if they knew how to, they'd have done it years ago.
Sorry, I get that being skeptical is good, but what kind of alternate reality have you been living in? While the Fury X definitely wasn't a massive improvement over previous GCN GPUs beyond just being bigger, it certanly wasn't noisy - except for some early units with whiny pumps it's still one of the quietest reference GPUs ever released. As for perf/W, it mostly kept pace with the 980 Ti above 1080p, but it did consume a bit more power doing so, and had no overclocking headroom - the latter being the only area where AMD explicitly overpromised anything for the Fury X. Overall it was nonetheless competitive.

And as for RDNA1 being "just GCN with a new logo"? How, then does it manage to dramatically outperform all incarnations of GCN? Heck, the 7nm, 40CU, 256-bit GDDR6, 225W 5700X essentially matches the 7nm, 60CU, 4096-bit HBM2, 295W Radeon VII. At essentially the same clock speeds (~100MHz more stock vs. stock). So "essentially the same" to you means a 50% performance/CU uplift on a much narrower memory bus, at a lower power draw despite a more power hungry memory technology, on the same node? Now, obviously the VII is by no means a perfect GPU - far from it! - but equating that development to being "just GCN with a new logo" is downright ridiculous. RDNA1 is - from AMD's own presentation of it at launch, no less - a stepping stone between GCN and a fully new architecture, but with the most important changes included. And that is reflected in both performance and efficiency.

There have been tons of situations where AMD have overpromised and underdelivered (Vega is probably the most egregious example), but the two you've picked out are arguably not that.

I'm not all that excited because all stars have already aligned long ago. You just gotta filter out the clutter, and those efficiency figures are among them. Reality shines through in the cold hard facts: limited to GDDR6, not as power efficient if they'd have to go for anything bigger than 256 bit, and certainly also not as small as they'd want, plus they've been aiming the development at a console performance target and we know where that is.
I'm not going to go into speculation on these specific rumors just because there's too much contradictory stuff flying around at the moment, and no(!) concrete leaks despite launch being a few weeks away. (The same applies for Zen 3 btw, which suggests that AMD is in full lockdown mode pre-launch.) But again, your arguments here don't stand up. How have they been "aiming the development at a console performance target"? Yes, RDNA 2 is obviously developed in close collaboration with both major console makers, but how does that translate to their biggest PC GPU being "aimed at a console performance target"? The Xbox Series X has 52 CUs. Are you actually arguing that AMD made that design, then went and said "You know what we'll do for the PC? We'll deliver the same performance with a 50% larger die and 33% more CUs! That makes sense!"? Because if that's what you're arguing, you've gone way past reasonable skepticism.

You're also ignoring the indisputable fact that AMD's GPUs in recent years have been made on shoestring R&D budgets, with RDNA 1 being the first generation to even partially benefit from the Zen cash infusion. RDNA 2 is built almost entirely in the post-Zen period of "hey, we've suddenly got the money to pay for R&D!" at AMD. If you're arguing that having more R&D resources has no effect on the outcome of said R&D, you're effectively arguing that everyone at AMD is grossly incompetent. Which, again, is way past reasonable skepticism.

There are a few plausible explanations here:
- AMD is going wide and slow with Navi 21, aiming for efficiency rather than absolute performance. (This strongly implies there will be a bigger die later, though there obviously is no guarantee for that.)
- AMD has figured out some sort of mitigation for the bandwidth issue (though the degree of efficiency of such a mitigation is obviously entirely up in the air, as that would be an entirely new thing.)
- The bandwidth leaks are purposefully misleading.

The less plausible explanation that you're arguing:
- AMD has collectively lost its marbles and is making a huge, expensive die to compete with their own consoles despite those having much lower CU counts.
- Everyone at AMD is grossly incompetent, and can't make a high performance GPU no matter the resources.

If you ask me, there's reason to be wary of all the three first points, but much more reason to disbelieve the latter two.
 
Such thoughts are for people who think 1st and 2nd place are determined by performance only. I think the one with better performance/watt and performance/mm² wins. Raw performance isn't everything. Or do you think Nvidia can always increase TDP by 100W just to keep the performance crown. ;)
I agree with you, however, I was pointing out a historical pattern by AMD, and yes Nvidia will do just about everything to keep its profits up.

TBH, I really dont care Hu is on first.
 
RDNA1 was just GCN with a new logo. This time it'll be different, really?
Oh Really? From what part of you are you pulling this? If it’s your brain, it’s sad...
 
I have no idea how you can come to the conclusion that AMD will only compete with the 3070 with a 536mm² die. It's their biggest die ever, with pretty much the same amount of transistors as the 3080, only to compete with the GA104? If that were true, I think Radeon should just forget desktop graphics altogether for the future, but I'm pretty certain it's not, you should listen to Tom and Paul more.

I have to remind you that
Vega 64 vs GTX 1080
Vega10 vs GP104
495mm2 vs 314mm2
484GBps vs 320GBps
300W vs 180W TDP

And Vega64 still lost to GTX 1080. Yeah Pascal kinda devastated AMD for the past 4 years. 1080 Ti (and also Titan XP) still has no worthy competition from AMD. Ampere is here so that massive amount of Pascal owners can upgrade to :D.
 
Right, so you're basically saying Ampere is for the last 0.1% fps chasing public :shadedshu:
 
Oh Really? From what part of you are you pulling this? If it’s your brain, it’s sad...

Its just another update to GCN, a good one, I won't deny that... but its no different from Maxwell > Pascal for example, and everyone agrees that is not a totally new arch either. They moved bits around, etc.

Unless you want to argue that this
1602157119431.png


Is radically different from this

1602157134678.png


Right, so you're basically saying Ampere is for the last 0.1% fps chasing public :shadedshu:

Kinda? Maybe they scaled the 'demand' expectations on that as well :roll::roll::roll:
 
Last edited:
Back
Top