AMD Radeon "Navy Flounder" Features 40CU, 192-bit GDDR6 Memory

#101

BoboOOZ

AssimilatorThe thing is, with trash channels like MLID

ValantarAs for the ad honinems: really? Where? Seriously, please show some quotes. As far as I'm aware I haven't commented on you personally whatsoever. (And no, saying your approach to source criticism is poor is not an ad hominem.)

That is an ad hominem right there, I never said the ad hominem was directed at me, that would be simply rude and I really hope we're past this kind of poor behaviour.

Anyways, although I appreciate sincerely the fact that you are trying to educate me on how to choose my sources, there's no need for it really, I suggest you apply the methodology that you preach and share with us your solid sources and information about the architecture of the coming Navi21. And that would have the added quality of being on topic.

#102

Valantar

BoboOOZThat is an ad hominem right there, I never said the ad hominem was directed at me, that would be simply rude and I really hope we're past this kind of poor behaviour.

Anyways, although I appreciate sincerely the fact that you are trying to educate me on how to choose my sources, there's no need for it really, I suggest you apply the methodology that you preach and share with us your solid sources and information about the architecture of the coming Navi21. And that would have the added quality of being on topic.

I hate having to pull out the dictionary in a debate, but no, that is not at all an ad hominem.

Wordnik
adjective: Attacking a person's character or motivations rather than a position or argument.

Saying that someone's approach to source criticism is poor in a debate where source criticism is relevant is an entirely on-topic argument. Asking you to provide proof for making a claim about personal attacks isn't an on-topic response, but it is a valid response to an unsourced claim. Nether says anything about the motivations or character of the person in question. I also fail to understand how pointing out if someone was making a personal attack against you would somehow be rude? I mean, isn't that what you do when people behave badly - call them out and ask them to change their behaviour?

As for my sources: I don't have any, as I haven't been making any claims about the architecture of these GPUs. I've speculated loosely on the basis of existing chips and known data about the nodes, and I have engaged with the speculations of others by attempting to compare them with what we know about current products and generalizable knowledge about chip production and design, but I have made zero claims about how Navi 21 will be. Especially architecturally, as there is no way of knowing that without a trustworthy source. So I don't quite see how your objection applies. If I had been making any claims, I obviously would need to source them (which, for example, I did in rebutting the RTX 3080 launch being a paper launch).

As for pointing out ad hominem arguments: If you're alluding to @Assimilator's semi-rant against MLID, that post presents an entirely valid criticism of a specific sub-genre of tech youtubers. It does of course engage with their motivations - in this case, preferring clicks, ad-views and profit over trustworthy sourcing - but that is a relevant argument when looking into the trustworthiness of a source. One would have a very, very hard time judging the quality of a source if one wasn't allowed to account for their motivations in such a judgement, and there's nothing unfair or personal about that. For example, there's long-running debate about open access vs. paywalled publication of research in academia, a situations in which the arguments presented by publishers and paywalled journals obviously need to be contrasted with their motivations for profit and continued operation, as such motivations can reasonably cause them to be biased. Just like the statements of most career politicians obviously need to be judged against their desire to get reelected.

Now can we please move past this silly bickering?

#103

InVasMani

dragontamer5788This has been going around Reddit, and basically sums up the rumors:

Its a bit of a meme and non-serious, but I think yall get the gist. There's so many conflicting rumors on Navi its hard to take any of these "leaks" seriously.

I want E + F!!!! ;)

#104

BoboOOZ

ValantarI also fail to understand how pointing out if someone was making a personal attack against you would somehow be rude? I mean, isn't that what you do when people behave badly - call them out and ask them to change their behaviour?

I really have a hard time with your semantics.
Ad hominem it's attacking the person instead of their arguments. It's rude and it's a simple way of trying to win an argument without being right. If you attacked me instead of my ideas, that would be indeed both rude and a sophism, but you're not doing that, you're just a tad patronizing, but hey I've seen worse on fora.
I haven't alluded to anything, I've put in a quote that you asked, the opening of the quote is a typical ad hominem.
Now, I already told you that you have a weird way of not putting enough effort to understand the other's ideas, but putting a lot of effort to argue with them.
Probably the fastest way to end the bickering would be to use the ignore button, but that would be a pity because from time to time you guys do say interesting things. But at the same time most discussions end up feeling like a waste of my time so, yeah, maybe that would be the better solution.

Now back to the topic of this discussion: there are some rumors about AMD having overhauled the memory architecture of Big Navi. The 2 guys talking about that are RGT and MLID. These are just rumors, although RGT said the rumors came with photos of the real cards and showed them. As always, they might be true, they might not.
If you have a source that says otherwise, or an argument to why that is not true, please share. If you have nothing to contribute to the conversation than personal attacks and free advice about how to check sources, there's really no need for it, and you already did that.
So, let's get on to the topic, please.

#105

Unregistered

ValantarUhm, that's the opposite of what you just said:

Brain fart, my brain (or what's left of it) and my cerebellum had troubles communicating with each other :banghead:
I meant the official prices by nVidia.

#106

TheoneandonlyMrK

AssimilatorThe thing is, with trash channels like MLID it's not even filtering good from bad, because there is no good: it's regurgitated from legitimate sources like Gamers Nexus in order to make MLID appear legitimate, and he then abuses that apparent legitimacy to peddle his half-baked bullshit. Result, people who aren't good at discerning trash from quality believe it all and fall back on "but he was right regarding XXX (that he copied from a real source) so he must be right on YYY (nonsense that he crapped out)".

Melding your lies with the mainstream's truth in order to make your lies appear truthful is the oldest trick in the book when it comes to manipulating discourse and public opinion (see: Russia and US elections), and unfortunately most people choose news sources based on whether that source agrees with their worldview, rather than how trustworthy said source is. They also have a penchant for doubling down and defending "their" news source when said the credibility of said source is brought into question (instead of holding it accountable), or handwaving the source's inaccuracy away with excuses such as "everyone gets it wrong now and then". Except the dodgy sources get it wrong time and time again.

Make no mistake though, MLID is laughing all the way to the bank with every cent of ad revenue he gets from every chump who watches his reddit clickbait videos. Anyone who wants to reward a liar for his lies, that's your business - but don't expect me to do the same.

Total nonesense from a guy who hasn't watched mlid ,his sources beat Gamer's Nexus time and again.
I watch them all(tubers, websites etc, etc)
, and extract small amounts of trend data personally, then use salt.

Flounder's eh , where's that salt.

#107

Caring1

BoboOOZAd hominem it's attacking the person instead of their arguments. It's rude and it's a simple way of trying to win an argument without being right.

Sometimes it's telling the person the truth they don't like.

BoboOOZyou're just a tad patronizing, but hey I've seen worse on fora.

Pot, meet kettle, as for seeing worse, were you told you were the cause then too?

BoboOOZProbably the fastest way to end the bickering would be to use the ignore button, but that would be a pity because from time to time you guys do say interesting things. But at the same time most discussions end up feeling like a waste of my time so, yeah, maybe that would be the better solution.

No doubt people will use the ignore button, against you.
As for being a waste of your time, get over yourself and stop wasting our time with your pompous attitude.

#108

Unregistered

theoneandonlymrkTotal nonesense from a guy who hasn't watched mlid ,his sources beat Gamer's Nexus time and again.
I watch them all(tubers, websites etc, etc)
, and extract small amounts of trend data personally, then use salt.

Flounder's eh , where's that salt.

I watched his videos, they are mostly spectaculation and poor analysis and very biased, I'm still waiting for the magical hardware upgrade for the PS5.

#109

sergionography

ValantarWell, I guess things change differently depending on your location. Here in the Nordics, prices have increased significantly across the board over the past decade. That's mostly due to the NOK/SEK to USD conversion rate, which made a big jump around 2015 or so, but as I said also due to knock-on effects from this. The same applies to prices in EUR though, as the same USD price jump can be seen there. This largely accounds for the change in practices where previously USD MSRP w/o tax ~= EU MSRP w/tax, simply because the EUR (and closely linked currencies) used to be worth more relative to the USD. That means that GPUs, consoles, phones, whatever - they've all become noticeably more expensive.

That is of course possible, but remember that power increases exponentially as clock speeds increase, so a 10-15% increase in clocks never results in a 10-15% increase in power draw - something more along the lines of 25-35% is far more likely. Which is part of why I'm skeptical of this. Sony's rated clocks are as you say peak boost clocks, but they have promised that the console will run at or near those clocks for the vast majority of use cases. That means that you're running a slightly overclocked 4900H or HS (the consoles have the same reduced cache sizes as Renoir IIRC, so let's be generous and say they manage 3.5GHz all-core at 40W) and an overclocked 5700 within the SoC TDP. That leaves X minus 40W for the GPU. Your numbers then mean they would be able to run an overclocked 5700 equivalent at just 135W. If this was approached through a wide-and-slow, more CUs but lower clocks approach (like the XSX), I would be inclined to agree with you that it would be possible given the promised efficiency improvements (up to 50%, though "up to" makes that a very loose promise) and node improvements. But for a chip of the same width with clocks pushed likely as high as they are able to get them? We have plenty of data to go on for AMD GPU implementations like that (5700 XT vs 5700, RX 590 vs 580, etc.), and what that data shows us is that power consumption makes a very significant jump to reach those higher clocks. And while Smart Shift will of course help some with power balancing, it won't have that much to work with given the likely 40-50W power envelope of the CPU. Even lightly threaded games are unlikely to drop below 30W of CPU power consumption after all, so even that gives the GPU just 155W to work with.

You're also too optimistic in thinking that 50% perf/W increase is across the board in all cases. The wording was - very deliberately, as this was an investor call - up to 50%. That likely means a worst case vs best case scenario comparison, so something like a 5700 XT compared to the 6000-series equivalent of a 5600 XT. The PS5 GPU with its high clocks does not meet the criteria for being a best case scenario for efficiency. Of course that was stated a while ago, and they might have managed more than 50% best-case-scenario improvements, but that still doesn't mean we're likely to get 50% improvement when clocks are pushed high.

All of those fish code names just made me think of Dr. Seuss.

Amd/comments/j06xcdAnd here's a leak. Apparently it's data extracted from the kernel so who knows. It is in line with my guesstimation, with the exception of that one part with 2500mhz. We will see soon enough

#110

Anymal

Both hbm and gddr6 is bollocks.

#111

InVasMani

AnymalBoth hbm and gddr6 is bollocks.

Unlikely perhaps, but very possible with a tiered cache level approach. Put the most important stuff in the HBM and less important stuff in the slower GDDR6. The plus side of it would be GDDR6 is more affordable than HBM is overall. Still I don't know how cost effective it would be in the end doing it that way because you've still got the additional cost of the interposer for the HBM. If that cost isn't much it's very plausible especially with HBCC they could tier NVME with the GDDR6 to bump up the performance parity closer plus there is variable rate shading as well if that could be more utilized on the GDDR6 data the drop off in bandwidth from HBM combined with the other things mentioned would be lessened.

I guess it boils down to performance, cost, and efficiency and how they all interrelate together. I would be viewed a bit like GTX970 though doing it in that manner, but the difference is they could have another cache tier in NVME boosting the performance parity drop off and variable rate shading is the other major factor had that been a thing with the GTX970 some of those performance points might not have been as pronounced and big a issue. I only see it happening if makes sense from a relative cost to performance perspective otherwise it seems far more likely they use one or the other.

Something to speculate is if they did do infinity cache they might scale it alongside the memory bus bit rate something like 192-bit rate 96MB infinity cache 6GB VRAM /256-bit rate 128MB infinity cache 12GB VRAM/ 320-bit rate160MB infinity cache 18GB VRAM. It's anyone's guess what AMD's game plan is for RDNA2, but we'll know soon enough.

#112

BoboOOZ

The idea of having one chip with 2 memory controllers is not bollocks, a GPU mask is extremely expensive, so having one that can satisfy multiple SKU means a lot of money saved, especially if one of the variants is not destined to be sold in huge volumes.

#113

Anymal

But just an idea.

#114

BoboOOZ

AnymalBut just an idea.

There were some rumors about GPU coming with both type of controllers. It was just an idea because there was no solid info on RDNA2. The info in this post is the most solid we've got by far until now.
www.techpowerup.com/forums/threads/amd-radeon-navy-flounder-features-40cu-192-bit-gddr6-memory.272437/post-4357107

#115

Valantar

BoboOOZ*snip*

I'll keep this in a spoiler tag as it's getting quite OT, but sadly it still necessitates a response.

sergionographyAmd/comments/j06xcdAnd here's a leak. Apparently it's data extracted from the kernel so who knows. It is in line with my guesstimation, with the exception of that one part with 2500mhz. We will see soon enough

That sure looks interesting. 2.5GHz for a GPU, even a relatively small one, is bonkers even if it's AMDs boost clock (=maximum boost, not sustained boost) spec. And if 40 CUs at 2.5GHz is also with a 170W TBP as some sites are reporting from this same data, that is downright insane. Also rather promising for overclocking of the 80CU SKU if that ends up being clocked lower. A lot of driver data like this is preliminary (especially clocks) but that also tends to mean that it's on the low end rather than overly optimistic. Which makes this all the more weird. I'm definitely not less interested in what they have to show in a month after this, that's for sure.

I'm pretty dubious about the chance of any dual concurrent VRAM config though. That would be a complete and utter mess on the driver side. How do you decide which data ought to live where? It also doesn't quite compute in terms of the configuration: if you have even a single stack of HBM2(e), adding a 192-bit GDDR6 bus to that ... doesn't do all that much. A single stack of HBM2e goes up to at least 12GB (though up to 24GB at the most), and does 460GB/s bandwidth if it's the top-end 3.6Gbps/pin type. Does adding another layer of GDDR6 below that actually help anything? I guess you could increase cumulative bandwidth to ~800GB/s, but that also means dealing with a complicated two-tier memory system, which would inevitably carry significant complications with it. Even accounting for the cost of a larger interposer, I would think adding a second HBM2e stack would be no more expensive and would perform better than doing a HBM2e+GDDR6 setup. If it's actually that the fully enabled SKU gets 2x HBM2e, cut-down gets 192-bit GDDR6, on the other hand? That I could believe. That way they could bake both into a single die rather than having to make the HBM SKU a separate piece of silicon like the (undoubtedly very expensive) Apple only Navi 12 last time around. It would still be expensive and waste silicon, but given the relatively limited amount of wafers available from TSMC, it's likely better to churn out lots of one adaptable chip than to tape out two similar ones.

#116

BoboOOZ

ValantarI'm pretty dubious about the chance of any dual concurrent VRAM config though. That would be a complete and utter mess on the driver side. How do you decide which data ought to live where?

The rumors that I heard (and which were also mentioned MLID, by the way) is that both controllers will exist on-chip, but only one is active on any given SKU. So basically they will activate the HBM one for the prosumer cards and the GDDR6 for the cheaper versions. But this is from the "take it with a huge pinch of salt" category.

For the other thing, I'll try to give you a proper answer via PM later on.

#117

Assimilator

ValantarThat sure looks interesting. 2.5GHz for a GPU, even a relatively small one, is bonkers even if it's AMDs boost clock (=maximum boost, not sustained boost) spec. And if 40 CUs at 2.5GHz is also with a 170W TBP as some sites are reporting from this same data, that is downright insane. Also rather promising for overclocking of the 80CU SKU if that ends up being clocked lower. A lot of driver data like this is preliminary (especially clocks) but that also tends to mean that it's on the low end rather than overly optimistic. Which makes this all the more weird. I'm definitely not less interested in what they have to show in a month after this, that's for sure.

2.5GHz @ 170W TBP is impossible unless we are talking an incredibly tiny chip.

BoboOOZThe rumors that I heard (and which were also mentioned MLID, by the way) is that both controllers will exist on-chip, but only one is active on any given SKU. So basically they will activate the HBM one for the prosumer cards and the GDDR6 for the cheaper versions. But this is from the "take it with a huge pinch of salt" category.

Not gonna happen. That's a massive amount of die space and transistors to be wasting for no good reason. No designer is going to (be allowed to) do that because it's essentially throwing money away. If they want to use different memory with the same GPU, they will make a derivative design with a different memory controller, and at that stage you might as well split that derivative design off entirely and cater it entirely for the prosumer market (e.g. GA100 vs GA102). AMD's long-running focus on keeping costs down makes this even less likely.

The additional thing that makes this a "not gonna happen" is the amount of die area that's going to be needed for ray-tracing hardware. Considering how large Turing and Ampere dies are, wasting space on inactive MCs would be an exceedingly poor, and therefore unlikely, decision on AMD's part.

As for the hybrid GDDR/HBM on a single card... that's pie-in-the-sky BS, always has been.

#118

BoboOOZ

AssimilatorNot gonna happen. That's a massive amount of die space and transistors to be wasting for no good reason. No designer is going to (be allowed to) do that because it's essentially throwing money away. If they want to use different memory with the same GPU, they will make a derivative design with a different memory controller, and at that stage you might as well split that derivative design off entirely and cater it entirely for the prosumer market (e.g. GA100 vs GA102). AMD's long-running focus on keeping costs down makes this even less likely.

The additional thing that makes this a "not gonna happen" is the amount of die area that's going to be needed for ray-tracing hardware. Considering how large Turing and Ampere dies are, wasting space on inactive MCs would be an exceedingly poor, and therefore unlikely, decision on AMD's part.

As for the hybrid GDDR/HBM on a single card... that's pie-in-the-sky BS, always has been.

I don't try to pretend that I understand what's gonna come, but a 128MB cache would also be absolutely huge... None of the rumors that I have seen makes complete sense to me.

Assimilator2.5GHz @ 170W TBP is impossible unless we are talking an incredibly tiny chip.

The Newegg leak puts this SKU at 150W :kookoo:

#119

Valantar

Assimilator2.5GHz @ 170W TBP is impossible unless we are talking an incredibly tiny chip.

I completely agree. As I said, absolutely insane if true - an unprecedented GPU clock speed in absolute numbers, a near unprecedented boost in clock speeds, and combined with a significant drop in power consumption compared to what ought to be a very close comparison (previous gen of same arch, slightly less mature node)? Something must be off about that.

AssimilatorNot gonna happen. That's a massive amount of die space and transistors to be wasting for no good reason. No designer is going to (be allowed to) do that because it's essentially throwing money away. If they want to use different memory with the same GPU, they will make a derivative design with a different memory controller, and at that stage you might as well split that derivative design off entirely and cater it entirely for the prosumer market (e.g. GA100 vs GA102). AMD's long-running focus on keeping costs down makes this even less likely.

The thing is, with a (rumored) 192-bit GDDR6 bus it's not a massive amount of die space - significant, yes, but possibly a cost it could make sense to swallow if the alternatives are a) leaving performance on the table for the top end SKU, or b) taping out two separate dice, one with HBM and one with GDDR6. And of course, HBM2 controllers are tiny, so the added cost to the cheaper (and thus more price sensitive) SKUs would be negligible. It could be that 40 CUs with HBM2 significantly outperforms 40 CUs with GDDR6, which could allow for them to get a new SKU out of this rather than, say, making a new 52 CU die - and that's a cost savings on the scale of hundreds of millions of dollars. "Wasting" a few percent die area might be cheap in comparison. I'm not saying this is happening, but there are at least somewhat reasonable arguments for it.

AssimilatorThe additional thing that makes this a "not gonna happen" is the amount of die area that's going to be needed for ray-tracing hardware. Considering how large Turing and Ampere dies are, wasting space on inactive MCs would be an exceedingly poor, and therefore unlikely, decision on AMD's part.

We don't yet know how AMD's RT hardware is implemented, so speculating about the die area required for it is difficult, but we do have one very well detailed piece of data: The Xbox Series X SoC. We know they got what is essentially a Renoir CPU layout + 56 fully RT-enabled RDNA2 CUs into 360.4mm² in the Xbox Series X. According to their Hot Chips presentation, the CUs account for around 47% of the die area, with the CPU and memory controllers each using ~11%. The rest is likely encode/decode, display processing, decompression, audio processing, I/O, and so on. Given that the non-CU parts of the core likely scale very little if at all as the GPU grows, we can make some napkin math estimates. Let's say display processing, 16x PCIe 4.0 and other things that a GPU needs accounts for 25% of that die size - that's 90,1mm². Ten memory channels are 11% or 39,7mm², or 4mm² per channel (some of the XSX channels are double bandwidth but let's try to keep this simple). That means a 192-bit GDDR6 bus (6 channels) needs 24mm², plus 90,1mm² for the other non-CU parts, plus 40 CUs at (360,4/100*47 = 169,4; 169,4/56*40=) ~121mm². For a total of 121+90+24mm² = 235mm². Admittedly that's with a narrower memory bus than Navi 10, but it's also smaller overall - though the margin of error with napkin math like this is obviously enormous. The part where this can be relevant, rather than absolute numbers, is how it would scale: doubling the CU count to 80 and doubling the memory bus would just mean a 61% increase in die size, at 380mm2. There is no doubt something missing from this calculation (for example, DF didn't mention if the 11% number for the memory accounts for the controllers, physical connections, or both, and judging by the die shot it being both seems unlikely, and it's also likely these numbers omit interconnects, internal buses and so on). But nonetheless, it would seem that AMD has some leeway in terms of die sizes for RDNA2. Are they going to use that space for some weird dual memory layout? I have no idea. It's possible, but it would also be unprecedented. I'm not dismissing it outright, but I'm not saying I believe it either.

#120

sergionography

ValantarI'll keep this in a spoiler tag as it's getting quite OT, but sadly it still necessitates a response.

Firstly, what you're doing here is what we in Norway call a hersketeknikk - the correct translation is master suppression technique, though that's a psychological term that probably isn't as well known as the original term is for us Norwegians. Anyhow, you are simultaneously claiming that I (and some of the people agreeing with me in this debate) did something wrong, then saying I actually didn't do this, but did something that might as well be that and is pretty much just as bad, then attempting to silence any counterarguments by claiming a moral high ground and saying discussion is meaningless, despite simultaneously continuing the debate yourself. I'm not saying you're doing this consciously, but techniques like these are explicitly shaped to limit actual debate and silence your opponents. I.e. it's a straightforward bad-faith line of arguing. I would appreciate if you could try to avoid that going forward, as I try to do the same.

Moving past that: I still haven't seen you actually explain how saying

is actually a personal attack. (I am assuming that's the quote you were alluding to - your wording is a bit unclear as you put two quotes after each other and then said

which means that the meaning of "that [...] right there" in your sentence is unclear - it could indicate either quote, or both.) If I'm right in thinking that was what you meant: again, please explain how that is an ad hominem. @Assimilator said that MLID falls into a category of "trash channels". MLID is not a person, but a YouTube channel, making it essentially impossible to level a personal attack against it. The channel is neither logically nor factually equivalent to the person in the channel's videos, regardless if that person is the only person involved in its production. That would just make the channel equivalent to (some of) their work, not them.

Attacking the channel, no matter how viciously and rudely, can still not be a personal attack - for that to be true, it would need to be directed at the person directly. The criteria for being "trash" must then also be related to the content of that channel - in this case, I would assume it relates to general content quality as well as reliability due to the channel's frequent content on rumors and speculation. Being "trash" in relation to any of this is still not personal - it just says the content is bad. For that descriptor to be personal, they would have had to say "the guy making MLID is trash", which isn't what that quote says. Criticizing the quality of someone's work - even by calling it trash - is not a personal attack, and it certainly doesn't reach the level of attacking a person's character or motivations. So no, you still haven't shown how this is an ad hominem. Also, you did originally address both of us ("you guys") and then said "your posts reek of bias and ad hominem arguments", strongly implying that posts from both of us did so. I'm still waiting for you to show me some actual examples of that.

This is also an example of where you (seemingly unintentionally) fall into a bad-faith argument: you are arguing as if calling MLID "trash" is the same as calling the guy making MLID trash. Not only is this a false equivalency, but by putting the line for what amounts to a personal attack there, you are essentially making criticizing the contents of the channel impossible, as there is no way for it to not be personal by the standard you've established.

I am at least glad we can agree that I haven't attacked you personally. That's a start. It's a bit weird to equate personal attacks with sophistry, though, as personal attacks are typically not "subtly deceptive".

Oh, and for being "a tad patronizing", I'll just leave this here:

As for this though:

That is an ad hominem. That sentence is directed solely at my character, motivations and intentions in this discussion. You're not criticizing the results of my work, and not even just my methods, but explicitly saying that I'm arguing just to argue and not actually interested in understanding you. You're very welcome to try to rephrase that into not being a personal attack, but that is very clearly one.

And I understand what you're saying just fine, I'm just asking you to show examples of what you're arguing where they are needed, and to clarify the parts that don't stand up to scrutiny. You seem to be treating that as a personal attack and lashing out instead of attempting to continue an actual debate, which is why this keeps escalating.

You're entirely welcome to ignore me if you want. I personally think forum ignore buttons should be reserved for harassment and other extreme cases, as willfully blocking out parts of a discussion regardless of its content is contrary to how I want a forum to work. But again, that's up to you. I'll be glad to end this if that's your choice, but if not, I'm looking forward to you actually addressing the questions I have raised to your posts (as well as the points above), as I'm genuinely interested in finding out what you meant by them.

That sure looks interesting. 2.5GHz for a GPU, even a relatively small one, is bonkers even if it's AMDs boost clock (=maximum boost, not sustained boost) spec. And if 40 CUs at 2.5GHz is also with a 170W TBP as some sites are reporting from this same data, that is downright insane. Also rather promising for overclocking of the 80CU SKU if that ends up being clocked lower. A lot of driver data like this is preliminary (especially clocks) but that also tends to mean that it's on the low end rather than overly optimistic. Which makes this all the more weird. I'm definitely not less interested in what they have to show in a month after this, that's for sure.

I'm pretty dubious about the chance of any dual concurrent VRAM config though. That would be a complete and utter mess on the driver side. How do you decide which data ought to live where? It also doesn't quite compute in terms of the configuration: if you have even a single stack of HBM2(e), adding a 192-bit GDDR6 bus to that ... doesn't do all that much. A single stack of HBM2e goes up to at least 12GB (though up to 24GB at the most), and does 460GB/s bandwidth if it's the top-end 3.6Gbps/pin type. Does adding another layer of GDDR6 below that actually help anything? I guess you could increase cumulative bandwidth to ~800GB/s, but that also means dealing with a complicated two-tier memory system, which would inevitably carry significant complications with it. Even accounting for the cost of a larger interposer, I would think adding a second HBM2e stack would be no more expensive and would perform better than doing a HBM2e+GDDR6 setup. If it's actually that the fully enabled SKU gets 2x HBM2e, cut-down gets 192-bit GDDR6, on the other hand? That I could believe. That way they could bake both into a single die rather than having to make the HBM SKU a separate piece of silicon like the (undoubtedly very expensive) Apple only Navi 12 last time around. It would still be expensive and waste silicon, but given the relatively limited amount of wafers available from TSMC, it's likely better to churn out lots of one adaptable chip than to tape out two similar ones.

I wonder if a new boost algorithm is in place to clock that high. Perhaps more like bursts or something.

As for the side memory thing; I don't believe they will use both configs at the same time, rather it is believed to have 2 memory controllers, one for HBM and one for gddr6

RedGamingTech reported and is adamant that hbm is not for the gaming cards, and that the gaming cards will have some sort of side memory.
I'm curious to see if there is any truth behind that report. If side memory uses less energy than gddr6 and is cheaper than hbm then it's a win I suppose. I hope that is the case honestly because of how useful it would be on APUs where bandwidth is limited.
It could also be a step before multi chip gaming gpus where that side memory basically acts as a L4 cache to feed the chips, so to begin moving in that direction makes the transition easier perhaps.

#121

Caring1

BoboOOZThe Newegg leak puts this SKU at 150W :kookoo:

Problem?
I don't see why that being much lower is an issue, efficiencies can improve.

#122

Nike_486DX

Why not to optimize everything up (crank up the efficiency at least) and also start using HBM memory mid-range onwards ? Seems like an impossible step for amd tho (but only AMD can do such a thing) :\
Ah and invest more $$$ into driver quality control...yeah

#123

Valantar

Nike_486DXWhy not to optimize everything up (crank up the efficiency at least) and also start using HBM memory mid-range onwards ? Seems like an impossible step for amd tho (but only AMD can do such a thing) :\
Ah and invest more $$$ into driver quality control...yeah

"Why not?" is a simple question to answer: cost. HBM is still very expensive, and even GDDR6 is cheaper and simpler to implement. If they could bring costs down to a level where this became feasible, they would be able to make some very interesting products - clearly demonstrated by the very impressive Radeon Pro 5600M. But the cost is likely still prohibitively high.

Caring1Problem?
I don't see why that being much lower is an issue, efficiencies can improve.

The problem is whether this is even remotely possible. A ~30-50% clock speed increase combined with a 33% drop in power consumption - all without a new production node to help reach this goal - would be completely unprecedented in the modern semiconductor industry. With most new nodes you're lucky if you get one of those two (clock speed gains or power consumption drop), and this isn't a new node, just a tweak of the existing one.

#124

InVasMani

ValantarThe problem is whether this is even remotely possible. A ~30-50% clock speed increase combined with a 33% drop in power consumption - all without a new production node to help reach this goal - would be completely unprecedented in the modern semiconductor industry. With most new nodes you're lucky if you get one of those two (clock speed gains or power consumption drop), and this isn't a new node, just a tweak of the existing one.

It's not necessarily a clock speed increase it could be better IPC at the same clock speeds which would also drop power consumption. It's worth nothing that AMD's transistor density is still quite a bit lower than Nvidia's so I wouldn't at all say it's impossible or unprecedented. Also look what Intel's done with 14nm+++++++++++++++ to counteract and hold it's ground and retain the single thread high frequency scaling performance advantages it still carries. Sure that's happened over a longer period of time, but there is no question AMD's had more R&D emphasis in the last 5 years or so devoted to Ryzen, but gradually shifting more back towards Radeon at the same time. I feel RDNA was the first major pushback from AMD on the graphics side and RDNA2 could be a continuance of it. Nvidia with Ampere and node shrink coinciding makes that more difficult, but let's face it we know AMD didn't eek out all the performance that can be tapped into with 7nm.

Nvidia has a higher transistor count on a larger node for starters and we've seen what Intel's done with 14nm+++++++++++++ as well the idea that 7nm isn't maturing and hasn't is just asinine to think it defiantly has improved from a year ago and AMD defiantly can squeeze more transistors into the design as well at least as many as Nvidia's previous designs or more is reasonable to comprehend being entirely feasible. We can only wait and see what happens. Let's also not forget AMD also use to be fabrication company and spun off global foundries the same can't be said of Nvidia they could certainly be working closely with TSMC on improvements to the node itself for their designs and we some some signs that they did for Ryzen in fact work alongside TSMC to incorporate some node tweaks to get more out of the chip designs on the manufacturing side.

It's just one of those things where everyone is going to have to wait and see what AMD did come up with for RDNA2 will it underwhelm, overwhelm, or be about what you can expect from AMD all things taken into consideration!!? Nvidia is transitioning to a smaller node so the ball is more in their court in that sense however AMD's transistor count is lower so it's defiantly not that simple. If AMD incorporated something clever and cost effective they could certainly make big leaps in performance and efficiency though and we know that AMD's compression is already trailing Nvidia's so they have room to improve there as well. Worth noting is AMD is transitioning toward RTRT hardware, but we really don't at all know to what extent and how invested into it they plan incorporate into that on this initial pushing into it. I think if they match a RTX2080 on the RTRT side non SUPER model they honestly are doing fine with it RTRT isn't going to take off overnight and the RDNA3 design can be more aggressive things will have changed a lot by then and hopefully it'll be 5nm by that point and perhaps HBM costs will have improved.

#125

Valantar

InVasManiIt's not necessarily a clock speed increase it could be better IPC at the same clock speeds which would also drop power consumption. It's worth nothing that AMD's transistor density is still quite a bit lower than Nvidia's so I wouldn't at all say it's impossible or unprecedented. Also look what Intel's done with 14nm+++++++++++++++ to counteract and hold it's ground and retain the single thread high frequency scaling performance advantages it still carries. Sure that's happened over a longer period of time, but there is no question AMD's had more R&D emphasis in the last 5 years or so devoted to Ryzen, but gradually shifting more back towards Radeon at the same time. I feel RDNA was the first major pushback from AMD on the graphics side and RDNA2 could be a continuance of it. Nvidia with Ampere and node shrink coinciding makes that more difficult, but let's face it we know AMD didn't eek out all the performance that can be tapped into with 7nm.

Nvidia has a higher transistor count on a larger node for starters and we've seen what Intel's done with 14nm+++++++++++++ as well the idea that 7nm isn't maturing and hasn't is just asinine to think it defiantly has improved from a year ago and AMD defiantly can squeeze more transistors into the design as well at least as many as Nvidia's previous designs or more is reasonable to comprehend being entirely feasible. We can only wait and see what happens. Let's also not forget AMD also use to be fabrication company and spun off global foundries the same can't be said of Nvidia they could certainly be working closely with TSMC on improvements to the node itself for their designs and we some some signs that they did for Ryzen in fact work alongside TSMC to incorporate some node tweaks to get more out of the chip designs on the manufacturing side.

It's just one of those things where everyone is going to have to wait and see what AMD did come up with for RDNA2 will it underwhelm, overwhelm, or be about what you can expect from AMD all things taken into consideration!!? Nvidia is transitioning to a smaller node so the ball is more in their court in that sense however AMD's transistor count is lower so it's defiantly not that simple. If AMD incorporated something clever and cost effective they could certainly make big leaps in performance and efficiency though and we know that AMD's compression is already trailing Nvidia's so they have room to improve there as well. Worth noting is AMD is transitioning toward RTRT hardware, but we really don't at all know to what extent and how invested into it they plan incorporate into that on this initial pushing into it. I think if they match a RTX2080 on the RTRT side non SUPER model they honestly are doing fine with it RTRT isn't going to take off overnight and the RDNA3 design can be more aggressive things will have changed a lot by then and hopefully it'll be 5nm by that point and perhaps HBM costs will have improved.

... that's not how this works. Please actually read the post when you are responding, as you are getting both your facts and the data form these leaks completely mixed up. This needs addressing point by point:

The recent leaks specifically mention clock speeds. Whether IPC has changed is thus irrelevant. 2.5GHz is 2.5GHz unless AMD has redefined what clock speed means (which they haven't). That's a 30-50% increase in clock speed from the fastest RDNA 1 SKU. If the rumors are accurate about this and about the power requirements - 150W! - and assuming IPC or perf/Tflop is the same, that's a more than 100% increase in perf/W before any IPC increases.
An increase in both absolute performance and performance per watt without moving to a new node is unprecedented. We have not seen a change like that on the same node for at least the past decade of silicon manufacturing, and likely not even the decade before that. Both silicon production and chip design is enormously complex and highly mature processes, making revolutionary jumps like this extremely unlikely. Can it happen? Sure! Is it likely to? Not at all.
The relationship between clock speed and transistor density is far too complex to be used in an argument the way you are doing. Besides, I never made any comparison to Intel or Nvidia, only to AMD's own previous generation, which is made on the same node (though it is now improved) and is based on an earlier version of the same architecture. We don't know the tweaks made to the node, nor how changed RDNA 2 is from RDNA 1, but assuming the combination is capable of doubling perf/W is wildly optimistic.
Your example from Intel actually speaks against you: they spent literally four years improving their 14nm node, and what did they get from it? No real improvement in perf/W (outside of low power applications at least), but higher boost clocks and higher maximum power draws. They went from 4c/8t 4.2GHz boost/4GHz base at 91W (with max OC somewhere around 4.5-4.7GHz) to 10c/20t 3.7GHz base/various boost speeds up to 5.3GHz at 125W to sustain the base clock or ~250W for boost clocks (max OC around 5.3-5.4GHz). For a more apples to apples comparison, their current fastest 4c/8t chip is the 65W i3-10320 at 3.7GHz base/4.6GHz 1c/4.4GHz all-core. That's a lower TDP, but it still needs 90W for its boost clocks, and the base clock is lower. IPC has not budged. So, Intel, one of the historically best silicon manufacturing companies in the world, spent four years improving their node and got a massive boost in maximum power draw and thus maximum clocks, but essentially zero perf/W improvement. But you're expecting AMD to magically advise TSMC into massively improving their node in a single year?
There's no doubt AMD is putting much more R&D effort into Radeon now than 3-5 years ago - they have much more cash on hand and a much stronger CPU portfolio, so that stands to reason. That means things ought to be improving, absolutely, but it does not warrant this level of optimism.
I never said 7nm wasn't maturing. Stop putting words in my mouth.

You're arguing as if I'm being extremely pessimistic here or even saying I don't expect RDNA 2 to improve whatsoever, which is a very fundamental misreading of what I've been saying. I would be very, very happy if any of this turned out to be true, but this looks far too good to be true. It's wildly unrealistic. And yes, it would be an unprecedented jump in efficiency - bigger even than Kepler to Maxwell (which also happened on the same node). If AMD could pull that off? That would be amazing. But I'm not pinning my hopes on that.

AMD Radeon "Navy Flounder" Features 40CU, 192-bit GDDR6 Memory

135 Comments on AMD Radeon "Navy Flounder" Features 40CU, 192-bit GDDR6 Memory

Latest GPU Drivers

New Forum Posts

Popular Reviews

TPU on YouTube

Controversial News Posts

AMD Radeon "Navy Flounder" Features 40CU, 192-bit GDDR6 Memory

Related News

135 Comments on AMD Radeon "Navy Flounder" Features 40CU, 192-bit GDDR6 Memory

Latest GPU Drivers

New Forum Posts

Popular Reviews

TPU on YouTube

Controversial News Posts