Tuesday, October 25th 2022

AMD Radeon RX 7900 XTX to Lead the RDNA3 Pack?

Oct 25th, 2022 00:33 Discuss (95 Comments)

AMD is reportedly bringing back the "XTX" brand extension to the main marketing names of its upcoming Radeon RX 7000-series SKUs. The company had, until now, reserved the "XTX" moniker for internal use, to denote SKUs that max out all hardware available on a given silicon. The RX 7000-series introduce the company's next-generation RDNA3 graphics architecture, and will see the company introduce its chiplets packaging design to the client-graphics space. The next-generation "Navi 31" GPU will likely be the first of its kind: while multi-chip module (MCM) GPUs aren't new, this would be the first time that multiple logic chips would sit on a single package for client GPUs. AMD has plenty of experience with MCM GPUs, but those have been single logic chips surrounded by memory stacks. "Navi 31" uses multiple logic chips on a package; which is then wired to conventional discrete GDDR6 memory devices like any other client GPU.

The rumored Radeon RX 7900 XTX is features 12,288 stream processors, likely across two logic tiles that contain the SIMD components. These tiles are [for now] rumored to be built on the TSMC N5 (5 nm EUV) foundry process. The Display CoreNext (DCN), and Video CoreNext (VCN) components, as well as the GDDR6 memory controllers, will be built on separate chiplets that are likely built on TSMC N6 (6 nm). The "Navi 31" has a 384-bit wide memory interface. This is 384-bit and not "2x 192-bit," because the logic tiles don't have memory interfaces of their own, but rely on memory controller tiles shared between the two logic tiles, much in the same as a dual-channel DDR4 memory interface being shared between the two 8-core CPU chiplets on a Ryzen 5950X processor.

The RX 7900 XTX features 24 GB of GDDR6 memory across a 384-bit wide memory interface. This memory ticks at 20 Gbps speed, which means a raw memory bandwidth of 960 GB/s. AMD is also expected to deploy large on-die caches, which it calls the Infinity Cache, to further lubricate the GPU's memory sub-system. The most interesting aspect of this rumor is the card's typical board power value, of 420 W. Technically, this is in the same league as the 450 W typical graphics power value of the GeForce RTX 4090. Since its teaser earlier this year in the launch event of the Ryzen 7000 series desktop processors, speculation is rife that AMD will not deploy the 12+4 pin ATX 12VHPWR power connector with its Radeon RX 7000-series GPUs, and the reference-design board likely has up to three conventional 8-pin PCIe power connectors. You're any way having to spare four 8-pin connectors for an RTX 4090.

AMD's second-best SKU based on the "Navi 31" is expected to be the RX 7900 XT, with fewer stream processors—likely 10,752. The memory size is reduced to 20 GB, and the memory interface narrowed to 320-bit, which at 20 Gbps memory speed produces 800 GB/s of bandwidth. Keeping up with the trend of AMD's second-largest GPU having half the stream processors of the largest one (eg: "Navi 22" having 2,560 against the 5,120 of the "Navi 21,") the "Navi 32" chip will likely have one of these 6,144-SP logic tiles, and a narrower memory interface.

Source: VideoCardz

Add your own comment

95 Comments on AMD Radeon RX 7900 XTX to Lead the RDNA3 Pack?

#51

Punkenjoy

medi01With Infinity Cache, AMD was able to compete on par while on much slower bus. I don't see why it would be different this time.

What is going to be different is this time, Nvidia have much larger L2 cache than it had. It's should actually be pretty close to the SKU without 3d v-cache.

There are still plenty of unknown regarding how decoupled MCD with IF cache on it will perform versus the monolithic die. I doubt that the SKU will only compete with the 4080, but it is very hard to estimate how it will perform versus the 4090. Only benchmark will be able to tell.

There is also the question of RT performance. I mean it won't be too much of an issue on low end sku but on a flagship sku, it should be there.

#52

ZetZet

Guwapo77If AMD does this, I would be flabbergasted. Last gen, they priced their cards based on performance relative to the Nvidia cards. I hope like hell they hit that $1,000 MSRP, but I just don't see it.

Yeah I expect a price raise too. Maybe 1200 for 7900XT.

#53

Kapone33

PunkenjoyWhat is going to be different is this time, Nvidia have much larger L2 cache than it had. It's should actually be pretty close to the SKU without 3d v-cache.

There are still plenty of unknown regarding how decoupled MCD with IF cache on it will perform versus the monolithic die. I doubt that the SKU will only compete with the 4080, but it is very hard to estimate how it will perform versus the 4090. Only benchmark will be able to tell.

There is also the question of RT performance. I mean it won't be too much of an issue on low end sku but on a flagship sku, it should be there.

The thing for me when I think about how they do this is taking it from Polaris and baking Crossfire's evolution into the card instead of the MB. That could mean something like the card would do something like 1 to 1 on the screen with the Cache serving as the brains to feed the information to the GPUs. We could have AFR with again the Cache serving as the brains to feed the GPUs. In terms of RT I am not worried as between Sony and Microsoft they will probably do more to drive RT than AMD or Nvidia. Regardless 50% more performance per watt (AMD's claim) is hard to rationalize when the 6500XT/6600/6700/6800XT/6900XT are so good already. If nothing else it should at least compete with the 4090 in Regular 4K Gaming.

#54

Punkenjoy

kapone32The thing for me when I think about how they do this is taking it from Polaris and baking Crossfire's evolution into the card instead of the MB. That could mean something like the card would do something like 1 to 1 on the screen with the Cache serving as the brains to feed the information to the GPUs. We could have AFR with again the Cache serving as the brains to feed the GPUs. In terms of RT I am not worried as between Sony and Microsoft they will probably do more to drive RT than AMD or Nvidia. Regardless 50% more performance per watt (AMD's claim) is hard to rationalize when the 6500XT/6600/6700/6800XT/6900XT are so good already. If nothing else it should at least compete with the 4090 in Regular 4K Gaming.

I do not think any multi compute die imply using alternate frame rendering or scan line interleaving. From what i understand, there will be 1 Master tile with the main scheduler that will dispatch compute task to the second CCD and the OS will only see 1 GPU. (else you would need double the memory, ex 20 GB per GPU for a total of 40 GB).

The challenge is how you exchange data between the 2 GPU (Ex you run a shaders that need to reads pixels that were previously rendered on the other GPU). This is the main challenge. The master also need to be aware of the state of the second tile compute units to effectively dispatch jobs. Also, let say all your MCD are connected to the main tiles, it means the secondary tiles have to perform all those memory access using the link between the chips. If they split it 50/50, each tiles will have to perform a portion of their memory access on the other die. You will also have to map your memory across 2 die.

No matter what you do, the connection between the 2 compute tiles will need to be beefy.

This is easy when you have a single tiles but the challenge increase if you have to do it across chips. Note that AMD have an Hardware scheduler for quite some time and they might have improved it to be tile aware and schedule the load accordingly.

I suspect that it would be easier to load balance 2 larger dies that could do a big portion of their work themselves than a lot of smaller die that would need to exchange data frequently. But that may just be a theory with no value.

Alternate frame rendering could maybe be possible on multi die but the main issue remain frame pacing. How do you know when it's the best time to start rendering the next frame? for that you need to know how fast you will finish the current frame but you don't always know until it's done. And if you wait until it's done, it's already the time to start a new frame on the main GPU. They tried many tricks to try to fix frame pacing on alternate frame rendering without too much success and it's probably not worth the effort.

For SLI,(the original term, not the rebranding of multi gpu by Nvidia), the thing is shaders can affect block of pixels, how do you handle that if you only render half the line? you can't so it's a done tech that died with the coming of shaders.

#55

Kapone33

PunkenjoyI do not think any multi compute die imply using alternate frame rendering or scan line interleaving. From what i understand, there will be 1 Master tile with the main scheduler that will dispatch compute task to the second CCD and the OS will only see 1 GPU. (else you would need double the memory, ex 20 GB per GPU for a total of 40 GB).

The challenge is how you exchange data between the 2 GPU (Ex you run a shaders that need to reads pixels that were previously rendered on the other GPU). This is the main challenge. The master also need to be aware of the state of the second tile compute units to effectively dispatch jobs. Also, let say all your MCD are connected to the main tiles, it means the secondary tiles have to perform all those memory access using the link between the chips. If they split it 50/50, each tiles will have to perform a portion of their memory access on the other die. You will also have to map your memory across 2 die.

No matter what you do, the connection between the 2 compute tiles will need to be beefy.

This is easy when you have a single tiles but the challenge increase if you have to do it across chips. Note that AMD have an Hardware scheduler for quite some time and they might have improved it to be tile aware and schedule the load accordingly.

I suspect that it would be easier to load balance 2 larger dies that could do a big portion of their work themselves than a lot of smaller die that would need to exchange data frequently. But that may just be a theory with no value.

Alternate frame rendering could maybe be possible on multi die but the main issue remain frame pacing. How do you know when it's the best time to start rendering the next frame? for that you need to know how fast you will finish the current frame but you don't always know until it's done. And if you wait until it's done, it's already the time to start a new frame on the main GPU. They tried many tricks to try to fix frame pacing on alternate frame rendering without too much success and it's probably not worth the effort.

For SLI,(the original term, not the rebranding of multi gpu by Nvidia), the thing is shaders can affect block of pixels, how do you handle that if you only render half the line? you can't so it's a done tech that died with the coming of shaders.

I do understand what you are saying and can see it too. The thing is Polaris was different than any iteration of Crossfire (Multi GPU) and worked quite beautifully. We also did not have Freesync in those days either so frame pacing might be a non issue if done right. Like I said before we can extrapolate all we want but these GPUs will be different than anything seen before so we may both be right and wrong but it is fun discussing the possibilities.

#56

Luke357

DimitrimanCan't wait for the "XXX" refresh.

Do that and bring back Ruby ;)

#57

Punkenjoy

kapone32I do understand what you are saying and can see it too. The thing is Polaris was different than any iteration of Crossfire (Multi GPU) and worked quite beautifully. We also did not have Freesync in those days either so frame pacing might be a non issue if done right. Like I said before we can extrapolate all we want but these GPUs will be different than anything seen before so we may both be right and wrong but it is fun discussing the possibilities.

Another thing that complicate the AFR is the utilisation of temporal effect (like TAA, temporal upscaling tech, etc.)

You can't use the data of a previously rendered frame if that frame isn't rendered yet.

AFR is probably dead, the benefits of reusing temporal data really outweight the benefits of AFR and multi GPU. And AFR is just the brute force way of doing things.

#58

Kapone33

PunkenjoyAnother thing that complicate the AFR is the utilisation of temporal effect (like TAA, temporal upscaling tech, etc.)

You can't use the data of a previously rendered frame if that frame isn't rendered yet.

AFR is probably dead, the benefits of reusing temporal data really outweight the benefits of AFR and multi GPU. And AFR is just the brute force way of doing things.

Isn't FSR at the end of the pipeline? I am not sure. The benefits of it could be baked into the card as well. We just don't know but I know what you mean about the performance of Upscaling tech.

#59

awesomesauce

MCM + driver + ATI + game = ouff

watchout for bugs

#60

Punkenjoy

kapone32Isn't FSR at the end of the pipeline? I am not sure. The benefits of it could be baked into the card as well. We just don't know but I know what you mean about the performance of Upscaling tech.

It would work without any problem with FSR 1.0 that is a spacial upscaler and doesn't need previous frame information.

For FSR 2.0, and other TAAU, yes, it's more at the end of the pipeline but generally before the post processing effects. You could in theory, start a second frame and it would have the frame buffer image when it need it. But that start to become really complicated. And that is just for upscaler.

Let's say you create and move particles using a shaders, the next frame would need to get the previous data to continue. You would have to wait and sync every frame making it very complicated. If you use shaders to do terrain or object deformation too. Those things would be way earlier in the pipeline. In the end, you just add multiple sync and wait to your image generation and those thing will kill your efficiency.

That is not worth the effort. AFR is just a stupid way of using 2 GPU that was fine when game were easier to run and simplier, but is no longer a good solutions for now. What multi-tiles GPU needs is a way to send the works intelligently across multiple dies and finding way to manage memory efficiently while doing it. Once you have that figured out, your solutions become way powerful and you don't need to deal with frame pacing issue, where thing happen in the frame rendering process, etc.

#61

Kapone33

PunkenjoyIt would work without any problem with FSR 1.0 that is a spacial upscaler and doesn't need previous frame information.

For FSR 2.0, and other TAAU, yes, it's more at the end of the pipeline but generally before the post processing effects. You could in theory, start a second frame and it would have the frame buffer image when it need it. But that start to become really complicated. And that is just for upscaler.

Let's say you create and move particles using a shaders, the next frame would need to get the previous data to continue. You would have to wait and sync every frame making it very complicated. If you use shaders to do terrain or object deformation too. Those things would be way earlier in the pipeline. In the end, you just add multiple sync and wait to your image generation and those thing will kill your efficiency.

That is not worth the effort. AFR is just a stupid way of using 2 GPU that was fine when game were easier to run and simplier, but is no longer a good solutions for now. What multi-tiles GPU needs is a way to send the works intelligently across multiple dies and finding way to manage memory efficiently while doing it. Once you have that figured out, your solutions become way powerful and you don't need to deal with frame pacing issue, where thing happen in the frame rendering process, etc.

Truly excellent explanation on why AFR is not a solution. I can't wait to see what these Gpus can do.

#62

Darmok N Jalad

Ever since AMD made a Vega card with dual GPUs connected through infinity fabric for the 2019 Mac Pro, I wondered if we’d see a return to MCM in the consumer space. Curious to see if they’ve overcome the past scaling issues, as I suspect this thing will present itself as a single block of resources to the OS.

#63

btk2k2

CallandorWoT@btarunr I call bs on that image, because Jayz2cents and others have said with confidence that RDNA3 will be using the new power connector that 4090 uses, and that image is using two older style power connectors (if my zoom was correct).

heh, we will see soon enough.

Scott Herkelman has confirmed otherwise.

https://twitter.com/i/web/status/1584931430483705859

#64

R-T-B

MarsM4NSo what? :twitch: At least basic 8pin connectors do not catch fire like Nvidia's 12VHPWR connector.

Isn't AMD expected to use the same connectors?

And the hoopla around that is pretty questionable at this point.

btk2k2Scott Herkelman has confirmed otherwise.

https://twitter.com/i/web/status/1584931430483705859

Oooh, interesting, if a bit silly. But it will comfort some if nothing else.

b1k3rdudeWHY this undersized 12pin connector was even created with smaller pins, when existing 8pin PCI power connectors are capable of passing 300W/25A per cponnector.

Yeah, if you rewire them. And then people plug them in to the old ports and get fun fire and component hazards.

Yes you can key them, but that's not always enough for some, as history has shown.

#65

wolf

Better Than Native

Legacy-ZAsome companies, unlike nVidia, care about their reputation

This is ridiculous, of course they care about their reputation, how do you think they have so damn much mindshare? They have a reputation for producing the fastest gaming graphics cards (fastest halo products sell a lot of lower SKU's), they have a rep for CUDA, Datacentres, professional cards like Quadro, stable drivers, the list goes on. This is very much a reputation they work hard to maintain.

Are they all totally valid points for every possible buyer? of course not. Just like what your insinuating about their reputation for underhanded tactics, not caring about gamers, ripping people off etc etc isn't a consideration, and at least not a deal breaker for the vast majority of their buyers, it's just how a vocal minority consider their reputation.

#66

Blueberries

Of course AMD isn't using the 12VHPWR connector, who do you think is responsible for all of the negative PR surrounding it?

#67

Iocedmyself

what titles does a 6900XT have superior raster performance (by a non insignificant margin, shall we say 10%+? ) and do that consuming 200w while a 3090 consumes 350w. I don't think I've ever seen that.

Well, metro exodus comes to mind, dirt 5, Forza 5 resident evil village, Red Dead Redemption to name a few that i know offer greater frames at 4k, and 1440p. I'm not saying their aren't Nvidia optimized titles (horizon zero dawn for instance) that offer better performance, but generally the 6900xt is within 10fps of the 3090 in non-RT scenarios, using 150w less power, at a much lower price point.

More materially to my point, the 6900XT enjoyed advantages in the metrics I quoted to the tune of 88% and 61%, I don't think it enjoys a single win over the 3090 to the tune of even 61%, let alone 88%, but I'm sure if you dig hard enough you might find an unrealistic niche example or two where that might be the case.

At the end of the day, the only things that really matter are how much you have to pay for given performance, and how much power is required to get there. The 3090 is 50% more expensive, with 50% higher power draw, to sometimes, in certain games, with certain settings perform marginally better.

From what I know, the 6900XT enjoyed a minor lead at 1080p (less than 10% on average), roughly par at 1440p, and the 3090 enjoyed a minor lead at 4k (less than 10% on average)

It's very much title dependant, some games are optimized in favor of one card or another, but overall the average findings favor the 6900xt.

That's the reality I remember, expect most publications don't really test DLSS, at least not in like for like testing, because then it wouldn't be like for like... so the 3090 trounces a 6900XT for RT, and then you have DLSS to help even more.

That was true prior to FSR being a thing, but it's more common place now, particularly when FSR2/2.1 can so easily be modded into any title with DLSS, and used on pretty much any GPU made in the past 5 years. Yes nvidia has the lead in RT, and yes i'm one of those idiots that spent stupid money buying a card during a pandemic in the EU specifically for that feature simply because when i started playing around with CAD apps 25 years ago, it took 36 hours to render a single RT light source on a blank backround, and while the nerd in me loves the fact that it's a thing, the reality is that it's RARELY noticeable in practice while gaming, beyond the fact that your performance has dropped by 2-3x. It's still in the realm of curiosity, though it has been gaining traction over the past year, i think it's going to be another year or two before we really see it's full potential.

If I were you I'd brace for AMD being all too happy to follow this trend, hell, it's already started.

From what i know the 79xx cards are going to have a power limit of 300-400w, with more than a 50% uplift in performance per watt and RT performance. In 2008, i had a pair of 4870x2's, each of which consumed 285-350w with 2.4 teraflops of rendering performance. AMD has managed to keep power consumption fairly consistent. Given that the 4090 costs $1600 if you can find it in stock and don't care about maybe setting your computer on fire, If AMD manages 75% of the performance for under $1000, they'll be in a very good position.

Now, despite everything i've said, if money were no object and i didn't consider anything beyond the raw performance numbers, i'd buy Nvidia in a heartbeat. The end result is incredibly impressive, the means of achieving it is just disappointing

#68

ratirt

medi01The main question is whether AMD has a superior product.
Superior i the sense that they can match or beat competitor, while spending less.

If yes, I'd expect CPU like offensive.

4090 costs $1600 which is more than $2k in Norway and probably across EU. If the top SKU from AMD performs the same or even if slightly slower than 4090 but has a price around $1k it would have been a hit. Something tells me this is a fool's errand to think that way. If AMD can produce the SKU in a lower cost we will see but it does not mean the end price will reflect this cause it may easily not.

#69

Figus

As always waiting for the x8xx cards, always more money performance balanced.

#70

Rainy'sLearning

Guys, do you think I should buy the 6900 xt now for its all time low of 750€ in amazon france? Or should i wait for the new releases first in order to get a better deal?

#71

medi01

Blueberrieswho do you think is responsible for all of the negative PR surrounding it?

Whoever did lousy job designing the connector, then TESTED it, figured out it was TERRIBLE and still went on launching 2000+ Euro cards with it?
Hm, who was that, let me think...

Rainy'sLearningGuys, do you think I should buy the 6900 xt now for its all time low of 750€ in amazon france? Or should i wait for the new releases first in order to get a better deal?

I'd wait, but there is a bit of gambling either way.

If this card is coming from AIBs, they DO KNOW what is coming.

I mean, have we ever had older AMD GPUs become more expensive, after new gen is released? (planetary level cryptobazinga doesn't count)

#72

Tropick

FigusAs always waiting for the x8xx cards, always more money performance balanced.

Oh yeah, got my Red Dragon 6800XT NIB for $549 about a month ago. With a mild overclock I'm getting numbers within 5% of a stock 6900XT. Absolutely killer 1440p performance and I was able to keep my 750W PSU :rockout:

#73

Oberon

BlueberriesOf course AMD isn't using the 12VHPWR connector, who do you think is responsible for all of the negative PR surrounding it?

It's definitely all of those AMD shills going out and buying $1600+ USD NVIDIA cards and intentionally destroying them.

#74

TheEndIsNear

I just don't like when they release it after the other card. I want the fastest now! Lol

#75

The 24gb will cost two kidneys while 20gb will cost one.

Add your own comment

AMD Radeon RX 7900 XTX to Lead the RDNA3 Pack?

95 Comments on AMD Radeon RX 7900 XTX to Lead the RDNA3 Pack?

Latest GPU Drivers

New Forum Posts

Popular Reviews

TPU on YouTube

Controversial News Posts

AMD Radeon RX 7900 XTX to Lead the RDNA3 Pack?

Related News

95 Comments on AMD Radeon RX 7900 XTX to Lead the RDNA3 Pack?

Latest GPU Drivers

New Forum Posts

Popular Reviews

TPU on YouTube

Controversial News Posts