Tuesday, October 25th 2022

AMD Radeon RX 7900 XTX to Lead the RDNA3 Pack?
AMD is reportedly bringing back the "XTX" brand extension to the main marketing names of its upcoming Radeon RX 7000-series SKUs. The company had, until now, reserved the "XTX" moniker for internal use, to denote SKUs that max out all hardware available on a given silicon. The RX 7000-series introduce the company's next-generation RDNA3 graphics architecture, and will see the company introduce its chiplets packaging design to the client-graphics space. The next-generation "Navi 31" GPU will likely be the first of its kind: while multi-chip module (MCM) GPUs aren't new, this would be the first time that multiple logic chips would sit on a single package for client GPUs. AMD has plenty of experience with MCM GPUs, but those have been single logic chips surrounded by memory stacks. "Navi 31" uses multiple logic chips on a package; which is then wired to conventional discrete GDDR6 memory devices like any other client GPU.
The rumored Radeon RX 7900 XTX is features 12,288 stream processors, likely across two logic tiles that contain the SIMD components. These tiles are [for now] rumored to be built on the TSMC N5 (5 nm EUV) foundry process. The Display CoreNext (DCN), and Video CoreNext (VCN) components, as well as the GDDR6 memory controllers, will be built on separate chiplets that are likely built on TSMC N6 (6 nm). The "Navi 31" has a 384-bit wide memory interface. This is 384-bit and not "2x 192-bit," because the logic tiles don't have memory interfaces of their own, but rely on memory controller tiles shared between the two logic tiles, much in the same as a dual-channel DDR4 memory interface being shared between the two 8-core CPU chiplets on a Ryzen 5950X processor.The RX 7900 XTX features 24 GB of GDDR6 memory across a 384-bit wide memory interface. This memory ticks at 20 Gbps speed, which means a raw memory bandwidth of 960 GB/s. AMD is also expected to deploy large on-die caches, which it calls the Infinity Cache, to further lubricate the GPU's memory sub-system. The most interesting aspect of this rumor is the card's typical board power value, of 420 W. Technically, this is in the same league as the 450 W typical graphics power value of the GeForce RTX 4090. Since its teaser earlier this year in the launch event of the Ryzen 7000 series desktop processors, speculation is rife that AMD will not deploy the 12+4 pin ATX 12VHPWR power connector with its Radeon RX 7000-series GPUs, and the reference-design board likely has up to three conventional 8-pin PCIe power connectors. You're any way having to spare four 8-pin connectors for an RTX 4090.
AMD's second-best SKU based on the "Navi 31" is expected to be the RX 7900 XT, with fewer stream processors—likely 10,752. The memory size is reduced to 20 GB, and the memory interface narrowed to 320-bit, which at 20 Gbps memory speed produces 800 GB/s of bandwidth. Keeping up with the trend of AMD's second-largest GPU having half the stream processors of the largest one (eg: "Navi 22" having 2,560 against the 5,120 of the "Navi 21,") the "Navi 32" chip will likely have one of these 6,144-SP logic tiles, and a narrower memory interface.
Source:
VideoCardz
The rumored Radeon RX 7900 XTX is features 12,288 stream processors, likely across two logic tiles that contain the SIMD components. These tiles are [for now] rumored to be built on the TSMC N5 (5 nm EUV) foundry process. The Display CoreNext (DCN), and Video CoreNext (VCN) components, as well as the GDDR6 memory controllers, will be built on separate chiplets that are likely built on TSMC N6 (6 nm). The "Navi 31" has a 384-bit wide memory interface. This is 384-bit and not "2x 192-bit," because the logic tiles don't have memory interfaces of their own, but rely on memory controller tiles shared between the two logic tiles, much in the same as a dual-channel DDR4 memory interface being shared between the two 8-core CPU chiplets on a Ryzen 5950X processor.The RX 7900 XTX features 24 GB of GDDR6 memory across a 384-bit wide memory interface. This memory ticks at 20 Gbps speed, which means a raw memory bandwidth of 960 GB/s. AMD is also expected to deploy large on-die caches, which it calls the Infinity Cache, to further lubricate the GPU's memory sub-system. The most interesting aspect of this rumor is the card's typical board power value, of 420 W. Technically, this is in the same league as the 450 W typical graphics power value of the GeForce RTX 4090. Since its teaser earlier this year in the launch event of the Ryzen 7000 series desktop processors, speculation is rife that AMD will not deploy the 12+4 pin ATX 12VHPWR power connector with its Radeon RX 7000-series GPUs, and the reference-design board likely has up to three conventional 8-pin PCIe power connectors. You're any way having to spare four 8-pin connectors for an RTX 4090.
AMD's second-best SKU based on the "Navi 31" is expected to be the RX 7900 XT, with fewer stream processors—likely 10,752. The memory size is reduced to 20 GB, and the memory interface narrowed to 320-bit, which at 20 Gbps memory speed produces 800 GB/s of bandwidth. Keeping up with the trend of AMD's second-largest GPU having half the stream processors of the largest one (eg: "Navi 22" having 2,560 against the 5,120 of the "Navi 21,") the "Navi 32" chip will likely have one of these 6,144-SP logic tiles, and a narrower memory interface.
95 Comments on AMD Radeon RX 7900 XTX to Lead the RDNA3 Pack?
There are still plenty of unknown regarding how decoupled MCD with IF cache on it will perform versus the monolithic die. I doubt that the SKU will only compete with the 4080, but it is very hard to estimate how it will perform versus the 4090. Only benchmark will be able to tell.
There is also the question of RT performance. I mean it won't be too much of an issue on low end sku but on a flagship sku, it should be there.
The challenge is how you exchange data between the 2 GPU (Ex you run a shaders that need to reads pixels that were previously rendered on the other GPU). This is the main challenge. The master also need to be aware of the state of the second tile compute units to effectively dispatch jobs. Also, let say all your MCD are connected to the main tiles, it means the secondary tiles have to perform all those memory access using the link between the chips. If they split it 50/50, each tiles will have to perform a portion of their memory access on the other die. You will also have to map your memory across 2 die.
No matter what you do, the connection between the 2 compute tiles will need to be beefy.
This is easy when you have a single tiles but the challenge increase if you have to do it across chips. Note that AMD have an Hardware scheduler for quite some time and they might have improved it to be tile aware and schedule the load accordingly.
I suspect that it would be easier to load balance 2 larger dies that could do a big portion of their work themselves than a lot of smaller die that would need to exchange data frequently. But that may just be a theory with no value.
Alternate frame rendering could maybe be possible on multi die but the main issue remain frame pacing. How do you know when it's the best time to start rendering the next frame? for that you need to know how fast you will finish the current frame but you don't always know until it's done. And if you wait until it's done, it's already the time to start a new frame on the main GPU. They tried many tricks to try to fix frame pacing on alternate frame rendering without too much success and it's probably not worth the effort.
For SLI,(the original term, not the rebranding of multi gpu by Nvidia), the thing is shaders can affect block of pixels, how do you handle that if you only render half the line? you can't so it's a done tech that died with the coming of shaders.
You can't use the data of a previously rendered frame if that frame isn't rendered yet.
AFR is probably dead, the benefits of reusing temporal data really outweight the benefits of AFR and multi GPU. And AFR is just the brute force way of doing things.
watchout for bugs
For FSR 2.0, and other TAAU, yes, it's more at the end of the pipeline but generally before the post processing effects. You could in theory, start a second frame and it would have the frame buffer image when it need it. But that start to become really complicated. And that is just for upscaler.
Let's say you create and move particles using a shaders, the next frame would need to get the previous data to continue. You would have to wait and sync every frame making it very complicated. If you use shaders to do terrain or object deformation too. Those things would be way earlier in the pipeline. In the end, you just add multiple sync and wait to your image generation and those thing will kill your efficiency.
That is not worth the effort. AFR is just a stupid way of using 2 GPU that was fine when game were easier to run and simplier, but is no longer a good solutions for now. What multi-tiles GPU needs is a way to send the works intelligently across multiple dies and finding way to manage memory efficiently while doing it. Once you have that figured out, your solutions become way powerful and you don't need to deal with frame pacing issue, where thing happen in the frame rendering process, etc.
And the hoopla around that is pretty questionable at this point. Oooh, interesting, if a bit silly. But it will comfort some if nothing else. Yeah, if you rewire them. And then people plug them in to the old ports and get fun fire and component hazards.
Yes you can key them, but that's not always enough for some, as history has shown.
Are they all totally valid points for every possible buyer? of course not. Just like what your insinuating about their reputation for underhanded tactics, not caring about gamers, ripping people off etc etc isn't a consideration, and at least not a deal breaker for the vast majority of their buyers, it's just how a vocal minority consider their reputation.
Now, despite everything i've said, if money were no object and i didn't consider anything beyond the raw performance numbers, i'd buy Nvidia in a heartbeat. The end result is incredibly impressive, the means of achieving it is just disappointing
Hm, who was that, let me think... I'd wait, but there is a bit of gambling either way.
If this card is coming from AIBs, they DO KNOW what is coming.
I mean, have we ever had older AMD GPUs become more expensive, after new gen is released? (planetary level cryptobazinga doesn't count)