• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

AMD Radeon RX 9070 XT Pricing Leak: More Affordable Than RTX 5070?

It does. That's the reason why RX 6000 series ware able to catch up with RTX 3000 series.

With RX 7000 series, AMD nerfed (halved) L3 cache and that was a bad move IMHO. RX 7000s could have been better with more L3 cache. My RX 7800 XT has about about 18% less compute units than RX 6800 XT and half the L3 cache. IPC improvement (higher clocks, "dual-issue" stream processor) of RDNA3 was able to partially compensate for lack of those units, but it still sucks when 7800 XT is beaten by 6800 XT in some games even today, while in others it is losing by only a single digit %. I'd even dare to say that RX 7800 XT is not a real successor of RX 6800 XT.
That's not a cache problem, that's because the 6800XT has 20% more compute units 72 vs 60.
IPC improvements are basically zero between RDNA2 and RDNA3, proved quite conclusively by the 7600 having near-identical performance to the 6650XT when clocked at the same speed.

What you're seeing is the 7800XT with half the cache of the 6800XT making up the compute unit deficit with clockspeed.
72CU x 2.2GHz boost clock = 158 'CU GHz'​
60CU x 2.6GHz boost clock = 156 'CU GHz'​

i.e, if both were identical architecture and IPC, the 6800XT would be only 1-2% faster than the 7800XT, which is often the case in real games, regardless of the cache sizes being dramatically different. It's also worth noting that you cannot call higher clocks an IPC gain like you cited:
IPC improvement (higher clocks, "dual-issue" stream processor) of RDNA3
IPC literally means instructions per clock, so it's independent of clockspeed.


It isn't the real successor to the 6800 XT; the name is a red herring. When you account for TDP, die size, and MSRP, it's an obvious successor to the 6700 XT.
To me the 7800XT is the obvious successor to the vanilla 6800, in that it's the same rough price, same bus width, same VRAM amount, and same core config - just clocked a solid 25% faster ;)
 
Last edited:
That's not a cache problem, that's because the 6800XT has 20% more compute units 72 vs 60.
IPC improvements are basically zero between RDNA2 and RDNA3, proved quite conclusively by the 7600 having near-identical performance to the 6650XT when clocked at the same speed.

What you're seeing is the 7800XT with half the cache of the 6800XT making up the compute unit deficit with clockspeed.
72CU x 2.2GHz boost clock = 158 'CU GHz'​
60CU x 2.6GHz boost clock = 156 'CU GHz'​

i.e, if both were identical architecture and IPC, the 6800XT would be only 1-2% faster than the 7800XT, which is often the case in real games, regardless of the cache sizes being dramatically different. It's also worth noting that you cannot call higher clocks an IPC gain like you cited:

IPC literally means instructions per clock, so it's independent of clockspeed.



To me the 7800XT is the obvious successor to the vanilla 6800, in that it's the same rough price, same bus width, same VRAM amount, and same core config - just clocked a solid 25% faster ;)
There is a modest increase in IPC. The compiler often misses dual issue opportunities so the cases with IPC increase rely on hand optimized code. The 7800 XT clocks about 10% higher than the 6800 on average and is about 23% faster in those games. However, as this requires hand optimized code, it will be limited to well known, recent games.
 
Oof, hand-optimised assembly code to replace the shaders every developer is going to be using?

I mean, I guess that's an IPC increase, but holy hell, it's only going to exist in situations where AMD's driver team directly intervenes and it has no IPC impact on the thousands of demanding games in the pre-RDNA3 back-catalogue or any in-house stuff that hasn't made it to AMD's driver team yet :(

Hopefully RDNA4 doesn't need such hand-optimised shader replacements and they've learned from RDNA3's lesson - Presumably their driver team is wasting tons of time firefighting individual game performance because of the dual-issue change that they'd rather be spending on fixing bugs and improving features like FSR and Framegen.
 
i.e, if both were identical architecture and IPC, the 6800XT would be only 1-2% faster than the 7800XT, which is often the case in real games, regardless of the cache sizes being dramatically different. It's also worth noting that you cannot call higher clocks an IPC gain like you cited:
Then why did AMD doubled the bandwidth of L3 cache in 7000 series? If it doesn't have noticeable impact on games, they might have just as well halved it only and call it a day, or not?

IPC literally means instructions per clock, so it's independent of clockspeed.
Good point, I should have wrote "performance improvement (higher clocks, "dual-issue" stream processor)" instead. Apologies for my wording.
I'd like to know how noticeable is the performance impact of "dual-stream" processors in games.

To me the 7800XT is the obvious successor to the vanilla 6800, in that it's the same rough price, same bus width, same VRAM amount, and same core config - just clocked a solid 25% faster ;)
Makes sense.
 
Then why did AMD doubled the bandwidth of L3 cache in 7000 series?
They didn't intentionally double the bandwidth, it was just a side-effect of building a wider interface at higher clocks. The L3 (Infinity)Cache joins the cores and the memory controllers, so the bandwitdth is a product of how wide the memory bus is and how fast the core is clocked at the cache interface.

If you're comparing the 256-bit, ~2.3GHz 6950XT with 1800GB/s to the 384-bit ~2.6GHz 7900XTX it's fairly easy maths, so long as you remember that one of RDNA3's advantages over RDNA2 was decoupling the shader clocks from the rest of the core to save power - so the cache is likely to be running at ~2.9-3GHz when the shader boost clocks are reporting ~2.6Ghz. Meanwhile, an RDNA2 card reporting 2.3GHz boost clocks is running the cache at 2.3GHz too.

So here's the theoretical bandwidth calculations:

6950XT = 1800GB/S @ ~2.3GHz/256-bit​
7900XTX = 3500GB/s @ ~2.9GHz/384-bit.​
So let's multiply 1800 by 1.5 to account for the 50% increase in bus width to get 2700, and then let's multiply 2700 by 2.9/2.3 to account for the 26% clock increase.​
That gives 3400GB/s if we assume 2.9GHz cache clock and 3522GB/s if we assume 3.0GHz cache clock.​

So there's your bandwidth doubling, the simple theory is right on the money because it's an easily calculable side-effect of other more significant changes, not an intentional "hey let's double the bandwidth" choice. As for why the cache was halved for RDNA3, AMD's official stance was this:

"The Infinity Cache capacity was decreased due to RDNA 3 having wider a memory interface up to 384-bit whereas RDNA 2 used memory interfaces up to 256-bit. RDNA 3 having a wider 384-bit memory means that its cache hitrate does not have to be as high to still avoid bandwidth bottlenecks as there is higher memory bandwidth."
-https://chipsandcheese.com/2023/01/07/microbenchmarking-amds-rdna-3-graphics-architecture/​
That's a fair enough argument for the 7900XT and XTX, but Navi32 and Navi33 received no such increase in memory bus width, so I'm not sure I agree with their statement entirely. The bandwidth did increase because of VRAM clock speed changes, but that's not what AMD said and it's not by as much. The 7800XT got 19.5Gbps GDDR6 whilst the equivalent vanilla 6800 got 16Gbps GDDR6. Likewise down at the low end, 6600XT was launched with 16Gbps but the 7600/7600XT got 18Gbps.
 
If we could get 72CU at 3.0GHz boost that would be something like a (stacked) 38% performance gain.
That would fall under enthusiast territory. Probably the next target for TSMC 2nm and that should be some WILD performance.
 
If we could get 72CU at 3.0GHz boost that would be something like a (stacked) 38% performance gain.
That would fall under enthusiast territory. Probably the next target for TSMC 2nm and that should be some WILD performance.
Expensive though. Die area costs increase exponentially because of wafer defects taking out a higher percentage of dies on each wafer and fewer dies per wafer as well.

A 72CU die probably costs 50% more than a 64CU die, which would make it a high-end expensive part, and AMD have very clearly stated that they're not targeting the high end this generation and want to make affordable cards to capture the midrange and claw back marketshare.
 
Back
Top