AMD Radeon RX 9070 XT Pricing Leak: More Affordable Than RTX 5070?

Chrispy_ · Jan 14, 2025

LittleBro said:
It does. That's the reason why RX 6000 series ware able to catch up with RTX 3000 series.

With RX 7000 series, AMD nerfed (halved) L3 cache and that was a bad move IMHO. RX 7000s could have been better with more L3 cache. My RX 7800 XT has about about 18% less compute units than RX 6800 XT and half the L3 cache. IPC improvement (higher clocks, "dual-issue" stream processor) of RDNA3 was able to partially compensate for lack of those units, but it still sucks when 7800 XT is beaten by 6800 XT in some games even today, while in others it is losing by only a single digit %. I'd even dare to say that RX 7800 XT is not a real successor of RX 6800 XT.

That's not a cache problem, that's because the 6800XT has 20% more compute units 72 vs 60.
IPC improvements are basically zero between RDNA2 and RDNA3, proved quite conclusively by the 7600 having near-identical performance to the 6650XT when clocked at the same speed.

What you're seeing is the 7800XT with half the cache of the 6800XT making up the compute unit deficit with clockspeed.

72CU x 2.2GHz boost clock = 158 'CU GHz'

60CU x 2.6GHz boost clock = 156 'CU GHz'

i.e, if both were identical architecture and IPC, the 6800XT would be only 1-2% faster than the 7800XT, which is often the case in real games, regardless of the cache sizes being dramatically different. It's also worth noting that you cannot call higher clocks an IPC gain like you cited:

LittleBro said:
IPC improvement (higher clocks, "dual-issue" stream processor) of RDNA3

IPC literally means instructions per clock, so it's independent of clockspeed.

AnotherReader said:
It isn't the real successor to the 6800 XT; the name is a red herring. When you account for TDP, die size, and MSRP, it's an obvious successor to the 6700 XT.

To me the 7800XT is the obvious successor to the vanilla 6800, in that it's the same rough price, same bus width, same VRAM amount, and same core config - just clocked a solid 25% faster

AnotherReader · Jan 14, 2025

Chrispy_ said:
That's not a cache problem, that's because the 6800XT has 20% more compute units 72 vs 60.
IPC improvements are basically zero between RDNA2 and RDNA3, proved quite conclusively by the 7600 having near-identical performance to the 6650XT when clocked at the same speed.

What you're seeing is the 7800XT with half the cache of the 6800XT making up the compute unit deficit with clockspeed.

72CU x 2.2GHz boost clock = 158 'CU GHz'
60CU x 2.6GHz boost clock = 156 'CU GHz'

i.e, if both were identical architecture and IPC, the 6800XT would be only 1-2% faster than the 7800XT, which is often the case in real games, regardless of the cache sizes being dramatically different. It's also worth noting that you cannot call higher clocks an IPC gain like you cited:

IPC literally means instructions per clock, so it's independent of clockspeed.

To me the 7800XT is the obvious successor to the vanilla 6800, in that it's the same rough price, same bus width, same VRAM amount, and same core config - just clocked a solid 25% faster

There is a modest increase in IPC. The compiler often misses dual issue opportunities so the cases with IPC increase rely on hand optimized code. The 7800 XT clocks about 10% higher than the 6800 on average and is about 23% faster in those games. However, as this requires hand optimized code, it will be limited to well known, recent games.

Chrispy_ · Jan 14, 2025

AnotherReader said:
The compiler often misses dual issue opportunities

Oof, hand-optimised assembly code to replace the shaders every developer is going to be using?

I mean, I guess that's an IPC increase, but holy hell, it's only going to exist in situations where AMD's driver team directly intervenes and it has no IPC impact on the thousands of demanding games in the pre-RDNA3 back-catalogue or any in-house stuff that hasn't made it to AMD's driver team yet

Hopefully RDNA4 doesn't need such hand-optimised shader replacements and they've learned from RDNA3's lesson - Presumably their driver team is wasting tons of time firefighting individual game performance because of the dual-issue change that they'd rather be spending on fixing bugs and improving features like FSR and Framegen.

LittleBro · Jan 14, 2025

Chrispy_ said:
i.e, if both were identical architecture and IPC, the 6800XT would be only 1-2% faster than the 7800XT, which is often the case in real games, regardless of the cache sizes being dramatically different. It's also worth noting that you cannot call higher clocks an IPC gain like you cited:

Then why did AMD doubled the bandwidth of L3 cache in 7000 series? If it doesn't have noticeable impact on games, they might have just as well halved it only and call it a day, or not?

Chrispy_ said:
IPC literally means instructions per clock, so it's independent of clockspeed.

Good point, I should have wrote "performance improvement (higher clocks, "dual-issue" stream processor)" instead. Apologies for my wording.
I'd like to know how noticeable is the performance impact of "dual-stream" processors in games.

Chrispy_ said:
To me the 7800XT is the obvious successor to the vanilla 6800, in that it's the same rough price, same bus width, same VRAM amount, and same core config - just clocked a solid 25% faster

Makes sense.

Chrispy_ · Jan 14, 2025

LittleBro said:
Then why did AMD doubled the bandwidth of L3 cache in 7000 series?

They didn't intentionally double the bandwidth, it was just a side-effect of building a wider interface at higher clocks. The L3 (Infinity)Cache joins the cores and the memory controllers, so the bandwitdth is a product of how wide the memory bus is and how fast the core is clocked at the cache interface.

If you're comparing the 256-bit, ~2.3GHz 6950XT with 1800GB/s to the 384-bit ~2.6GHz 7900XTX it's fairly easy maths, so long as you remember that one of RDNA3's advantages over RDNA2 was decoupling the shader clocks from the rest of the core to save power - so the cache is likely to be running at ~2.9-3GHz when the shader boost clocks are reporting ~2.6Ghz. Meanwhile, an RDNA2 card reporting 2.3GHz boost clocks is running the cache at 2.3GHz too.

So here's the theoretical bandwidth calculations:

6950XT = 1800GB/S @ ~2.3GHz/256-bit

7900XTX = 3500GB/s @ ~2.9GHz/384-bit.

So let's multiply 1800 by 1.5 to account for the 50% increase in bus width to get 2700, and then let's multiply 2700 by 2.9/2.3 to account for the 26% clock increase.

That gives 3400GB/s if we assume 2.9GHz cache clock and 3522GB/s if we assume 3.0GHz cache clock.

So there's your bandwidth doubling, the simple theory is right on the money because it's an easily calculable side-effect of other more significant changes, not an intentional "hey let's double the bandwidth" choice. As for why the cache was halved for RDNA3, AMD's official stance was this:

"The Infinity Cache capacity was decreased due to RDNA 3 having wider a memory interface up to 384-bit whereas RDNA 2 used memory interfaces up to 256-bit. RDNA 3 having a wider 384-bit memory means that its cache hitrate does not have to be as high to still avoid bandwidth bottlenecks as there is higher memory bandwidth."

-https://chipsandcheese.com/2023/01/07/microbenchmarking-amds-rdna-3-graphics-architecture/

That's a fair enough argument for the 7900XT and XTX, but Navi32 and Navi33 received no such increase in memory bus width, so I'm not sure I agree with their statement entirely. The bandwidth did increase because of VRAM clock speed changes, but that's not what AMD said and it's not by as much. The 7800XT got 19.5Gbps GDDR6 whilst the equivalent vanilla 6800 got 16Gbps GDDR6. Likewise down at the low end, 6600XT was launched with 16Gbps but the 7600/7600XT got 18Gbps.

DaemonForce · Jan 14, 2025

Chrispy_ said:
math

If we could get 72CU at 3.0GHz boost that would be something like a (stacked) 38% performance gain.
That would fall under enthusiast territory. Probably the next target for TSMC 2nm and that should be some WILD performance.

Chrispy_ · Jan 15, 2025

DaemonForce said:
If we could get 72CU at 3.0GHz boost that would be something like a (stacked) 38% performance gain.
That would fall under enthusiast territory. Probably the next target for TSMC 2nm and that should be some WILD performance.

Expensive though. Die area costs increase exponentially because of wafer defects taking out a higher percentage of dies on each wafer and fewer dies per wafer as well.

A 72CU die probably costs 50% more than a 64CU die, which would make it a high-end expensive part, and AMD have very clearly stated that they're not targeting the high end this generation and want to make affordable cards to capture the midrange and claw back marketshare.

System Name	Bragging Rights
Processor	Atom Z3735F 1.33GHz
Motherboard	It has no markings but it's green
Cooling	No, it's a 2.2W processor
Memory	2GB DDR3L-1333
Video Card(s)	Gen7 Intel HD (4EU @ 311MHz)
Storage	32GB eMMC and 128GB Sandisk Extreme U3
Display(s)	10" IPS 1280x800 60Hz
Case	Veddha T2
Audio Device(s)	Apparently, yes
Power Supply	Samsung 18W 5V fast-charger
Mouse	MX Anywhere 2
Keyboard	Logitech MX Keys (not Cherry MX at all)
VR HMD	Samsung Oddyssey, not that I'd plug it into this though....
Software	W10 21H1, barely
Benchmark Scores	I once clocked a Celeron-300A to 564MHz on an Abit BE6 and it scored over 9000.

Processor	Ryzen 7 5700X
Motherboard	ASUS TUF Gaming X570-PRO (WiFi 6)
Cooling	Noctua NH-C14S (two fans)
Memory	2x16GB DDR4 3200
Video Card(s)	Reference Vega 64
Storage	Intel 665p 1TB, WD Black SN850X 2TB, Crucial MX300 1TB SATA, Samsung 830 256 GB SATA
Display(s)	Nixeus NX-EDG27, and Samsung S23A700
Case	Fractal Design R5
Power Supply	Seasonic PRIME TITANIUM 850W
Mouse	Logitech
VR HMD	Oculus Rift
Software	Windows 11 Pro, and Ubuntu 20.04

System Name	Bragging Rights
Processor	Atom Z3735F 1.33GHz
Motherboard	It has no markings but it's green
Cooling	No, it's a 2.2W processor
Memory	2GB DDR3L-1333
Video Card(s)	Gen7 Intel HD (4EU @ 311MHz)
Storage	32GB eMMC and 128GB Sandisk Extreme U3
Display(s)	10" IPS 1280x800 60Hz
Case	Veddha T2
Audio Device(s)	Apparently, yes
Power Supply	Samsung 18W 5V fast-charger
Mouse	MX Anywhere 2
Keyboard	Logitech MX Keys (not Cherry MX at all)
VR HMD	Samsung Oddyssey, not that I'd plug it into this though....
Software	W10 21H1, barely
Benchmark Scores	I once clocked a Celeron-300A to 564MHz on an Abit BE6 and it scored over 9000.

System Name	AM5_TimeKiller
Processor	AMD Ryzen 7 9800X3D
Motherboard	ASUS ROG Strix B650E-F Gaming
Cooling	Arctic Freezer II 420 rev.7 (with 6 fans in push-pull setup)
Memory	G.Skill Trident Z5 Neo RGB, 2x16 GB DDR5, Hynix A-Die, 6400 MHz @ CL30-39-39-102-141 1T @ 1.40 V
Video Card(s)	ASUS TUF Radeon RX 9070 XT GAMING
Storage	Samsung 990 PRO 1 TB, Kingston KC3000 1 TB, Kingston KC3000 2 TB
Case	Corsair 7000D Airflow
Audio Device(s)	Creative Sound Blaster X-Fi Titanium
Power Supply	Seasonic Prime TX-850
Mouse	Logitech wireless mouse for 15€, 6y old
Keyboard	Logitech wireless keyboard, 12y old

System Name	Bragging Rights
Processor	Atom Z3735F 1.33GHz
Motherboard	It has no markings but it's green
Cooling	No, it's a 2.2W processor
Memory	2GB DDR3L-1333
Video Card(s)	Gen7 Intel HD (4EU @ 311MHz)
Storage	32GB eMMC and 128GB Sandisk Extreme U3
Display(s)	10" IPS 1280x800 60Hz
Case	Veddha T2
Audio Device(s)	Apparently, yes
Power Supply	Samsung 18W 5V fast-charger
Mouse	MX Anywhere 2
Keyboard	Logitech MX Keys (not Cherry MX at all)
VR HMD	Samsung Oddyssey, not that I'd plug it into this though....
Software	W10 21H1, barely
Benchmark Scores	I once clocked a Celeron-300A to 564MHz on an Abit BE6 and it scored over 9000.

System Name	DevKit
Processor	AMD Ryzen 5 3600 ↗4.0GHz
Motherboard	Asus TUF Gaming X570-Plus WiFi
Cooling	Koolance CPU-300-H06, Koolance GPU-180-L06, SC800 Pump
Memory	4x16GB Ballistix 3200MT/s ↗3800
Video Card(s)	PowerColor RX 580 Red Devil 8GB ↗1380MHz ↘1105mV, PowerColor RX 7900 XT Hellhound 20GB
Storage	240GB Corsair MP510, 120GB KingDian S280
Display(s)	Nixeus VUE-24 (1080p144)
Case	Koolance PC2-601BLW + Koolance EHX1020CUV Radiator Kit
Audio Device(s)	Oculus CV-1
Power Supply	Antec Earthwatts EA-750 Semi-Modular
Mouse	Easterntimes Tech X-08, Zelotes C-12
Keyboard	Logitech 106-key, Romoral 15-Key Macro, Royal Kludge RK84
VR HMD	Oculus CV-1
Software	Windows 10 Pro Workstation, VMware Workstation 16 Pro, MS SQL Server 2016, Fan Control v120, Blender
Benchmark Scores	Cinebench R15: 1590cb Cinebench R20: 3530cb (7.83x451cb) CPU-Z 17.01.64: 481.2/3896.8 VRMark: 8009