• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

AMD Details DeepSeek R1 Performance on Radeon RX 7900 XTX, Confirms Ryzen AI Max Memory Sizes

btarunr

Editor & Senior Moderator
Staff member
Joined
Oct 9, 2007
Messages
47,448 (7.50/day)
Location
Hyderabad, India
System Name RBMK-1000
Processor AMD Ryzen 7 5700G
Motherboard ASUS ROG Strix B450-E Gaming
Cooling DeepCool Gammax L240 V2
Memory 2x 8GB G.Skill Sniper X
Video Card(s) Palit GeForce RTX 2080 SUPER GameRock
Storage Western Digital Black NVMe 512GB
Display(s) BenQ 1440p 60 Hz 27-inch
Case Corsair Carbide 100R
Audio Device(s) ASUS SupremeFX S1220A
Power Supply Cooler Master MWE Gold 650W
Mouse ASUS ROG Strix Impact
Keyboard Gamdias Hermes E2
Software Windows 11 Pro
AMD today put out detailed guides on how to get DeepSeek R1 distilled reasoning models to run on Radeon RX graphics cards and Ryzen AI processors. The guide confirms that the new Ryzen AI Max "Strix Halo" processors come in hardwired to LPCAMM2 memory configurations of 32 GB, 64 GB, and 128 GB, and there won't be a 16 GB memory option for notebook manufacturers to cheap out with. The guide goes on to explain that "Strix Halo" will be able to locally accelerate DeepSeek-R1-Distill-Llama with 70 billion parameters on the 64 GB and 128 GB memory configurations of "Strix Halo" powered notebooks, while the 32 GB model should be able to run DeepSeek-R1-Distill-Qwen-32B. Ryzen AI "Strix Point" mobile processors should be capable of running DeepSeek-R1-Distill-Qwen-14B and DeepSeek-R1-Distill-Llama-14B on their RDNA 3.5 iGPUs and NPUs. Meanwhile, older generation processors based on "Phoenix Point" and "Hawk Point" chips should be capable of DeepSeek-R1-Distill-Llama-14B. The company recommends running all of the above distills in Q4 K M quantization.

Switching gears to the discrete graphics cards, and AMD is only recommending its Radeon RX 7000 series for now, since the RDNA 3 graphics architecture introduces AI accelerators. The flagship Radeon RX 7900 XTX is recommended for DeepSeek-R1-Distill-Qwen-32B distill, while all SKUs with 12 GB to 20 GB of memory—that's RX 7600 XT, RX 7700 XT, RX 7800 XT, RX 7900 GRE, and RX 7900 XT, are recommended till DeepSeek-R1-Distill-Qwen-14B. The mainstream RX 7600 with its 8 GB memory is only recommended till DeepSeek-R1-Distill-Llama-8B. You will need LM Studio 0.3.8 or later and Radeon Software Adrenalin 25.1.1 beta or later drivers. AMD put out first party LMStudio 0.3.8 tokens/second performance numbers for the RX 7900 XTX, comparing it with the NVIDIA GeForce RTX 4080 SUPER and the RTX 4090.



When compared to the RTX 4080 SUPER, the RX 7900 XTX posts up to 34% higher performance with DeepSeek-R1-Distill-Qwen-7B, up to 27% higher performance with DeepSeek-R1-Distill-Llama-8B, and up to 22% higher performance with DeepSeek-R1-Distill-Qwen-14B. Next up, the big face-off between the RX 7900 XTX and the GeForce RTX 4090 with its 24 GB of memory. The RX 7900 XTX is shown to prevail in 3 out of 4 tests, posting up to 13% higher performance with DeepSeek-R1-Distill-Qwen-7B, up to 11% higher performance with DeepSeek-R1-Distill-Llama-8B, and up to 2% higher performance with DeepSeek-R1-Distill-Qwen-14B. It only falls behind the RTX 4090 by 4% with the larger DeepSeek-R1-Distill-Qwen-32B model.

Catch the step-by-step guide on getting DeepSeek R1 disrilled reasoning models to run on AMD hardware in the source link below.

View at TechPowerUp Main Site | Source
 
Joined
Sep 26, 2022
Messages
2,285 (2.66/day)
Location
Brazil
System Name G-Station 2.0 "YGUAZU"
Processor AMD Ryzen 7 5700X3D
Motherboard Gigabyte X470 Aorus Gaming 7 WiFi
Cooling Freezemod: Pump, Reservoir, 360mm Radiator, Fittings / Bykski: Blocks / Barrow: Meters
Memory Asgard Bragi DDR4-3600CL14 2x16GB
Video Card(s) Sapphire PULSE RX 7900 XTX
Storage 240GB Samsung 840 Evo, 1TB Asgard AN2, 2TB Hiksemi FUTURE-LITE, 320GB+1TB 7200RPM HDD
Display(s) Samsung 34" Odyssey OLED G8
Case Lian Li Lancool 216
Audio Device(s) Astro A40 TR + MixAmp
Power Supply Cougar GEX X2 1000W
Mouse Razer Viper Ultimate
Keyboard Razer Huntsman Elite (Red)
Software Windows 11 Pro, Garuda Linux
What? AMD for once in their life getting their timing right to capitalize on something?
 
Joined
Dec 6, 2022
Messages
571 (0.73/day)
Location
NYC
System Name GameStation
Processor AMD R5 5600X
Motherboard Gigabyte B550
Cooling Artic Freezer II 120
Memory 16 GB
Video Card(s) Sapphire Pulse 7900 XTX
Storage 2 TB SSD
Case Cooler Master Elite 120
Hmm, been meaning to try this.

Thanks for the link.
 
Joined
May 19, 2011
Messages
147 (0.03/day)
The guide confirms that the new Ryzen AI Max "Strix Halo" processors come in hardwired to LPCAMM2 memory configurations of 32 GB, 64 GB, and 128 GB, and there won't be a 16 GB memory option for notebook manufacturers to cheap out with.

I combed the source page for this language or any clarification to the matter, saying it is “Hardwired” to LPCAMM2 is a bit counterintuitive. Was it supposed to read LPDDR5 instead?

Either way, The 32 GB mandatory minimum is a welcome sight. I’m a bit surprised (hence the confusion above) that 48 and 96GB capacities weren’t also mentioned as those capacities should be possible via LPCAMM2.
 
Joined
Sep 26, 2022
Messages
2,285 (2.66/day)
Location
Brazil
System Name G-Station 2.0 "YGUAZU"
Processor AMD Ryzen 7 5700X3D
Motherboard Gigabyte X470 Aorus Gaming 7 WiFi
Cooling Freezemod: Pump, Reservoir, 360mm Radiator, Fittings / Bykski: Blocks / Barrow: Meters
Memory Asgard Bragi DDR4-3600CL14 2x16GB
Video Card(s) Sapphire PULSE RX 7900 XTX
Storage 240GB Samsung 840 Evo, 1TB Asgard AN2, 2TB Hiksemi FUTURE-LITE, 320GB+1TB 7200RPM HDD
Display(s) Samsung 34" Odyssey OLED G8
Case Lian Li Lancool 216
Audio Device(s) Astro A40 TR + MixAmp
Power Supply Cougar GEX X2 1000W
Mouse Razer Viper Ultimate
Keyboard Razer Huntsman Elite (Red)
Software Windows 11 Pro, Garuda Linux
I combed the source page for this language or any clarification to the matter, saying it is “Hardwired” to LPCAMM2 is a bit counterintuitive. Was it supposed to read LPDDR5 instead?

Either way, The 32 GB mandatory minimum is a welcome sight. I’m a bit surprised (hence the confusion above) that 48 and 96GB capacities weren’t also mentioned as those capacities should be possible via LPCAMM2.
Afaik, 48G isn't achievable in a quad-channel configuration (4x12G?) but 96G should be (as 4x24G is something available).
 
Joined
May 10, 2023
Messages
553 (0.88/day)
Location
Brazil
Processor 5950x
Motherboard B550 ProArt
Cooling Fuma 2
Memory 4x32GB 3200MHz Corsair LPX
Video Card(s) 2x RTX 3090
Display(s) LG 42" C2 4k OLED
Power Supply XPG Core Reactor 850W
Software I use Arch btw
I combed the source page for this language or any clarification to the matter, saying it is “Hardwired” to LPCAMM2 is a bit counterintuitive. Was it supposed to read LPDDR5 instead?

Either way, The 32 GB mandatory minimum is a welcome sight. I’m a bit surprised (hence the confusion above) that 48 and 96GB capacities weren’t also mentioned as those capacities should be possible via LPCAMM2.
LPCAMM2 uses LPDDR5(X) modules still.
Afaik, 48G isn't achievable in a quad-channel configuration (4x12G?) but 96G should be (as 4x24G is something available).
IIRC each LPCAMM2 module is 128-bit, for strix halo you'll need 2 of those, so for 48GB you could go for 24GB modules.
However, crucial only lists 32 and 64GB modules in their page:

So it'd mean either 64 or 128GB for strix halo. I'm too lazy to look into other manufacturers.
 
Joined
Sep 17, 2014
Messages
23,098 (6.10/day)
Location
The Washing Machine
System Name Tiny the White Yeti
Processor 7800X3D
Motherboard MSI MAG Mortar b650m wifi
Cooling CPU: Thermalright Peerless Assassin / Case: Phanteks T30-120 x3
Memory 32GB Corsair Vengeance 30CL6000
Video Card(s) ASRock RX7900XT Phantom Gaming
Storage Lexar NM790 4TB + Samsung 850 EVO 1TB + Samsung 980 1TB + Crucial BX100 250GB
Display(s) Gigabyte G34QWC (3440x1440)
Case Lian Li A3 mATX White
Audio Device(s) Harman Kardon AVR137 + 2.1
Power Supply EVGA Supernova G2 750W
Mouse Steelseries Aerox 5
Keyboard Lenovo Thinkpad Trackpoint II
VR HMD HD 420 - Green Edition ;)
Software W11 IoT Enterprise LTSC
Benchmark Scores Over 9000
I think an upgrade path for my 7900XT has just opened up right here.

Thanks AMD I guess?
 
Joined
Mar 11, 2024
Messages
123 (0.38/day)
It's crazy because the price to performance of hardware capable of running decently sized models fast enough is WAY WAY lower than what it's sold for. It should become much more affordable to run interesting models in the next few decades.
 
Joined
Nov 26, 2021
Messages
1,768 (1.52/day)
Location
Mississauga, Canada
Processor Ryzen 7 5700X
Motherboard ASUS TUF Gaming X570-PRO (WiFi 6)
Cooling Noctua NH-C14S (two fans)
Memory 2x16GB DDR4 3200
Video Card(s) Reference Vega 64
Storage Intel 665p 1TB, WD Black SN850X 2TB, Crucial MX300 1TB SATA, Samsung 830 256 GB SATA
Display(s) Nixeus NX-EDG27, and Samsung S23A700
Case Fractal Design R5
Power Supply Seasonic PRIME TITANIUM 850W
Mouse Logitech
VR HMD Oculus Rift
Software Windows 11 Pro, and Ubuntu 20.04
LPCAMM2 uses LPDDR5(X) modules still.

IIRC each LPCAMM2 module is 128-bit, for strix halo you'll need 2 of those, so for 48GB you could go for 24GB modules.
However, crucial only lists 32 and 64GB modules in their page:

So it'd mean either 64 or 128GB for strix halo. I'm too lazy to look into other manufacturers.
The 32 GB SKU might be using soldered LPDDR5X; that is the norm for laptops after all.
 
Joined
Jun 22, 2012
Messages
312 (0.07/day)
Processor Intel i7-12700K
Motherboard MSI PRO Z690-A WIFI
Cooling Noctua NH-D15S
Memory Corsair Vengeance 4x16 GB (64GB) DDR4-3600 C18
Video Card(s) MSI GeForce RTX 3090 GAMING X TRIO 24G
Storage Samsung 980 Pro 1TB, SK hynix Platinum P41 2TB
Case Fractal Define C
Power Supply Corsair RM850x
Mouse Logitech G203
Software openSUSE Tumbleweed
A small niche of enthusiasts has been asking for years for more VRAM on consumer GPUs to run bigger AI models; hopefully the current DeepSeek craze is going to make manufacturers reconsider their stance of just providing the bare minimum needed for running games at the resolution the GPUs are primarily intended to be used with.
 
Joined
Nov 26, 2021
Messages
1,768 (1.52/day)
Location
Mississauga, Canada
Processor Ryzen 7 5700X
Motherboard ASUS TUF Gaming X570-PRO (WiFi 6)
Cooling Noctua NH-C14S (two fans)
Memory 2x16GB DDR4 3200
Video Card(s) Reference Vega 64
Storage Intel 665p 1TB, WD Black SN850X 2TB, Crucial MX300 1TB SATA, Samsung 830 256 GB SATA
Display(s) Nixeus NX-EDG27, and Samsung S23A700
Case Fractal Design R5
Power Supply Seasonic PRIME TITANIUM 850W
Mouse Logitech
VR HMD Oculus Rift
Software Windows 11 Pro, and Ubuntu 20.04
A small niche of enthusiasts has been asking for years for more VRAM on consumer GPUs to run bigger AI models; hopefully the current DeepSeek craze is going to make manufacturers reconsider their stance of just providing the bare minimum needed for running games at the resolution the GPUs are primarily intended to be used with.
Honestly, I believe that for inference, Apple's approach is better; the unified DRAM pool allows memory capacities that consumer GPUs just can't match. A lot of people use laptops so a bigger Strix Halo with a 512-bit bus could have 256 GB of RAM with 76% of a desktop RTX 4080's bandwidth.
 
Joined
Jun 22, 2012
Messages
312 (0.07/day)
Processor Intel i7-12700K
Motherboard MSI PRO Z690-A WIFI
Cooling Noctua NH-D15S
Memory Corsair Vengeance 4x16 GB (64GB) DDR4-3600 C18
Video Card(s) MSI GeForce RTX 3090 GAMING X TRIO 24G
Storage Samsung 980 Pro 1TB, SK hynix Platinum P41 2TB
Case Fractal Define C
Power Supply Corsair RM850x
Mouse Logitech G203
Software openSUSE Tumbleweed
Honestly, I believe that for inference, Apple's approach is better; the unified DRAM pool allows memory capacities that consumer GPUs just can't match.
That could be a path forward too with mixture-of-expert (MoE) LLMs similar to DeepSeek V3/R1, but merely providing non-upgradable systems with relatively large amounts of RAM (e.g. 128GB) at mediocre-to-low bandwidth (~250-300 GB/s, still below the level of a low-end discrete GPU) isn't going to help a lot. Memory doesn't just have to be abundant, but fast too.
 
Joined
May 19, 2011
Messages
147 (0.03/day)
LPCAMM2 uses LPDDR5(X) modules still.

Right, that's why the confusion comes from the use of the phrase "hardwired to LPCAMM2 configurations of [fixed sizes]". The word "hardwired" implies that it is in fact soldered.
 
Joined
Nov 26, 2021
Messages
1,768 (1.52/day)
Location
Mississauga, Canada
Processor Ryzen 7 5700X
Motherboard ASUS TUF Gaming X570-PRO (WiFi 6)
Cooling Noctua NH-C14S (two fans)
Memory 2x16GB DDR4 3200
Video Card(s) Reference Vega 64
Storage Intel 665p 1TB, WD Black SN850X 2TB, Crucial MX300 1TB SATA, Samsung 830 256 GB SATA
Display(s) Nixeus NX-EDG27, and Samsung S23A700
Case Fractal Design R5
Power Supply Seasonic PRIME TITANIUM 850W
Mouse Logitech
VR HMD Oculus Rift
Software Windows 11 Pro, and Ubuntu 20.04
That could be a path forward too with mixture-of-expert (MoE) LLMs similar to DeepSeek V3/R1, but merely providing non-upgradable systems with relatively large amounts of RAM (e.g. 128GB) at mediocre-to-low bandwidth (~250-300 GB/s, still below the level of a low-end discrete GPU) isn't going to help a lot. Memory doesn't just have to be abundant, but fast too.
Yes, there's a tradeoff, and for inference, memory bandwidth trumps all. The trend for GPUs is clear though. GDDR leads to low memory capacities; HBM allows exceeding that capacity at infeasible cost. Upgradeable RAM allows the most capacity, but that comes at the expense of bandwidth as well.
 
Joined
Oct 12, 2005
Messages
726 (0.10/day)
Yes, there's a tradeoff, and for inference, memory bandwidth trumps all. The trend for GPUs is clear though. GDDR leads to low memory capacities; HBM allows exceeding that capacity at infeasible cost. Upgradeable RAM allows the most capacity, but that comes at the expense of bandwidth as well.
The main issue with HBM is the fact it require an interposer to sit on and communicate with the main die. That is drastically increase the cost as HBM need to be on package on silicon.

But there are work to produce 3D DRAM that wouldn't necessary be HBM in order to increase capacities. but from what i see, its still few years in the making

note that it look they are also working on stacked dram that would use the same bus size as GDDR* and would probably be a drop in solution while we wait
 
Last edited:
Joined
Dec 17, 2024
Messages
42 (0.93/day)
What? AMD for once in their life getting their timing right to capitalize on something?
Yeah, upon reading this I was lauding their reactivity... then I remembered that it's probably thanks to the marketing department not being in charge.
 
Joined
Nov 13, 2007
Messages
10,945 (1.74/day)
Location
Austin Texas
System Name stress-less
Processor 9800X3D @ 5.42GHZ
Motherboard MSI PRO B650M-A Wifi
Cooling Thermalright Phantom Spirit EVO
Memory 64GB DDR5 6000 1:1 CL30-36-36-96 FCLK 2000
Video Card(s) RTX 4090 FE
Storage 2TB WD SN850, 4TB WD SN850X
Display(s) Alienware 32" 4k 240hz OLED
Case Jonsbo Z20
Audio Device(s) Yes
Power Supply RIP Corsair SF750... Waiting for SF1000
Mouse DeathadderV2 X Hyperspeed
Keyboard 65% HE Keyboard
Software Windows 11
Benchmark Scores They're pretty good, nothing crazy.
This is awesome. YESSSS
 
Joined
Jan 3, 2021
Messages
3,758 (2.52/day)
Location
Slovenia
Processor i5-6600K
Motherboard Asus Z170A
Cooling some cheap Cooler Master Hyper 103 or similar
Memory 16GB DDR4-2400
Video Card(s) IGP
Storage Samsung 850 EVO 250GB
Display(s) 2x Oldell 24" 1920x1200
Case Bitfenix Nova white windowless non-mesh
Audio Device(s) E-mu 1212m PCI
Power Supply Seasonic G-360
Mouse Logitech Marble trackball, never had a mouse
Keyboard Key Tronic KT2000, no Win key because 1994
Software Oldwin
The main issue with HBM is the fact it require an interposer to sit on and communicate with the main die. That is drastically increase the cost as HBM need to be on package on silicon.
There are more issues. A HBM memory cell takes up twice as much space as a DDR cell. Then there's TSV stacking, which seems to be incredibly expensive, possibly because there's insufficient manufacturing capacity everywhere.
DRAM dies are also stacked in large capacity server DIMMs. That used to be the case for really, really expensive 128 GB DIMMs and up, but now as larger capacity dies exist, it's probably 256 GB and up. Going by the price, I assume it's TSV stacking.
LPDDR dies are also stacked in some designs, for example Apple's M chips. Probably TSV again because speed matters and cost doesn't.
A case of non-TSV stacked dies (with old style wire bonding instead) would be NAND, for several reasons: lower speed, small number of wires due to 8-bit bus, and requirement for low cost.

But there are work to produce 3D DRAM that wouldn't necessary be HBM in order to increase capacities. but from what i see, its still few years in the making
Thanks for the link. Semiengineering posted this nice overview of current tech in 2021 ... and later I ocassionally checked and found nothing. Yes, we'll wait some more for 3D. Someone will eventually modify the NAND manufacturing tech so that those capacitors, well, quickly charge and discharge. And when they succeed, they will try everything to compress four bits into one cell.

note that it look they are also working on stacked dram that would use the same bus size as GDDR* and would probably be a drop in solution while we wait
What sort of stacked DRAM dou yo mean here? Again, due to high speed, it would have to be TSV stacked, so in a different price category.
 
Last edited:
Joined
Oct 30, 2020
Messages
371 (0.24/day)
Location
Toronto
System Name GraniteXT
Processor Ryzen 9950X
Motherboard ASRock B650M-HDV
Cooling 2x360mm custom loop
Memory 2x24GB Team Xtreem DDR5-8000 [M die]
Video Card(s) RTX 3090 FE underwater
Storage Intel P5800X 800GB + Samsung 980 Pro 2TB
Display(s) MSI 342C 34" OLED
Case O11D Evo RGB
Audio Device(s) DCA Aeon 2 w/ SMSL M200/SP200
Power Supply Superflower Leadex VII XG 1300W
Mouse Razer Basilisk V3
Keyboard Steelseries Apex Pro V2 TKL
The 395 looks more and more interesting by the day and I can see it replace low/mid end GPU's in the laptop space in the future. Please AMD, release one on the desktop. Or Turin Threadripper. These two are a lot more interesting than the shit these three companies are spitting out the last couple of years and i'd love to tweak them out.

Fast forward a few years and a 16 cores with V-Cache + UDNA + CAMM2 should be awesome. HBM remains a pipe dream because their prices rose pretty and TSV stacking remains prohibitively expensive.
 
Joined
Aug 3, 2006
Messages
262 (0.04/day)
Location
Austin, TX
Processor Ryzen 6900HX
Memory 32 GB DDR4LP
Video Card(s) Radeon 6800m
Display(s) LG C3 42''
Software Windows 11 home premium
The 395 looks more and more interesting by the day and I can see it replace low/mid end GPU's in the laptop space in the future. Please AMD, release one on the desktop. Or Turin Threadripper. These two are a lot more interesting than the shit these three companies are spitting out the last couple of years and i'd love to tweak them out.

Fast forward a few years and a 16 cores with V-Cache + UDNA + CAMM2 should be awesome. HBM remains a pipe dream because their prices rose pretty and TSV stacking remains prohibitively expensive.

I'm positive that AMD and companies like Minisforum will be release mini motherboards with the SoC embedded for system builders.
 
Joined
Jan 3, 2021
Messages
3,758 (2.52/day)
Location
Slovenia
Processor i5-6600K
Motherboard Asus Z170A
Cooling some cheap Cooler Master Hyper 103 or similar
Memory 16GB DDR4-2400
Video Card(s) IGP
Storage Samsung 850 EVO 250GB
Display(s) 2x Oldell 24" 1920x1200
Case Bitfenix Nova white windowless non-mesh
Audio Device(s) E-mu 1212m PCI
Power Supply Seasonic G-360
Mouse Logitech Marble trackball, never had a mouse
Keyboard Key Tronic KT2000, no Win key because 1994
Software Oldwin
Please AMD, release one on the desktop.
And its name shall be 10980XG. It would only fit in a TR socket though, with its four channels.

We have yet to see what becomes of CAMM2 and LPCAMM. Either of these may become a commodity in a couple years. Or they may remain a rarity, with poor availability, mostly available through OEMs.
 
Last edited:
Joined
Jan 14, 2019
Messages
13,940 (6.31/day)
Location
Midlands, UK
Processor Various Intel and AMD CPUs
Motherboard Micro-ATX and mini-ITX
Cooling Yes
Memory Overclocking is overrated
Video Card(s) Various Nvidia and AMD GPUs
Storage A lot
Display(s) Monitors and TVs
Case The smaller the better
Audio Device(s) Speakers and headphones
Power Supply 300 to 750 W, bronze to gold
Mouse Wireless
Keyboard Mechanic
VR HMD Not yet
Software Linux gaming master race
The 7900 XTX being better at AI than the 4090? Good joke! :laugh: Wait... Seriously? :wtf:
 

hatyii

New Member
Joined
Jan 30, 2025
Messages
1 (1.00/day)
So if the 7900 XTX is faster for AI than the 4090 and AMD mentions that RDNA3 specifically can run this model well because of hardware advantages over RDNA2, explain to me why is the new FSR version was supposed to be exclusive to their new GPUs? I mean even an RTX 2000 GPU can benefit of DLSS, so I'm just confused about these stuff.
 
Joined
Jan 14, 2019
Messages
13,940 (6.31/day)
Location
Midlands, UK
Processor Various Intel and AMD CPUs
Motherboard Micro-ATX and mini-ITX
Cooling Yes
Memory Overclocking is overrated
Video Card(s) Various Nvidia and AMD GPUs
Storage A lot
Display(s) Monitors and TVs
Case The smaller the better
Audio Device(s) Speakers and headphones
Power Supply 300 to 750 W, bronze to gold
Mouse Wireless
Keyboard Mechanic
VR HMD Not yet
Software Linux gaming master race
So if the 7900 XTX is faster for AI than the 4090 and AMD mentions that RDNA3 specifically can run this model well because of hardware advantages over RDNA2, explain to me why is the new FSR version was supposed to be exclusive to their new GPUs? I mean even an RTX 2000 GPU can benefit of DLSS, so I'm just confused about these stuff.
FSR 4 could be vastly different from DeepSeek in how it runs. RDNA 3's AI accelerators are part of the shader engine. RDNA 4 may be getting dedicated units. Who knows.

Also, DLSS hasn't changed much in its base operation, so it can run on anything with tensor cores. FSR hasn't needed AI cores so far, but FSR 4 does.

My other theory is that Nvidia hasn't touched the RT and tensor cores much since RTX 2000 (judging by performance data). We know very little about what an AI/tensor core actually is and how it works.
 
Top