• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

AMD Details DeepSeek R1 Performance on Radeon RX 7900 XTX, Confirms Ryzen AI Max Memory Sizes

Joined
May 13, 2008
Messages
779 (0.13/day)
System Name HTPC whhaaaat?
Processor 2600k @ 4500mhz
Motherboard Asus Maximus IV gene-z gen3
Cooling Noctua NH-C14
Memory Gskill Ripjaw 2x4gb
Video Card(s) EVGA 1080 FTW @ 2037/11016
Storage 2x512GB MX100/1x Agility 3 128gb ssds, Seagate 3TB HDD
Display(s) Vizio P 65'' 4k tv
Case Lian Li pc-c50b
Audio Device(s) Denon 3311
Power Supply Corsair 620HX
Great time to uncancel Navi 41 and 42 then ? Bring them to Market with 30 and 36GB of VRAM.

You mean 40/48GB of ram? I doubt it was ever GDDR7, but it's possible.

I think N41 (partially) got canned because they know once people have >80TF and 24GB (essentially a 4090) most ain't upgrading for a long-long time. Those that wanted that at >$1000+ bought a 4090.
Cutting the price of 4080 from $1200 to $1000 probably also had something to do with it, as I think that's where AMD wanted to compete.
Similar reason for the gap in nV products. Why GB203 limited to <80TF (1 less cluster than half GB202 + PL locks) and doesn't have a 24GB option. Gotta milk needing those upgrades as long as possible...
Hence both wanted to get one more cycle in before that happened...or maybe just able to make it for a larger margin given the move to 3nm and 3GB GDDR7 (256-bit instead of 384-bit for 24GB spec).
Something like a $500 BOM (~GB203/N48 size; 100+ KGD per 20k wafer + ~$300 of 3GB GDDR7) makes a lot more sense than making a slightly-slower 4090 for ~$1200 MSRP.
They would've needed 12288sp @ 3640mhz to match a 4090...That's probably impossible if not not close-to-impossible to yield on 4/5nm for a gpu.
We may see w/ N48 3.4ghz is probably difficult-enough to yield within decent power. I say that because if all N48 products can't hit 3.3ghz+ they've kinda failed; might as well buy a 6800xt/7800xt.
I'll be verrryyy curious if (binned) 3x8-pin designs will be able to hit anywhere around ~3.6ghz(+/-?), as that may have been the N4 goal, both (cancelled) large and (non-cancelled) smalls, with 24gbps ram.
Still think something like a 11264sp+ 3nm design is going to be a lot of people's last stop in this market for the most part. People with a 4090 (unless they have to have the best) probably already don't care.
Making ~1920sp*6/96 ROPs is just sooo much cheaper. It would only require 3900mhz to match 4090 which I think is very doable given how current 5nm GPU designs yield against 2.93/3.24 Apple products.
We don't know how N48 yielded against the 3460-3700mhz Apple products yet, or how much power it uses, but it should be interesting. Both clock yields and the power usage for those clocks on the curve.
This could be telling who has the better idea on 3nm.
nVIDIA is probably shooting for 12288sp@3780mhz/36000 like Apple's efficient clock on N3B, while AMD could perhaps be shooting for 11520sp @ ~3.87/40000+, more-similar to Apple's 4050mhz N3P.
Whatever they do, it'll be a lot cheaper to make than a 4090 or whatever AMD wanted to do with N41...chiplet or monolithic.

At any rate, it's fascinating to see what's possible with this deepseek model; it's almost like pure hardware always wins out in the end versus software/marketing bullshit and artificial limitations!
It's amusing to see the hardware limitations exposed when not locked to their ecosystem.

Long-live the Fine Wine of actually well-matched hardware/vram that always prevails in the end.
 
Last edited:
Joined
Dec 29, 2021
Messages
74 (0.07/day)
Location
Colorado
Processor Ryzen 7 9800X3D
Motherboard Asus ROG Crosshair x870E Hero
Cooling Arctic Liquid Freezr II 420mm
Memory 64GB G.Skill DDR5 CAS30 fruity LED RAM
Video Card(s) Nvidia RTX 4080 (Gigabyte) or a Sapphire Nitro+ 7900XTX depending on planetary alignment.
Storage 3x WD 850whatever 4TB + 2 Spinny disks
Display(s) Alienware AW3423DWF
Case Thermaltake Level 20XT E-ATX
Audio Device(s) Onboard
Power Supply Super Flower Leadex VII 1000w
Mouse Logitech g502x
Keyboard Logitech g915x
Software Windows 11 Insider Preview
So if the 7900 XTX is faster for AI than the 4090 and AMD mentions that RDNA3 specifically can run this model well because of hardware advantages over RDNA2, explain to me why is the new FSR version was supposed to be exclusive to their new GPUs? I mean even an RTX 2000 GPU can benefit of DLSS, so I'm just confused about these stuff.

IIRC RDNA3 has a lot higher throughput in some floating point formats (e.g. FP32) than Lovelace and vice versa for other formats.

I'm not sure what DLSS/FSR use and don't care.
 
Joined
Sep 17, 2014
Messages
23,095 (6.10/day)
Location
The Washing Machine
System Name Tiny the White Yeti
Processor 7800X3D
Motherboard MSI MAG Mortar b650m wifi
Cooling CPU: Thermalright Peerless Assassin / Case: Phanteks T30-120 x3
Memory 32GB Corsair Vengeance 30CL6000
Video Card(s) ASRock RX7900XT Phantom Gaming
Storage Lexar NM790 4TB + Samsung 850 EVO 1TB + Samsung 980 1TB + Crucial BX100 250GB
Display(s) Gigabyte G34QWC (3440x1440)
Case Lian Li A3 mATX White
Audio Device(s) Harman Kardon AVR137 + 2.1
Power Supply EVGA Supernova G2 750W
Mouse Steelseries Aerox 5
Keyboard Lenovo Thinkpad Trackpoint II
VR HMD HD 420 - Green Edition ;)
Software W11 IoT Enterprise LTSC
Benchmark Scores Over 9000
I think N41 (partially) got canned because they know once people have >80TF and 24GB (essentially a 4090) most ain't upgrading for a long-long time. Those that wanted that at >$1000+ bought a 4090.
Cutting the price of 4080 from $1200 to $1000 probably also had something to do with it, as I think that's where AMD wanted to compete.
Similar reason for the gap in nV products. Why GB203 limited to <80TF (1 less cluster than half GB202 + PL locks) and doesn't have a 24GB option. Gotta milk needing those upgrades as long as possible...
Very good points
 
Top