• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

NVIDIA 2025 International CES Keynote: Liveblog

Joined
Jan 14, 2019
Messages
13,237 (6.05/day)
Location
Midlands, UK
Processor Various Intel and AMD CPUs
Motherboard Micro-ATX and mini-ITX
Cooling Yes
Memory Anything from 4 to 48 GB
Video Card(s) Various Nvidia and AMD GPUs
Storage A lot
Display(s) Monitors and TVs
Case The smaller the better
Audio Device(s) Speakers and headphones
Power Supply 300 to 750 W, bronze to gold
Mouse Wireless
Keyboard Wired
VR HMD Not yet
Software Linux gaming master race
We can safely asssume that it's going to be similar to what PS5 Pro has so double the intersection perf and better BVH acceleration, also better handling of divergent rays so increased performance in harder RT workloads like GI and PT. Plus any other tech that Sony didn't disclose in detail.

View attachment 378991
The PS5 Pro has an RDNA 3(.5?) iGPU.
 
Joined
Jan 19, 2023
Messages
271 (0.38/day)
The PS5 Pro has an RDNA 3(.5?) iGPU.
It doesn't have RDNA3 or 3.5. Shaders are the same as PS5 so RDNA2, plus RT from RDNA4 and ML hardware that comes from Sony itself.
You can watch the whole presentation, Mark explains it pretty well.
RDNA3 didn't increase the intersection perf. I have looked everywhere for that info sometime ago and didn't find a single mention of that.

EDIT

Here you can find good deeper explanation on how RDNA3 and 2 RT works:

And even from AMD own slides:

1736339029543.png


No mention of improvement on intersection. So basically PS5 Pro has better RT than RDNA3. But still even if they doubled the intersection it doesn't mean RT perf will double from that alone, but well there are other improvements so who knows, we will see.
 
Last edited:
Joined
Sep 17, 2014
Messages
22,840 (6.06/day)
Location
The Washing Machine
System Name Tiny the White Yeti
Processor 7800X3D
Motherboard MSI MAG Mortar b650m wifi
Cooling CPU: Thermalright Peerless Assassin / Case: Phanteks T30-120 x3
Memory 32GB Corsair Vengeance 30CL6000
Video Card(s) ASRock RX7900XT Phantom Gaming
Storage Lexar NM790 4TB + Samsung 850 EVO 1TB + Samsung 980 1TB + Crucial BX100 250GB
Display(s) Gigabyte G34QWC (3440x1440)
Case Lian Li A3 mATX White
Audio Device(s) Harman Kardon AVR137 + 2.1
Power Supply EVGA Supernova G2 750W
Mouse Steelseries Aerox 5
Keyboard Lenovo Thinkpad Trackpoint II
VR HMD HD 420 - Green Edition ;)
Software W11 IoT Enterprise LTSC
Benchmark Scores Over 9000
I've seen 8 and I've seen 12... Lets hope 12 considering it should be 7700XT like performance... Rumors are all over the place with that.... The 5060 is even rumored to get 12GB via the 3GB GDDR7 chips and likely won't come out till they are avail but rumors are whatever.... We won't know till Nvidia/AMD shows them off.
My crystall ball says 8GB. 12 is wishful thinking, but not Nvidia M.O.
If they give it 12, they will cannibalize the 5070. This entire Blackwell stack is positioned such that it doesn't look terrible if you still have Ada, while still giving Ada owners an incentive to upgrade. They can after all, sell their cards and buy a replacement at near cost neutrality.

This is precisely what is happening with the 4090s on the market right now. Nvidia's executing a perfect strategy here because AMD isn't even playing.

Look here. Perfect price parity with the 5000+ shader count 5090.
We will have the 4090 taking the slot between the 5080 and 5090 for the foreseeable future. Nvidia doesn't need anything in between.

1736343424252.png


The above is just under half the number of sellers on this site, now...

Here's another search just for the lulz. There are almost no (literally 3 in Netherlands!!) sellers of a 7900XTX. AMD ensured its own stagnation, these owners will sooner or later jump ship. Fantastic plan, going midrange!

1736343504982.png
 
Joined
Feb 20, 2019
Messages
8,442 (3.92/day)
System Name Bragging Rights
Processor Atom Z3735F 1.33GHz
Motherboard It has no markings but it's green
Cooling No, it's a 2.2W processor
Memory 2GB DDR3L-1333
Video Card(s) Gen7 Intel HD (4EU @ 311MHz)
Storage 32GB eMMC and 128GB Sandisk Extreme U3
Display(s) 10" IPS 1280x800 60Hz
Case Veddha T2
Audio Device(s) Apparently, yes
Power Supply Samsung 18W 5V fast-charger
Mouse MX Anywhere 2
Keyboard Logitech MX Keys (not Cherry MX at all)
VR HMD Samsung Oddyssey, not that I'd plug it into this though....
Software W10 21H1, barely
Benchmark Scores I once clocked a Celeron-300A to 564MHz on an Abit BE6 and it scored over 9000.
I agree though it is looking like apples to apples the 5080 probably isn't much faster than the 4080..
Far Cry 6 and Plague Tale Requiem are examples of the raw performance improvement because they clearly don't support DLSS4 MFG fakery.

1736343393021.png

That 30% improvement there is likely what we can really expect in the overwhelming majority of games. The 5080 has 15% more compute (cores*clocks) and sucks down more power despite being a newer, more efficient node, so the other 15% likely comes from the 4080 being sandbagged by power limits.
 
Joined
Sep 17, 2014
Messages
22,840 (6.06/day)
Location
The Washing Machine
System Name Tiny the White Yeti
Processor 7800X3D
Motherboard MSI MAG Mortar b650m wifi
Cooling CPU: Thermalright Peerless Assassin / Case: Phanteks T30-120 x3
Memory 32GB Corsair Vengeance 30CL6000
Video Card(s) ASRock RX7900XT Phantom Gaming
Storage Lexar NM790 4TB + Samsung 850 EVO 1TB + Samsung 980 1TB + Crucial BX100 250GB
Display(s) Gigabyte G34QWC (3440x1440)
Case Lian Li A3 mATX White
Audio Device(s) Harman Kardon AVR137 + 2.1
Power Supply EVGA Supernova G2 750W
Mouse Steelseries Aerox 5
Keyboard Lenovo Thinkpad Trackpoint II
VR HMD HD 420 - Green Edition ;)
Software W11 IoT Enterprise LTSC
Benchmark Scores Over 9000
I view the 5-600 usd cards as 1080p cards as it is so 12GB is fine but I agree people should be buying 16GB cards in 2025 regardless of how fast this 12GB card is or isn't.
That's how far we've already moved into the Nvidia story, but honestly, 500-600 would get you a top end card not too long ago. I bought a GTX 1080 that runs 1440p at medium EVEN TODAY for 520... What you are saying here is we literally regressed over the course of 4 generations of new GPUs. That's utterly terrible.

Far Cry 6 and Plague Tale Requiem are examples of the raw performance improvement because they clearly don't support DLSS4 MFG fakery.

View attachment 379002

That 30% improvement there is likely what we can really expect in the overwhelming majority of games. The 5080 has 15% more compute (cores*clocks) and sucks down more power despite being a newer, more efficient node, so the other 15% likely comes from the 4080 being sandbagged by power limits.
You failed at interpretation of this bar chart.

The left most bars indeed don't say DLSS. But they do say RT.
Raster performance might be at a complete standstill, just RT ON is improved, going by this chart. It does not say a thing about raster perf.
 
Joined
Sep 10, 2018
Messages
7,267 (3.14/day)
Location
California
System Name His & Hers
Processor R7 5800X/ R7 7950X3D Stock
Motherboard X670E Aorus Pro X/ROG Crosshair VIII Hero
Cooling Corsair h150 elite/ Corsair h115i Platinum
Memory Trident Z5 Neo 6000/ 32 GB 3200 CL14 @3800 CL16 Team T Force Nighthawk
Video Card(s) Evga FTW 3 Ultra 3080ti/ Gigabyte Gaming OC 4090
Storage lots of SSD.
Display(s) A whole bunch OLED, VA, IPS.....
Case 011 Dynamic XL/ Phanteks Evolv X
Audio Device(s) Arctis Pro + gaming Dac/ Corsair sp 2500/ Logitech G560/Samsung Q990B
Power Supply Seasonic Ultra Prime Titanium 1000w/850w
Mouse Logitech G502 Lightspeed/ Logitech G Pro Hero.
Keyboard Logitech - G915 LIGHTSPEED / Logitech G Pro
That's how far we've already moved into the Nvidia story, but honestly, 500-600 would get you a top end card not too long ago. I bought a GTX 1080 that runs 1440p at medium EVEN TODAY for 520... What you are saying here is we literally regressed over the course of 4 generations of new GPUs. That's utterly terrible.

Back when the 1080 launched it was 699 that would be over 900 usd in 2025 money.

Technically msrp was 599 but that was when Nvidia started the FE BS the price was reduced a year later when the 1080ti released to 499/549FE
 
Joined
Jan 8, 2017
Messages
9,580 (3.28/day)
System Name Good enough
Processor AMD Ryzen R9 7900 - Alphacool Eisblock XPX Aurora Edge
Motherboard ASRock B650 Pro RS
Cooling 2x 360mm NexXxoS ST30 X-Flow, 1x 360mm NexXxoS ST30, 1x 240mm NexXxoS ST30
Memory 32GB - FURY Beast RGB 5600 Mhz
Video Card(s) Sapphire RX 7900 XT - Alphacool Eisblock Aurora
Storage 1x Kingston KC3000 1TB 1x Kingston A2000 1TB, 1x Samsung 850 EVO 250GB , 1x Samsung 860 EVO 500GB
Display(s) LG UltraGear 32GN650-B + 4K Samsung TV
Case Phanteks NV7
Power Supply GPS-750C
The 5070 is a turd, it's only better at A.I workloads
Probably not even that, they've been lying in their marketing material even for the ML stuff :
1736351220526.png


They're running models at half the precession on 50 series cards and comparing them to FP8 on 40 series because presumably at the same precision they're not faster at all. Pretty much everything they've shown is a smokescreen, this might just be the most disingenuous marketing material they've ever released, there's not a single example of a performance claim where they haven't screwed with it in some way.

For those of you that don't know lower precision quantized models are worse, often unusable for some applications, so even the "14638746728463287 gazillion AI TOPS" meme is a lie.
 
Last edited:
Joined
Jun 14, 2020
Messages
3,752 (2.25/day)
System Name Mean machine
Processor 12900k
Motherboard MSI Unify X
Cooling Noctua U12A
Memory 7600c34
Video Card(s) 4090 Gamerock oc
Storage 980 pro 2tb
Display(s) Samsung crg90
Case Fractal Torent
Audio Device(s) Hifiman Arya / a30 - d30 pro stack
Power Supply Be quiet dark power pro 1200
Mouse Viper ultimate
Keyboard Blackwidow 65%
Probably not even that, they've been lying in their marketing material even for the ML stuff :
View attachment 379025

They're running models at half the precession on 50 series cards and comparing them to FP8 on 40 series because presumably at the same precision they're not faster at all. Pretty much everything they've shown is a smokescreen, this might just be the most disingenuous marketing material they've ever released, there's not a single example of a performance claim where they haven't screwed with it in some way.
It's not lying when they give you the detes. FP4 isn't supported on the 40series. It's working smarter, not harder.
 
Joined
Jan 8, 2017
Messages
9,580 (3.28/day)
System Name Good enough
Processor AMD Ryzen R9 7900 - Alphacool Eisblock XPX Aurora Edge
Motherboard ASRock B650 Pro RS
Cooling 2x 360mm NexXxoS ST30 X-Flow, 1x 360mm NexXxoS ST30, 1x 240mm NexXxoS ST30
Memory 32GB - FURY Beast RGB 5600 Mhz
Video Card(s) Sapphire RX 7900 XT - Alphacool Eisblock Aurora
Storage 1x Kingston KC3000 1TB 1x Kingston A2000 1TB, 1x Samsung 850 EVO 250GB , 1x Samsung 860 EVO 500GB
Display(s) LG UltraGear 32GN650-B + 4K Samsung TV
Case Phanteks NV7
Power Supply GPS-750C
It's not lying when they give you the detes. FP4 isn't supported on the 40series. It's working smarter, not harder.
Yeah sure, they're not lying, just showing you a ginormous bar chart where the thing is 2X times faster and then a miniscule text below telling you that actually it's not.
 
Joined
Jan 8, 2017
Messages
9,580 (3.28/day)
System Name Good enough
Processor AMD Ryzen R9 7900 - Alphacool Eisblock XPX Aurora Edge
Motherboard ASRock B650 Pro RS
Cooling 2x 360mm NexXxoS ST30 X-Flow, 1x 360mm NexXxoS ST30, 1x 240mm NexXxoS ST30
Memory 32GB - FURY Beast RGB 5600 Mhz
Video Card(s) Sapphire RX 7900 XT - Alphacool Eisblock Aurora
Storage 1x Kingston KC3000 1TB 1x Kingston A2000 1TB, 1x Samsung 850 EVO 250GB , 1x Samsung 860 EVO 500GB
Display(s) LG UltraGear 32GN650-B + 4K Samsung TV
Case Phanteks NV7
Power Supply GPS-750C
FP4 is for AI and actual work. Quit thinking as a gamer.
Outputs from FP4 and FP8 models are not equivalent, quit thinking as an AI tourist. People supposedly using these for work would know this is a false comparison.
 
Joined
Jun 14, 2020
Messages
3,752 (2.25/day)
System Name Mean machine
Processor 12900k
Motherboard MSI Unify X
Cooling Noctua U12A
Memory 7600c34
Video Card(s) 4090 Gamerock oc
Storage 980 pro 2tb
Display(s) Samsung crg90
Case Fractal Torent
Audio Device(s) Hifiman Arya / a30 - d30 pro stack
Power Supply Be quiet dark power pro 1200
Mouse Viper ultimate
Keyboard Blackwidow 65%
Yeah sure, they're not lying, just showing you a ginormous bar chart where the thing is 2X times faster and then a miniscule text below telling you that actually it's not.
B200 used similar marketing slides with FP4. You think they are getting sued by the big AI corps?
 
Joined
Jan 8, 2017
Messages
9,580 (3.28/day)
System Name Good enough
Processor AMD Ryzen R9 7900 - Alphacool Eisblock XPX Aurora Edge
Motherboard ASRock B650 Pro RS
Cooling 2x 360mm NexXxoS ST30 X-Flow, 1x 360mm NexXxoS ST30, 1x 240mm NexXxoS ST30
Memory 32GB - FURY Beast RGB 5600 Mhz
Video Card(s) Sapphire RX 7900 XT - Alphacool Eisblock Aurora
Storage 1x Kingston KC3000 1TB 1x Kingston A2000 1TB, 1x Samsung 850 EVO 250GB , 1x Samsung 860 EVO 500GB
Display(s) LG UltraGear 32GN650-B + 4K Samsung TV
Case Phanteks NV7
Power Supply GPS-750C
B200 used similar marketing slides with FP4. You think they are getting sued by the big AI corps?
Haven't seen them but they've never gotten sued over stuff like this so no I don't expect them to, it's still a lie though.
 
Joined
Jun 14, 2020
Messages
3,752 (2.25/day)
System Name Mean machine
Processor 12900k
Motherboard MSI Unify X
Cooling Noctua U12A
Memory 7600c34
Video Card(s) 4090 Gamerock oc
Storage 980 pro 2tb
Display(s) Samsung crg90
Case Fractal Torent
Audio Device(s) Hifiman Arya / a30 - d30 pro stack
Power Supply Be quiet dark power pro 1200
Mouse Viper ultimate
Keyboard Blackwidow 65%
Haven't seen them but they've never gotten sued over stuff like this so no I don't expect them to, it's still a lie though.
They claimed a 5x over hopper which was entirely due to fp4. Not an AI expert but supposedly the whole industry is trying to move to low precision - the tricky part is keeping the accuracy high, which is supposedly what nvidia has achieved and why it's dominating that segment.
 
Joined
May 10, 2023
Messages
485 (0.79/day)
Location
Brazil
Processor 5950x
Motherboard B550 ProArt
Cooling Fuma 2
Memory 4x32GB 3200MHz Corsair LPX
Video Card(s) 2x RTX 3090
Display(s) LG 42" C2 4k OLED
Power Supply XPG Core Reactor 850W
Software I use Arch btw
I got my hands on the expected Brazilian pricing and launch dates, the ones that launch 3 February are pre-orders, the ones 12 and 15 Feb are standard orders. I can't vouch for the authenticity of this list with 100% certainty so take this with a grain of salt, but I think this math is mostly mathin'.
This has been confirmed to be fake, it was just the supposed values in USD converted to BRL. Do notice how some actually end up below the US MSRP prices.
Curious though, what do you think happened? They got some info from what nvidia is going to do at the last minute and bailed to get back to the drawing board? Maybe there wasn't any plan to announce the 9070 at all and people just assumed?
Ian Cutress wrote up a bit on that, AMD did a Q&A with some journalists trying to explain that:

TLDR; they said that the product was not yet finished, they wouldn't have enough time to showcase it during their overall presentation, and that nvidia's announcement did have a part on the decision to not showcase RDNA4 (they want to undercut it).
Honestly I think they made a big bet on MCM for the gaming division hoping for a Zen moment and it did not pan out.... Sounds like they did have a high end RDNA4 chip planned but canceled it.

The reality is the low end stuff from Nvidia is getting worse and worse and AMD isn't offering alternatives that get people excited. If RDNA4 is a bust lets hope UDNA is the answer.
If they can get UDNA right, that would simplify a lot of things given they'll be able to do exactly what they done with Zen: have chiplets that provide great value in the enterprise (which brings big bucks), and that can also be used in the consumer market, all of those out of the same fabrication line. This is also exactly what Nvidia has been doing for quite a long time (albeit not with chiplets).
Their GPU division currently has both CDNA and RDNA, which not only need to compete in engineering time, but also fab allocation. Given how CDNA is bringing more money than RDNA, it makes sense to focus on that.
Yeah MCM or not, even if they had a bigger chip, they should have also had RT performance laying on the shelf to go with it. Which they might not have after all; RDNA3 was supposed to perform better even regardless of MCM. I think there were mostly promises, hopes and dreams flying around but people simply did not (manage to...) deliver. And this seems to be a recurring thing, not exclusive to Raja.
MCM is more about the fab efficiency, but the architecture design is a bit different from that. See how Zen has both MCM and monolithic products, and also how their RDNA design exists in both MCM and also in monolithic designs in iGPUs.
It's actually great in iGPUs, I guess they're just lacking in resources to scale it up because it makes more sense to put efforts into CDNA as the "big product" instead.
And some people don't get why I spoke so harshly against the misleading MFG performance data in the keynote. This is why. People are very easy to manipulate, and that is exactly what Nvidia is doing.
Tbf most users here won't fall for that, and everyone will wait for the proper reviews nonetheless, so that's like preaching to a choir.
That's the real upgrade path here. 2nd hand last gen as all the n00bs upgrade to the latest greatest that didn't gain them anything.
I got my 3090s used for like 1/2 and then 1/4 of their launch prices here after the mining craze, can't beat such value :p

You mean NV is comparing apples to apples in this one?, this would be nice (I guess since there is no fineprint, it might be so). Then the AI TOPS would indeed be massively improved (+70% when adjusting for the power increase of +25% for the 5070 vs 4070: 988/(466*1.25)).
According to "nvidia-ada-gpu-architecture.pdf", the 4090 is:

So it's either 1321 INT8 sparse or 1321 INT4 dense? Anyway, what matters more, is that it's an apples to apples comparison.
Funnily enough, yes. The numbers are for INT4 dense, sparsity is a nvidia-exclusive thing that's not that easy to use (you have to rearrange your tensors to make use of it).

They're running models at half the precession on 50 series cards and comparing them to FP8 on 40 series because presumably at the same precision they're not faster at all. Pretty much everything they've shown is a smokescreen, this might just be the most disingenuous marketing material they've ever released, there's not a single example of a performance claim where they haven't screwed with it in some way.

For those of you that don't know lower precision quantized models are worse, often unusable for some applications, so even the "14638746728463287 gazillion AI TOPS" meme is a lie.
I had explained it to someone else, but I'll write it up again:
Flux is often memory-bound, just like LLMs. The gains you see there are mostly from the extra 80% in memory bandwidth the 5090 has. Even running it in FP8 (which my 3090 doesn't even have support for) leads to a really minor perf diference, while using FP8 vs FP16 on a 4090 barely nets a perf gain, something around 5~10% in both scenarios. Same likely goes for this FP4 vs FP8 comparison.

You are also forgetting that there are different types of quantizations. Your Q4, gguf, ggml stuff is about compressing stuff for storage/memory, but you still do the maths in fp16, which leads to a noticeable lower performance. Doing proper quantization on a model through some extra fine-tuning with precision-awareness leads to way better quality than just shoving the original weights in a smaller data type.
Just take a look by yourself at the results from their model vs the bfp16 one:

BF16 on the left and FP4 on the right

Clearly not as good as the FP16, but way better than your usual Q8 quants.

Yeah sure, they're not lying, just showing you a ginormous bar chart where the thing is 2X times faster and then a miniscule text below telling you that actually it's not.
Unlike games, for inference you often aim for the smallest supported data type for both vram savings and extra throughput. When tensor cores came out, everyone switched to FP16. When Ada/Hopper came out, everyone started doing FP8. The trend still goes this way.
 
Joined
Jan 8, 2017
Messages
9,580 (3.28/day)
System Name Good enough
Processor AMD Ryzen R9 7900 - Alphacool Eisblock XPX Aurora Edge
Motherboard ASRock B650 Pro RS
Cooling 2x 360mm NexXxoS ST30 X-Flow, 1x 360mm NexXxoS ST30, 1x 240mm NexXxoS ST30
Memory 32GB - FURY Beast RGB 5600 Mhz
Video Card(s) Sapphire RX 7900 XT - Alphacool Eisblock Aurora
Storage 1x Kingston KC3000 1TB 1x Kingston A2000 1TB, 1x Samsung 850 EVO 250GB , 1x Samsung 860 EVO 500GB
Display(s) LG UltraGear 32GN650-B + 4K Samsung TV
Case Phanteks NV7
Power Supply GPS-750C
They claimed a 5x over hopper which was entirely due to fp4. Not an AI expert but supposedly the whole industry is trying to move to low precision - the tricky part is keeping the accuracy high, which is supposedly what nvidia has achieved and why it's dominating that segment.
Like I explained lower precision models are not equivalent to higher precision ones and never will be, if it was as simple as that then everything would be running in 1bit precision by now. LLMs like ChatGPT seem to have settled on 16 bits, that seems the be lower bound from which the output gets noticeably worse, with stuff like image generation you can go lower but they're still not equivalent.

Unlike games, for inference you often aim for the smallest supported data type for both vram savings and extra throughput. When tensor cores came out, everyone switched to FP16. When Ada/Hopper came out, everyone started doing FP8. The trend still goes this way.
It does not matter, it's not an appropriate way to compare the two as they are running different models.
 
Joined
May 10, 2023
Messages
485 (0.79/day)
Location
Brazil
Processor 5950x
Motherboard B550 ProArt
Cooling Fuma 2
Memory 4x32GB 3200MHz Corsair LPX
Video Card(s) 2x RTX 3090
Display(s) LG 42" C2 4k OLED
Power Supply XPG Core Reactor 850W
Software I use Arch btw
LLMs like ChatGPT seem to have settled on 16 bits
No, it has been discussed multiple times that OpenAI has been quantizing their models without telling anyone over time.
Many LLMs are running in Q4, Q6 and Q8 out there in production by many different providers.

It does not matter, it's not an appropriate way to compare the two as they are running different models.
It does matter because that's how people are going to run it given the hardware support, period.
 
Joined
Jan 8, 2017
Messages
9,580 (3.28/day)
System Name Good enough
Processor AMD Ryzen R9 7900 - Alphacool Eisblock XPX Aurora Edge
Motherboard ASRock B650 Pro RS
Cooling 2x 360mm NexXxoS ST30 X-Flow, 1x 360mm NexXxoS ST30, 1x 240mm NexXxoS ST30
Memory 32GB - FURY Beast RGB 5600 Mhz
Video Card(s) Sapphire RX 7900 XT - Alphacool Eisblock Aurora
Storage 1x Kingston KC3000 1TB 1x Kingston A2000 1TB, 1x Samsung 850 EVO 250GB , 1x Samsung 860 EVO 500GB
Display(s) LG UltraGear 32GN650-B + 4K Samsung TV
Case Phanteks NV7
Power Supply GPS-750C
Many LLMs are running in Q4, Q6 and Q8 out there in production by many different providers.
Many don't disclose exactly what they're doing but 4bit quantization is markedly worse, that's for sure.

It does matter because that's how people are going to run it given the hardware support, period.
No, this is just more Nvidia marketing lying cope. Nobody is saying this isn't what people should be running, I don't care, just put the figures for each model side by side so you know exactly what you are looking at so that you don't have to read footnotes in the tiniest font possible to see that that they're not comparing the same thing. This is like the bare minimum you can expect, that at least the software they're running is the same.

They could have had separate charts showcasing VRAM usage as well, making a point about being able to run these things on lesser GPUs with less memory but they're so hell bent on lying and being as disingenuous as possible they don't even know when to use this to their advantage.
 
Last edited:
Joined
Jan 14, 2019
Messages
13,237 (6.05/day)
Location
Midlands, UK
Processor Various Intel and AMD CPUs
Motherboard Micro-ATX and mini-ITX
Cooling Yes
Memory Anything from 4 to 48 GB
Video Card(s) Various Nvidia and AMD GPUs
Storage A lot
Display(s) Monitors and TVs
Case The smaller the better
Audio Device(s) Speakers and headphones
Power Supply 300 to 750 W, bronze to gold
Mouse Wireless
Keyboard Wired
VR HMD Not yet
Software Linux gaming master race
Tbf most users here won't fall for that, and everyone will wait for the proper reviews nonetheless, so that's like preaching to a choir.
Wanna bet?

Ian Cutress wrote up a bit on that, AMD did a Q&A with some journalists trying to explain that:

TLDR; they said that the product was not yet finished, they wouldn't have enough time to showcase it during their overall presentation, and that nvidia's announcement did have a part on the decision to not showcase RDNA4 (they want to undercut it).
That was the most logical reason for it. Thanks for the article.
 
Joined
May 10, 2023
Messages
485 (0.79/day)
Location
Brazil
Processor 5950x
Motherboard B550 ProArt
Cooling Fuma 2
Memory 4x32GB 3200MHz Corsair LPX
Video Card(s) 2x RTX 3090
Display(s) LG 42" C2 4k OLED
Power Supply XPG Core Reactor 850W
Software I use Arch btw
Many don't disclose exactly what they're doing but 4bit quantization is markedly worse, that's for sure.
For LLMs with billions or close to trillions parameters? Sure, the error propagation gets way worse.
But a 70B Q4 model is still way better than a 30B Q8 one, and a bigger model at bigger data types is useless if you can't get it to run to begin with, or if performance is not good enough.

For smaller models the quantization perf loss is not that significative.

No, this is just more Nvidia marketing lying cope. Nobody is saying this isn't what people should be running, I don't care, just put the figures for each model side by side so you know exactly what you are looking at so that you don't have to read footnotes in the tiniest font possible to see that that they're not comparing the same thing.
I get your point about "fairness", but I bet you performance would be pretty close to what was shown at FP8 because, as I've told before, this is mostly a memory bw issue. Doing it at FP4 is a showcase of a newly supported data type that we didn't have before (with only a minor perf uplift in this specific case).
You'd be better going back to complaining about the FG comparisons.

Wanna bet?


That was the most logical reason for it. Thanks for the article.
Given how most people here are hoping that a 5080 matches or surpasses a 4090, I don't think people here bought into the 5070=4090 idea.
 
Joined
Jun 14, 2020
Messages
3,752 (2.25/day)
System Name Mean machine
Processor 12900k
Motherboard MSI Unify X
Cooling Noctua U12A
Memory 7600c34
Video Card(s) 4090 Gamerock oc
Storage 980 pro 2tb
Display(s) Samsung crg90
Case Fractal Torent
Audio Device(s) Hifiman Arya / a30 - d30 pro stack
Power Supply Be quiet dark power pro 1200
Mouse Viper ultimate
Keyboard Blackwidow 65%
Are we seriously suggesting that the whole ai industry bought into b200 cause they were misled - they didn't understand what FP4 is?
 
Joined
Jan 8, 2017
Messages
9,580 (3.28/day)
System Name Good enough
Processor AMD Ryzen R9 7900 - Alphacool Eisblock XPX Aurora Edge
Motherboard ASRock B650 Pro RS
Cooling 2x 360mm NexXxoS ST30 X-Flow, 1x 360mm NexXxoS ST30, 1x 240mm NexXxoS ST30
Memory 32GB - FURY Beast RGB 5600 Mhz
Video Card(s) Sapphire RX 7900 XT - Alphacool Eisblock Aurora
Storage 1x Kingston KC3000 1TB 1x Kingston A2000 1TB, 1x Samsung 850 EVO 250GB , 1x Samsung 860 EVO 500GB
Display(s) LG UltraGear 32GN650-B + 4K Samsung TV
Case Phanteks NV7
Power Supply GPS-750C
I get your point about "fairness", but I bet you performance would be pretty close to what was shown at FP8 because, as I've told before, this is mostly a memory bw issue. Doing it at FP4 is a showcase of a newly supported data type that we didn't have before (with only a minor perf uplift in this specific case).
You'd be better going back to complaining about the FG comparisons.
At least with FG you know they're running the same game, I don't think it's even about fairness it's just a shitty way of presenting those performance metrics for the tiny percentage of people who would even care, wouldn't you want to know that this now supports a smaller data type and that you can now run smaller models ? Be honest when you saw that did you assume it's the same data type or did you magically understand there must be more to it before squinting your eyes in the footnotes (if you ever did before someone else pointed it out for you), I for one admit I missed it before I saw someone else talk about it.
 
Joined
Jan 14, 2019
Messages
13,237 (6.05/day)
Location
Midlands, UK
Processor Various Intel and AMD CPUs
Motherboard Micro-ATX and mini-ITX
Cooling Yes
Memory Anything from 4 to 48 GB
Video Card(s) Various Nvidia and AMD GPUs
Storage A lot
Display(s) Monitors and TVs
Case The smaller the better
Audio Device(s) Speakers and headphones
Power Supply 300 to 750 W, bronze to gold
Mouse Wireless
Keyboard Wired
VR HMD Not yet
Software Linux gaming master race
Are we seriously suggesting that the whole ai industry bought into b200 cause they were misled - they didn't understand what FP4 is?
Does the AI industry even buy 5070/5080 level cards? I mean, home users getting their feet wet in AI, sure, but the wealthiest AI corps need a lot more oomph, don't they? That's who uber expensive professional cards are for. To them, everything you say about the 5070/5080 is meaningless.
 
Joined
Jun 14, 2020
Messages
3,752 (2.25/day)
System Name Mean machine
Processor 12900k
Motherboard MSI Unify X
Cooling Noctua U12A
Memory 7600c34
Video Card(s) 4090 Gamerock oc
Storage 980 pro 2tb
Display(s) Samsung crg90
Case Fractal Torent
Audio Device(s) Hifiman Arya / a30 - d30 pro stack
Power Supply Be quiet dark power pro 1200
Mouse Viper ultimate
Keyboard Blackwidow 65%
Does the AI industry even buy 5070/5080 level cards? I mean, home users getting their feet wet in AI, sure, but the wealthiest AI corps need a lot more oomph, don't they? That's who uber expensive professional cards are for. To them, everything you say about the 5070/5080 is meaningless.
No, they are buyin B200 which also used FP4 claims (vs FP8 for hopper) in their marketing slides.


Look, the thing is, there was another company at CES that compared their 120w CPU vs the competitions 17w chip. With no small letters btw. No one is talking about it being misleading, but we have 50 different threads 20 pages long complaining about nvidia. Makes you wonder
 
Joined
Jan 14, 2019
Messages
13,237 (6.05/day)
Location
Midlands, UK
Processor Various Intel and AMD CPUs
Motherboard Micro-ATX and mini-ITX
Cooling Yes
Memory Anything from 4 to 48 GB
Video Card(s) Various Nvidia and AMD GPUs
Storage A lot
Display(s) Monitors and TVs
Case The smaller the better
Audio Device(s) Speakers and headphones
Power Supply 300 to 750 W, bronze to gold
Mouse Wireless
Keyboard Wired
VR HMD Not yet
Software Linux gaming master race
No, they are buyin B200 which also used FP4 claims (vs FP8 for hopper) in their marketing slides.
Fair enough. Still wrong, imo, but as long as buyers are fine with it, who am I to argue.

Look, the thing is, there was another company at CES that compared their 120w CPU vs the competitions 17w chip.
Really? That's poor as well. I guess no one was really interested in that CPU. I don't even know which one you're talking about, it completely missed the spot with me (although I admit, I only looked for GPUs this time around).
 
Top