Monday, January 6th 2025

NVIDIA 2025 International CES Keynote: Liveblog

NVIDIA kicks off the 2025 International CES with a bang. The company is expected to debut its new GeForce "Blackwell" RTX 5000 generation of gaming graphics cards. It is also expected to launch new technology, such as neural rendering, and DLSS 4. The company is also expected to highlight a new piece of silicon for Windows on Arm laptops, showcase the next in its Drive PX FSD hardware, and probably even talk about its next-generation "Blackwell Ultra" AI GPU, and if we're lucky, even namedrop "Rubin." Join us, as we liveblog CEO Jensen Huang's keynote address.

02:22 UTC: The show is finally underway!
02:35 UTC: CTA president Gary Shaprio kicks off the show, introduces Jensen Huang.
02:46 UTC: "Tokens are the building blocks of AI"

02:46 UTC: "Do you like my jacket?"
02:47 UTC: NVIDIA recounts progress all the way till NV1 and UDA.
02:48 UTC: "CUDA was difficult to explain, it took 6 years to get the industry to like it"
02:50 UTC: "AI is coming home to GeForce". NVIDIA teases neural material and neural rendering. Rendered on "Blackwell"
02:55 UTC: Every single pixel is ray traced, thanks to AI rendering.
02:55 UTC: Here it is, the GeForce RTX 5090.
03:20 UTC: At least someone is pushing the limits for GPUs.
03:22 UTC: Incredible board design.
03:22 UTC: RTX 5070 matches RTX 4090 at $550.
03:24 UTC: Here's the lineup, available from January.
03:24 UTC: RTX 5070 Laptop starts at $1299.
03:24 UTC: "The future of computer graphics is neural rendering"
03:25 UTC: Laptops powered by RTX Blackwell: staring prices:
03:26 UTC: AI has come back to power GeForce.
03:28 UTC: Supposedly the Grace Blackwell NVLink72.
03:28 UTC: 1.4 ExaFLOPS.
03:32 UTC: NVIDIA very sneakily teased a Windows AI PC chip.

03:35 UTC: NVIDIA is teaching generative AI basic physics. NVIDIA Cosmos, a world foundation model.
03:41 UTC: NVIDIA Cosmos is trained on 20 million hours of video.

03:43 UTC: Cosmos is open-licensed on GitHub.

03:52 UTC: NVIDIA onboards Toyota for its next generation EV for full-self driving.

03:53 UTC: NVIDIA unveils Thor Blackwell robotics processor.
03:53 UTC: Thor is 20x the processing capability of Orin.

03:54 UTC: CUDA is now a functional safe computer thanks to its automobile certifications.
04:01 UTC: NVIDIA brought a dozen humanoid robots to the stage.

04:07 UTC: Project DIGITS, is a shrunk down AI supercomputer.
04:08 UTC: NVIDIA GB110 "Grace-Blackwell" chip powers DIGITS.
Add your own comment

446 Comments on NVIDIA 2025 International CES Keynote: Liveblog

#376
Vayra86
oxrufiioxoI've seen 8 and I've seen 12... Lets hope 12 considering it should be 7700XT like performance... Rumors are all over the place with that.... The 5060 is even rumored to get 12GB via the 3GB GDDR7 chips and likely won't come out till they are avail but rumors are whatever.... We won't know till Nvidia/AMD shows them off.
My crystall ball says 8GB. 12 is wishful thinking, but not Nvidia M.O.
If they give it 12, they will cannibalize the 5070. This entire Blackwell stack is positioned such that it doesn't look terrible if you still have Ada, while still giving Ada owners an incentive to upgrade. They can after all, sell their cards and buy a replacement at near cost neutrality.

This is precisely what is happening with the 4090s on the market right now. Nvidia's executing a perfect strategy here because AMD isn't even playing.

Look here. Perfect price parity with the 5000+ shader count 5090.
We will have the 4090 taking the slot between the 5080 and 5090 for the foreseeable future. Nvidia doesn't need anything in between.



The above is just under half the number of sellers on this site, now...

Here's another search just for the lulz. There are almost no (literally 3 in Netherlands!!) sellers of a 7900XTX. AMD ensured its own stagnation, these owners will sooner or later jump ship. Fantastic plan, going midrange!

Posted on Reply
#377
Chrispy_
oxrufiioxoI agree though it is looking like apples to apples the 5080 probably isn't much faster than the 4080..
Far Cry 6 and Plague Tale Requiem are examples of the raw performance improvement because they clearly don't support DLSS4 MFG fakery.



That 30% improvement there is likely what we can really expect in the overwhelming majority of games. The 5080 has 15% more compute (cores*clocks) and sucks down more power despite being a newer, more efficient node, so the other 15% likely comes from the 4080 being sandbagged by power limits.
Posted on Reply
#378
Vayra86
oxrufiioxoI view the 5-600 usd cards as 1080p cards as it is so 12GB is fine but I agree people should be buying 16GB cards in 2025 regardless of how fast this 12GB card is or isn't.
That's how far we've already moved into the Nvidia story, but honestly, 500-600 would get you a top end card not too long ago. I bought a GTX 1080 that runs 1440p at medium EVEN TODAY for 520... What you are saying here is we literally regressed over the course of 4 generations of new GPUs. That's utterly terrible.
Chrispy_Far Cry 6 and Plague Tale Requiem are examples of the raw performance improvement because they clearly don't support DLSS4 MFG fakery.



That 30% improvement there is likely what we can really expect in the overwhelming majority of games. The 5080 has 15% more compute (cores*clocks) and sucks down more power despite being a newer, more efficient node, so the other 15% likely comes from the 4080 being sandbagged by power limits.
You failed at interpretation of this bar chart.

The left most bars indeed don't say DLSS. But they do say RT.
Raster performance might be at a complete standstill, just RT ON is improved, going by this chart. It does not say a thing about raster perf.
Posted on Reply
#379
oxrufiioxo
Vayra86That's how far we've already moved into the Nvidia story, but honestly, 500-600 would get you a top end card not too long ago. I bought a GTX 1080 that runs 1440p at medium EVEN TODAY for 520... What you are saying here is we literally regressed over the course of 4 generations of new GPUs. That's utterly terrible.
Back when the 1080 launched it was 699 that would be over 900 usd in 2025 money.

Technically msrp was 599 but that was when Nvidia started the FE BS the price was reduced a year later when the 1080ti released to 499/549FE
Posted on Reply
#380
Vya Domus
Legacy-ZAThe 5070 is a turd, it's only better at A.I workloads
Probably not even that, they've been lying in their marketing material even for the ML stuff :


They're running models at half the precession on 50 series cards and comparing them to FP8 on 40 series because presumably at the same precision they're not faster at all. Pretty much everything they've shown is a smokescreen, this might just be the most disingenuous marketing material they've ever released, there's not a single example of a performance claim where they haven't screwed with it in some way.

For those of you that don't know lower precision quantized models are worse, often unusable for some applications, so even the "14638746728463287 gazillion AI TOPS" meme is a lie.
Posted on Reply
#381
JustBenching
Vya DomusProbably not even that, they've been lying in their marketing material even for the ML stuff :


They're running models at half the precession on 50 series cards and comparing them to FP8 on 40 series because presumably at the same precision they're not faster at all. Pretty much everything they've shown is a smokescreen, this might just be the most disingenuous marketing material they've ever released, there's not a single example of a performance claim where they haven't screwed with it in some way.
It's not lying when they give you the detes. FP4 isn't supported on the 40series. It's working smarter, not harder.
Posted on Reply
#382
Vya Domus
JustBenchingIt's not lying when they give you the detes. FP4 isn't supported on the 40series. It's working smarter, not harder.
Yeah sure, they're not lying, just showing you a ginormous bar chart where the thing is 2X times faster and then a miniscule text below telling you that actually it's not.
Posted on Reply
#383
SOAREVERSOR
Vya DomusYeah sure, they're not lying, just showing you a ginormous bar chart where the thing is 2X times faster and then a miniscule text below telling you that actually it's not.
FP4 is for AI and actual work. Quit thinking as a gamer.
Posted on Reply
#384
Vya Domus
SOAREVERSORFP4 is for AI and actual work. Quit thinking as a gamer.
Outputs from FP4 and FP8 models are not equivalent, quit thinking as an AI tourist. People supposedly using these for work would know this is a false comparison.
Posted on Reply
#385
JustBenching
Vya DomusYeah sure, they're not lying, just showing you a ginormous bar chart where the thing is 2X times faster and then a miniscule text below telling you that actually it's not.
B200 used similar marketing slides with FP4. You think they are getting sued by the big AI corps?
Posted on Reply
#386
Vya Domus
JustBenchingB200 used similar marketing slides with FP4. You think they are getting sued by the big AI corps?
Haven't seen them but they've never gotten sued over stuff like this so no I don't expect them to, it's still a lie though.
Posted on Reply
#387
JustBenching
Vya DomusHaven't seen them but they've never gotten sued over stuff like this so no I don't expect them to, it's still a lie though.
They claimed a 5x over hopper which was entirely due to fp4. Not an AI expert but supposedly the whole industry is trying to move to low precision - the tricky part is keeping the accuracy high, which is supposedly what nvidia has achieved and why it's dominating that segment.
Posted on Reply
#388
igormp
Dr. DroI got my hands on the expected Brazilian pricing and launch dates, the ones that launch 3 February are pre-orders, the ones 12 and 15 Feb are standard orders. I can't vouch for the authenticity of this list with 100% certainty so take this with a grain of salt, but I think this math is mostly mathin'.
This has been confirmed to be fake, it was just the supposed values in USD converted to BRL. Do notice how some actually end up below the US MSRP prices.
JustBenchingCurious though, what do you think happened? They got some info from what nvidia is going to do at the last minute and bailed to get back to the drawing board? Maybe there wasn't any plan to announce the 9070 at all and people just assumed?
Ian Cutress wrote up a bit on that, AMD did a Q&A with some journalists trying to explain that:
morethanmoore.substack.com/p/where-was-rdna4-at-amds-keynote

TLDR; they said that the product was not yet finished, they wouldn't have enough time to showcase it during their overall presentation, and that nvidia's announcement did have a part on the decision to not showcase RDNA4 (they want to undercut it).
oxrufiioxoHonestly I think they made a big bet on MCM for the gaming division hoping for a Zen moment and it did not pan out.... Sounds like they did have a high end RDNA4 chip planned but canceled it.

The reality is the low end stuff from Nvidia is getting worse and worse and AMD isn't offering alternatives that get people excited. If RDNA4 is a bust lets hope UDNA is the answer.
If they can get UDNA right, that would simplify a lot of things given they'll be able to do exactly what they done with Zen: have chiplets that provide great value in the enterprise (which brings big bucks), and that can also be used in the consumer market, all of those out of the same fabrication line. This is also exactly what Nvidia has been doing for quite a long time (albeit not with chiplets).
Their GPU division currently has both CDNA and RDNA, which not only need to compete in engineering time, but also fab allocation. Given how CDNA is bringing more money than RDNA, it makes sense to focus on that.
Vayra86Yeah MCM or not, even if they had a bigger chip, they should have also had RT performance laying on the shelf to go with it. Which they might not have after all; RDNA3 was supposed to perform better even regardless of MCM. I think there were mostly promises, hopes and dreams flying around but people simply did not (manage to...) deliver. And this seems to be a recurring thing, not exclusive to Raja.
MCM is more about the fab efficiency, but the architecture design is a bit different from that. See how Zen has both MCM and monolithic products, and also how their RDNA design exists in both MCM and also in monolithic designs in iGPUs.
It's actually great in iGPUs, I guess they're just lacking in resources to scale it up because it makes more sense to put efforts into CDNA as the "big product" instead.
AusWolfAnd some people don't get why I spoke so harshly against the misleading MFG performance data in the keynote. This is why. People are very easy to manipulate, and that is exactly what Nvidia is doing.
Tbf most users here won't fall for that, and everyone will wait for the proper reviews nonetheless, so that's like preaching to a choir.
Vayra86That's the real upgrade path here. 2nd hand last gen as all the n00bs upgrade to the latest greatest that didn't gain them anything.
I got my 3090s used for like 1/2 and then 1/4 of their launch prices here after the mining craze, can't beat such value :p
10tothemin9voltsYou mean NV is comparing apples to apples in this one?, this would be nice (I guess since there is no fineprint, it might be so). Then the AI TOPS would indeed be massively improved (+70% when adjusting for the power increase of +25% for the 5070 vs 4070: 988/(466*1.25)).
According to "nvidia-ada-gpu-architecture.pdf", the 4090 is:

So it's either 1321 INT8 sparse or 1321 INT4 dense? Anyway, what matters more, is that it's an apples to apples comparison.
Funnily enough, yes. The numbers are for INT4 dense, sparsity is a nvidia-exclusive thing that's not that easy to use (you have to rearrange your tensors to make use of it).
Vya DomusThey're running models at half the precession on 50 series cards and comparing them to FP8 on 40 series because presumably at the same precision they're not faster at all. Pretty much everything they've shown is a smokescreen, this might just be the most disingenuous marketing material they've ever released, there's not a single example of a performance claim where they haven't screwed with it in some way.

For those of you that don't know lower precision quantized models are worse, often unusable for some applications, so even the "14638746728463287 gazillion AI TOPS" meme is a lie.
I had explained it to someone else, but I'll write it up again:
Flux is often memory-bound, just like LLMs. The gains you see there are mostly from the extra 80% in memory bandwidth the 5090 has. Even running it in FP8 (which my 3090 doesn't even have support for) leads to a really minor perf diference, while using FP8 vs FP16 on a 4090 barely nets a perf gain, something around 5~10% in both scenarios. Same likely goes for this FP4 vs FP8 comparison.

You are also forgetting that there are different types of quantizations. Your Q4, gguf, ggml stuff is about compressing stuff for storage/memory, but you still do the maths in fp16, which leads to a noticeable lower performance. Doing proper quantization on a model through some extra fine-tuning with precision-awareness leads to way better quality than just shoving the original weights in a smaller data type.
Just take a look by yourself at the results from their model vs the bfp16 one:

BF16 on the left and FP4 on the right
blackforestlabs.ai/flux-nvidia-blackwell/

Clearly not as good as the FP16, but way better than your usual Q8 quants.
Vya DomusYeah sure, they're not lying, just showing you a ginormous bar chart where the thing is 2X times faster and then a miniscule text below telling you that actually it's not.
Unlike games, for inference you often aim for the smallest supported data type for both vram savings and extra throughput. When tensor cores came out, everyone switched to FP16. When Ada/Hopper came out, everyone started doing FP8. The trend still goes this way.
Posted on Reply
#389
Vya Domus
JustBenchingThey claimed a 5x over hopper which was entirely due to fp4. Not an AI expert but supposedly the whole industry is trying to move to low precision - the tricky part is keeping the accuracy high, which is supposedly what nvidia has achieved and why it's dominating that segment.
Like I explained lower precision models are not equivalent to higher precision ones and never will be, if it was as simple as that then everything would be running in 1bit precision by now. LLMs like ChatGPT seem to have settled on 16 bits, that seems the be lower bound from which the output gets noticeably worse, with stuff like image generation you can go lower but they're still not equivalent.
igormpUnlike games, for inference you often aim for the smallest supported data type for both vram savings and extra throughput. When tensor cores came out, everyone switched to FP16. When Ada/Hopper came out, everyone started doing FP8. The trend still goes this way.
It does not matter, it's not an appropriate way to compare the two as they are running different models.
Posted on Reply
#390
igormp
Vya DomusLLMs like ChatGPT seem to have settled on 16 bits
No, it has been discussed multiple times that OpenAI has been quantizing their models without telling anyone over time.
Many LLMs are running in Q4, Q6 and Q8 out there in production by many different providers.
Vya DomusIt does not matter, it's not an appropriate way to compare the two as they are running different models.
It does matter because that's how people are going to run it given the hardware support, period.
Posted on Reply
#391
Vya Domus
igormpMany LLMs are running in Q4, Q6 and Q8 out there in production by many different providers.
Many don't disclose exactly what they're doing but 4bit quantization is markedly worse, that's for sure.
igormpIt does matter because that's how people are going to run it given the hardware support, period.
No, this is just more Nvidia marketing lying cope. Nobody is saying this isn't what people should be running, I don't care, just put the figures for each model side by side so you know exactly what you are looking at so that you don't have to read footnotes in the tiniest font possible to see that that they're not comparing the same thing. This is like the bare minimum you can expect, that at least the software they're running is the same.

They could have had separate charts showcasing VRAM usage as well, making a point about being able to run these things on lesser GPUs with less memory but they're so hell bent on lying and being as disingenuous as possible they don't even know when to use this to their advantage.
Posted on Reply
#392
AusWolf
igormpTbf most users here won't fall for that, and everyone will wait for the proper reviews nonetheless, so that's like preaching to a choir.
Wanna bet?
igormpIan Cutress wrote up a bit on that, AMD did a Q&A with some journalists trying to explain that:
morethanmoore.substack.com/p/where-was-rdna4-at-amds-keynote

TLDR; they said that the product was not yet finished, they wouldn't have enough time to showcase it during their overall presentation, and that nvidia's announcement did have a part on the decision to not showcase RDNA4 (they want to undercut it).
That was the most logical reason for it. Thanks for the article.
Posted on Reply
#393
igormp
Vya DomusMany don't disclose exactly what they're doing but 4bit quantization is markedly worse, that's for sure.
For LLMs with billions or close to trillions parameters? Sure, the error propagation gets way worse.
But a 70B Q4 model is still way better than a 30B Q8 one, and a bigger model at bigger data types is useless if you can't get it to run to begin with, or if performance is not good enough.

For smaller models the quantization perf loss is not that significative.
Vya DomusNo, this is just more Nvidia marketing lying cope. Nobody is saying this isn't what people should be running, I don't care, just put the figures for each model side by side so you know exactly what you are looking at so that you don't have to read footnotes in the tiniest font possible to see that that they're not comparing the same thing.
I get your point about "fairness", but I bet you performance would be pretty close to what was shown at FP8 because, as I've told before, this is mostly a memory bw issue. Doing it at FP4 is a showcase of a newly supported data type that we didn't have before (with only a minor perf uplift in this specific case).
You'd be better going back to complaining about the FG comparisons.
AusWolfWanna bet?


That was the most logical reason for it. Thanks for the article.
Given how most people here are hoping that a 5080 matches or surpasses a 4090, I don't think people here bought into the 5070=4090 idea.
Posted on Reply
#394
JustBenching
Are we seriously suggesting that the whole ai industry bought into b200 cause they were misled - they didn't understand what FP4 is?
Posted on Reply
#395
Vya Domus
igormpI get your point about "fairness", but I bet you performance would be pretty close to what was shown at FP8 because, as I've told before, this is mostly a memory bw issue. Doing it at FP4 is a showcase of a newly supported data type that we didn't have before (with only a minor perf uplift in this specific case).
You'd be better going back to complaining about the FG comparisons.
At least with FG you know they're running the same game, I don't think it's even about fairness it's just a shitty way of presenting those performance metrics for the tiny percentage of people who would even care, wouldn't you want to know that this now supports a smaller data type and that you can now run smaller models ? Be honest when you saw that did you assume it's the same data type or did you magically understand there must be more to it before squinting your eyes in the footnotes (if you ever did before someone else pointed it out for you), I for one admit I missed it before I saw someone else talk about it.
Posted on Reply
#396
AusWolf
JustBenchingAre we seriously suggesting that the whole ai industry bought into b200 cause they were misled - they didn't understand what FP4 is?
Does the AI industry even buy 5070/5080 level cards? I mean, home users getting their feet wet in AI, sure, but the wealthiest AI corps need a lot more oomph, don't they? That's who uber expensive professional cards are for. To them, everything you say about the 5070/5080 is meaningless.
Posted on Reply
#397
JustBenching
AusWolfDoes the AI industry even buy 5070/5080 level cards? I mean, home users getting their feet wet in AI, sure, but the wealthiest AI corps need a lot more oomph, don't they? That's who uber expensive professional cards are for. To them, everything you say about the 5070/5080 is meaningless.
No, they are buyin B200 which also used FP4 claims (vs FP8 for hopper) in their marketing slides.


Look, the thing is, there was another company at CES that compared their 120w CPU vs the competitions 17w chip. With no small letters btw. No one is talking about it being misleading, but we have 50 different threads 20 pages long complaining about nvidia. Makes you wonder
Posted on Reply
#398
AusWolf
JustBenchingNo, they are buyin B200 which also used FP4 claims (vs FP8 for hopper) in their marketing slides.
Fair enough. Still wrong, imo, but as long as buyers are fine with it, who am I to argue.
JustBenchingLook, the thing is, there was another company at CES that compared their 120w CPU vs the competitions 17w chip.
Really? That's poor as well. I guess no one was really interested in that CPU. I don't even know which one you're talking about, it completely missed the spot with me (although I admit, I only looked for GPUs this time around).
Posted on Reply
#399
JustBenching
AusWolfFair enough. Still wrong, imo, but as long as buyers are fine with it, who am I to argue.


Really? That's poor as well. I guess no one was really interested in that CPU. I don't even know which one you're talking about, it completely missed the spot with me (although I admit, I only looked for GPUs this time around).
Just an example from nvidias computex presentation regarding B200




The CPU in question was strix point (390Ai). But you know, it's amd, so it's not trying to mislead us :D
Posted on Reply
#400
AusWolf
JustBenchingJust an example from nvidias computex presentation regarding B200


Oh, but that clearly states the precision level right below the number. You don't need to see the small print for that.
JustBenchingThe CPU in question was strix point (390Ai). But you know, it's amd, so it's not trying to mislead us :D
Ah OK, fair point then. I'm happy to call out bullshit on any side (although I personally skipped that part entirely, as I only cared for GPUs this time around).
Posted on Reply
Add your own comment
Jan 9th, 2025 10:52 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts