Monday, January 6th 2025

NVIDIA 2025 International CES Keynote: Liveblog

CES by

Jan 6th, 2025 21:10 Discuss (470 Comments)

NVIDIA kicks off the 2025 International CES with a bang. The company is expected to debut its new GeForce "Blackwell" RTX 5000 generation of gaming graphics cards. It is also expected to launch new technology, such as neural rendering, and DLSS 4. The company is also expected to highlight a new piece of silicon for Windows on Arm laptops, showcase the next in its Drive PX FSD hardware, and probably even talk about its next-generation "Blackwell Ultra" AI GPU, and if we're lucky, even namedrop "Rubin." Join us, as we liveblog CEO Jensen Huang's keynote address.

02:22 UTC: The show is finally underway!

02:35 UTC: CTA president Gary Shaprio kicks off the show, introduces Jensen Huang.

02:46 UTC: "Tokens are the building blocks of AI"

02:46 UTC: "Do you like my jacket?"

02:47 UTC: NVIDIA recounts progress all the way till NV1 and UDA.

02:48 UTC: "CUDA was difficult to explain, it took 6 years to get the industry to like it"

02:50 UTC: "AI is coming home to GeForce". NVIDIA teases neural material and neural rendering. Rendered on "Blackwell"

02:55 UTC: Every single pixel is ray traced, thanks to AI rendering.

02:55 UTC: Here it is, the GeForce RTX 5090.

03:20 UTC: At least someone is pushing the limits for GPUs.

03:22 UTC: Incredible board design.

03:22 UTC: RTX 5070 matches RTX 4090 at $550.

03:24 UTC: Here's the lineup, available from January.

03:24 UTC: RTX 5070 Laptop starts at $1299.

03:24 UTC: "The future of computer graphics is neural rendering"

03:25 UTC: Laptops powered by RTX Blackwell: staring prices:

03:26 UTC: AI has come back to power GeForce.

03:28 UTC: Supposedly the Grace Blackwell NVLink72.

03:28 UTC: 1.4 ExaFLOPS.

03:32 UTC: NVIDIA very sneakily teased a Windows AI PC chip.

03:35 UTC: NVIDIA is teaching generative AI basic physics. NVIDIA Cosmos, a world foundation model.

03:41 UTC: NVIDIA Cosmos is trained on 20 million hours of video.

03:43 UTC: Cosmos is open-licensed on GitHub.

03:52 UTC: NVIDIA onboards Toyota for its next generation EV for full-self driving.

03:53 UTC: NVIDIA unveils Thor Blackwell robotics processor.

03:53 UTC: Thor is 20x the processing capability of Orin.

03:54 UTC: CUDA is now a functional safe computer thanks to its automobile certifications.

04:01 UTC: NVIDIA brought a dozen humanoid robots to the stage.

04:07 UTC: Project DIGITS, is a shrunk down AI supercomputer.

04:08 UTC: NVIDIA GB110 "Grace-Blackwell" chip powers DIGITS.

Add your own comment

470 Comments on NVIDIA 2025 International CES Keynote: Liveblog

#376

Vayra86

oxrufiioxoI've seen 8 and I've seen 12... Lets hope 12 considering it should be 7700XT like performance... Rumors are all over the place with that.... The 5060 is even rumored to get 12GB via the 3GB GDDR7 chips and likely won't come out till they are avail but rumors are whatever.... We won't know till Nvidia/AMD shows them off.

My crystall ball says 8GB. 12 is wishful thinking, but not Nvidia M.O.
If they give it 12, they will cannibalize the 5070. This entire Blackwell stack is positioned such that it doesn't look terrible if you still have Ada, while still giving Ada owners an incentive to upgrade. They can after all, sell their cards and buy a replacement at near cost neutrality.

This is precisely what is happening with the 4090s on the market right now. Nvidia's executing a perfect strategy here because AMD isn't even playing.

Look here. Perfect price parity with the 5000+ shader count 5090.
We will have the 4090 taking the slot between the 5080 and 5090 for the foreseeable future. Nvidia doesn't need anything in between.

The above is just under half the number of sellers on this site, now...

Here's another search just for the lulz. There are almost no (literally 3 in Netherlands!!) sellers of a 7900XTX. AMD ensured its own stagnation, these owners will sooner or later jump ship. Fantastic plan, going midrange!

#377

Chrispy_

oxrufiioxoI agree though it is looking like apples to apples the 5080 probably isn't much faster than the 4080..

Far Cry 6 and Plague Tale Requiem are examples of the raw performance improvement because they clearly don't support DLSS4 MFG fakery.

That 30% improvement there is likely what we can really expect in the overwhelming majority of games. The 5080 has 15% more compute (cores*clocks) and sucks down more power despite being a newer, more efficient node, so the other 15% likely comes from the 4080 being sandbagged by power limits.

#378

Vayra86

oxrufiioxoI view the 5-600 usd cards as 1080p cards as it is so 12GB is fine but I agree people should be buying 16GB cards in 2025 regardless of how fast this 12GB card is or isn't.

That's how far we've already moved into the Nvidia story, but honestly, 500-600 would get you a top end card not too long ago. I bought a GTX 1080 that runs 1440p at medium EVEN TODAY for 520... What you are saying here is we literally regressed over the course of 4 generations of new GPUs. That's utterly terrible.

Chrispy_Far Cry 6 and Plague Tale Requiem are examples of the raw performance improvement because they clearly don't support DLSS4 MFG fakery.

That 30% improvement there is likely what we can really expect in the overwhelming majority of games. The 5080 has 15% more compute (cores*clocks) and sucks down more power despite being a newer, more efficient node, so the other 15% likely comes from the 4080 being sandbagged by power limits.

You failed at interpretation of this bar chart.

The left most bars indeed don't say DLSS. But they do say RT.
Raster performance might be at a complete standstill, just RT ON is improved, going by this chart. It does not say a thing about raster perf.

#379

oxrufiioxo

Vayra86That's how far we've already moved into the Nvidia story, but honestly, 500-600 would get you a top end card not too long ago. I bought a GTX 1080 that runs 1440p at medium EVEN TODAY for 520... What you are saying here is we literally regressed over the course of 4 generations of new GPUs. That's utterly terrible.

Back when the 1080 launched it was 699 that would be over 900 usd in 2025 money.

Technically msrp was 599 but that was when Nvidia started the FE BS the price was reduced a year later when the 1080ti released to 499/549FE

#380

Vya Domus

Legacy-ZAThe 5070 is a turd, it's only better at A.I workloads

Probably not even that, they've been lying in their marketing material even for the ML stuff :

They're running models at half the precession on 50 series cards and comparing them to FP8 on 40 series because presumably at the same precision they're not faster at all. Pretty much everything they've shown is a smokescreen, this might just be the most disingenuous marketing material they've ever released, there's not a single example of a performance claim where they haven't screwed with it in some way.

For those of you that don't know lower precision quantized models are worse, often unusable for some applications, so even the "14638746728463287 gazillion AI TOPS" meme is a lie.

#381

JustBenching

Vya DomusProbably not even that, they've been lying in their marketing material even for the ML stuff :

They're running models at half the precession on 50 series cards and comparing them to FP8 on 40 series because presumably at the same precision they're not faster at all. Pretty much everything they've shown is a smokescreen, this might just be the most disingenuous marketing material they've ever released, there's not a single example of a performance claim where they haven't screwed with it in some way.

It's not lying when they give you the detes. FP4 isn't supported on the 40series. It's working smarter, not harder.

#382

Vya Domus

JustBenchingIt's not lying when they give you the detes. FP4 isn't supported on the 40series. It's working smarter, not harder.

Yeah sure, they're not lying, just showing you a ginormous bar chart where the thing is 2X times faster and then a miniscule text below telling you that actually it's not.

#383

SOAREVERSOR

Vya DomusYeah sure, they're not lying, just showing you a ginormous bar chart where the thing is 2X times faster and then a miniscule text below telling you that actually it's not.

FP4 is for AI and actual work. Quit thinking as a gamer.

#384

Vya Domus

SOAREVERSORFP4 is for AI and actual work. Quit thinking as a gamer.

Outputs from FP4 and FP8 models are not equivalent, quit thinking as an AI tourist. People supposedly using these for work would know this is a false comparison.

#385

JustBenching

Vya DomusYeah sure, they're not lying, just showing you a ginormous bar chart where the thing is 2X times faster and then a miniscule text below telling you that actually it's not.

B200 used similar marketing slides with FP4. You think they are getting sued by the big AI corps?

#386

Vya Domus

JustBenchingB200 used similar marketing slides with FP4. You think they are getting sued by the big AI corps?

Haven't seen them but they've never gotten sued over stuff like this so no I don't expect them to, it's still a lie though.

#387

JustBenching

Vya DomusHaven't seen them but they've never gotten sued over stuff like this so no I don't expect them to, it's still a lie though.

They claimed a 5x over hopper which was entirely due to fp4. Not an AI expert but supposedly the whole industry is trying to move to low precision - the tricky part is keeping the accuracy high, which is supposedly what nvidia has achieved and why it's dominating that segment.

#388

igormp

Dr. DroI got my hands on the expected Brazilian pricing and launch dates, the ones that launch 3 February are pre-orders, the ones 12 and 15 Feb are standard orders. I can't vouch for the authenticity of this list with 100% certainty so take this with a grain of salt, but I think this math is mostly mathin'.

This has been confirmed to be fake, it was just the supposed values in USD converted to BRL. Do notice how some actually end up below the US MSRP prices.

JustBenchingCurious though, what do you think happened? They got some info from what nvidia is going to do at the last minute and bailed to get back to the drawing board? Maybe there wasn't any plan to announce the 9070 at all and people just assumed?

Ian Cutress wrote up a bit on that, AMD did a Q&A with some journalists trying to explain that:
morethanmoore.substack.com/p/where-was-rdna4-at-amds-keynote

TLDR; they said that the product was not yet finished, they wouldn't have enough time to showcase it during their overall presentation, and that nvidia's announcement did have a part on the decision to not showcase RDNA4 (they want to undercut it).

oxrufiioxoHonestly I think they made a big bet on MCM for the gaming division hoping for a Zen moment and it did not pan out.... Sounds like they did have a high end RDNA4 chip planned but canceled it.

The reality is the low end stuff from Nvidia is getting worse and worse and AMD isn't offering alternatives that get people excited. If RDNA4 is a bust lets hope UDNA is the answer.

If they can get UDNA right, that would simplify a lot of things given they'll be able to do exactly what they done with Zen: have chiplets that provide great value in the enterprise (which brings big bucks), and that can also be used in the consumer market, all of those out of the same fabrication line. This is also exactly what Nvidia has been doing for quite a long time (albeit not with chiplets).
Their GPU division currently has both CDNA and RDNA, which not only need to compete in engineering time, but also fab allocation. Given how CDNA is bringing more money than RDNA, it makes sense to focus on that.

Vayra86Yeah MCM or not, even if they had a bigger chip, they should have also had RT performance laying on the shelf to go with it. Which they might not have after all; RDNA3 was supposed to perform better even regardless of MCM. I think there were mostly promises, hopes and dreams flying around but people simply did not (manage to...) deliver. And this seems to be a recurring thing, not exclusive to Raja.

MCM is more about the fab efficiency, but the architecture design is a bit different from that. See how Zen has both MCM and monolithic products, and also how their RDNA design exists in both MCM and also in monolithic designs in iGPUs.
It's actually great in iGPUs, I guess they're just lacking in resources to scale it up because it makes more sense to put efforts into CDNA as the "big product" instead.

AusWolfAnd some people don't get why I spoke so harshly against the misleading MFG performance data in the keynote. This is why. People are very easy to manipulate, and that is exactly what Nvidia is doing.

Tbf most users here won't fall for that, and everyone will wait for the proper reviews nonetheless, so that's like preaching to a choir.

Vayra86That's the real upgrade path here. 2nd hand last gen as all the n00bs upgrade to the latest greatest that didn't gain them anything.

I got my 3090s used for like 1/2 and then 1/4 of their launch prices here after the mining craze, can't beat such value :p

10tothemin9voltsYou mean NV is comparing apples to apples in this one?, this would be nice (I guess since there is no fineprint, it might be so). Then the AI TOPS would indeed be massively improved (+70% when adjusting for the power increase of +25% for the 5070 vs 4070: 988/(466*1.25)).
According to "nvidia-ada-gpu-architecture.pdf", the 4090 is:

So it's either 1321 INT8 sparse or 1321 INT4 dense? Anyway, what matters more, is that it's an apples to apples comparison.

Funnily enough, yes. The numbers are for INT4 dense, sparsity is a nvidia-exclusive thing that's not that easy to use (you have to rearrange your tensors to make use of it).

Vya DomusThey're running models at half the precession on 50 series cards and comparing them to FP8 on 40 series because presumably at the same precision they're not faster at all. Pretty much everything they've shown is a smokescreen, this might just be the most disingenuous marketing material they've ever released, there's not a single example of a performance claim where they haven't screwed with it in some way.

For those of you that don't know lower precision quantized models are worse, often unusable for some applications, so even the "14638746728463287 gazillion AI TOPS" meme is a lie.

I had explained it to someone else, but I'll write it up again:
Flux is often memory-bound, just like LLMs. The gains you see there are mostly from the extra 80% in memory bandwidth the 5090 has. Even running it in FP8 (which my 3090 doesn't even have support for) leads to a really minor perf diference, while using FP8 vs FP16 on a 4090 barely nets a perf gain, something around 5~10% in both scenarios. Same likely goes for this FP4 vs FP8 comparison.

You are also forgetting that there are different types of quantizations. Your Q4, gguf, ggml stuff is about compressing stuff for storage/memory, but you still do the maths in fp16, which leads to a noticeable lower performance. Doing proper quantization on a model through some extra fine-tuning with precision-awareness leads to way better quality than just shoving the original weights in a smaller data type.
Just take a look by yourself at the results from their model vs the bfp16 one:

BF16 on the left and FP4 on the right
blackforestlabs.ai/flux-nvidia-blackwell/

Clearly not as good as the FP16, but way better than your usual Q8 quants.

Vya DomusYeah sure, they're not lying, just showing you a ginormous bar chart where the thing is 2X times faster and then a miniscule text below telling you that actually it's not.

Unlike games, for inference you often aim for the smallest supported data type for both vram savings and extra throughput. When tensor cores came out, everyone switched to FP16. When Ada/Hopper came out, everyone started doing FP8. The trend still goes this way.

#389

Vya Domus

JustBenchingThey claimed a 5x over hopper which was entirely due to fp4. Not an AI expert but supposedly the whole industry is trying to move to low precision - the tricky part is keeping the accuracy high, which is supposedly what nvidia has achieved and why it's dominating that segment.

Like I explained lower precision models are not equivalent to higher precision ones and never will be, if it was as simple as that then everything would be running in 1bit precision by now. LLMs like ChatGPT seem to have settled on 16 bits, that seems the be lower bound from which the output gets noticeably worse, with stuff like image generation you can go lower but they're still not equivalent.

igormpUnlike games, for inference you often aim for the smallest supported data type for both vram savings and extra throughput. When tensor cores came out, everyone switched to FP16. When Ada/Hopper came out, everyone started doing FP8. The trend still goes this way.

It does not matter, it's not an appropriate way to compare the two as they are running different models.

#390

igormp

Vya DomusLLMs like ChatGPT seem to have settled on 16 bits

No, it has been discussed multiple times that OpenAI has been quantizing their models without telling anyone over time.
Many LLMs are running in Q4, Q6 and Q8 out there in production by many different providers.

Vya DomusIt does not matter, it's not an appropriate way to compare the two as they are running different models.

It does matter because that's how people are going to run it given the hardware support, period.

#391

Vya Domus

igormpMany LLMs are running in Q4, Q6 and Q8 out there in production by many different providers.

Many don't disclose exactly what they're doing but 4bit quantization is markedly worse, that's for sure.

igormpIt does matter because that's how people are going to run it given the hardware support, period.

No, this is just more Nvidia marketing lying cope. Nobody is saying this isn't what people should be running, I don't care, just put the figures for each model side by side so you know exactly what you are looking at so that you don't have to read footnotes in the tiniest font possible to see that that they're not comparing the same thing. This is like the bare minimum you can expect, that at least the software they're running is the same.

They could have had separate charts showcasing VRAM usage as well, making a point about being able to run these things on lesser GPUs with less memory but they're so hell bent on lying and being as disingenuous as possible they don't even know when to use this to their advantage.

#392

AusWolf

igormpTbf most users here won't fall for that, and everyone will wait for the proper reviews nonetheless, so that's like preaching to a choir.

Wanna bet?

igormpIan Cutress wrote up a bit on that, AMD did a Q&A with some journalists trying to explain that:
morethanmoore.substack.com/p/where-was-rdna4-at-amds-keynote

TLDR; they said that the product was not yet finished, they wouldn't have enough time to showcase it during their overall presentation, and that nvidia's announcement did have a part on the decision to not showcase RDNA4 (they want to undercut it).

That was the most logical reason for it. Thanks for the article.

#393

igormp

Vya DomusMany don't disclose exactly what they're doing but 4bit quantization is markedly worse, that's for sure.

For LLMs with billions or close to trillions parameters? Sure, the error propagation gets way worse.
But a 70B Q4 model is still way better than a 30B Q8 one, and a bigger model at bigger data types is useless if you can't get it to run to begin with, or if performance is not good enough.

For smaller models the quantization perf loss is not that significative.

Vya DomusNo, this is just more Nvidia marketing lying cope. Nobody is saying this isn't what people should be running, I don't care, just put the figures for each model side by side so you know exactly what you are looking at so that you don't have to read footnotes in the tiniest font possible to see that that they're not comparing the same thing.

I get your point about "fairness", but I bet you performance would be pretty close to what was shown at FP8 because, as I've told before, this is mostly a memory bw issue. Doing it at FP4 is a showcase of a newly supported data type that we didn't have before (with only a minor perf uplift in this specific case).
You'd be better going back to complaining about the FG comparisons.

AusWolfWanna bet?

That was the most logical reason for it. Thanks for the article.

Given how most people here are hoping that a 5080 matches or surpasses a 4090, I don't think people here bought into the 5070=4090 idea.

#394

JustBenching

Are we seriously suggesting that the whole ai industry bought into b200 cause they were misled - they didn't understand what FP4 is?

#395

Vya Domus

igormpI get your point about "fairness", but I bet you performance would be pretty close to what was shown at FP8 because, as I've told before, this is mostly a memory bw issue. Doing it at FP4 is a showcase of a newly supported data type that we didn't have before (with only a minor perf uplift in this specific case).
You'd be better going back to complaining about the FG comparisons.

At least with FG you know they're running the same game, I don't think it's even about fairness it's just a shitty way of presenting those performance metrics for the tiny percentage of people who would even care, wouldn't you want to know that this now supports a smaller data type and that you can now run smaller models ? Be honest when you saw that did you assume it's the same data type or did you magically understand there must be more to it before squinting your eyes in the footnotes (if you ever did before someone else pointed it out for you), I for one admit I missed it before I saw someone else talk about it.

#396

AusWolf

JustBenchingAre we seriously suggesting that the whole ai industry bought into b200 cause they were misled - they didn't understand what FP4 is?

Does the AI industry even buy 5070/5080 level cards? I mean, home users getting their feet wet in AI, sure, but the wealthiest AI corps need a lot more oomph, don't they? That's who uber expensive professional cards are for. To them, everything you say about the 5070/5080 is meaningless.

#397

JustBenching

AusWolfDoes the AI industry even buy 5070/5080 level cards? I mean, home users getting their feet wet in AI, sure, but the wealthiest AI corps need a lot more oomph, don't they? That's who uber expensive professional cards are for. To them, everything you say about the 5070/5080 is meaningless.

No, they are buyin B200 which also used FP4 claims (vs FP8 for hopper) in their marketing slides.

Look, the thing is, there was another company at CES that compared their 120w CPU vs the competitions 17w chip. With no small letters btw. No one is talking about it being misleading, but we have 50 different threads 20 pages long complaining about nvidia. Makes you wonder

#398

AusWolf

JustBenchingNo, they are buyin B200 which also used FP4 claims (vs FP8 for hopper) in their marketing slides.

Fair enough. Still wrong, imo, but as long as buyers are fine with it, who am I to argue.

JustBenchingLook, the thing is, there was another company at CES that compared their 120w CPU vs the competitions 17w chip.

Really? That's poor as well. I guess no one was really interested in that CPU. I don't even know which one you're talking about, it completely missed the spot with me (although I admit, I only looked for GPUs this time around).

#399

JustBenching

AusWolfFair enough. Still wrong, imo, but as long as buyers are fine with it, who am I to argue.

Really? That's poor as well. I guess no one was really interested in that CPU. I don't even know which one you're talking about, it completely missed the spot with me (although I admit, I only looked for GPUs this time around).

Just an example from nvidias computex presentation regarding B200

The CPU in question was strix point (390Ai). But you know, it's amd, so it's not trying to mislead us :D

#400

AusWolf

JustBenchingJust an example from nvidias computex presentation regarding B200

Oh, but that clearly states the precision level right below the number. You don't need to see the small print for that.

JustBenchingThe CPU in question was strix point (390Ai). But you know, it's amd, so it's not trying to mislead us :D

Ah OK, fair point then. I'm happy to call out bullshit on any side (although I personally skipped that part entirely, as I only cared for GPUs this time around).

Add your own comment

NVIDIA 2025 International CES Keynote: Liveblog

470 Comments on NVIDIA 2025 International CES Keynote: Liveblog

Latest GPU Drivers

New Forum Posts

Popular Reviews

Controversial News Posts

NVIDIA 2025 International CES Keynote: Liveblog

Related News

470 Comments on NVIDIA 2025 International CES Keynote: Liveblog

Latest GPU Drivers

New Forum Posts

Popular Reviews

Controversial News Posts