Monday, January 6th 2025

NVIDIA 2025 International CES Keynote: Liveblog

CES by

Jan 6th, 2025 20:10 Discuss (470 Comments)

NVIDIA kicks off the 2025 International CES with a bang. The company is expected to debut its new GeForce "Blackwell" RTX 5000 generation of gaming graphics cards. It is also expected to launch new technology, such as neural rendering, and DLSS 4. The company is also expected to highlight a new piece of silicon for Windows on Arm laptops, showcase the next in its Drive PX FSD hardware, and probably even talk about its next-generation "Blackwell Ultra" AI GPU, and if we're lucky, even namedrop "Rubin." Join us, as we liveblog CEO Jensen Huang's keynote address.

02:22 UTC: The show is finally underway!

02:35 UTC: CTA president Gary Shaprio kicks off the show, introduces Jensen Huang.

02:46 UTC: "Tokens are the building blocks of AI"

02:46 UTC: "Do you like my jacket?"

02:47 UTC: NVIDIA recounts progress all the way till NV1 and UDA.

02:48 UTC: "CUDA was difficult to explain, it took 6 years to get the industry to like it"

02:50 UTC: "AI is coming home to GeForce". NVIDIA teases neural material and neural rendering. Rendered on "Blackwell"

02:55 UTC: Every single pixel is ray traced, thanks to AI rendering.

02:55 UTC: Here it is, the GeForce RTX 5090.

03:20 UTC: At least someone is pushing the limits for GPUs.

03:22 UTC: Incredible board design.

03:22 UTC: RTX 5070 matches RTX 4090 at $550.

03:24 UTC: Here's the lineup, available from January.

03:24 UTC: RTX 5070 Laptop starts at $1299.

03:24 UTC: "The future of computer graphics is neural rendering"

03:25 UTC: Laptops powered by RTX Blackwell: staring prices:

03:26 UTC: AI has come back to power GeForce.

03:28 UTC: Supposedly the Grace Blackwell NVLink72.

03:28 UTC: 1.4 ExaFLOPS.

03:32 UTC: NVIDIA very sneakily teased a Windows AI PC chip.

03:35 UTC: NVIDIA is teaching generative AI basic physics. NVIDIA Cosmos, a world foundation model.

03:41 UTC: NVIDIA Cosmos is trained on 20 million hours of video.

03:43 UTC: Cosmos is open-licensed on GitHub.

03:52 UTC: NVIDIA onboards Toyota for its next generation EV for full-self driving.

03:53 UTC: NVIDIA unveils Thor Blackwell robotics processor.

03:53 UTC: Thor is 20x the processing capability of Orin.

03:54 UTC: CUDA is now a functional safe computer thanks to its automobile certifications.

04:01 UTC: NVIDIA brought a dozen humanoid robots to the stage.

04:07 UTC: Project DIGITS, is a shrunk down AI supercomputer.

04:08 UTC: NVIDIA GB110 "Grace-Blackwell" chip powers DIGITS.

Add your own comment

470 Comments on NVIDIA 2025 International CES Keynote: Liveblog

#451

10tothemin9volts

ModEl4In agree that 12GB is a problem, I haven't seen usage in Indiana with path tracing at FHD base resolution to check it and also 5070 will support neural texture compression enabling similar quality textures with smaller memory footprint (or higher quality textures in a given memory budget) but what you said even if we suppose that it isn't true today (FHD base...) certainly will be in the near future.

And tomorrow is here, see my Indiana Jones example, and I don't think the neural texture compression -- it was like 0.4 - 0.5 GB less VRAM demand, according to a vid -- is going to help with that bc the VRAM utilization in that IJ scenario rises to ~15GB. Ofc not many exmples, but in the future more game gonna demand similar VRAM requiremetns (16GB VRAM for 1440p with even the lowest pathtracing setting), but better than nothing and it's free for all gens(?). Ofc, need to wait for independent reviews whether there's any catch with that.
The transformer based upscaling/DLSS is also free for all gens, that's also very nice (waiting for independent reviews whether theres any catch with that too).

For all AI LLM selfhosters enthusiasts, the GeForce 5090' 512-bit bus width theoretically means NV could release a clamshell RTX 6000 Blackwell 64GB VRAM workstation card (in a few months), like they usually do (384-bit: GeForce 4090 24 GB VRAM and "RTX 6000 Ada" 48GB VRAM).

#452

ModEl4

10tothemin9voltsAnd tomorrow is here, see my Indiana Jones example, and I don't think the neural texture compression -- it was like 0.4 - 0.5 GB less VRAM demand, according to a vid -- is going to help with that bc the VRAM utilization in that IJ scenario rises to ~15GB. Ofc not many exmples, but in the future more game gonna demand similar VRAM requiremetns (16GB VRAM for 1440p with even the lowest pathtracing setting), but better than nothing and it's free for all gens(?). Ofc, need to wait for independent reviews whether there's any catch with that.
The transformer based upscaling/DLSS is also free for all gens, that's also very nice (waiting for independent reviews whether theres any catch with that too).

For all AI LLM selfhosters enthusiasts, the GeForce 5090' 512-bit bus width theoretically means NV could release a clamshell RTX 6000 Blackwell 64GB VRAM workstation card (in a few months), like they usually do (384-bit: GeForce 4090 24 GB VRAM and "RTX 6000 Ada" 48GB VRAM).

It's not only 0,4-0,5GB, but we will see in the near future (I hope) with actual implementations the memory requirement difference

RTX Neural Texture Compression uses AI to compress thousands of textures in less than a minute. Their neural representations are stored or accessed in real time or loaded directly into memory without further modification. The neurally compressed textures save up to 7x more VRAM or system memory than traditional block compressed textures at the same visual quality.

#453

igormp

10tothemin9voltsFor all AI LLM selfhosters enthusiasts, the GeForce 5090' 512-bit bus width theoretically means NV could release a clamshell RTX 6000 Blackwell 64GB VRAM workstation card (in a few months), like they usually do (384-bit: GeForce 4090 24 GB VRAM and "RTX 6000 Ada" 48GB VRAM).

Don't forget the 3GB modules that are coming really soon. That would mean a 48GB model without clamshell, and a 96GB with.

#454

10tothemin9volts

ModEl4It's not only 0,4-0,5GB, but we will see in the near future (I hope) with actual implementations the memory requirement difference
RTX Neural Texture Compression uses AI to compress thousands of textures in less than a minute. Their neural representations are stored or accessed in real time or loaded directly into memory without further modification. The neurally compressed textures save up to 7x more VRAM or system memory than traditional block compressed textures at the same visual quality.

Depends what they mean. Right now it's all marketing: ".. than traditional block compressed textures", What percentage make these kind of textures up vs other ones, etc? Indeed, let's wait for independent reviews. But whatever it is, I don't believe the game VRAM consumption will be reduced by 7x, obv, nobody expects that. The vid I'm talking about showed ~0.5GB, which is actually more believable, bc this is the actual believable part of the ".. same visual quality.". It's all marketing so far (7x of a fraction is still not much).

igormpDon't forget the 3GB modules that are coming really soon. That would mean a 48GB model without clamshell, and a 96GB with.

3GB modules? true. But using a 512bit big and expensive chip for 16 chips * 3GB per chip vs the already available RTX 6000 Ada with the same 48GB VRAM would be kinda..? But maybe the bandwith increase from 960 GB/s to 1792 GB/s + other Blackwell features are what customers (would) demand, but it's still 48GB so no sure. I'm also not sure NV is going to relese a 16chips * 3GB per chip * 2 [clamshell] = 96GB version, I mean it would be very nice, but I think they will stretch this one out to at least next generation. But even then, in ~2.5 years, I'm not sure bc then they can claim that 3GB VRAM modules are widely available to then now in quantities and they can do 384 bit ones (btw, 384bit/32bit per chip=12 chips, for someone who doesn't know): 12chips * 3GB per chip * 2 [clamshell] = 72GB. So 96GB VRAM maybe in next next generation in ~5 years?
Though I have to say that the 32GB VRAM of the 5090 came as a slight positive surprise, bc I was thinking that no way they would want a competitor to ther expensive 32GB workstation card and instead the'd would release a 28-30GB VRAM card (ofc, people who need the workstation features, won't switch to the 5090, but other who need 32GB VRAM in one card and don't care about workstation card featurs now don't need to buy the expensive workstation card either).

#455

notoperable

Next Nvidia architecture will be called Gaslight and the AI will design it itself. What could possibly go wrong.

It's barely 2025 and I already can't hear that marketing babble spillage and it will get worse since everyone seems to find ai hallucinations totally fine and a part of the empirical reality. Can someone please reboot the universe?

Every GPU ad I've seen from CES so far sounds like a marketing manager on Ritalin overdose

#456

dragontamer5788

notoperableEvery GPU ad I've seen from CES so far sounds like a marketing manager on Ritalin overdose

GPU Makers know that AI isn't important, but it also doesn't "cost" very much to add FP16 or FP4 (or whatever) matrix-multiplication units to your GPUs.

Seriously, 4-bit numbers? Okay NVidia. I see what you're doing.

#457

Visible Noise

dragontamer5788GPU Makers know that AI isn't important, but it also doesn't "cost" very much to add FP16 or FP4 (or whatever) matrix-multiplication units to your GPUs.

Seriously, 4-bit numbers? Okay NVidia. I see what you're doing.

What is Nvidia doing, besides optimizing instruction sets for inference?
You didn’t know AMD is doing the same apparently.

The first product in the AMD Instinct MI350 Series, the AMD Instinct MI350X accelerator, is based on the AMD CDNA 4 architecture and is expected to be available in 2025. It will use the same industry standard Universal Baseboard server design as other MI300 Series accelerators and will be built using advanced 3nm process technology, support the FP4 and FP6 AI datatypes and have up to 288 GB of HBM3E memory.

Maybe a little googling before posting will be helpful.

#458

notoperable

Neural Regression

#459

dragontamer5788

Visible NoiseMaybe a little googling before posting will be helpful.

Or maybe I might be talking about something different.

64-bit multiplications are difficult, because multiplication scales by O(n^2) (assuming Dadda Multiplier architecture). This means that a 64-bit multiplier requires 4x as many adders as a 32-bit multiplier, or 16x as many adders as a 16-bit multiplier, or 32x as many adders as a 8-bit multiplier, or 64x more adders than a 4-bit multiplier.

So having 16 x 4-bit multipliers is still only 25% of the area (!!!!) of a 64-bit bit multiplier because of this O(n^2) scaling.

Case in point, the extreme 1-bit multiplier is also known as "AND" gate. (0*0 == 0 AND 0. 1*0 == 1 AND 0. 1*1 == 1 AND 1). When we go all the way down to the smallest number of bits, multiplication gets ridiculously easy to calculate. IE: Its very cheap to add 16x 4-bit multipliers, 8x 8-bit multipliers, 4x 16-bit multipliers to a circuit. Especially if that circuit already has the big-honking 64-bit or 32-bit multiplier off to the side. And all GPUs have 32-bit multipliers because 32-bits is the standard for video games.

All of this AI stuff is just gold to NVidia and AMD. They barely have to do any work (from a computer design perspective), they just need to lol give incredibly easy 4-bit designs and then sell them for more money.

------------

If these "AI Companies" can sell 2-bit or even 1-bit multiplication, I guarantee you that they will do so. Its not about efficacy, its about how little work they need to do yet still able to sell something for more money.

#460

Visible Noise

It’s FP4. Floating point. A wee bit more complicated than you make it out to be. Perhaps you should look it up. Here’s a starting point papers.nips.cc/paper/2020/file/13b919438259814cd5be8cb45877d577-Paper.pdf

#461

dragontamer5788

Visible NoiseIt’s FP4. Floating point. A wee bit more complicated than you make it out to be. Perhaps you should look it up. Here’s a starting point papers.nips.cc/paper/2020/file/13b919438259814cd5be8cb45877d577-Paper.pdf

Oh right. Good point. Its 1-bit sign, 3-bit exponent, 0-bit mantissa. So "multipliers" are actually 3-bit adders.

Thanks for making me look it up. Its even worse than I expected. ("Multiplication" of exponents simply becomes addition. 2^4 * 2^3 == 2^7).

#462

Visible Noise

dragontamer5788Oh right. Good point. Its 1-bit sign, 3-bit exponent, 0-bit mantissa. So "multipliers" are actually 3-bit adders.

Thanks for making me look it up. Its n worse than I expected. ("Multiplication" of exponents simply becomes addition. 2^4 * 2^3 == 2^7).

if that’s what you got out of the IBM research paper then you’re too blinded by hate to understand. Upset that your gaming GPU are expensive because they found other users for then I presume.

#463

dragontamer5788

Visible Noiseif that’s what you got out of the IBM research paper then you’re too blinded by hate to understand. Upset that your gaming GPU are expensive because they found other users for then I presume.

Dude, its elementary to design a 1-bit sign / 3-bit exponent / 0-bit mantissa multiplier. Seriously, look at the damn format and think about how the circuit would be made. This is the stuff a 2nd year computer engineer can do.

If you aren't into computer engineering, then I promise you, its not very complex. The hard stuff are the larger multiplier circuits (aka: Wallace Trees and whatnot), which are closer to 4th year or even masters-degree level in my experience.

Think like an NVidia or AMD engineer. Think about how to wire adders together so that it'd make a multiplication circuit. Some of these things are absurdly easy to do.

#464

notoperable

dragontamer5788Dude, its elementary to design a 1-bit sign / 3-bit exponent / 0-bit mantissa multiplier. Seriously, look at the damn format and think about how the circuit would be made. This is the stuff a 2nd year computer engineer can do.

If you aren't into computer engineering, then I promise you, its not very complex. The hard stuff are the larger multiplier circuits (aka: Wallace Trees and whatnot), which are closer to 4th year or even masters-degree level in my experience.

Think like an NVidia or AMD engineer. Think about how to wire adders together so that it'd make a multiplication circuit. Some of these things are absurdly easy to do.

But its fp4 ! I'm gonna repeat the catchword over and over again and then I'll get it.

Visible Noiseif that’s what you got out of the IBM research paper then you’re too blinded by hate to understand. Upset that your gaming GPU are expensive because they found other users for then I presume.

And you are some kind of GPU Jedi or how did you remotely get from fp4 to this? Back to the ocean! Now!

#465

10tothemin9volts

GeForce 5070 Laptop still 8GB VRAM also a disappointment..planned obsolescence ("PCGH demonstrates why 8GB GPUs are simply not good enough for 2025". Hardware Unboxed demonstrated it like 1-3 years ago.). If NV at least gave 10GB VRAM. But, what can I say, if ppl still buy it..

#466

SRS

igormpIt is awful for sure. But I the same time, I don't feel it's like Nvidia has been bribing companies to not use their competitors.

What competitors?

There is zero competition happening right now in the 4090–5090 space (and even below).

#467

notoperable

Imaginary competition :) makes it less cartel a like

Add your own comment

NVIDIA 2025 International CES Keynote: Liveblog

470 Comments on NVIDIA 2025 International CES Keynote: Liveblog

Latest GPU Drivers

New Forum Posts

Popular Reviews

TPU on YouTube

Controversial News Posts

NVIDIA 2025 International CES Keynote: Liveblog

Related News

470 Comments on NVIDIA 2025 International CES Keynote: Liveblog

Latest GPU Drivers

New Forum Posts

Popular Reviews

TPU on YouTube

Controversial News Posts