NVIDIA 2025 International CES Keynote: Liveblog

SRS · Jan 11, 2025

Nvidia isn't calling the 5090 a Titan card, even though the pricing and tech gap between it and the 5080 is much larger than Titan cards were, from my recollection. I could be wrong but I do not remember Titan being 50% more powerful than the next step down. I vaguely recall Titan also being more appealing because of its prosumer use cases.

Why would Nvidia do this? It has most likely decided that it will obtain more money by having reviewers (perhaps unwittingly) pressure buyers into the 5090 purchase (because it's not an extravagent Titan... it's simply the best of the regular consumer cards and is therefore the standard for most every review's benchmarks). It seems to be mainly a matter of optics.

It also exposes the problem of Nvidia's monopoly over higher-end/enthusiast consumer GPUs. Nvidia would never be able to get away with such a large gap if something even approaching adequate competition were in place. (I really loathe the minute difference = new product tier strategy but the fact is that consumers have shown they are williing to tolerate having so many products with tiny differences. 50% as a gap is excessive, however, even for me.)

I find it rather amusing to read so many "Oh... so inexpensive!" comments from the first two pages or so of this thread. $2500–$3000 is inexpensive? We all know how the game works by now, right? Only a tiny number of people will get the Founder's card (or whatever they're calling it) and there will be Reddit pages with people hoping, tracking, bragging about their BestBuy escapades. Everyone else will deal with scalpers, shortages, and 3rd-party cards with minuscule overclocks and a higher price. We've also lived through two cycles of mining-driven shortages at least, which compounded AMD's higher-end cards that seemed to be more designed for mining than for gaming. (Now we don't even have those to be disappointed with.)

Instead of hoping for things to be different this time... what I'd like to see is competition. Not a giant void where even a dupolist could be half-heartedly competing with cards that have too-small dies at too-high clocks with too-small coolers like the vaunted Radeon VII.

Most everything I've written above is debatable to some degree. However, the fact that the higher-end consumer GPU space (both for gaming and home AI, such as text-to-image) is occupied by a single corporate entity is not. That's not capitalism.

igormp · Jan 11, 2025

SRS said:
Nvidia isn't calling the 5090 a Titan card, even though the pricing and tech gap between it and the 5080 is much larger than Titan cards were, from my recollection. I could be wrong but I do not remember Titan being 50% more powerful than the next step down. I vaguely recall Titan also being more appealing because of its prosumer use cases.

Titans never had much of a gap, if any, to begin with. I posted this in another thread:

igormp said:
I did a graph some time ago comparing the different generations and their % to the top die, similar to the one that has been floating around here, but that compares an assumed "top product" for each generation instead of the actual die nvidia uses:

I decided to give it a go and update it today with the known values of the blackwell gen:

Conclusions are up to each one.

Not only is the gap between the highest and the 2nd one bigger, seems like we have a new trend where consumers don't even get the full die (or close to it) anymore.

SRS said:
$2500–$3000 is inexpensive?

For games it's pretty expensive. But for productivity? It's still quite cheap.
I had heard a rumor about 80% of the 4090s being used in AI farms, but I don't have anything to back this up, but this would mean that indeed those cards are not even being used for games anymore, and Nvidia is pretty aware of that.

SRS said:
Most everything I've written above is debatable to some degree. However, the fact that the higher-end consumer GPU space (both for gaming and home AI, such as text-to-image) is occupied by a single corporate entity is not. That's not capitalism.

It is awful for sure. But I the same time, I don't feel it's like Nvidia has been bribing companies to not use their competitors. They have been just doing a good job and the others didn't try to stack up to it, which is bad but I'm not sure we could fine Nvidia for that.
IIRC, France did try to investigate Nvidia on monopolistic practices and didn't come up with anything.

Dawora · Jan 11, 2025

Can be very good for 5070 12GB

RTX Neural Texture Compression
uses AI to compress thousands of textures in less than a minute. Their neural representations are stored or accessed in real time or loaded directly into memory without further modification.
The neurally compressed textures save up to 7x more VRAM or system memory than traditional block compressed textures at the same visual quality.

AusWolf · Jan 11, 2025

Dawora said:
Can be very good for 5070 12GB

RTX Neural Texture Compression
uses AI to compress thousands of textures in less than a minute. Their neural representations are stored or accessed in real time or loaded directly into memory without further modification.
The neurally compressed textures save up to 7x more VRAM or system memory than traditional block compressed textures at the same visual quality.

That's marketing. Marketing always sounds good. Let's see how it works in action.

10tothemin9volts · Jan 11, 2025

ModEl4 said:
In agree that 12GB is a problem, I haven't seen usage in Indiana with path tracing at FHD base resolution to check it and also 5070 will support neural texture compression enabling similar quality textures with smaller memory footprint (or higher quality textures in a given memory budget) but what you said even if we suppose that it isn't true today (FHD base...) certainly will be in the near future.

And tomorrow is here, see my Indiana Jones example, and I don't think the neural texture compression -- it was like 0.4 - 0.5 GB less VRAM demand, according to a vid -- is going to help with that bc the VRAM utilization in that IJ scenario rises to ~15GB. Ofc not many exmples, but in the future more game gonna demand similar VRAM requiremetns (16GB VRAM for 1440p with even the lowest pathtracing setting), but better than nothing and it's free for all gens(?). Ofc, need to wait for independent reviews whether there's any catch with that.
The transformer based upscaling/DLSS is also free for all gens, that's also very nice (waiting for independent reviews whether theres any catch with that too).

For all AI LLM selfhosters enthusiasts, the GeForce 5090' 512-bit bus width theoretically means NV could release a clamshell RTX 6000 Blackwell 64GB VRAM workstation card (in a few months), like they usually do (384-bit: GeForce 4090 24 GB VRAM and "RTX 6000 Ada" 48GB VRAM).

ModEl4 · Jan 11, 2025

10tothemin9volts said:
And tomorrow is here, see my Indiana Jones example, and I don't think the neural texture compression -- it was like 0.4 - 0.5 GB less VRAM demand, according to a vid -- is going to help with that bc the VRAM utilization in that IJ scenario rises to ~15GB. Ofc not many exmples, but in the future more game gonna demand similar VRAM requiremetns (16GB VRAM for 1440p with even the lowest pathtracing setting), but better than nothing and it's free for all gens(?). Ofc, need to wait for independent reviews whether there's any catch with that.
The transformer based upscaling/DLSS is also free for all gens, that's also very nice (waiting for independent reviews whether theres any catch with that too).

For all AI LLM selfhosters enthusiasts, the GeForce 5090' 512-bit bus width theoretically means NV could release a clamshell RTX 6000 Blackwell 64GB VRAM workstation card (in a few months), like they usually do (384-bit: GeForce 4090 24 GB VRAM and "RTX 6000 Ada" 48GB VRAM).

It's not only 0,4-0,5GB, but we will see in the near future (I hope) with actual implementations the memory requirement difference

RTX Neural Texture Compression uses AI to compress thousands of textures in less than a minute. Their neural representations are stored or accessed in real time or loaded directly into memory without further modification. The neurally compressed textures save up to 7x more VRAM or system memory than traditional block compressed textures at the same visual quality.

igormp · Jan 11, 2025

10tothemin9volts said:
For all AI LLM selfhosters enthusiasts, the GeForce 5090' 512-bit bus width theoretically means NV could release a clamshell RTX 6000 Blackwell 64GB VRAM workstation card (in a few months), like they usually do (384-bit: GeForce 4090 24 GB VRAM and "RTX 6000 Ada" 48GB VRAM).

Don't forget the 3GB modules that are coming really soon. That would mean a 48GB model without clamshell, and a 96GB with.

10tothemin9volts · Jan 12, 2025

ModEl4 said:
It's not only 0,4-0,5GB, but we will see in the near future (I hope) with actual implementations the memory requirement difference

RTX Neural Texture Compression uses AI to compress thousands of textures in less than a minute. Their neural representations are stored or accessed in real time or loaded directly into memory without further modification. The neurally compressed textures save up to 7x more VRAM or system memory than traditional block compressed textures at the same visual quality.

Depends what they mean. Right now it's all marketing: ".. than traditional block compressed textures", What percentage make these kind of textures up vs other ones, etc? Indeed, let's wait for independent reviews. But whatever it is, I don't believe the game VRAM consumption will be reduced by 7x, obv, nobody expects that. The vid I'm talking about showed ~0.5GB, which is actually more believable, bc this is the actual believable part of the ".. same visual quality.". It's all marketing so far (7x of a fraction is still not much).

igormp said:
Don't forget the 3GB modules that are coming really soon. That would mean a 48GB model without clamshell, and a 96GB with.

3GB modules? true. But using a 512bit big and expensive chip for 16 chips * 3GB per chip vs the already available RTX 6000 Ada with the same 48GB VRAM would be kinda..? But maybe the bandwith increase from 960 GB/s to 1792 GB/s + other Blackwell features are what customers (would) demand, but it's still 48GB so no sure. I'm also not sure NV is going to relese a 16chips * 3GB per chip * 2 [clamshell] = 96GB version, I mean it would be very nice, but I think they will stretch this one out to at least next generation. But even then, in ~2.5 years, I'm not sure bc then they can claim that 3GB VRAM modules are widely available to then now in quantities and they can do 384 bit ones (btw, 384bit/32bit per chip=12 chips, for someone who doesn't know): 12chips * 3GB per chip * 2 [clamshell] = 72GB. So 96GB VRAM maybe in next next generation in ~5 years?
Though I have to say that the 32GB VRAM of the 5090 came as a slight positive surprise, bc I was thinking that no way they would want a competitor to ther expensive 32GB workstation card and instead the'd would release a 28-30GB VRAM card (ofc, people who need the workstation features, won't switch to the 5090, but other who need 32GB VRAM in one card and don't care about workstation card featurs now don't need to buy the expensive workstation card either).

notoperable · Jan 13, 2025

Next Nvidia architecture will be called Gaslight and the AI will design it itself. What could possibly go wrong.

It's barely 2025 and I already can't hear that marketing babble spillage and it will get worse since everyone seems to find ai hallucinations totally fine and a part of the empirical reality. Can someone please reboot the universe?

Every GPU ad I've seen from CES so far sounds like a marketing manager on Ritalin overdose

dragontamer5788 · Jan 13, 2025

notoperable said:
Every GPU ad I've seen from CES so far sounds like a marketing manager on Ritalin overdose

GPU Makers know that AI isn't important, but it also doesn't "cost" very much to add FP16 or FP4 (or whatever) matrix-multiplication units to your GPUs.

Seriously, 4-bit numbers? Okay NVidia. I see what you're doing.

Visible Noise · Jan 13, 2025

dragontamer5788 said:
GPU Makers know that AI isn't important, but it also doesn't "cost" very much to add FP16 or FP4 (or whatever) matrix-multiplication units to your GPUs.

Seriously, 4-bit numbers? Okay NVidia. I see what you're doing.

What is Nvidia doing, besides optimizing instruction sets for inference?
You didn’t know AMD is doing the same apparently.

The first product in the AMD Instinct MI350 Series, the AMD Instinct MI350X accelerator, is based on the AMD CDNA 4 architecture and is expected to be available in 2025. It will use the same industry standard Universal Baseboard server design as other MI300 Series accelerators and will be built using advanced 3nm process technology, support the FP4 and FP6 AI datatypes and have up to 288 GB of HBM3E memory.

Maybe a little googling before posting will be helpful.

notoperable · Jan 13, 2025

Neural Regression

dragontamer5788 · Jan 13, 2025

Visible Noise said:
Maybe a little googling before posting will be helpful.

Or maybe I might be talking about something different.

64-bit multiplications are difficult, because multiplication scales by O(n^2) (assuming Dadda Multiplier architecture). This means that a 64-bit multiplier requires 4x as many adders as a 32-bit multiplier, or 16x as many adders as a 16-bit multiplier, or 32x as many adders as a 8-bit multiplier, or 64x more adders than a 4-bit multiplier.

So having 16 x 4-bit multipliers is still only 25% of the area (!!!!) of a 64-bit bit multiplier because of this O(n^2) scaling.

Case in point, the extreme 1-bit multiplier is also known as "AND" gate. (0*0 == 0 AND 0. 1*0 == 1 AND 0. 1*1 == 1 AND 1). When we go all the way down to the smallest number of bits, multiplication gets ridiculously easy to calculate. IE: Its very cheap to add 16x 4-bit multipliers, 8x 8-bit multipliers, 4x 16-bit multipliers to a circuit. Especially if that circuit already has the big-honking 64-bit or 32-bit multiplier off to the side. And all GPUs have 32-bit multipliers because 32-bits is the standard for video games.

All of this AI stuff is just gold to NVidia and AMD. They barely have to do any work (from a computer design perspective), they just need to lol give incredibly easy 4-bit designs and then sell them for more money.

------------

If these "AI Companies" can sell 2-bit or even 1-bit multiplication, I guarantee you that they will do so. Its not about efficacy, its about how little work they need to do yet still able to sell something for more money.

Visible Noise · Jan 13, 2025

It’s FP4. Floating point. A wee bit more complicated than you make it out to be. Perhaps you should look it up. Here’s a starting point https://papers.nips.cc/paper/2020/file/13b919438259814cd5be8cb45877d577-Paper.pdf

dragontamer5788 · Jan 13, 2025

Visible Noise said:
It’s FP4. Floating point. A wee bit more complicated than you make it out to be. Perhaps you should look it up. Here’s a starting point https://papers.nips.cc/paper/2020/file/13b919438259814cd5be8cb45877d577-Paper.pdf

Oh right. Good point. Its 1-bit sign, 3-bit exponent, 0-bit mantissa. So "multipliers" are actually 3-bit adders.

Thanks for making me look it up. Its even worse than I expected. ("Multiplication" of exponents simply becomes addition. 2^4 * 2^3 == 2^7).

Visible Noise · Jan 13, 2025

dragontamer5788 said:
Oh right. Good point. Its 1-bit sign, 3-bit exponent, 0-bit mantissa. So "multipliers" are actually 3-bit adders.

Thanks for making me look it up. Its n worse than I expected. ("Multiplication" of exponents simply becomes addition. 2^4 * 2^3 == 2^7).

if that’s what you got out of the IBM research paper then you’re too blinded by hate to understand. Upset that your gaming GPU are expensive because they found other users for then I presume.

dragontamer5788 · Jan 13, 2025

Visible Noise said:
if that’s what you got out of the IBM research paper then you’re too blinded by hate to understand. Upset that your gaming GPU are expensive because they found other users for then I presume.

Dude, its elementary to design a 1-bit sign / 3-bit exponent / 0-bit mantissa multiplier. Seriously, look at the damn format and think about how the circuit would be made. This is the stuff a 2nd year computer engineer can do.

If you aren't into computer engineering, then I promise you, its not very complex. The hard stuff are the larger multiplier circuits (aka: Wallace Trees and whatnot), which are closer to 4th year or even masters-degree level in my experience.

Think like an NVidia or AMD engineer. Think about how to wire adders together so that it'd make a multiplication circuit. Some of these things are absurdly easy to do.

notoperable · Jan 13, 2025

dragontamer5788 said:
Dude, its elementary to design a 1-bit sign / 3-bit exponent / 0-bit mantissa multiplier. Seriously, look at the damn format and think about how the circuit would be made. This is the stuff a 2nd year computer engineer can do.

If you aren't into computer engineering, then I promise you, its not very complex. The hard stuff are the larger multiplier circuits (aka: Wallace Trees and whatnot), which are closer to 4th year or even masters-degree level in my experience.

Think like an NVidia or AMD engineer. Think about how to wire adders together so that it'd make a multiplication circuit. Some of these things are absurdly easy to do.

But its fp4 ! I'm gonna repeat the catchword over and over again and then I'll get it.

Visible Noise said:
if that’s what you got out of the IBM research paper then you’re too blinded by hate to understand. Upset that your gaming GPU are expensive because they found other users for then I presume.

And you are some kind of GPU Jedi or how did you remotely get from fp4 to this? Back to the ocean! Now!

10tothemin9volts · Jan 15, 2025

GeForce 5070 Laptop still 8GB VRAM also a disappointment..planned obsolescence ("PCGH demonstrates why 8GB GPUs are simply not good enough for 2025". Hardware Unboxed demonstrated it like 1-3 years ago.). If NV at least gave 10GB VRAM. But, what can I say, if ppl still buy it..

SRS · Jan 23, 2025

igormp said:
It is awful for sure. But I the same time, I don't feel it's like Nvidia has been bribing companies to not use their competitors.

What competitors?

There is zero competition happening right now in the 4090–5090 space (and even below).

notoperable · Jan 23, 2025

Imaginary competition

makes it less cartel a like

Processor	9950x \| 5950x
Motherboard	x670e ProArt\| B550 ProArt
Cooling	PA 120 SE \|Fuma 2
Memory	4x64GB Kingston CUDIMM @5200MHz \| 4x32GB 3200MHz Corsair LPX
Video Card(s)	2x RTX 3090
Display(s)	LG 42" C2 4k OLED
Power Supply	Corsair RM1000e \| XPG Core Reactor 850W
Software	I use Arch btw

System Name	My second and third PCs are Intel + Nvidia
Processor	AMD Ryzen 7 7800X3D @ 45 W TDP Eco Mode
Motherboard	MSi Pro B650M-A Wifi
Cooling	Noctua NH-U9S chromax.black push+pull
Memory	2x 24 GB Corsair Vengeance DDR5-6000 CL36
Video Card(s)	PowerColor Reaper Radeon RX 9070 XT
Storage	2 TB Corsair MP600 GS, 4 TB Seagate Barracuda
Display(s)	Dell S3422DWG 34" 1440 UW 144 Hz
Case	Corsair Crystal 280X
Audio Device(s)	Logitech Z333 2.1 speakers, AKG Y50 headphones
Power Supply	750 W Seasonic Prime GX
Mouse	Logitech MX Master 2S
Keyboard	Logitech G413 SE
Software	Bazzite (Fedora Linux) KDE Plasma

Processor	7800X3D @ Curve Optimizer: All Core: -25
Motherboard	TUF Gaming B650-Plus
Memory	2xKSM48E40BD8KM-32HM ECC RAM (ECC enabled in BIOS)
Video Card(s)	4070 @ 110W
Display(s)	SAMSUNG S95B 55" QD-OLED TV
Power Supply	RM850x

Processor	9950x \| 5950x
Motherboard	x670e ProArt\| B550 ProArt
Cooling	PA 120 SE \|Fuma 2
Memory	4x64GB Kingston CUDIMM @5200MHz \| 4x32GB 3200MHz Corsair LPX
Video Card(s)	2x RTX 3090
Display(s)	LG 42" C2 4k OLED
Power Supply	Corsair RM1000e \| XPG Core Reactor 850W
Software	I use Arch btw

Processor	7800X3D @ Curve Optimizer: All Core: -25
Motherboard	TUF Gaming B650-Plus
Memory	2xKSM48E40BD8KM-32HM ECC RAM (ECC enabled in BIOS)
Video Card(s)	4070 @ 110W
Display(s)	SAMSUNG S95B 55" QD-OLED TV
Power Supply	RM850x

System Name	XPS, Lenovo and HP Laptops, HP Xeon Mobile Workstation, HP Servers, Dell Desktops
Processor	Everything from Turion to 13900kf
Motherboard	MSI - they own the OEM market
Cooling	Air on laptops, lots of air on servers, AIO on desktops
Memory	I think one of the laptops is 2GB, to 64GB on gamer, to 128GB on ZFS Filer
Video Card(s)	A pile up to my knee, with a RTX 4090 teetering on top
Storage	Rust in the closet, solid state everywhere else
Display(s)	Laptop crap, LG UltraGear of various vintages
Case	OEM and a 42U rack
Audio Device(s)	Headphones
Power Supply	Whole home UPS w/Generac Standby Generator
Software	ZFS, UniFi Network Application, Entra, AWS IoT Core, Splunk
Benchmark Scores	1.21 GigaBungholioMarks