NVIDIA GeForce RTX 50 Series "Blackwell" Features Similar L1/L2 Cache Architecture to RTX 40 Series

AleksandarK · Jan 17, 2025

NVIDIA's upcoming RTX 5090 and 5080 graphics cards are maintaining similar L1 cache architectures as their predecessors while introducing marginal improvements to L2 cache capacity, according to recent specifications reported by HardwareLuxx. The flagship RTX 5090 maintains the same 128 KB L1 cache per SM as the RTX 4090 but achieves a higher total L1 cache of 21.7 MB thanks to its increased SM count of 170. This represents a notable improvement over the RTX 4090's 16.3 MB total L1 cache, which features 128 SMs. In terms of L2 cache, the RTX 5090 sees a 33.3% increase over its predecessor, boasting 96 MB compared to the RTX 4090's 72 MB, with SM count going up by 32.8%, so there is a slight difference.

However, this improvement is relatively modest compared to the previous generation's leap, where the RTX 4090 featured twelve times more L2 cache than the RTX 3090. The RTX 5080 shows more conservative improvements, with its L1 cache capacity only marginally exceeding its predecessor by 1 MB (10.7 MB vs 9.7 MB). Its L2 cache maintains parity at 64 MB, matching the RTX 4080 and 4080 Super. To compensate for these incremental cache improvements, NVIDIA is implementing faster GDDR7 memory across the RTX 50 series. Most models will feature 28 Gbps modules, with the RTX 5080 receiving special treatment with 30 Gbps memory. Additionally, some models are getting wider memory buses, with the RTX 5090 featuring a 512-bit bus and the RTX 5070 Ti upgrading to a 256-bit interface.

View at TechPowerUp Main Site | Source

kondamin · Jan 17, 2025

I wonder why they still haven’t done what amd just did with zen5
put a slab of l3 under or above it reduce the l2 and replace it with more logic

AleksandarK · Jan 17, 2025

kondamin said:
I wonder why they still haven’t done what amd just did with zen5
put a slab of l3 under or above it reduce the l2 and replace it with more logic

Yeah, I was thinking that as well. But current packaging methods can only allow lower TDP logic below cache stacks. You can't cool 600 W TDP GPU with a cache layer on top easily. NVIDIA has already looked into this (see here).

N/A · Jan 17, 2025

in RDNA L3 level is devided between the MCD chiplets and confined to the and acessed only by the section it belongs to.
L2 sits in the center, acessed by everything and interfacing with the memory.
besides didn't 5090 have 88 MB L2 overall. where is the whitepaper showing this..

starfals · Jan 17, 2025

5080, the biggest fail i have ever seen. The only worse card in the last 10 years gotta be 4060 or 4080 Super... Cus that Super doesnt do anything and 4060... yeah, you know why that 1 is bad. So far, Nvidia is failing gen after gen. I wonder what will happen in 2027.

londiste · Jan 17, 2025

I have been wonder for a while - isn't Nvidia's L2 cache tied to some other architectural component? Memory controllers or ROPs would seem like a logical choice (or shader arrays) but none actually match the amounts in specs.

Daven · Jan 17, 2025

Balckwell is not going to be much faster than Ada unless a game uses a lot of RT, AI tech and you have DLSS enabled.

GeForce RTX 5090D reviewer says "this generation hardware improvements aren't massive" - VideoCardz.com

Notes from GeForce RTX 5090 reviewer make it online Another Chiphell rumor today, but this time about the GeForce series. The forums are particularly active ahead of the GeForce RTX 50 launch, as they attract members from various Chinese tech media. GeForce graphics cards are dominant in the...

videocardz.com

NVIDIA GeForce RTX 5090 appears in first Geekbench OpenCL & Vulkan leaks - VideoCardz.com

GeForce RTX 5090 tested in Geekbench OpenCL/Vulkan test And so begin the RTX 50 performance leaks. And of course, they involve Geekbench. One thing is certain: NVIDIA’s DLSS technology has no impact here. If you’re interested in raw performance improvements over the RTX 4090 AD102 GPU, the...

videocardz.com

I’m going to stick to the 1000’s of games of yesteryear that I still haven’t played. I can run them at 4K/Ultra/120 fps with my 7900XT.

Prima.Vera · Jan 17, 2025

Long gone are the times when a x070 GPU was faster, or the same raw speed of the older (x-1)090 GPU.
Now it's all about low quality upscaling and fake frames generating... For double the price.
You gotta love those callous and greedy megacorporations.

tpuuser256 · Jan 17, 2025

AleksandarK said:
Yeah, I was thinking that as well. But current packaging methods can only allow lower TDP logic below cache stacks. You can't cool 600 W TDP GPU with a cache layer on top easily. NVIDIA has already looked into this (see here).

Iirc the 9000x3d series is using cache below the logic, allow the heat to be sucked out without going through the cache. The same could be used for NVIDIA GPUs, they are just holding back because there is no real need for more performance/competitiveness currently.
They are holding back a lot actually

TheinsanegamerN · Jan 17, 2025

tpuuser256 said:
Iirc the 9000x3d series is using cache below the logic, allow the heat to be sucked out without going through the cache. The same could be used for NVIDIA GPUs, they are just holding back because there is no real need for more performance/competitiveness currently.
They are holding back a lot actually

AMD's method also doesnt scale very well with larger dies, which is why we havent seen an APU with the big iGPU using x3d cache yet. The bigger the die, apparently the harder it is to pull off without breaking something.

Now scale that to the monster that is the 5090.

TechBuyingHavoc · Jan 17, 2025

Daven said:
Balckwell is not going to be much faster than Ada unless a game uses a lot of RT, AI tech and you have DLSS enabled.

GeForce RTX 5090D reviewer says "this generation hardware improvements aren't massive" - VideoCardz.com

Notes from GeForce RTX 5090 reviewer make it online Another Chiphell rumor today, but this time about the GeForce series. The forums are particularly active ahead of the GeForce RTX 50 launch, as they attract members from various Chinese tech media. GeForce graphics cards are dominant in the...

videocardz.com

NVIDIA GeForce RTX 5090 appears in first Geekbench OpenCL & Vulkan leaks - VideoCardz.com

GeForce RTX 5090 tested in Geekbench OpenCL/Vulkan test And so begin the RTX 50 performance leaks. And of course, they involve Geekbench. One thing is certain: NVIDIA’s DLSS technology has no impact here. If you’re interested in raw performance improvements over the RTX 4090 AD102 GPU, the...

videocardz.com

I’m going to stick to the 1000’s of games of yesteryear that I still haven’t played. I can run them at 4K/Ultra/120 fps with my 7900XT.

The 7900XT is going to age very nicely, tons of rasterization performance, plenty of VRAM, and solid drivers, all at 300W power usage. As for games, I am plenty busy slowly churning through BG3 and Helldivers 2 alone.

Punkenjoy · Jan 17, 2025

N/A said:
in RDNA L3 level is devided between the MCD chiplets and confined to the and acessed only by the section it belongs to.
L2 sits in the center, acessed by everything and interfacing with the memory.
besides didn't 5090 have 88 MB L2 overall. where is the whitepaper showing this..

The MALL on RDNA3 (Memory Attached Last Level cache) aka Infinity Cache is indeed tied to each memory region that its connected to. But that is nothing new. This was the same thing as RDNA2 and RDNA 3.5.

If RDNA use MALL cache, that would also be the case. The thing is it's not a big deal as you seem to say. Firstly cache don't cache data but memory line. The data would be naturally spread across all memory controller in order to benefits from the whole bandwidth available. That would also mean the cache load would be spread accross all of those chiplets naturally.

N/A · Jan 17, 2025

Punkenjoy said:
The MALL on RDNA3 (Memory Attached Last Level cache) aka Infinity Cache is indeed tied to each memory region that its connected to. But that is nothing new. This was the same thing as RDNA2 and RDNA 3.5.

If RDNA use MALL cache, that would also be the case. The thing is it's not a big deal as you seem to say. Firstly cache don't cache data but memory line.

You're right it's continuous, and the memory line that could benefit is mostly the front and back buffer which in the case of 4K is 64 MB so anything less than 64 Mb included, 5070 Ti could be out of luck here as I can imagine.

Rightness_1 · Jan 19, 2025

It's the same chip. Just bolted on some extra cores and enabled FP4/INT4 on the consumer chips.

Condosaurus · Jan 20, 2025

N/A said:
You're right it's continuous, and the memory line that could benefit is mostly the front and back buffer which in the case of 4K is 64 MB so anything less than 64 Mb included, 5070 Ti could be out of luck here as I can imagine.

Can you ELI5 this? I tried googling some of those terms, but I still don't understand how 64 MB of L2 cache would benefit 4K gaming specifically beyond simply "more cache means more hits".

Processor	Ryzen 7800X3D
Motherboard	ROG STRIX B650E-F GAMING WIFI
Memory	2x16GB G.Skill Flare X5 DDR5-6000 CL36 (F5-6000J3636F16GX2-FX5)
Video Card(s)	INNO3D GeForce RTX™ 4070 Ti SUPER TWIN X2
Storage	2TB Samsung 980 PRO, 4TB WD Black SN850X
Display(s)	42" LG C2 OLED, 27" ASUS PG279Q
Case	Thermaltake Core P5
Power Supply	Fractal Design Ion+ Platinum 760W
Mouse	Corsair Dark Core RGB Pro SE
Keyboard	Corsair K100 RGB
VR HMD	HTC Vive Cosmos

Processor	Intel® Core™ i7-13700K
Motherboard	Gigabyte Z790 Aorus Elite AX
Cooling	Noctua NH-D15
Memory	32GB(2x16) DDR5@6600MHz G-Skill Trident Z5
Video Card(s)	ZOTAC GAMING GeForce RTX 3080 AMP Holo
Storage	2TB SK Platinum P41 SSD + 4TB SanDisk Ultra SSD + 500GB Samsung 840 EVO SSD
Display(s)	Acer Predator X34 3440x1440@100Hz G-Sync
Case	NZXT PHANTOM410-BK
Audio Device(s)	Creative X-Fi Titanium PCIe
Power Supply	Corsair 850W
Mouse	Logitech Hero G502 SE
Software	Windows 11 Pro - 64bit
Benchmark Scores	30FPS in NFS:Rivals

System Name	Skunkworks 3.0
Processor	5800x3d
Motherboard	x570 unify
Cooling	Noctua NH-U12A
Memory	32GB 3600 mhz
Video Card(s)	asrock 6800xt challenger D
Storage	Sabarent rocket 4.0 2TB, MX 500 2TB
Display(s)	Asus 1440p144 27"
Case	Old arse cooler master 932
Power Supply	Corsair 1200w platinum
Mouse	squeak
Keyboard	Some old office thing
Software	Manjaro

NVIDIA GeForce RTX 50 Series "Blackwell" Features Similar L1/L2 Cache Architecture to RTX 40 Series

AleksandarK

News Editor

kondamin

AleksandarK

News Editor

N/A

starfals

londiste

Daven

GeForce RTX 5090D reviewer says "this generation hardware improvements aren't massive" - VideoCardz.com

NVIDIA GeForce RTX 5090 appears in first Geekbench OpenCL & Vulkan leaks - VideoCardz.com

Prima.Vera

tpuuser256

TheinsanegamerN

TechBuyingHavoc

GeForce RTX 5090D reviewer says "this generation hardware improvements aren't massive" - VideoCardz.com

NVIDIA GeForce RTX 5090 appears in first Geekbench OpenCL & Vulkan leaks - VideoCardz.com

Punkenjoy

N/A

Rightness_1

Condosaurus

New Member