Saturday, March 9th 2024
NVIDIA RTX 50-series "GB20X" GPU Memory Interface Details Leak Out
Earlier in the week it was revealed that NVIDIA had distributed next-gen AI GPUs to its most important ecosystem partners and customers—Dell's CEO expressed enthusiasm with his discussion of "Blackwell" B100 and B200 evaluation samples. Team Green's next-gen family of gaming GPUs have received less media attention in early 2024—a mid-February TPU report pointed to a rumored PCIe 6.0 CEM specification for upcoming RTX 50-series cards, but leaks have become uncommon since late last year. Top technology tipster, kopite7kimi, has broken the relative silence on Blackwell's gaming configurations—an early hours tweet posits a slightly underwhelming scenario: "although I still have fantasies about 512 bit, the memory interface configuration of GB20x is not much different from that of AD10x."
Past disclosures have hinted about next-gen NVIDIA gaming GPUs sporting memory interface configurations comparable to the current crop of "Ada Lovelace" models. The latest batch of insider information suggests that Team Green's next flagship GeForce RTX GPU—GB202—will stick with a 384-bit memory bus. The beefiest current-gen GPU AD102—as featured in GeForce RTX 4090 graphics cards—is specced with a 384-bit interface. A significant upgrade for GeForce RTX 50xx cards could arrive with a step-up to next-gen GDDR7 memory—kopite7kimi reckons that top GPU designers will stick with 16 Gbit memory chip densities (2 GB). JEDEC officially announced its "GDDR7 Graphics Memory Standard" a couple of days ago. VideoCardz has kindly assembled the latest batch of insider info into a cross-generation comparison table (see below).
Sources:
kopite7kimi Tweet, VideoCardz, Tom's Hardware, Wccftech, Tweak Town
Past disclosures have hinted about next-gen NVIDIA gaming GPUs sporting memory interface configurations comparable to the current crop of "Ada Lovelace" models. The latest batch of insider information suggests that Team Green's next flagship GeForce RTX GPU—GB202—will stick with a 384-bit memory bus. The beefiest current-gen GPU AD102—as featured in GeForce RTX 4090 graphics cards—is specced with a 384-bit interface. A significant upgrade for GeForce RTX 50xx cards could arrive with a step-up to next-gen GDDR7 memory—kopite7kimi reckons that top GPU designers will stick with 16 Gbit memory chip densities (2 GB). JEDEC officially announced its "GDDR7 Graphics Memory Standard" a couple of days ago. VideoCardz has kindly assembled the latest batch of insider info into a cross-generation comparison table (see below).
41 Comments on NVIDIA RTX 50-series "GB20X" GPU Memory Interface Details Leak Out
• Third world countries. Life there doesn't guarantee you the means for buying semi-decent computers, let alone next-gen hardware.
• Priorities. Not everyone treats their PC as their best toy. Many people only use their PC on their sick leaves or during extremely unfavourable weather.
• Health conditions. Some people have vision disorders which make 1080p to 2160p upgrade pointless.
• Just being broke. A dude wants to buy an RTX 4090 and a top-tier display but can't remotely afford that.
And no, "Homeless? Just buy a house" doesn't work.
Frankly for a punctual AI usage that needs more than 24GB on a personal workstation, I think the alternative maybe would be to use a cloud provider and run your dataset on the cloud IF possible. On AWS I think in some regions you can already have EC2 instances with Nvidia GPUs
Regarding games: Games need to work on a maximum of PC, if a game cannot run @4K on a 5090, I can't even imagine how it runs on the actual consoles and I would really question the developers rather than Nvidia on the 24GB choice here.
In summary: If a game in 2025 needs more than 24GB for 4K/High in 2024, I'd choose to give money to developers that know how to actually optimize games. For instance TLOU part 1 released in a horrendous state, albeit they fixed some issues, I'd never give my money on a port of that quality.
But I'm not really worried, the limiting factors for gaming today are the consoles and they have a shared pool of 16GB, so I really don't see how a computer with 16/32GB DDR5 & 24GB of vRAM can fail to max the game.
They don't want consumer-grade-GPU batched doing AI stuff. Solution: cripple memory amount and bandwith.
You all know their idea is just to run the games/computing on cloud (their service) and stream them to your computer/console (your money).
[[ Add here economical, ecological and other bullshit the-verge-like marketing arguments and some hand-made scarfs. ]]
And this is how you have complete control on the market prices: everthing as a service.
But if we take horizon zero dawn as an example - no, the assets are clearly not super high quality, and yet 8k makes an immense difference. At 4k the image is (comparatively) rather blurry, and there is quite a bit of shimmering - at 8k the image gets completely cleaned up. Being crystal clear, and no shimmering etc.
As for your friend running a console on an 8k tv - i hope for your sake that you can see why it isn't in any way comparable.
There are problems. But the most emphasized one is that no one advertises 4K gaming even today.
You have plenty of great graphics cards for 4K gaming, but no one tells the potentials users that it's worth it to buy a 4K monitor for $200.
8K can easily work:
1. Textures compression;
2. Upscaling;
3. DisplayPort higher and better than DP 2.1 UHBR20's 77.37 Gbit/s throughput.
When I'm at my parents home for holidays etc. I can use my mother's laptop and play whatever I want for the time I'm there.
Also not everyone is capable of buying a full fledged PC to play games. But a half decent laptop is more than enough to stream from the cloud.
The prices on the gpu are high because of the demand. Not because they push the industry to the streaming service direction, necessarily.
To achieve 24GB 3090 used dual sided piggie backed 16Gb memory really. That could still happen in first generation.
4080 can deliver 1440p144 on average, and if 5080 is 50% faster that's still not a 4K card. So forget it, 8K144 is impossible before RTX 9090 on anything other than fortnight or a 10 year old game.
GB203 5080 108(112 -4)SM/Rop 14336 -512 shaders
GB205 5070 62(64 -2)SM's 5888 shader and a virtual copy of 3070/4070. Keeping the trend alive of zero shaders added and 25% performance gain just clock it higher. 3.3GHz N3 node 200mm2. 12GB again
And 7680 shader 5070 Ti.
GB206 5060 42SM's 38-40 enabled Samsung 4 node with 16GB option
This graph shows average at maxed out, but maxed out is an overkill and the settings should be dialed down.
That would be interesting.
5060 with 12, 5070 with 18GB, 5080 with 24 and 5090 with 36GB of VRAM.
GDDR7 will have 24Gb modules so that will change the math a bit, probably not on the launch models other than maybe 5090/5080 but certainly on the refreshes down the line.
My expectations,
RTX 5050 8GB 2560 cores (performance of 4060 also 75W) for $169
RTX 5060 12GB (I hope) 3584 cores for $329
RTX 5060 Ti 16GB 5120 cores for $399
RTX 5070 16GB/18GB 7680 cores for $599
RTX 5080 24GB 12288 cores for $899
RTX 5090 36GB 22528-24576 cores for $1699
Among other reasons I don't know, prices are high because production is more profitable in sectors like workstations and specially big datacenters of all kinds. Sells and distributions is easier and profit per waffer substantailly higher. This includes consoles one chip every some years: millions of replicas.
If they wanted to produce more ram, or more nand or more GPUs they could, but they just cartel to keep prices high. Prove, simple:
Evaluate the performance, the tecnologies, ram, etc. evolution in the datacenter market vs. consumer market.
Another example is consumer nand which is getting "faster" but worse in terms or writes/endurance at each iteration and almost stuck in prices.
Another example is consumer amount of available ram. Stuck in the 16-32gigs for more than 10 years. --> if you need more go to [130].
Another example is HBM, ECC, number of pcie lanes, 10gbit, --> why do you need it? go to [130]
Another new trend are the E cores and P cores, AVX512 deprecation, etc. more bullshit. --> Need more P cores? go to [130]
I don't say consumer market is not improving, I say an "artificial gap"/"deliberate extreme segmentation" is being created between consumer an "pro*" "datacenter" market.
[130] I say that gap is generated by the same providing on-cloud services (and a big bunch of freelance programmers and tech-gurus [hand-made scarf here]).
And I say the low amount of RAM and narrow GPU buses on GPUs is part of that strategy as well, neither BIM costs, not "limited resources", not "consumer needs", not "its enough for 4k", [put more lame tweet rationalizations here].
Its like any other sector, drugs, netflix, etc. Starts cheap, you get used, substitutes dissapear, prices increase, service lowers quality, prices increase, you prepare your next busines. <-- MBA reduced to one line.
Do you know parsec btw?
*Thats another topic: workstation crumble.