Tuesday, June 11th 2024
Possible Specs of NVIDIA GeForce "Blackwell" GPU Lineup Leaked
Possible specifications of the various NVIDIA GeForce "Blackwell" gaming GPUs were leaked to the web by Kopite7kimi, a reliable source with NVIDIA leaks. These are specs of the maxed out silicon, NVIDIA will carve out several GeForce RTX 50-series SKUs based on these chips, which could end up with lower shader counts than those shown here. We've known from older reports that there will be five chips in all, the GB202 being the largest, followed by the GB203, the GB205, the GB206, and the GB207. There is a notable absence of a successor to the AD104, GA104, and TU104, because NVIDIA is trying a slightly different way to approach the performance segment with this generation.
The GB202 is the halo segment chip that will drive the possible RTX 5090 (RTX 4090 successor). This chip is endowed with 192 streaming multiprocessors (SM), or 96 texture processing clusters (TPCs). These 96 TPCs are spread across 12 graphics processing clusters (GPCs), which each have 8 of them. Assuming that "Blackwell" has the same 256 CUDA cores per TPC that the past several generations of NVIDIA gaming GPUs have had, we end up with a total CUDA core count of 24,576. Another interesting aspect about this mega-chip is memory. The GPU implements the next-generation GDDR7 memory, and uses a mammoth 512-bit memory bus. Assuming the 28 Gbps memory speed that was being rumored for NVIDIA's "Blackwell" generation, this chip has 1,792 GB/s of memory bandwidth on tap!The GB203 is the next chip in the series, and poised to be a successor in name to the current AD103. It generationally reduces the shader counts, counting on the architecture and clock speeds to more than come through for performance; while retaining the 256-bit bus width of the AD103. The net result could be a significantly smaller GPU than the AD103, for better performance. The GB203 is endowed with 10,752 CUDA cores, spread across 84 SM (42 TPCs). The chip has 7 GPCs, each with 6 TPCs. The memory bus, as we mentioned, is 256-bit, and at a memory speed of 28 Gbps, would yield 896 GB/s of bandwidth.
The GB205 will power the lower half of the performance segment in the GeForce "Blackwell" generation. This chip has a rather surprising CUDA core count of just 6,400, spread across 50 SM, which are arranged in 5 GPCs of 5 TPCs, each. The memory bus width is 192-bit. For 28 Gbps, this would result in 672 GB/s of memory bandwidth.
The GB206 drives the mid-range of the series. This chip gets very close to matching the CUDA core count of the GB205, with 6,144 of them. These are spread across 36 SM (18 TPCs). The 18 TPCs span 3 GPCs of 6 TPCs, each. The key differentiator between the GB205 and GB206 is memory bus width, which is narrowed to 128-bit for the GB206. With the same 28 Gbps memory speed being used here, such a chip would end up with 448 GB/s of memory bandwidth.
At the entry level, there is the GB207, a significantly smaller chip with just 2,560 CUDA cores, across 10 SM, spanning two GPCs of 5 TPCs, each. The memory bus width is unchanged at 128-bit, but the memory type used is the older generation GDDR6. Assuming NVIDIA uses 18 Gbps memory speeds, it ends up with 288 GB/s on tap.
NVIDIA is expected to double down on large on-die caches on all its chips, to cushion the memory sub-systems. We expect there to be several other innovations in the areas of ray tracing performance, AI acceleration, and certain other features exclusive to the architecture. The company is expected to debut the series some time in Q4-2024.
Source:
kopite7kimi (Twitter)
The GB202 is the halo segment chip that will drive the possible RTX 5090 (RTX 4090 successor). This chip is endowed with 192 streaming multiprocessors (SM), or 96 texture processing clusters (TPCs). These 96 TPCs are spread across 12 graphics processing clusters (GPCs), which each have 8 of them. Assuming that "Blackwell" has the same 256 CUDA cores per TPC that the past several generations of NVIDIA gaming GPUs have had, we end up with a total CUDA core count of 24,576. Another interesting aspect about this mega-chip is memory. The GPU implements the next-generation GDDR7 memory, and uses a mammoth 512-bit memory bus. Assuming the 28 Gbps memory speed that was being rumored for NVIDIA's "Blackwell" generation, this chip has 1,792 GB/s of memory bandwidth on tap!The GB203 is the next chip in the series, and poised to be a successor in name to the current AD103. It generationally reduces the shader counts, counting on the architecture and clock speeds to more than come through for performance; while retaining the 256-bit bus width of the AD103. The net result could be a significantly smaller GPU than the AD103, for better performance. The GB203 is endowed with 10,752 CUDA cores, spread across 84 SM (42 TPCs). The chip has 7 GPCs, each with 6 TPCs. The memory bus, as we mentioned, is 256-bit, and at a memory speed of 28 Gbps, would yield 896 GB/s of bandwidth.
The GB205 will power the lower half of the performance segment in the GeForce "Blackwell" generation. This chip has a rather surprising CUDA core count of just 6,400, spread across 50 SM, which are arranged in 5 GPCs of 5 TPCs, each. The memory bus width is 192-bit. For 28 Gbps, this would result in 672 GB/s of memory bandwidth.
The GB206 drives the mid-range of the series. This chip gets very close to matching the CUDA core count of the GB205, with 6,144 of them. These are spread across 36 SM (18 TPCs). The 18 TPCs span 3 GPCs of 6 TPCs, each. The key differentiator between the GB205 and GB206 is memory bus width, which is narrowed to 128-bit for the GB206. With the same 28 Gbps memory speed being used here, such a chip would end up with 448 GB/s of memory bandwidth.
At the entry level, there is the GB207, a significantly smaller chip with just 2,560 CUDA cores, across 10 SM, spanning two GPCs of 5 TPCs, each. The memory bus width is unchanged at 128-bit, but the memory type used is the older generation GDDR6. Assuming NVIDIA uses 18 Gbps memory speeds, it ends up with 288 GB/s on tap.
NVIDIA is expected to double down on large on-die caches on all its chips, to cushion the memory sub-systems. We expect there to be several other innovations in the areas of ray tracing performance, AI acceleration, and certain other features exclusive to the architecture. The company is expected to debut the series some time in Q4-2024.
141 Comments on Possible Specs of NVIDIA GeForce "Blackwell" GPU Lineup Leaked
Let's make it clear here, AMD is staring down the barrel regarding GPUs. The last 7 quarters are the worst for them since Jon Peddie Research started tracking this metric a decade ago, they had never dropped under 18% until Q3 2022, and with the upcoming Blackwell launch and nothing new from AMD we can expect NVIDIA to breach 90% of the desktop GPU market. That is annihilation territory for AMD GPUs, that is territory where they consider exiting the desktop consumer market and concentrate on consoles only. That is territory where your company should start pulling out all the stops to recover, yet what is AMD doing in response? Literally nothing.
And it all compounds. If NVIDIA believes they're going to outsell AMD by 9:1, NVIDIA is going to book out 9x as much capacity at TSMC, which gives them a much larger volume discount than AMD will get, which means AMDs GPUs cost more; AIBs will have the same issue with all the other components they use like memory chips, PCBs, ... Once you start losing economies of scale and the associated discounts you get into even worse of a position regarding being able to manipulate your prices to compete.
Is AMD staring down the barrel? Is this really worse here than the years they were getting by on very low cashflow/margin products, pre-Ryzen? Are we really thinking they will destroy the one division that makes them a unique, synergistic player in the market?
There are a few indicators of markets moving.
- APUs are getting strong enough to run games proper, as gaming requirements are actually plateau-ing, you said it yourself, that 4060ti can even run 1440p. Does the PC market truly need discrete for a large segment of its gaming soon? Part of this key driver is also the PC handheld market, which AMD has captured admirably and IS devoting resources into.
- Their custom chip business line floats entirely on the presence and continued development of RDNA
- Their console business floats on continued development of RDNA - notably, sub high end, as those are the chips consoles want
- The endgame in PC gaming still floats on console ports before PC-first games at this point and with more cloudbased/unification between platforms, that won't get less, it will get more pronounced.
- AI will always move fastest on GPUs, another huge driver to keep RDNA.
Where is heavy RT in this outlook I wonder. I'm not seeing it. So Nvidia will command its little mountain of 'RT aficionado's on the PC', a dwindling discrete PC gaming market with a high cost of entry, and I think AMD will be fine selling vastly reduced numbers of GPU in that discrete PC segment because its just easy money alongside their other strategic business lines.
This whole thing wasn't new or hasn't changed since what, the first PS4.
AMD is fine, and I can totally see why they aren't moving. It would only introduce more risk for questionable gains, they can't just conjure up the technology to 'beat Nvidia' can they? Nvidia beats them at better integration of soft- and hardware.
Still, I see your other points about them and I understand why people are worried. But this isn't new to AMD. Its story of their life, and they're still here and their share gained 400% over the last five years.
If they're going to keep it on a 128-bit bus, GDDR7 is maybe going to turn it into a 1440p card. At 448GB/s it's still 12% less bandwidth on paper than a vanilla 4070 which is okay at 1440p, but that's with lower-latency GDDR6. I'm not 100% sure you can just compare bandwidth between GDDR6 and GDDR7 because latency will have doubled, clock for clock - which means (only a guess here) that the 5060Ti will have 88% the bandwidth of a 4070 but ~50% higher latency. That's going to make it considerably handicapped compared to a 4070 overall, so I guess the rest of it is down to how well they've mitigated that shortcoming with better cache, more cache, and hopefully some lessons learned from the pointlessness of the 4060Ti.
4070 Ti S4080 lite arguably a different tier than 4070 Ti).I'm hoping two generations plus same tier or 1-2 tier up (5090/5090 Ti?) is enough to double performance.
Fingers crossed lol. If I do go 5090/Ti I'll likely keep it three generations to recoup the extra cost. Maybe, still I think xx60 class cards will be native 1080/DLSS 1440 for at least this next gen.
Important to bear in mind 1080p on PC or 1440p DLSS arguably looks better than "native" 4K on console, which is realistically the competition at the entry level.
Native in quotes because consoles typically vary resolution and make heavy use of mediocre upscaling when playing at 4K, that or have a 30 FPS frame target which is pathetic.
Should be:
5090 - 512-bit 32GB <-- Needed for 4K Max settings in all games with 64GB being overkill.
5080 - 384-bit 24GB <-- 16GB is too little for something that will be around the power of a 4090.
5070 - 256-bit 16GB <-- Sweet spot for mid range.
5060 Ti - 192-bit 12GB <-- Would sell really well.
5060 - 128-bit 8GB <-- 8GB is fine if priced right...
And for the people slating AMD I had the ASUS 7900XTX TUF Gaming OC and it was incredible! Sure the street lights would flicker when I was 4K gaming but hey ho...
2560x1080 21:9 is somewhere between 1080p and 1440p based on my own testing over the years and most of the time I'm running out of raw GPU raster performance first when I crank up the settings at this resolution so I wouldn't exactly mind 12 GB Vram either but 16 is welcome if its not too overpriced. 'I'm also a constant user of DLSS whenever its a available in a game so that helps'
Tbh if the ~mid range 5000 serie fails to deliver in my budget range then I will just pick up a second hand 4070 Super and call it a day. 'plenty enough for my needs'
Maybe a combination of refinements to the cache that they got wrong with Ada and the switch to GDDR7 will be enough. As always, it'll really just come down to what they're charging for it - the 4060Ti 16G would have be a fantastic $349 GPU but that's not what we got... If the major benefits to the 50-series are for AI, the 40-series will remain perfectly good for this generation of games.
Which is so laughable considering AMD has no problem competing with Nvidias offerings outside of the RTX 4090
The RX 7900XTX Trades blows with the RTX 4080 Super mostly edging it out
The RX 7900XT beats the RTX 4070Ti Super
The RX 7900GRE Beats the RTX 4070 Super
The RX 7800XT Beats the RTX 4070
etc....
All while offering much better prices
- RT
- DLSS
- The internet myth that AMD has fundamental driver problems and Nvidia doesn't
Outside of those three things, the GPU market looks very even and competitive with AMD doing slightly better in performance and price as you pointed out. But even if all three of my points above didn't exist, these loyalists would still buy Nvidia. But I appreciate you and everyone else doing what they can to prevent the blind fealty to one company that threatens to ruin our DIY PC building market that we love so much.5090 probably won't be either.
These 100% enabled die numbers aren't representative of consumer cards, but Quadro ones.
DLSS is ok but so is FSR
And yea I hear that a lot. Which is funny because I’ve used AMD since the HD 4000 days and haven’t had driver issues since Hawaii. Which quite some time ago.
And if anyone thinks AMD are innocent in all this, don’t forget, they launched their 7900XTX at $1,000. So they aren’t gonna save you either.
Why sell 90-100% enabled dies to consumers when you can sell them for 2-3x the price as Quadro cards anyway?
The real facts are that no matter what AMD has done in the past, their PC discrete share is dropping. They're just not consistent enough and this echoes in consumer sentiment. Its also clear they've adopted a different strategy and are betting on different horses for quite a while now.
There is nothing new here with RDNA3 or RDNA4 in terms of market movement. Granted - RDNA3 didn't turn out as expected, but what if it did score higher on raster? Would that change the world?
Thanks for demonstrating exactly the same failure of understanding that I documented for AMD's marketing department in my post.
2015: 18%
2019: 18.8%
2020H2: 18%
2022: 10%
2023Q4: 19%
They've been 'rock bottom' many times before. And if you draw a line over this graph, isn't this just the continuation of the trend of the last decade?
Oh? I must have missed that statement after Cyberpunk ran at sub 30 FPS on a 4090.
I think it mostly sucks for people who expect Path Tracing to be the norm. They're gonna be waiting and getting disappointed for a loooong time. Game graphics haven't stopped moving forward despite Path Tracing. Gonna be fun :)
Please stand by.
What is it with that random line that isn't even a real line? Did you just fail at drawing a straight line from the first to thal last shown quarter or did you connect random quarters on purose?