Tuesday, March 19th 2024
NVIDIA "Blackwell" GeForce RTX to Feature Same 5nm-based TSMC 4N Foundry Node as GB100 AI GPU
Following Monday's blockbuster announcements of the "Blackwell" architecture and NVIDIA's B100, B200, and GB200 AI GPUs, all eyes are now on its client graphics derivatives, or the GeForce RTX GPUs that implement "Blackwell" as a graphics architecture. Leading the effort will be the new GB202 ASIC, a successor to the AD102 powering the current RTX 4090. This will be NVIDIA's biggest GPU with raster graphics and ray tracing capabilities. The GB202 is rumored to be followed by the GB203 in the premium segment, the GB205 a notch lower, and the GB206 further down the stack. Kopite7kimi, a reliable source with NVIDIA leaks, says that the GB202 silicon will be built on the same TSMC 4N foundry node as the GB100.
TSMC 4N is a derivative of the company's mainline N4P node, the "N" in 4N stands for NVIDIA. This is a nodelet that TSMC designed with optimization for NVIDIA SoCs. TSMC still considers the 4N as a derivative of the 5 nm EUV node. There is very little public information on the power- and transistor density improvements of the TSMC 4N over TSMC N5. For reference, the N4P, which TSMC regards as a 5 nm derivative, offers a 6% transistor-density improvement, and a 22% power efficiency improvement. In related news, Kopite7kimi says that with "Blackwell," NVIDIA is focusing on enlarging the L1 caches of the streaming multiprocessors (SM), which suggests a design focus on increasing the performance at an SM-level.
Sources:
Kopite7kimi (Twitter), #2, VideoCardz
TSMC 4N is a derivative of the company's mainline N4P node, the "N" in 4N stands for NVIDIA. This is a nodelet that TSMC designed with optimization for NVIDIA SoCs. TSMC still considers the 4N as a derivative of the 5 nm EUV node. There is very little public information on the power- and transistor density improvements of the TSMC 4N over TSMC N5. For reference, the N4P, which TSMC regards as a 5 nm derivative, offers a 6% transistor-density improvement, and a 22% power efficiency improvement. In related news, Kopite7kimi says that with "Blackwell," NVIDIA is focusing on enlarging the L1 caches of the streaming multiprocessors (SM), which suggests a design focus on increasing the performance at an SM-level.
60 Comments on NVIDIA "Blackwell" GeForce RTX to Feature Same 5nm-based TSMC 4N Foundry Node as GB100 AI GPU
Pricewise they will be cheaper than Ada according to tick tock policy (cheap Pascal, expensive Turing, cheap Ampere, expensive Ada....)
What I don't want to see is bullshXt software tweaks, just for the charts.
Ok, they may introduce a DLSS 4+, available only on 5000, but I don't want to see a performance jump because of this only.
- Increase power limit to just a hair under 600 W so that clocks don't decrease
- keep power limit the same as the 4090, but increase die size to 2080 Ti proportions. This would allow more cores, perhaps up to the 192 in the rumours
Another option is to use Chip-On-Wafer-On-Substrate-L (CoWoS-L) which is used for GB200, but that seems unlikely for a gaming GPU.Four figures for an 80-series card is still too high IMO, you shouldn't break that barrier until the 90-series.
In otherwords AMD would have to specifically forget to include 3nm capacity for it's GPUs in order for them to not get any allocation (assuming they are going 3nm of course), which would be the highest level of incompetence possible given securing wafers is essential to operating a silicon design business. A company like AMD does not simply forget to secure wafers for it's products and continue to ignore that deficiency over the multi-year design period of a product.
What could be a problem for AMD though is if they didn't purchase enough allocation in advance, as in their sales exceed their expectations. In that instance they would be competiting with other companies like Apple for any unearmarked capacity TSMC has. There's also the possibility that TSMC would not be able to meet AMDs required 3nm wafer allotment as a reason AMD could not go 3nm, although I doubt this potentialitiy given TSMC's capacity has increased while simulatniously demand has decreased.
That’s how it usually works, yes, but Apple is a special case in terms of their relationship with TSMC. They are a VVIP customer and they get first dibs on any new node, no ifs or buts. Everyone else has to fight for scraps and if there ARE no scraps… tough. The fact that NV didn’t manage to get any or elected not to speaks volumes to me. Even if AMD does get some allocation, there is absolutely no way they would have decided to spend it on the GPU part of the business and not what actually makes them money - CPUs and/or CDNA HPC accelerators. As such, while it’s theoretically possible that we will see a case of NV being on an older node while new AMD GPUs get their chiplets made on 3nm, I just don’t see it. Especially since in the Ada vs RDNA 3 it was AMD who used an older (if very marginally) node for their GPUs.
In addition, a good part of AMD's investment into GPUs is shared with their CPUs. Any improvements to infinity fabric and additional modularization that come with GPU chiplets undoubtedly help advance the packaging of their CPUs as well. Even if AMD only performs so-so in AI / Gaming GPUs, that technical expertise and knowledge is invaluable to the company as a whole. The MI300 and MI300X are excellent examples of that.
At the end of the day, regardless of how Vs Apple has in addition to being a VIP, AMD and TSMC would have known years in advance wafer allotments. If TSMC told AMD straight up years back that they couldn't fill a theoretical 3nm allotment for GPUs, that would be a failure on TSMCs part for sure. Again Apple can eat up all the uncontracted capacity but a lack of an ability for TSMC to build out capacity? I'm not seeing it given the lowered demand and massive investments by TSMC to do just that, build out capacity.
Again this all assumes that AMD wanted to go with 3nm for it's GPUs but I tink if they wanted to they could.
There are however, bad nodes; like Samsung's 8nm. As a result of that, while Ampere wasn't bad in performance, it was absolutely stellar in TDP. Not in a good sense.
TSMC is targeting 80% yield for 3nm: www.androidheadlines.com/2024/02/tsmc-double-3nm-production-2024.html
They are also looking at doubling capacity with Qualcomm, MediaTek, and others having 3nm chips in the pipe.
A GPU in late 2024 would not be infeasible at an 80% yield rate, particularly when we are talking about a chiplet based GPU where the individual dies are smaller and thus work better on lower yield nodes. There are no official die size numbers for the M3 Max, Apple's largest 3nm product, but it has 3.7 times the transistors of the base M3. The base M3 of which has a approximate die size of 150 - 170mm2. Based on the lowest estimate, a rough guess of the die size would be 555mm2. Even if that estimately is significantly off, it's definitely possible to see that if Apple is able to get high enough yield to make the M3 Max possible, it should definitely be possible to get a 304mm2 GPU die working. I'm also thinking it's possible that the rumors that AMD is ceding the high end could potentially be people confusing a smaller GCD size with them giving up the high end. It could be that AMD lowers the GCD size to even smaller than 304mm2 to further increase yields and then simply adds multiple to it's higher end products. AMD could just keep the MCDs on an older node given those don't really benefit much, although they are extremely tiny so they would yield well on newer nodes. Using different nodes though would allow them to leverage more capacity.
I don't know the probability that AMD goes 3nm, without access to both TSMC's numbers and AMD's numbers it's very hard to say.
We have no idea what the yields are like on Apple silicon, true, but Apple did announce that they are buying50% more 3nm capacity in 2024. I have no idea what this tells us about yields (do they expect more demand or are the yields still poor even for what demand is there), but it means that a lot of extra capacity they will gobble up. Then there is Intel who also already booked quite a bit, apparently. MediaTek and Qualcomm are in line, I guess that they will get what is left. Again, the fact that after all the rumors and leaks it turns out that NV was unable or unwilling to get some for themselves is, IMO, telling. It’s obviously not a question of money, not for NV. Whatever is the reason - lack of capacity or current 3nm being unsuitable for their needs is a separate question. I do want to note that even the M3 Max is a fairly low power chip. It very well might be that, until N3E is off the ground, producing high wattage parts which desktop GPUs absolutely are is just off the table. If AMD can design the chiplets in a way that would make them suitable they MIGHT have a chance to use 3nm, but those parts will be quite limited in how capable they would be, I think. So far all the AMD focused 3nm rumors were about Zen5c, another low power part. I think that we will know more in approximately May when it’s speculated that Zen 5 proper will be revealed. If regular Zen 5 isn’t on 3nm (or at least initially isn’t, shrinks are possible) then I’d say no way in hell is RDNA 4 on 3nm.
It doesn’t really matter in the end. I have no faith that even with a hypothetical node advantage AMD could actually make NV stumble and present themselves as a peer competitor. It would take a miracle at this point, AMD’s last time they could be one was what, Hawaii, arguably? Or hell, the HD5000 series where they capitalized on NV eff up that was OG Fermi?
What could happen is that GB202 uses 2x GB203. :D That does make a bit sense, if GB202 is 512bit and GB203 is 256bit... But I doubt it.
How much extra performance is there with a 30% density and shader increase? More VRAM bandwidth with GDDR7 also gives some performance, but +70% performance in total like some rumors said???
I won't buy a GPU that uses more power then the 4090, I even undervolted it below 300W...
If the latest rumors are true, Blackwell might be dead for consumers... I doubt many enthusiasts would buy a 5090 at that point for over 1700$...
It would be pretty unfortunate for us gamers if these rumors are really true but it is kind of inevitable that gaming is taking a backseat for now if you look at the sheer numbers. nVidia made more than SIX times more revenue from AI/datacenter than from gaming ($18.404bn vs. $2.865bn). And yet another ~$0.75bn was made from Automotive and Professional Visualization.
nVidia have certainly almost exclusively been focusing on datacenter for quite a while by now and moved all of their top talent (engineering *and* software) to the datacenter segment. That is what makes this rumor pretty plausible. They are probably putting a pretty low effort into the next gaming generation. Compared to the previous gens that had lots of surprises on the feature/software side (DLSS3/FG/Remix etc.) the RTX 5000 series will probably be a disappointment but that's how it goes when the spotlight shifts in such a MASSIVE way...
But hey, people are paying 1000+ $ for the same phones every year, so there is no lack of idiots with more money than common sense...
3bn isn't peanuts
- "RTX 3080 was never really $700 card, for most of it's lifetime it was sold for over $1500, and people (not gamers, though) were buying it!"
Even original TechPowerUP review told us to compare it to cryptoinflated RTX 3090 Ti, which officialy launched for insane $2000! And then crypto collapsed.
And when RTX 5080 hits at close to $2000, we'll hear that we can't compare it to RTX 4080 SUPER's $1000 MSRP or other discounts but with original generation releases $1200, and add two years worth of inflation - which in this economy is what people think it is. So in view of many a $2000 card in start of 2025 will be perfectly same value as $1200 card in 2022, and they'll have charts to prove it to you! It's not even more expensive!
TSMC 150nm, NV20 to NV25: 44% aggregate increase.
TSMC 130nm, NV38 to NV40: 63% aggregate increase.
TSMC 90nm, G71 to G80: 88.7% aggregate increase.
TSMC 65nm, G92 to GT200: 49.8% aggregate increase.
TSMC 28nm, GK110 to GM200: 49.3% aggregate increase.