Tuesday, September 13th 2022

NVIDIA AD102 "Ada" Packs Over 75 Billion Transistors

Sep 13th, 2022 21:44 Discuss (31 Comments)

NVIDIA's next-generation AD102 "Ada" GPU is shaping up to be a monstrosity, with a rumored transistor-count north of 75 billion. This would put over 2.6 times the 28.3 billion transistors of the current-gen GA102 silicon. NVIDIA is reportedly building the AD102 on the TSMC N5 (5 nm EUV) node, which offers a significant transistor-density uplift over the Samsung 8LPP (8 nm DUV) node on which the GA102 is built. The 8LPP offers 44.56 million transistors per mm² die-area (MTr/mm²), while the N5 offers a whopping 134 MTr/mm², which fits in with the transistor-count gain. This would put its die-area in the neighborhood of 560 mm². The AD102 is expected to power high-end RTX 40-series SKUs in the RTX 4090-series and RTX 4080-series.

Source: kopite7kimi (Twitter)

Add your own comment

31 Comments on NVIDIA AD102 "Ada" Packs Over 75 Billion Transistors

wolf

Better Than Native

IF that's true, along with the other very recent rumors, perhaps the 4090 really will be ~2x 3090/Ti-ish

Unregistered

Does anyone has an idea of how many transistors are used/wasted on the tensor cores.

wolf

Better Than Native

Xex360Does anyone has an idea of how many transistors are used/~~wasted~~ on the tensor cores.

8-10% of die area for RT+Tensor cores in a Turning die as per this sort of napkin math assessment.

Not sure about Ampere and Ada, but I'd happily take ~10% die space for RT acceleration and Tensor cores over a 10% increase in raster performance.

Zunexxx

wolfIF that's true, along with the other very recent rumors, perhaps the 4090 really will be ~2x 3090/Ti-ish

It's the same ad102 die, hence the transistors are there, doesn't mean all are active. 4090 at most will be 2x 3090. I went to chh just moments ago, the rumors are 4090 about 21k take while the 4090ti about 24.5k tse stock.

DrCR

Will it cost as much as a used Corolla though?

bobsled

I suspect these will have a price to reflect that huge die size. Yields on 5nm won’t be great this early on, surely

Minus Infinity

Meanwhile AMD's largest core will be in the region of 350mm^2 or so, smaller than last gen 6950XT core. Can't recall transistor count.

Given a 7600XT is said to beat a 6950XT, I easily beleive 7900XT will be 2x 6900XT and that's what the inside information has been saying all along. So 4090 Ti should be similarly 2x 3090Ti again in the ball park of leaks.

Crackong

I am sorry but I can't stop laughing when I saw the shape of those

Shou Miko

CrackongI am sorry but I can't stop laughing when I saw the shape of those

Who said cutting edge was better than rounding edge????? :roll:

#10

Richards

Easily 2.5x performance over a 3090ti.. tripple the transistors

#11

Unregistered

wolf8-10% of die area for RT+Tensor cores in a Turning die as per this sort of napkin math assessment.

Not sure about Ampere and Ada, but I'd happily take ~10% die space for RT acceleration and Tensor cores over a 10% increase in raster performance.

10% is quite a lot of wasted transistors.
But I can understand nVidia's goal of accelerating real work rather than gaming.

#12

pavle

Old news, rumors really.

#13

vimsux

That image is AI generated, lol...

#14

ModEl4

@btarunr
The figures regarding million transistors per mm² die-area (MTr/mm²) that you are taking are completely uncorrelatable and the comparison result just wrong.
If I understood you took something like a GA102 result regarding 8LPP (44.56MTr/mm²) and tried to compare it with something like a N5 Apple A14 SOC (134 MTr/mm²).
You will get closer results if you calculate (accordingly for logic/SRAM/analog etc) based on foundry tech sites like WikiChip for example (slightly different from official TSMC claims, for example TSMC N10 vs N16 logic density scaling claim is 2X while WikiChip gives 1.82X or N5 vs N7 TSMC claim is 1.84X while WikiChip gives 1.87X but in this case for example TSMC compared a whole CPU block)
According to WikiChip Samsung's 8LPP is around 17% denser than TSMC's N10 regarding logic and if you compare Apple's 10nm and 5nm SOCs the actual scaling is just 2.73X!
Logic scaling scales very differently from caches/analog for example (e.g. N5 vs N7 logic scaling is 1.84X, SRAM 1.35X and analog 1.2X only!)
So if you take 2 completely different designs the compared results will be completely wrong.

Anyway, if the 75b+ figure is true this means at least 45b+ for AD103.
If the 96MB cache implementation is similar to AMD's infinity cache and Nvidia uses 6T SRAM for example the transistor count is inconsequential (4.6b transistors+redundancy/ overhead)
So comparing 7GPC designs (AD103 vs GA102 10752 Cuda cores both) the transistor increase per GPC is just insane, I wonder what extra features Ada will implement and at what DX level will end up being in the future.

TSMC logic density scaling by WikiChip:

Apple SOC density scaling example:

SOC	process	Transistors	die size	density
A11	N10	4.3b	87.66mm²	49MTr/mm²
A14	N5	11.8b	88mm²	134MTr/mm²

#15

Denver

It still has fewer transistors than Apple's inefficient aberration.

#16

Tomorrow

wolfI'd happily take ~10% die space for RT acceleration and Tensor cores over a 10% increase in raster performance.

I would too but not at this time. Because none of the games i regularly play support RT or DLSS. So to me these transistors are useless at the moment.
At some point in the future when RT perf is actually good and many more games support RT and DLSS then sure.
Im running 2080Ti right now at 1440p 165Hz.

#17

Richards

ModEl4@btarunr
The figures regarding million transistors per mm² die-area (MTr/mm²) that you are taking are completely uncorrelatable and the comparison result just wrong.
If I understood you took something like a GA102 result regarding 8LPP (44.56MTr/mm²) and tried to compare it with something like a N5 Apple A14 SOC (134 MTr/mm²).
You will get closer results if you calculate (accordingly for logic/SRAM/analog etc) based on foundry tech sites like WikiChip for example (slightly different from official TSMC claims, for example TSMC N10 vs N16 logic density scaling claim is 2X while WikiChip gives 1.82X or N5 vs N7 TSMC claim is 1.84X while WikiChip gives 1.87X but in this case for example TSMC compared a whole CPU block)
According to WikiChip Samsung's 8LPP is around 17% denser than TSMC's N10 regarding logic and if you compare Apple's 10nm and 5nm SOCs the actual scaling is just 2.73X!
Logic scaling scales very differently from caches/analog for example (e.g. N5 vs N7 logic scaling is 1.84X, SRAM 1.35X and analog 1.2X only!)
So if you take 2 completely different designs the compared results will be completely wrong.

Anyway, if the 75b+ figure is true this means at least 45b+ for AD103.
If the 96MB cache implementation is similar to AMD's infinity cache and Nvidia uses 6T SRAM for example the transistor count is inconsequential (4.6b transistors+redundancy/ overhead)
So comparing 7GPC designs (AD103 vs GA102 10752 Cuda cores both) the transistor increase per GPC is just insane, I wonder what extra features Ada will implement and at what DX level will end up being in the future.

TSMC logic density scaling by WikiChip:

Apple SOC density scaling example:

SOC process Transistors die size density
A11 N10 4.3b 87.66mm² 49MTr/mm²
A14 N5 11.8b 88mm² 134MTr/mm²

They must be using the high density library cells for the cache and high performance cells for the cores

#18

TheoneandonlyMrK

wolfIF that's true, along with the other very recent rumors, perhaps the 4090 really will be ~2x 3090/Ti-ish

The 4090 is rumoured to be using Ad103 though, perhaps the Ti.

Oh noes I r wrong it is 102.

#19

ModEl4

RichardsThey must be using the high density library cells for the cache and high performance cells for the cores

Who knows, I wonder what die size will have.
The previous rumor was around 600mm² which seems difficult since regular N4 is around 6% denser than N5, maybe we have a customized node like in Turing's case (TSMC 12nm "FFN")

#20

Steevo

TomorrowI would too but not at this time. Because none of the games i regularly play support RT or DLSS. So to me these transistors are useless at the moment.
At some point in the future when RT perf is actually good and many more games support RT and DLSS then sure.
Im running 2080Ti right now at 1440p 165Hz.

How does the RT acceleration work in the 2080 compare, who would buy a 2080 over a new card today? New features on new cards need to work within a couple years or they are old features that new cards have better versions of.

I say this after watching it happen numerous times from both brands.

#21

Tomorrow

SteevoHow does the RT acceleration work in the 2080 compare, who would buy a 2080 over a new card today? New features on new cards need to work within a couple years or they are old features that new cards have better versions of.

I say this after watching it happen numerous times from both brands.

I bought this card for 700 before the mining craze in January 2021. And i bought if for raster performance. Not RT or DLSS.

#22

wolf

Better Than Native

Xex36010% is quite a lot of wasted transistors.

I suppose they are if personally you consider them to be a waste, I do not. I've used that 10% die area for the majority of my ownership of an RTX gpu.

Bear in mind too iirc of that 10% die area, roughly 2/3 of that is RT and 1/3 is Tensor, so ~3% die space for Tensor alone, easy yes for me and indeed many people like me.

Xex360But I can understand nVidia's goal of accelerating real work rather than gaming.

Well I've used that die area for hundreds of hours of RT/DLSS enabled gaming, so I'd say they definitely are at least gaming features too.

#23

Kapone33

So the Power draw rumours could be real. Imagine that many Gates opening and closing in sequence. Are not the transistors 5nm too which means a ridiculous amount of heat to dissipate. No wonder those look like 4 slot coolers with 120MM fans. Good luck using a single rad with one of these puppies water-cooled.

#24

The Von Matrices

kapone32So the Power draw rumours could be real. Imagine that many Gates opening and closing in sequence. Are not the transistors 5nm too which means a ridiculous amount of heat to dissipate. No wonder those look like 4 slot coolers with 120MM fans. Good luck using a single rad with one of these puppies water-cooled.

Yep. You don't just triple the number of transistors nowadays without nearly tripling the power.

People are thinking that this chip will be amazingly fast based solely upon the number of transistors. It won't come close to that because it will be severely power limited.

#25

R0H1T

The Von MatricesYep. You don't just triple the number of transistors nowadays without nearly tripling the power.

That's not necessarily true, you can clock them reasonably well & not trip your 110V circuit breaker in the States :D

Add your own comment

NVIDIA AD102 "Ada" Packs Over 75 Billion Transistors

31 Comments on NVIDIA AD102 "Ada" Packs Over 75 Billion Transistors

Latest GPU Drivers

New Forum Posts

Popular Reviews

Controversial News Posts

NVIDIA AD102 "Ada" Packs Over 75 Billion Transistors

Related News

31 Comments on NVIDIA AD102 "Ada" Packs Over 75 Billion Transistors

Latest GPU Drivers

New Forum Posts

Popular Reviews

Controversial News Posts