Wednesday, September 21st 2022

NVIDIA RTX 4090 Doesn't Max-Out AD102, Ample Room Left for Future RTX 4090 Ti

The AD102 silicon on which NVIDIA's new flagship graphics card, the GeForce RTX 4090, is based, is a marvel of semiconductor engineering. Built on the 4 nm EUV (TSMC 4N) silicon fabrication process, the chip has a gargantuan transistor-count of 76.3 billion, a nearly 170% increase over the previous GA102, and a die-size of 608 mm², which is in fact smaller than the 628 mm² die-area of the GA102. This is thanks to TSMC 4N offering nearly thrice the transistor-density of the Samsung 8LPP node on which the GA102 is based.

The AD102 physically features 18,432 CUDA cores, 568 fourth-generation Tensor cores, and 142 third-generation RT cores. The streaming multiprocessors (SM) come with special components that enable the Shader Execution Reordering optimization, which has a significant performance impact on both raster- and ray traced graphics rendering performance. The silicon supports up to 24 GB of GDDR6X or up to 48 GB of GDDR6+ECC memory (the latter will be seen in the RTX Ada professional-visualization card), across a 384-bit wide memory bus. There are 568 TMUs, and a mammoth 192 ROPs on the silicon.
The RTX 4090 is carved out of this silicon by enabling 16,384 out of 18,432 CUDA cores. 512 out of 568 Tensor cores, 512 out of 568 TMUs, 128 out of 142 RT cores, and unless NVIDIA has touched the ROP count, it could remain at 192. The memory bus is maxed out, with 24 GB of 21 Gbps GDDR6X memory across the 384-bit bus-width. In creating the RTX 4090, NVIDIA has given itself a 10% headroom in the number-crunching machinery, from which to carve out future SKUs such as the possible RTX 4090 Ti. Until that SKU is needed in the product-stack, NVIDIA will use this 10% margin toward harvesting the AD102 silicon.
Add your own comment

27 Comments on NVIDIA RTX 4090 Doesn't Max-Out AD102, Ample Room Left for Future RTX 4090 Ti

#1
Vayra86
'Samsung 8 was a good node'.
Mhm, but TSMC's are much better.
Posted on Reply
#2
Sabotaged_Enigma
RTX 4090 doesn't max-out Max TBP, ample room left for future 800 W power.
Good job, nVIDIA.
Posted on Reply
#3
ir_cow
Is anyone surpised by this? At one point the Ti model was a max core GPU. Than it shifted to the Titan Model. Now the Titan is gone. The Ti is once again the highest core count.
Posted on Reply
#4
gasolina
I would want the rtx4000 to be at 3000 series with same performance and cosume 1/2 to 1/3 power
Posted on Reply
#6
Jimmy_
Vayra86'Samsung 8 was a good node'.
Mhm, but TSMC's are much better.
Agreed
Яid!culousOwORTX 4090 doesn't max-out Max TBP, ample room left for future 800 W power.
Good job, nVIDIA.
Damn dude - EVGA made 3090Ti liquid cooler then they didn't had any room left for future 4090ti :D - Thats why they said bye bye Nvidia
Posted on Reply
#7
watzupken
I feel even with a fully enabled chip, its not going to result in a significant improvement over a slightly gimped one. Increasing number of cores will end up with diminishing returns. And knowing Nvidia, they will likely increase power limit in their mid cycle refresh, plus a further price increase.
Vayra86'Samsung 8 was a good node'.
Mhm, but TSMC's are much better.
I am not sure if its a good node, I.e. Samsung 8nm which is essentially 10nm. Compared to AMD, Nvidia seems to be doing well with a node disadvantage. But frankly, this may be attributed to better architecture than RDNA2. Considering the huge jump in specs and clocks peed on TSMC 4nm (5nm) I feel the Samsung node actually was holding Ampere's performance back.
Posted on Reply
#8
Vayra86
watzupkenI feel the Samsung node actually was holding Ampere's performance back.
Of course, every half wit knows this, but there is a strong following of GPU owners that is adamant Samsung's nodes 'are not bad at all'. The latest argument in favor of that was pushing the blame to GDDR6X for the monumental power consumption, never mind the fact clocks were lower than in 2016 on... TSMC...16nm ;)

And here we are seeing the same GDDR6X on TSMC with more memory alongside a smaller die with many more transistors at a relatively small increase of power budget :)
Posted on Reply
#10
usiname
Looking at this 4080 12gb, which is slower or on par in rasterization compared to 3090 with 285W vs 350W makes me think that the 8nm samsung was not that bad, but just nvidia can't produce effective card
Posted on Reply
#11
pavle
usinameLooking at this 4080 12gb, which is slower or on par in rasterization compared to 3090 with 285W vs 350W makes me think that the 8nm samsung was not that bad, but just nvidia can't produce effective card
I believe greed is the key word here, as in ngreedia as we've seen nvidiot_central being called in the past. Let's see the reviews...
Posted on Reply
#12
ratirt
usinameLooking at this 4080 12gb, which is slower or on par in rasterization compared to 3090 with 285W vs 350W makes me think that the 8nm samsung was not that bad, but just nvidia can't produce effective card
maybe the node was not bad at all. The proper question is, does the architecture is good? Maybe what was not so great was the architecture itself and the node change is not gonna change it either.
Posted on Reply
#13
Gungar
The 3090 didnt max out GA102 either and we didn't get a 3090ti using a bigger die.
Posted on Reply
#14
ratirt
GungarThe 3090 didnt max out GA102 either and we didn't get a 3090ti using a bigger die.
It is not about bigger die since that would mean a different chip. 3090 has 2 Sm units disabled. 3090 has 82 SM units vs 84 for 3090ti. It is not about bigger die since that one is the same. You have more resources.
Posted on Reply
#15
Daven
I meant to post this comment here:

There seems to be some ambiguity around the ROP count. Is there an official number yet?
Posted on Reply
#16
fevgatos
Releasing the 4090ti a year from now, mere months before a 5xxx launch is just.....I don't know. I'd never buy an xx90ti if it doesn't come out near the launch day of the current gen
Posted on Reply
#17
Richards
3x transistors if its not 2x performance on all resolutions its a flop architecture and node
Posted on Reply
#18
Daven
fevgatosReleasing the 4090ti a year from now, mere months before a 5xxx launch is just.....I don't know. I'd never buy an xx90ti if it doesn't come out near the launch day of the current gen
Don’t get bogged down in model letters and numbers. Currently Ti cards are the year out refresh that graphics manufacturers have been doing for years. In the past, different model letters and numbers have been used such as Super and xx50 XT.

Refreshes require more mature manufacturing nodes and a build up of harvested dies that yield more or less silicon to be activated. It’s a way to sell as many chips as possible given the reality of defects and poor yields near the beginning of new product series.

As an aside, its also easier to make one complete chip and then lock otherwise functioning parts to create lower SKUs. This only works up to a point when the ‘dead’ or ‘locked’ silicon exceeds the portion of working parts of the chip at which point you manufacture a smaller ‘native’ chip.

Edit: Oh and sometimes later SKU refreshes are just added in response to competition product releases. A company might even save such responses from the beginning on purpose to see how the competition reacts.
Posted on Reply
#19
JAB Creations
It's 5nm "marketed" as "4nm". The people writing articles should be filtering the BS, not echoing it.
Posted on Reply
#20
Unregistered
They need another card to milk us. The worst part is we used to have Titans that at least brought professional things to the mainstream.
#21
AdmiralThrawn
If performance is 10-20 percent higher than anything that AMD has, then it will be branded as a Titan GPU. The reason the 30 series didn't have one is partially due to how close AMD was in performance. They will not risk the headline, "Titan loses"
Posted on Reply
#22
BArms
Datacenters will get all the fully functional dies, gamers get the broken scraps.
Posted on Reply
#23
Chrispy_
Yields on something that big must be horrible.
Posted on Reply
#24
R-T-B
Vayra86but there is a strong following of GPU owners that is adamant Samsung's nodes 'are not bad at all'.
They weren't bad at all...

...on launch day.

This is how tech advancement works.
Posted on Reply
#25
Unregistered
AdmiralThrawnIf performance is 10-20 percent higher than anything that AMD has, then it will be branded as a Titan GPU. The reason the 30 series didn't have one is partially due to how close AMD was in performance. They will not risk the headline, "Titan loses"
nVidia discovered that people are gullible so they made them believe the 3090 was a Titan and price it accordingly, even the so called tech media (who for the most part are philistines with oversized egos praising their lords nVidia/AMD or Intel) fell for it.
Add your own comment
May 21st, 2024 07:33 EDT change timezone

New Forum Posts

Popular Reviews

Controversial News Posts