Saturday, January 29th 2022

NVIDIA "Hopper" Might Have Huge 1000 mm² Die, Monolithic Design

Jan 29th, 2022 10:37 Discuss (60 Comments)

Renowned hardware leaker kopike7kimi on Twitter revealed some purported details on NVIDIA's next-generation architecture for HPC (High Performance Computing), Hopper. According to the leaker, Hopper is still sporting a classic monolithic die design despite previous rumors, and it appears that NVIDIA's performance targets have led to the creation of a monstrous, ~1000 mm² die package for the GH100 chip, which usually maxes out the complexity and performance that can be achieved on a particular manufacturing process. This is despite the fact that Hopper is also rumored to be manufactured under TSMC's 5 nm technology, thus achieving higher transistor density and power efficiency compared to the 8 nm Samsung process that NVIDIA is currently contracting. At the very least, it means that the final die will be bigger than the already enormous 826 mm² of NVIDIA's GA100.

If this is indeed the case and NVIDIA isn't deploying a MCM (Multi-Chip Module) design on Hopper, which is designed for a market with increased profit margins, it likely means that less profitable consumer-oriented products from NVIDIA won't be featuring the technology either. MCM designs also make more sense in NVIDIA's HPC products, as they would enable higher theoretical performance when scaling - exactly what that market demands. Of course, NVIDIA could be looking to develop an MCM version of the GH100 still; but if that were to happen, the company could be looking to pair two of these chips together as another HPC product (rumored GH-102). ~2,000 mm² in a single GPU package, paired with increased density and architectural improvements might actually be what NVIDIA requires to achieve the 3x performance jump from the Ampere-based A100 the company is reportedly targeting.

Source: Videocardz

Add your own comment

60 Comments on NVIDIA "Hopper" Might Have Huge 1000 mm² Die, Monolithic Design

#26

lexluthermiester

repman244It's ~31.62mm.

TADA!

Let's review...

Raevenlordand it appears that NVIDIA's performance targets have led to the creation of a monstrous, ~1000 mm² die package for the GH100 chip

... that's what the article stated....

oobymach1000mm is 100cm which is 1meter or around 40" for you Americans

... and this was stated(implying an insulting tone), to which I replied....

lexluthermiesterThanks for the tip.... except you would be incorrect. The measurement is 1000mm squared, which is equal to 100mm wide by 100mm long. It's a function of area not total length of any one side.

TADA! Math is fun!

...with this.

So...

repman244It's ~31.62mm.

TADA!

...where does YOUR comment come in? Hmmm? Your math skills seem about as good as mister oobymach.

#27

repman244

lexluthermiesterLet's review...

... that's what the article stated....

... and this was stated(implying an insulting tone), to which I replied....

...with this.

So...

...where does YOUR comment come in? Hmmm? Your math skills seem about as good as mister oobymach.

100mm wide and 100mm long does not get you a surface area of 1000mm2 but 10000mm2.

#28

Wirko

lexluthermiesterThanks for the tip.... except you would be incorrect. The measurement is 1000mm squared, which is equal to 100mm wide by 100mm long. It's a function of area not total length of any one side.

TADA! Math is fun!

What if it is a one-dimensional chip, 1000 mm long and rolled into a chromosome-like shape to be more practical? Data goes in at one end, exits at the other.

RH92Actully neither of those is a necessity , they can print beyond reticle limit by making multriple passes it's just that yealds will be poor but this is HPC so money is not really an issue as long as perf is there .

I too saw the possibility of multiple exposures mentioned somewhere, do you happen to have any links?

Regarding the yield, I'm sure such a chip can operate with a small percentage of bad compute units, so it shouldn't be horribly low.

#29

Vayra86

lynx29TIE FIRST ROUND OF SALES TO STEAM ACCOUNTS WITH AT LEAST 1K+ GAMES AND 10 YEARS OLD IN LENGTH+

then lower the requirements for each round of graphics card sales after that.

THIS IS what nvidia/amd would do if they cared about gamers, yes i know not everyone plays on steam, but at this is is a starting point for the first say 7 rounds before you open it up to everyone.

but w.e

I thought Steam fans were against exclusive deals?!

;)

#30

JustBenching

lynx29TIE FIRST ROUND OF SALES TO STEAM ACCOUNTS WITH AT LEAST 1K+ GAMES AND 10 YEARS OLD IN LENGTH+

then lower the requirements for each round of graphics card sales after that.

THIS IS what nvidia/amd would do if they cared about gamers, yes i know not everyone plays on steam, but at this is is a starting point for the first say 7 rounds before you open it up to everyone.

but w.e

What a great idea....I wonder why noone else thought about it. Oh, nevermind, I know, cause it's dumb

#31

WhoDecidedThat

RH92Actully neither of those is a necessity , they can print beyond reticle limit by making multriple passes it's just that yealds will be poor but this is HPC so money is not really an issue as long as perf is there .

Didn't know. Do you have any links?
Also, if you can print beyond the reticle limit, why is it called the reticle ""limit"" in the first place? Like what is the principle used to decide that this particular size is the reticle limit?

WirkoWhat if it is a one-dimensional chip, 1000 mm long and rolled into a chromosome-like shape to be more practical? Data goes in at one end, exits at the other.

What if we created such a chip but instead of using 0s and 1s we use As, Ts, Cs and Gs? Woudn't that be best for ML?

WirkoRegarding the yield, I'm sure such a chip can operate with a small percentage of bad compute units, so it shouldn't be horribly low.

Just like Ceberas' Wafer Scale Engine then?

#32

Wirko

blanarahulAlso, if you can print beyond the reticle limit, why is it called the reticle ""limit"" in the first place? Like what is the principle used to decide that this particular size is the reticle limit?

The limit is 33 mm x 26 mm = 858 mm2 for DUV and apparently it's the same for EUV. The whole optical system of the scanner machine is designed for this size so there's no trivial way to make it bigger. Some kind of stitching and using multiple exposures is required for larger chips, I too would like to know more.

blanarahulJust like Ceberas' Wafer Scale Engine then?

Big chips with a large number of equal units are always designed with some redundancy. Processors, RAM, NAND. Cerebras, I suppose, is more complex than all of those, its interconnect is similar to a network of network routers and it has the ability to route data around defective processors.

#33

lexluthermiester

repman244100mm wide and 100mm long does not get you a surface area of 1000mm2 but 10000mm2.

Really? Are you sure? Damn, I's just gotta improve on my math skills... :rolleyes:

WirkoWhat if it is a one-dimensional chip, 1000 mm long and rolled into a chromosome-like shape to be more practical? Data goes in at one end, exits at the other.

But what if we wrap it around like a pretzel and then plug it into the you-know-what?

#34

repman244

lexluthermiesterReally? Are you sure? Damn, I's just gotta improve on my math skills... :rolleyes:

I don't quite understand what your trying here, but it's interesting you mention other peoples insulting tone while doing the same.
Nothing bad happens if you make a mistake from time to time.

#35

HalfAHertz

yikes...

With the 5nm wafers costing 25 ~30k $, just manufacturing the die would be 1200$. That means that to break even, they'd need to sell this thing for at least 3k. As others mentioned, this would most definitely be meant only for HPC clients and probably start at 7-8k.

#36

TheoneandonlyMrK

HalfAHertzyikes...

With the 5nm wafers costing 25 ~30k $, just manufacturing the die would be 1200$. That means that to break even, they'd need to sell this thing for at least 3k. As others mentioned, this would most definitely be meant only for HPC clients and probably start at 7-8k.

One thing these calculations don't account for is the intelligently designed elements, Nvidia have lead the world on the design of fault tolerant circuits, what I mean by that is they're very easy to disable elements on that have a fault limiting function and as we have seen throughout the years can bin defective chips into lower performance SKU very effectively.
Few would in reality have defects that cannot be used in some way.

#37

Minus Infinity

Oh for god's sake, how bad are people at maths. If the chip is square then clearly the length of the side is sqrt(1000) = 31.62mm. It's not rocket science, it's primary school math. If the chip was 100mm on a side then the area would be 100000mm^2.

Make it even simpler, if you had a rectangular chip that was 40mm x 25mm, what is the area?

#38

Jism

oobymach1000mm is 100cm which is 1meter or around 40" for you Americans, that is a big honking gpu die. I think we took a leap backward here,

* imagines trying to fit that motherboard sized block along with a pcb and rams and such required to run it in a cpu case.

They actually designed that card pretty well lol.

#39

bug

Minus InfinityOh for god's sake, how bad are people at maths. If the chip is square then clearly the length of the side is sqrt(1000) = 31.62mm. It's not rocket science, it's primary school math. If the chip was 100mm on a side then the area would be 100000mm^2.

Make it even simpler, if you had a rectangular chip that was 40mm x 25mm, what is the area?

Kudos to you if you learned how extract the square root in primary school. Spot on, otherwise.

#40

comtek

lexluthermiesterThanks for the tip.... except you would be incorrect. The measurement is 1000mm squared, which is equal to 100mm wide by 100mm long. It's a function of area not total length of any one side.

TADA! Math is fun!

It is 1cm x 10cm = 10cm2 = 1000mm2

#41

bug

comtekIt is 1cm x 10cm = 10cm2 = 1000mm2

When in doubt, you can always google these things :P

#42

Punkenjoy

Minus InfinityOh for god's sake, how bad are people at maths. If the chip is square then clearly the length of the side is sqrt(1000) = 31.62mm. It's not rocket science, it's primary school math. If the chip was 100mm on a side then the area would be 100000mm^2.

Make it even simpler, if you had a rectangular chip that was 40mm x 25mm, what is the area?

well 1 too many zeros, it would be 10 000 mm^2, not 100 000.

#43

bug

Punkenjoywell 1 too many zeros, it would be 10 000 mm^2, not 100 000.

Anyone else remembers when MS-DOS introduced numerical separators to the dir command? In a stroke of genius, that feature was called "no more fingers on the screen".

#44

stimpy88

Unlikely, as they can't focus the UV light that wide. Unless this is on some older process.

#45

lexluthermiester

repman244I don't quite understand what your trying here, but it's interesting you mention other peoples insulting tone while doing the same.
Nothing bad happens if you make a mistake from time to time.

comtekIt is 1cm x 10cm = 10cm2 = 1000mm2

Just let it go...

#46

Soul_

oobymach1000mm is 100cm which is 1meter or around 40" for you Americans, that is a big honking gpu die. I think we took a leap backward here,

* imagines trying to fit that motherboard sized block along with a pcb and rams and such required to run it in a cpu case.

Love the video. But is it 1000mm^2 not 1000mm, they are talking about area, not a single dimension.

1000mm^2 = 10cm^2 = 1.55inch^2, so around 1.24inch sides for the square.

#47

ppn

EUV can't make them bigger than 429 unless there is some new developpement

For circuit designers, this means an effective field of 16.5 mm by 26 mm for a new maximum die size of 429 mm². Say goodbye to the massive dies we got used to from Intel and Nvidia. 2019-asml-euv

Or they could have blocks with 4 dies uncut and interconnected on the wafer itself.

#48

Wirko

stimpy88Unlikely, as they can't focus the UV light that wide. Unless this is on some older process.

They can use some good old stitching (but maybe it's not old enough for patents to have expired ... Ian Dr. Cutress has his doubts). One of the issues is that an ASML scanner can process 170 wafers per hour but that number is certainly reduced if it's used to draw the patterns for half of each chip in each pass, not whole chip.

Here's how a reticle (photomask) looks like: TSMC

ppnEUV can't make them bigger than 429 unless there is some new developpement

Ironically, 429 mm2 is THE new development, entering mass production in 2025 (if you believe it - I'd rather say 202y or 202z or 202α). That's high-numerical aperture EUV. The photomask size will remain the same. The optical system, however, will reduce the image to a surface area that's half smaller than it is now.

Thanks for the link! I sometimes read stuff at Semi Engineering but it's usually over my head.

ppnOr they could have blocks with 4 dies uncut and interconnected on the wafer itself.

Yes, that's the kind of stitching I mentioned. Not four equal dies but two different halves that together make one die.

#49

Spencer LeBlanc

Dishnetwork is doing a great job! :P

#50

ModEl4

So, around +20% regarding actual die size, +15% increased frequency potentially and +85% increased density regarding logic, - not so great density scaling for the analog parts,
fuse.wikichip.org/wp-content/uploads/2020/03/tsmc-5nm-density-q1-2020.png
+ hopper architectural improvements and it seems everything is in order, nothing surprising regarding this story

Add your own comment

NVIDIA "Hopper" Might Have Huge 1000 mm² Die, Monolithic Design

60 Comments on NVIDIA "Hopper" Might Have Huge 1000 mm² Die, Monolithic Design

Latest GPU Drivers

New Forum Posts

Popular Reviews

TPU on YouTube

Controversial News Posts

NVIDIA "Hopper" Might Have Huge 1000 mm² Die, Monolithic Design

Related News

60 Comments on NVIDIA "Hopper" Might Have Huge 1000 mm² Die, Monolithic Design

Latest GPU Drivers

New Forum Posts

Popular Reviews

TPU on YouTube

Controversial News Posts