Saturday, January 29th 2022
NVIDIA "Hopper" Might Have Huge 1000 mm² Die, Monolithic Design
Renowned hardware leaker kopike7kimi on Twitter revealed some purported details on NVIDIA's next-generation architecture for HPC (High Performance Computing), Hopper. According to the leaker, Hopper is still sporting a classic monolithic die design despite previous rumors, and it appears that NVIDIA's performance targets have led to the creation of a monstrous, ~1000 mm² die package for the GH100 chip, which usually maxes out the complexity and performance that can be achieved on a particular manufacturing process. This is despite the fact that Hopper is also rumored to be manufactured under TSMC's 5 nm technology, thus achieving higher transistor density and power efficiency compared to the 8 nm Samsung process that NVIDIA is currently contracting. At the very least, it means that the final die will be bigger than the already enormous 826 mm² of NVIDIA's GA100.
If this is indeed the case and NVIDIA isn't deploying a MCM (Multi-Chip Module) design on Hopper, which is designed for a market with increased profit margins, it likely means that less profitable consumer-oriented products from NVIDIA won't be featuring the technology either. MCM designs also make more sense in NVIDIA's HPC products, as they would enable higher theoretical performance when scaling - exactly what that market demands. Of course, NVIDIA could be looking to develop an MCM version of the GH100 still; but if that were to happen, the company could be looking to pair two of these chips together as another HPC product (rumored GH-102). ~2,000 mm² in a single GPU package, paired with increased density and architectural improvements might actually be what NVIDIA requires to achieve the 3x performance jump from the Ampere-based A100 the company is reportedly targeting.
Source:
Videocardz
If this is indeed the case and NVIDIA isn't deploying a MCM (Multi-Chip Module) design on Hopper, which is designed for a market with increased profit margins, it likely means that less profitable consumer-oriented products from NVIDIA won't be featuring the technology either. MCM designs also make more sense in NVIDIA's HPC products, as they would enable higher theoretical performance when scaling - exactly what that market demands. Of course, NVIDIA could be looking to develop an MCM version of the GH100 still; but if that were to happen, the company could be looking to pair two of these chips together as another HPC product (rumored GH-102). ~2,000 mm² in a single GPU package, paired with increased density and architectural improvements might actually be what NVIDIA requires to achieve the 3x performance jump from the Ampere-based A100 the company is reportedly targeting.
60 Comments on NVIDIA "Hopper" Might Have Huge 1000 mm² Die, Monolithic Design
So... ...where does YOUR comment come in? Hmmm? Your math skills seem about as good as mister oobymach.
Regarding the yield, I'm sure such a chip can operate with a small percentage of bad compute units, so it shouldn't be horribly low.
;)
Also, if you can print beyond the reticle limit, why is it called the reticle ""limit"" in the first place? Like what is the principle used to decide that this particular size is the reticle limit? What if we created such a chip but instead of using 0s and 1s we use As, Ts, Cs and Gs? Woudn't that be best for ML? Just like Ceberas' Wafer Scale Engine then?
Nothing bad happens if you make a mistake from time to time.
With the 5nm wafers costing 25 ~30k $, just manufacturing the die would be 1200$. That means that to break even, they'd need to sell this thing for at least 3k. As others mentioned, this would most definitely be meant only for HPC clients and probably start at 7-8k.
Few would in reality have defects that cannot be used in some way.
Make it even simpler, if you had a rectangular chip that was 40mm x 25mm, what is the area?
1000mm^2 = 10cm^2 = 1.55inch^2, so around 1.24inch sides for the square.
Here's how a reticle (photomask) looks like: TSMC Ironically, 429 mm2 is THE new development, entering mass production in 2025 (if you believe it - I'd rather say 202y or 202z or 202α). That's high-numerical aperture EUV. The photomask size will remain the same. The optical system, however, will reduce the image to a surface area that's half smaller than it is now.
Thanks for the link! I sometimes read stuff at Semi Engineering but it's usually over my head. Yes, that's the kind of stitching I mentioned. Not four equal dies but two different halves that together make one die.
fuse.wikichip.org/wp-content/uploads/2020/03/tsmc-5nm-density-q1-2020.png
+ hopper architectural improvements and it seems everything is in order, nothing surprising regarding this story