Saturday, January 29th 2022
NVIDIA "Hopper" Might Have Huge 1000 mm² Die, Monolithic Design
Renowned hardware leaker kopike7kimi on Twitter revealed some purported details on NVIDIA's next-generation architecture for HPC (High Performance Computing), Hopper. According to the leaker, Hopper is still sporting a classic monolithic die design despite previous rumors, and it appears that NVIDIA's performance targets have led to the creation of a monstrous, ~1000 mm² die package for the GH100 chip, which usually maxes out the complexity and performance that can be achieved on a particular manufacturing process. This is despite the fact that Hopper is also rumored to be manufactured under TSMC's 5 nm technology, thus achieving higher transistor density and power efficiency compared to the 8 nm Samsung process that NVIDIA is currently contracting. At the very least, it means that the final die will be bigger than the already enormous 826 mm² of NVIDIA's GA100.
If this is indeed the case and NVIDIA isn't deploying a MCM (Multi-Chip Module) design on Hopper, which is designed for a market with increased profit margins, it likely means that less profitable consumer-oriented products from NVIDIA won't be featuring the technology either. MCM designs also make more sense in NVIDIA's HPC products, as they would enable higher theoretical performance when scaling - exactly what that market demands. Of course, NVIDIA could be looking to develop an MCM version of the GH100 still; but if that were to happen, the company could be looking to pair two of these chips together as another HPC product (rumored GH-102). ~2,000 mm² in a single GPU package, paired with increased density and architectural improvements might actually be what NVIDIA requires to achieve the 3x performance jump from the Ampere-based A100 the company is reportedly targeting.
Source:
Videocardz
If this is indeed the case and NVIDIA isn't deploying a MCM (Multi-Chip Module) design on Hopper, which is designed for a market with increased profit margins, it likely means that less profitable consumer-oriented products from NVIDIA won't be featuring the technology either. MCM designs also make more sense in NVIDIA's HPC products, as they would enable higher theoretical performance when scaling - exactly what that market demands. Of course, NVIDIA could be looking to develop an MCM version of the GH100 still; but if that were to happen, the company could be looking to pair two of these chips together as another HPC product (rumored GH-102). ~2,000 mm² in a single GPU package, paired with increased density and architectural improvements might actually be what NVIDIA requires to achieve the 3x performance jump from the Ampere-based A100 the company is reportedly targeting.
60 Comments on NVIDIA "Hopper" Might Have Huge 1000 mm² Die, Monolithic Design
You will own nothing and be happy.
MCM in consumer gaming hardware is inevitable, 3090 is the biggest example of that. Look at how inefficient that monolithic design is, look at how difficult it is to get clock increases when every bump you give it has to be applied to 10k+ cores on the same die. Whether or not it works well will have to be determined.
I am not saying multi die GPU in consumer graphics is impossible. But it will take a lot more time and effort than we expect to get there.
This echoes in many things in the past 3-5 generations of GPUs:
www.techpowerup.com/gpu-specs/geforce-gtx-980-ti.c2724
The best overclocker (in % perf, not necessarily peak clock, but even then, 1600mhz was a unicorn lower in the stack) in the whole stack of Maxwell was a top tier part. Peak clocks equal that of lower tiered parts with power within spec, and temperature started playing a greater role as GPU Boost was introduced. These were the last 28nm GPUs with fantastic yields.
Pascal:
www.techpowerup.com/gpu-specs/geforce-gtx-1080-ti.c2877
Review OC clocks on a properly cooled Gaming X:
Versus a GP104 part on the same cooler (1080):
And this was on a smaller node, straight from the get-go. GPU Boost was further refined. The architecture was stripped of anything non-gaming.
Since Turing, we saw a step back in clocks on a highly similar node, as Nvidia added new components to CUDA, but the small gap between parts in the stack remained.
Since Ampere, we saw another step back, considering this was another shrink and no clockspeed was earned. This can be attributed again to further focus on new CUDA components but also: Samsung's 8nm node that's definitely worse than anything TSMC has.