Tuesday, March 22nd 2022

NVIDIA H100 is a Compute Monster with 80 Billion Transistors, New Compute Units and HBM3 Memory

During the GTC 2022 keynote, NVIDIA announced its newest addition to the accelerator cards family. Called NVIDIA H100 accelerator, it is the company's most powerful creation ever. Utilizing 80 billion of TSMC's 4N 4 nm transistors, H100 can output some insane performance, according to NVIDIA. Featuring a new fourth-generation Tensor Core design, it can deliver a six-fold performance increase compared to A100 Tensor Cores and a two-fold MMA (Matrix Multiply Accumulate) improvement. Additionally, new DPX instructions accelerate Dynamic Programming algorithms up to seven times over the previous A100 accelerator. Thanks to the new Hopper architecture, the Streaming Module structure has been optimized for better transfer of large data blocks.

The full GH100 chip implementation features 144 SMs, and 128 FP32 CUDA cores per SM, resulting in 18,432 CUDA cores at maximum configuration. The NVIDIA H100 GPU with SXM5 board form-factor features 132 SMs, totaling 16,896 CUDA cores, while the PCIe 5.0 add-in card has 114 SMs, totaling 14,592 CUDA cores. As much as 80 GB of HBM3 memory surrounds the GPU at 3 TB/s bandwidth. Interestingly, the SXM5 variant features a very large TDP of 700 Watts, while the PCIe card is limited to 350 Watts. This is the result of better cooling solutions offered for the SXM form-factor. As far as performance figures are concerned, the SXM and PCIe versions provide two distinctive figures for each implementation. You can check out the performance estimates in various precision modes below. You can read more about the Hopper architecture and what makes it special in this whitepaper published by NVIDIA.
NVIDIA H100
Add your own comment

29 Comments on NVIDIA H100 is a Compute Monster with 80 Billion Transistors, New Compute Units and HBM3 Memory

#1
DeathtoGnomes
AleksandarKfeatures a very large TDP of 700 Watts
JFC! Cant imaging a rack full of these. Might as well run them submerged in LN2. o_O
Posted on Reply
#2
Dimitriman
So.. does this mean Ada Lovelace is 4nm as well?
Posted on Reply
#3
Sabotaged_Enigma
I've become so numb, I can't feel you (referring to new launches) there...
Posted on Reply
#4
ncrs
DimitrimanSo.. does this mean Ada Lovelace is 4nm as well?
Not really, the current generation had 7nm TSMC for A100 and 8nm Samsung for RTX 3000 series, so we don't know yet what will happen with RTX 4000 series.
Posted on Reply
#5
samum
DimitrimanSo.. does this mean Ada Lovelace is 4nm as well?
Pretty sure this is Hopper. "Fourth-generation Tensor Core"

1. Turing
2. Ampere
3. Lovelace
4. Hopper

Lovelace is 5nm.

Anyway, something something Crysis.
Posted on Reply
#6
ncrs
samumPretty sure this is Hopper. "Fourth-generation Tensor Core"

1. Turing
2. Ampere
3. Lovelace
4. Hopper

Lovelace is 5nm.

Anyway, something something Crysis.
Volta was the first generation with Tensor Cores.
Posted on Reply
#7
noel_fs
yet no gpu avaliability for the average consumer

is this a fucking joke at this point?????????????????
Posted on Reply
#8
N/A
N4 is an enhanced N5 with 6% smaller die area via optical shrink and lower complexity via mask-cost reduction so it is a cheaper N5 plus.
Posted on Reply
#9
CyberCT
noel_fsyet no gpu avaliability for the average consumer

is this a fucking joke at this point?????????????????
There's GPU availability now. The higher end 3000 series are still inflated, price wise but the 3070 TI can now be had for under $900 and is in stock now. What is the "supposed" launch price? $600 before all the OEMs said prices were going up regardless? We're almost there ...
Posted on Reply
#10
dj-electric
noel_fsyet no gpu avaliability for the average consumer

is this a fucking joke at this point?????????????????
There's no "yet" here. GPUs for server market are much more profitible than for individual clients.
Posted on Reply
#11
R-T-B
noel_fsyet no gpu avaliability for the average consumer

is this a fucking joke at this point?????????????????
No, you just aren't first priority and haven't been for some time.
Posted on Reply
#12
BArms
noel_fsyet no gpu avaliability for the average consumer

is this a fucking joke at this point?????????????????
I blame all the idiot companies putting "smart features" into toasters, flip flops, fridges, toilets and fragrance dispensers as much for the chip shortages as anyone.
Posted on Reply
#13
Minus Infinity
ncrsNot really, the current generation had 7nm TSMC for A100 and 8nm Samsung for RTX 3000 series, so we don't know yet what will happen with RTX 4000 series.
It's 5nm for Lovelace at least the higher end 4070-4090 range. RDNA3 is also 5nm for higher end, 5nm and 6nm for mid-range and 6nm for lower end.
Posted on Reply
#14
Denver
R-T-BNo, you just aren't first priority and haven't been for some time.
It's weird that the market that was for a long time the most profitable for Nvidia is not a priority. lol
Posted on Reply
#15
Tomorrow
noel_fsyet no gpu avaliability for the average consumer

is this a fucking joke at this point?????????????????
30 series has been available for months. The problem has been the price.
Posted on Reply
#16
R-T-B
DenverIt's weird that the market that was for a long time the most profitable for Nvidia is not a priority. lol
Hasn't been first for nearly a decade. Profitability is a fickle thing.
Posted on Reply
#17
ncrs
Minus InfinityIt's 5nm for Lovelace at least the higher end 4070-4090 range. RDNA3 is also 5nm for higher end, 5nm and 6nm for mid-range and 6nm for lower end.
I've seen the leak claiming 4060-4090 being on 5nm as well, with AMD stuff being split into 5nm and 6nm because the former was RDNA3 and the latter NAVI2, so a different situation to NVIDIA. All in all it's just a leak and we'll have to wait for official announcements.
Posted on Reply
#18
steen
Minus InfinityRDNA3 is also 5nm for higher end, 5nm and 6nm for mid-range and 6nm for lower end.
Not quite correct.
ncrswith AMD stuff being split into 5nm and 6nm because the former was RDNA3 and the latter NAVI2
Not entirely correct, either.
Posted on Reply
#19
AusWolf
I really hope this is not indicative of desktop Ada Lovelace power consumption, though "I have a baaad feeling about this". :wtf:
Posted on Reply
#20
ModEl4
For -20% performance, 48 TTFP64 vs 60 TTFP64 the power goes from 700W (NVLink) to 350W (PCI-express 5.0)
I really hate the increase in TDP but it isn't only Nvidia it seems to be industry wide due to process advancements.
Regarding Ada Lovelace i expect the $499-$449 (cut down AD104 5nm) part to have around +15% performance/W vs Navi 33 (6nm)
I read somewhere that AD106 & AD107 is 6nm not 5nm but i don't know if it's true.
Edit : it seems too big for 4nm (only -6% vs 5nm logic density scaling) for only 80 billion transistors, I'm way off with my calculations, I'm nearly 100mm² off, enough to house 240MB L3 cache, surely I'm doing something wrong.
Posted on Reply
#21
noel_fs
Tomorrow30 series has been available for months. The problem has been the price.
have you ever heard of supply and demand
Posted on Reply
#22
caroline!
noel_fshave you ever heard of supply and demand
Scalpers and massive mining operations hoarding cards? yeah heard about those
Posted on Reply
#23
Tomorrow
noel_fshave you ever heard of supply and demand
noel_fsyet no gpu avaliability for the average consumer
My reply was to your quote about no availability. Availability has not been a big problem since last year.
Posted on Reply
#24
qubit
Overclocked quantum bit
"Featuring a new fourth-generation Tensor Core design, it can deliver a six-fold performance increase compared to A100 Tensor Cores and a two-fold MMA (Matrix Multiply Accumulate) improvement."

Damn, I'll bet this performance monster can do 8K with no problem. I'm sure that the high end cards will also be reassuringly unaffordable, making any reviews academic.
Posted on Reply
#25
medi01
Huang (greedia's CEO) himself, after AMD rolled out some serious shit on that market, admitted that such compute GPUs are fairly trivial.

Huge number crunchers built for massively parallel tasks. Not much to balance as with gaming GPUs. Just take the biggest die and cram as much as you can into it.
Posted on Reply
Add your own comment
Nov 22nd, 2024 23:49 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts