Tuesday, March 22nd 2022

NVIDIA H100 is a Compute Monster with 80 Billion Transistors, New Compute Units and HBM3 Memory

Mar 22nd, 2022 13:36 Discuss (29 Comments)

During the GTC 2022 keynote, NVIDIA announced its newest addition to the accelerator cards family. Called NVIDIA H100 accelerator, it is the company's most powerful creation ever. Utilizing 80 billion of TSMC's 4N 4 nm transistors, H100 can output some insane performance, according to NVIDIA. Featuring a new fourth-generation Tensor Core design, it can deliver a six-fold performance increase compared to A100 Tensor Cores and a two-fold MMA (Matrix Multiply Accumulate) improvement. Additionally, new DPX instructions accelerate Dynamic Programming algorithms up to seven times over the previous A100 accelerator. Thanks to the new Hopper architecture, the Streaming Module structure has been optimized for better transfer of large data blocks.

The full GH100 chip implementation features 144 SMs, and 128 FP32 CUDA cores per SM, resulting in 18,432 CUDA cores at maximum configuration. The NVIDIA H100 GPU with SXM5 board form-factor features 132 SMs, totaling 16,896 CUDA cores, while the PCIe 5.0 add-in card has 114 SMs, totaling 14,592 CUDA cores. As much as 80 GB of HBM3 memory surrounds the GPU at 3 TB/s bandwidth. Interestingly, the SXM5 variant features a very large TDP of 700 Watts, while the PCIe card is limited to 350 Watts. This is the result of better cooling solutions offered for the SXM form-factor. As far as performance figures are concerned, the SXM and PCIe versions provide two distinctive figures for each implementation. You can check out the performance estimates in various precision modes below. You can read more about the Hopper architecture and what makes it special in this whitepaper published by NVIDIA.

Add your own comment

29 Comments on NVIDIA H100 is a Compute Monster with 80 Billion Transistors, New Compute Units and HBM3 Memory

DeathtoGnomes

AleksandarKfeatures a very large TDP of 700 Watts

JFC! Cant imaging a rack full of these. Might as well run them submerged in LN2. o_O

Dimitriman

So.. does this mean Ada Lovelace is 4nm as well?

Sabotaged_Enigma

I've become so numb, I can't feel you (referring to new launches) there...

ncrs

DimitrimanSo.. does this mean Ada Lovelace is 4nm as well?

Not really, the current generation had 7nm TSMC for A100 and 8nm Samsung for RTX 3000 series, so we don't know yet what will happen with RTX 4000 series.

samum

DimitrimanSo.. does this mean Ada Lovelace is 4nm as well?

Pretty sure this is Hopper. "Fourth-generation Tensor Core"

1. Turing
2. Ampere
3. Lovelace
4. Hopper

Lovelace is 5nm.

Anyway, something something Crysis.

ncrs

samumPretty sure this is Hopper. "Fourth-generation Tensor Core"

1. Turing
2. Ampere
3. Lovelace
4. Hopper

Lovelace is 5nm.

Anyway, something something Crysis.

Volta was the first generation with Tensor Cores.

noel_fs

yet no gpu avaliability for the average consumer

is this a fucking joke at this point?????????????????

N/A

N4 is an enhanced N5 with 6% smaller die area via optical shrink and lower complexity via mask-cost reduction so it is a cheaper N5 plus.

CyberCT

noel_fsyet no gpu avaliability for the average consumer

is this a fucking joke at this point?????????????????

There's GPU availability now. The higher end 3000 series are still inflated, price wise but the 3070 TI can now be had for under $900 and is in stock now. What is the "supposed" launch price? $600 before all the OEMs said prices were going up regardless? We're almost there ...

#10

dj-electric

noel_fsyet no gpu avaliability for the average consumer

is this a fucking joke at this point?????????????????

There's no "yet" here. GPUs for server market are much more profitible than for individual clients.

#11

R-T-B

noel_fsyet no gpu avaliability for the average consumer

is this a fucking joke at this point?????????????????

No, you just aren't first priority and haven't been for some time.

#12

BArms

noel_fsyet no gpu avaliability for the average consumer

is this a fucking joke at this point?????????????????

I blame all the idiot companies putting "smart features" into toasters, flip flops, fridges, toilets and fragrance dispensers as much for the chip shortages as anyone.

#13

Minus Infinity

ncrsNot really, the current generation had 7nm TSMC for A100 and 8nm Samsung for RTX 3000 series, so we don't know yet what will happen with RTX 4000 series.

It's 5nm for Lovelace at least the higher end 4070-4090 range. RDNA3 is also 5nm for higher end, 5nm and 6nm for mid-range and 6nm for lower end.

#14

Denver

R-T-BNo, you just aren't first priority and haven't been for some time.

It's weird that the market that was for a long time the most profitable for Nvidia is not a priority. lol

#15

Tomorrow

noel_fsyet no gpu avaliability for the average consumer

is this a fucking joke at this point?????????????????

30 series has been available for months. The problem has been the price.

#16

R-T-B

DenverIt's weird that the market that was for a long time the most profitable for Nvidia is not a priority. lol

Hasn't been first for nearly a decade. Profitability is a fickle thing.

#17

ncrs

Minus InfinityIt's 5nm for Lovelace at least the higher end 4070-4090 range. RDNA3 is also 5nm for higher end, 5nm and 6nm for mid-range and 6nm for lower end.

I've seen the leak claiming 4060-4090 being on 5nm as well, with AMD stuff being split into 5nm and 6nm because the former was RDNA3 and the latter NAVI2, so a different situation to NVIDIA. All in all it's just a leak and we'll have to wait for official announcements.

#18

steen

Minus InfinityRDNA3 is also 5nm for higher end, 5nm and 6nm for mid-range and 6nm for lower end.

Not quite correct.

ncrswith AMD stuff being split into 5nm and 6nm because the former was RDNA3 and the latter NAVI2

Not entirely correct, either.

#19

AusWolf

I really hope this is not indicative of desktop Ada Lovelace power consumption, though "I have a baaad feeling about this". :wtf:

#20

ModEl4

For -20% performance, 48 TTFP64 vs 60 TTFP64 the power goes from 700W (NVLink) to 350W (PCI-express 5.0)
I really hate the increase in TDP but it isn't only Nvidia it seems to be industry wide due to process advancements.
Regarding Ada Lovelace i expect the $499-$449 (cut down AD104 5nm) part to have around +15% performance/W vs Navi 33 (6nm)
I read somewhere that AD106 & AD107 is 6nm not 5nm but i don't know if it's true.
Edit : it seems too big for 4nm (only -6% vs 5nm logic density scaling) for only 80 billion transistors, I'm way off with my calculations, I'm nearly 100mm² off, enough to house 240MB L3 cache, surely I'm doing something wrong.

#21

noel_fs

Tomorrow30 series has been available for months. The problem has been the price.

have you ever heard of supply and demand

#22

caroline!

noel_fshave you ever heard of supply and demand

Scalpers and massive mining operations hoarding cards? yeah heard about those

#23

Tomorrow

noel_fshave you ever heard of supply and demand

noel_fsyet no gpu avaliability for the average consumer

My reply was to your quote about no availability. Availability has not been a big problem since last year.

#24

qubit

Overclocked quantum bit

"Featuring a new fourth-generation Tensor Core design, it can deliver a six-fold performance increase compared to A100 Tensor Cores and a two-fold MMA (Matrix Multiply Accumulate) improvement."

Damn, I'll bet this performance monster can do 8K with no problem. I'm sure that the high end cards will also be reassuringly unaffordable, making any reviews academic.

#25

medi01

Huang (greedia's CEO) himself, after AMD rolled out some serious shit on that market, admitted that such compute GPUs are fairly trivial.

Huge number crunchers built for massively parallel tasks. Not much to balance as with gaming GPUs. Just take the biggest die and cram as much as you can into it.

Add your own comment

NVIDIA H100 is a Compute Monster with 80 Billion Transistors, New Compute Units and HBM3 Memory

29 Comments on NVIDIA H100 is a Compute Monster with 80 Billion Transistors, New Compute Units and HBM3 Memory

Latest GPU Drivers

New Forum Posts

Popular Reviews

Controversial News Posts

NVIDIA H100 is a Compute Monster with 80 Billion Transistors, New Compute Units and HBM3 Memory

Related News

29 Comments on NVIDIA H100 is a Compute Monster with 80 Billion Transistors, New Compute Units and HBM3 Memory

Latest GPU Drivers

New Forum Posts

Popular Reviews

Controversial News Posts