NVIDIA Blackwell's High Power Consumption Drives Cooling Demands; Liquid Cooling Penetration Expected to Reach 10% by Late 2024

TheLostSwede · Jul 30, 2024

With the growing demand for high-speed computing, more effective cooling solutions for AI servers are gaining significant attention. TrendForce's latest report on AI servers reveals that NVIDIA is set to launch its next-generation Blackwell platform by the end of 2024. Major CSPs are expected to start building AI server data centers based on this new platform, potentially driving the penetration rate of liquid cooling solutions to 10%.

Air and liquid cooling systems to meet higher cooling demands
TrendForce reports that the NVIDIA Blackwell platform will officially launch in 2025, replacing the current Hopper platform and becoming the dominant solution for NVIDIA's high-end GPUs, accounting for nearly 83% of all high-end products. High-performance AI server models like the B200 and GB200 are designed for maximum efficiency, with individual GPUs consuming over 1,000 W. HGX models will house 8 GPUs each, while NVL models will support 36 or 72 GPUs per rack, significantly boosting the growth of the liquid cooling supply chain for AI servers.

TrendForce highlights the increasing TDP of server chips, with the B200 chip's TDP reaching 1,000 W, making traditional air cooling solutions inadequate. The TDP of the GB200 NVL36 and NVL72 complete rack systems is projected to reach 70kW and nearly 140kW, respectively, necessitating advanced liquid cooling solutions for effective heat management.

TrendForce observes that the GB200 NVL36 architecture will initially utilize a combination of air and liquid cooling solutions, while the NVL72, due to higher cooling demands, will primarily employ liquid cooling.

TrendForce identifies five major components in the current liquid cooling supply chain for GB200 rack systems: cold plates, coolant distribution units (CDUs), manifolds, quick disconnects (QDs), and rear door heat exchangers (RDHx).

The CDU is the critical system responsible for regulating coolant flow to maintain rack temperatures within the designated TDP range, preventing component damage. Vertiv is currently the main CDU supplier for NVIDIA AI solutions, with Chicony, Auras, Delta, and CoolIT undergoing continuous testing.

GB200 shipments expected to reach 60,000 units in 2025, making Blackwell the mainstream platform and accounting for over 80% of NVIDIA's high-end GPUs
In 2025, NVIDIA will target CSPs and enterprise customers with diverse AI server configurations, including the HGX, GB200 Rack, and MGX, with expected shipment ratios of 5:4:1. The HGX platform will seamlessly transition from the existing Hopper platform, enabling CSPs and large enterprise customers to adopt it quickly. The GB200 rack AI server solution will primarily target the hyperscale CSP market. TrendForce predicts NVIDIA will introduce the NVL36 configuration at the end of 2024 to quickly enter the market, with the more complex NVL72 expected to launch in 2025.

TrendForce forecasts that in 2025, GB200 NVL36 shipments will reach 60,000 racks, with Blackwell GPU usage between 2.1 to 2.2 million units.

However, there are several variables in the adoption of the GB200 Rack by end customers. TrendForce points out that the NVL72's power consumption of around 140kW per rack requires sophisticated liquid cooling solutions, making it challenging. Additionally, liquid-cooled rack designs are more suitable for new CSP data centers but involve complex planning processes. CSPs might also avoid being tied to a single supplier's specifications and opt for HGX or MGX models with x86 CPU architectures, or expand their self-developed ASIC AI server infrastructure for lower costs or specific AI applications.

View at TechPowerUp Main Site | Source

Klemc · Jul 30, 2024

The global climate is getting hotter and hotter...

It's time for PCs to also participate in this phenomenon !

Daven · Jul 30, 2024

It’s really going to be up to us as consumers to reject these high power, expensive, obscenely high margin parts.

Our world will literally die the moment humans turn on millions of 1000W GPUs in order just to see better virtual water and light reflections in games and CGI films.

bonehead123 · Jul 30, 2024

Klemc said:
It's time for PCs to also participate in this phenomenon !

According to Ngreediya, this is already happening, and will only serve to increase prices on cooling parts overall, which will make them happy as this will give them yet ANUTHA reason to jack up GPU prices

R0H1T · Jul 30, 2024

Daven said:
It’s really going to be up to us as consumers to reject these high power, expensive, obscenely high margin parts.

If you look at the top parts in that leaked(?) image it shows a DC part, it's up to our corporate overlords to reject this "AI" BS now but that's obviously not happening :shadedshu:

Zazigalka · Jul 30, 2024

If there is one thing that I despise that company for, it's not the proprietary stuff that made its way into gaming, cause that's what usually drives the competition to develop the simpler but open version of it. It's the introduction of those high power GPUs for the purpose of crypto mining and data center.

Daven said:
It’s really going to be up to us as consumers to reject these high power, expensive, obscenely high margin parts.

Our world will literally die the moment humans turn on millions of 1000W GPUs in order just to see better virtual water and light reflections in games and CGI films.

well said. ain't happenning though, people will buy 500w 5090s like crazy just to have the best. 4K RT on isn't enough apparently.

Klemc · Jul 30, 2024

Zazigalka said:
If there is one thing that I despise that company for, it's not the proprietary stuff that made its way into gaming, cause that's what usually drives the competition to develop the simpler but open version of it. It's the introduction of those high power GPUs for the purpose of crypto mining and data center.

If this uses RTX cores (tensor thingy) then it's getting worst and worst

Philaphlous · Jul 30, 2024

how the heck do they think you can get 1000W+ on 12V rails.... they're obviously going to have to scale up the voltage. I'm already concerned how they're able to pump that much watts into a GPU...imagine the Amps from those MOSFETS. So what if its like 1.4V but you're talking like 300-400 amps+. 2700W is nuts... that's 225Amps @ 12V. Do they realize you'd need like 0/1 gauge wire for the GPU... something seems fishy here...

Zazigalka · Jul 30, 2024

Klemc said:
If this uses RTX cores (tensor thingy) then it's getting worst and worst

I think tensor/rt cores are actually doing more good than bad, being able to accelerate very specific workloads at the fraction of power that CUDA cores would require. The problem is nvidia going to downright stupid lenghts in order to provide the best datacenter GPUs regardles of the crazy amount of power they require. I can't quote the exact number or article, but I remember reading something that said keeping ChatGPT and similar AI services operational requires more power daily than a few moderately sized countries need.

JustBenching · Jul 30, 2024

Zazigalka said:
If there is one thing that I despise that company for, it's not the proprietary stuff that made its way into gaming, cause that's what usually drives the competition to develop the simpler but open version of it. It's the introduction of those high power GPUs for the purpose of crypto mining and data center.

well said. ain't happenning though, people will buy 500w 5090s like crazy just to have the best. 4K RT on isn't enough apparently.

Just limit it? At some point we need to realize the 5090,just like the 4090, are the most efficient chips, thus the best at "protecting the environment".

Of course people will buy 5090s like crazy. Why wouldn't they? As ive said, even people that care about effieicny and power draw will buy 5090s cause they are the fastest at any power level.

phints · Jul 30, 2024

Eyes that GB200 with 384GB VRAM and 2.7kW power for PUBG... just need an industrial 480VAC feeder to power it.

TheinsanegamerN · Jul 30, 2024

Daven said:
It’s really going to be up to us as consumers to reject these high power, expensive, obscenely high margin parts.

If you dont want it, dont buy it? There's clearly plenty of us willing to pay for powerful GPUs. That's ALWAYS been the case. And people have winged and moaned about high power expensive GPUs since the dawn of PCIe.

Daven said:
Our world will literally die the moment humans turn on millions of 1000W GPUs in order just to see better virtual water and light reflections in games and CGI films.

Klemc said:
The global climate is getting hotter and hotter...

It's time for PCs to also participate in this phenomenon !

It's time you DID something about it! Have you abandoned your PCs and returned to an agrarian society? Oh, no, you have not. Better get on it, you got a planet to save!

Jonny5isalivetm5 · Jul 30, 2024

Pretty sure the 2700watt monster will not be for consumer PCs but lets see XD

TheinsanegamerN · Jul 30, 2024

Philaphlous said:
how the heck do they think you can get 1000W+ on 12V rails.... they're obviously going to have to scale up the voltage. I'm already concerned how they're able to pump that much watts into a GPU...imagine the Amps from those MOSFETS. So what if its like 1.4V but you're talking like 300-400 amps+. 2700W is nuts... that's 225Amps @ 12V. Do they realize you'd need like 0/1 gauge wire for the GPU... something seems fishy here...

They're not pushing 225 amps on one 12v line. Have you ever noticed that connectors have multiple wires? There are, in fact SIX 12v lines on the ATX 3.0 connector. SIX. So divide that 225 amp by 6 and you get 37.5 per line. But wait, there's more!

A 1000w GPU is going to use 2 ATX 3.0 connectors. Split that power among the 12 12v lines, and that's just 6.94 amps per line. That's right around the power level of the old 8 pin connectors, per line.

400 amps at 1.4v is only 560 watt. We've had GPUs push more then that already. That 2700w card is going to be server only, likely using non standard connectors to push that kind of power. It's not unusual either, high power cards have existed for a LONG time. Wouldnt surprise me if they went with 24v for those.

Klemc · Jul 30, 2024

TheinsanegamerN said:
It's time you DID something about it! Have you abandoned your PCs and returned to an agrarian society? Oh, no, you have not. Better get on it, you got a planet to save!

I quitted France from 2010 to 2020... but living in Madagascar didn't made me less polutionist, and health goes down fast as hell too, there (didn't you know ?).

Also, my old uncle (dead) is the main builder of the ECOLOGISTS (politics) in France.

Bernard Charbonneau — Wikipédia

fr.wikipedia.org

Zazigalka · Jul 30, 2024

TheinsanegamerN said:
Have you abandoned your PCs and returned to an agrarian society? Oh, no, you have not. Better get on it, you got a planet to save!

No, but I've made a transition to using a zen4 laptop (7730U+Vega) as my main daily work/news/video PC, and switching the desktop on just for gaming sessions.
I also swapped the 3080 I had for 4070 Super just for desktop/gaming power reduction.

Daven · Jul 30, 2024

TheinsanegamerN said:
It's time you DID something about it! Have you abandoned your PCs and returned to an agrarian society? Oh, no, you have not. Better get on it, you got a planet to save!

Since you don’t know me, I’m 48 with no children, vegetarian for 20 years, walked to work for the last 15 years, my wife and I have mostly lived within 5 miles of work, own one small sedan that sits in the driveway, never owned a truck and built mostly SFF PCs with 65w CPUs and 150W GPUs.

My carbon foot print is extremely small for an American and I sleep fine at night. So I can easily say we need to reject such high power computer parts as much as we can. Oh and I was a political activist in my youth fighting for the environment.

londiste · Jul 30, 2024

Zazigalka said:
The problem is nvidia going to downright stupid lenghts in order to provide the best datacenter GPUs regardles of the crazy amount of power they require.

It is pretty simple with DCs - efficiency is king. If that 2700W thing does more work per power unit than competitors, it gets used.

SIGSEGV · Jul 30, 2024

No problem, and nothing to worry about.
Your believers still want to buy even if your products draw 10kilo watts.

TheDeeGee · Jul 30, 2024

bonehead123 said:
According to Ngreediya, this is already happening, and will only serve to increase prices on cooling parts overall, which will make them happy as this will give them yet ANUTHA reason to jack up GPU prices

Who is Ngreediya?

I only know AlwaysMuchDaft.

JustBenching · Jul 30, 2024

londiste said:
It is pretty simple with DCs - efficiency is king. If that 2700W thing does more work per power unit than competitors, it gets used.

Yeah, it's sad on a tech forum people don't understand the difference between power draw and efficiency. They see nvidia on the title and go berserk.

napata · Jul 30, 2024

Daven said:
It’s really going to be up to us as consumers to reject these high power, expensive, obscenely high margin parts.

Our world will literally die the moment humans turn on millions of 1000W GPUs in order just to see better virtual water and light reflections in games and CGI films.

This is all B2B so it's not like consumers can do anything about it. These get sold by the rack for millions of dollars. It's even in the article how a single NVL72 rack draws 140KW and these run 24/7 most likely unlike consumer GPUs. That's significantly more power than I use for everything combined and I own a 4090.

evernessince · Jul 30, 2024

The sooner faster ASICs come to the market for AI the better. This kind of power consumption is nowhere near where it needs to be. The human brain uses a mere 20 w of power and crunches a theoretical exaflop of data per second according to the NIST.

TheinsanegamerN said:
It's time you DID something about it! Have you abandoned your PCs and returned to an agrarian society? Oh, no, you have not. Better get on it, you got a planet to save!

It would be counter-intuitive to return to an agrarian society given that single persons / families farming their own food would be very inefficient, especially compared to today's farms that use GPS and precision systems to vastly improve efficiency and reduce waste. Stopping climate change isn't about halting all carbon output, it's going to be a combination of efficiency improvements, green energy, and carbon capture. Most people will not have to make significant changes to their life-style, it will be a matter of more efficient cars, hot water tanks, ect in combination with carbon capture tech. This is why people call on business to make changes, as they have a lot of carbon output from their activities and the products they produce have a downstream impact on customers as well.

There's no purity requirements to lodge a pro-environment argument or any argument for that matter. This is just a typical logical fallacy employed to discount a person's opinion without actually providing any substance against their argument.

R0H1T · Jul 30, 2024

evernessince said:
The human brain uses a mere 20 w of power and crunches a theoretical exaflop of data per second according to the NIST.

We should probably start growing more brains, as they say win-win :laugh:

MaMoo · Jul 30, 2024

evernessince said:
The sooner faster ASICs come to the market for AI the better. This kind of power consumption is nowhere near where it needs to be. The human brain uses a mere 20 w of power and crunches a theoretical exaflop of data per second according to the NIST.

It would be counter-intuitive to return to an agrarian society given that single persons / families farming their own food would be very inefficient, especially compared to today's farms that use GPS and precision systems to vastly improve efficiency and reduce waste. Stopping climate change isn't about halting all carbon output, it's going to be a combination of efficiency improvements, green energy, and carbon capture. Most people will not have to make significant changes to their life-style, it will be a matter of more efficient cars, hot water tanks, ect in combination with carbon capture tech. This is why people call on business to make changes, as they have a lot of carbon output from their activities and the products they produce have a downstream impact on customers as well.

There's no purity requirements to lodge a pro-environment argument or any argument for that matter. This is just a typical logical fallacy employed to discount a person's opinion without actually providing any substance against their argument.

Good points. I will just point out that regardless of ASICs, the current generation of AI algorithms are very inefficient compared to human learning. We generalize and extrapolate far more efficiently, for example, we don't need so much training data to identify animals and can combine both reductive analysis and systems methods to make inferences. I think this current generation of AI that is typical of the third renaissance of AI is going to need a few more renaissances to get to become more efficient, aside from ASICs.

System Name	Overlord Mk MLI
Processor	AMD Ryzen 7 7800X3D
Motherboard	Gigabyte X670E Aorus Master
Cooling	Noctua NH-D15 SE with offsets
Memory	32GB Team T-Create Expert DDR5 6000 MHz @ CL30-34-34-68
Video Card(s)	Gainward GeForce RTX 4080 Phantom GS
Storage	1TB Solidigm P44 Pro, 2 TB Corsair MP600 Pro, 2TB Kingston KC3000
Display(s)	Acer XV272K LVbmiipruzx 4K@160Hz
Case	Fractal Design Torrent Compact
Audio Device(s)	Corsair Virtuoso SE
Power Supply	be quiet! Pure Power 12 M 850 W
Mouse	Logitech G502 Lightspeed
Keyboard	Corsair K70 Max
Software	Windows 10 Pro
Benchmark Scores	https://valid.x86.fr/yfsd9w

System Name	KLM
Processor	7800X3D
Motherboard	B-650E-E Strix
Cooling	Arctic Cooling III 280
Memory	16x2 Fury Renegade 6000-32
Video Card(s)	4070-ti PNY
Storage	500+512+8+8+2+1+1+2+256+8+512+2
Display(s)	VA 32" 4K@60 - OLED 27" 2K@240
Case	4000D Airflow
Audio Device(s)	Edifier 1280Ts
Power Supply	Shift 1000
Mouse	502 Hero
Keyboard	K68
VR HMD	Steam Deck OLED
Software	EMDB
Benchmark Scores	0>1000

System Name	The Little One
Processor	i5-11320H @4.4GHZ
Motherboard	AZW SEI
Cooling	Fan w/heat pipes + side & rear vents
Memory	64GB Crucial DDR4-3200 (2x 32GB)
Video Card(s)	Iris XE
Storage	WD Black SN850X 8TB m.2, Seagate 2TB SSD + SN850 8TB x2 in an external enclosure
Display(s)	2x Samsung 43" & 2x 32"
Case	Practically identical to a mac mini, just purrtier in slate blue, & with 3x usb ports on the front !
Audio Device(s)	Yamaha ATS-1060 Bluetooth Soundbar & Subwoofer
Power Supply	65w brick
Mouse	Logitech MX Master 2
Keyboard	Logitech G613 mechanical wireless
VR HMD	Whahdatiz ???
Software	Windows 10 pro, with all the unnecessary background shitzu turned OFF !
Benchmark Scores	PDQ

System Name	KLM
Processor	7800X3D
Motherboard	B-650E-E Strix
Cooling	Arctic Cooling III 280
Memory	16x2 Fury Renegade 6000-32
Video Card(s)	4070-ti PNY
Storage	500+512+8+8+2+1+1+2+256+8+512+2
Display(s)	VA 32" 4K@60 - OLED 27" 2K@240
Case	4000D Airflow
Audio Device(s)	Edifier 1280Ts
Power Supply	Shift 1000
Mouse	502 Hero
Keyboard	K68
VR HMD	Steam Deck OLED
Software	EMDB
Benchmark Scores	0>1000

System Name	Mean machine
Processor	AMD 6900HS
Memory	2x16 GB 4800C40
Video Card(s)	AMD Radeon 6700S

System Name	Skunkworks 3.0
Processor	5800x3d
Motherboard	x570 unify
Cooling	Noctua NH-U12A
Memory	32GB 3600 mhz
Video Card(s)	asrock 6800xt challenger D
Storage	Sabarent rocket 4.0 2TB, MX 500 2TB
Display(s)	Asus 1440p144 27"
Case	Old arse cooler master 932
Power Supply	Corsair 1200w platinum
Mouse	squeak
Keyboard	Some old office thing
Software	Manjaro

System Name	SuperDuper
Processor	12900ks
Motherboard	MSI z690 DDR4 edge wifi
Cooling	420mm Artic Freezer
Memory	16gb TeamGroup UD4 4000mhz 18-19-19-39 2t Single Rank
Video Card(s)	1080ti
Storage	Intel Optane SSD boot disc
Display(s)	1080p iiyama 120hz
Case	Fractal Meshify 2 XL (outstanding)
Audio Device(s)	Realtek onboard
Power Supply	Seasonic Prime Titanium 650w
Mouse	Logitech MX Performance Wireless
Keyboard	allreli Mechanical
Software	11

Processor	Ryzen 7800X3D
Motherboard	ROG STRIX B650E-F GAMING WIFI
Memory	2x16GB G.Skill Flare X5 DDR5-6000 CL36 (F5-6000J3636F16GX2-FX5)
Video Card(s)	INNO3D GeForce RTX™ 4070 Ti SUPER TWIN X2
Storage	2TB Samsung 980 PRO, 4TB WD Black SN850X
Display(s)	42" LG C2 OLED, 27" ASUS PG279Q
Case	Thermaltake Core P5
Power Supply	Fractal Design Ion+ Platinum 760W
Mouse	Corsair Dark Core RGB Pro SE
Keyboard	Corsair K100 RGB
VR HMD	HTC Vive Cosmos

System Name	SIGSEGV
Processor	AMD Ryzen 9 9950X
Motherboard	MSI MEG ACE X670E
Cooling	Noctua NF-A14 IndustrialPPC Fan 3000RPM \| Arctic P14 MAX
Memory	Fury Beast 64 Gb CL30
Video Card(s)	TUF 4090 OC
Storage	1TB 7200/256 SSD PCIE \| ~ TB \| 970 Evo \| WD Black SN850X 2TB
Display(s)	27" /34"
Case	O11 EVO XL
Audio Device(s)	Realtek
Power Supply	FSP Hydro TI 1000
Mouse	g402
Keyboard	Leopold\|Ducky
Software	LinuxMint
Benchmark Scores	i dont care about scores

System Name	TheDeeGee's PC
Processor	Intel Core i7-11700
Motherboard	ASRock Z590 Steel Legend
Cooling	Noctua NH-D15S
Memory	Crucial Ballistix 3200/C16 32GB
Video Card(s)	Nvidia RTX 4070 Ti 12GB
Storage	Crucial P5 Plus 2TB / Crucial P3 Plus 2TB / Crucial P3 Plus 4TB
Display(s)	EIZO CX240
Case	Lian-Li O11 Dynamic Evo XL / Noctua NF-A12x25 fans
Audio Device(s)	Creative Sound Blaster ZXR / AKG K601 Headphones
Power Supply	Seasonic PRIME Fanless TX-700
Mouse	Logitech G500S
Keyboard	Keychron Q6
Software	Windows 10 Pro 64-Bit
Benchmark Scores	None, as long as my games runs smooth.

NVIDIA Blackwell's High Power Consumption Drives Cooling Demands; Liquid Cooling Penetration Expected to Reach 10% by Late 2024

News Editor