Friday, January 19th 2024

Meta Will Acquire 350,000 H100 GPUs Worth More Than 10 Billion US Dollars

Jan 19th, 2024 08:47 Discuss (53 Comments)

Mark Zuckerberg has shared some interesting insights about Meta's AI infrastructure buildout, which is on track to include an astonishing number of NVIDIA H100 Tensor GPUs. In the post on Instagram, Meta's CEO has noted the following: "We're currently training our next-gen model Llama 3, and we're building massive compute infrastructure to support our future roadmap, including 350k H100s by the end of this year -- and overall almost 600k H100s equivalents of compute if you include other GPUs." That means that the company will enhance its AI infrastructure with 350,000 H100 GPUs on top of the existing GPUs, which is equivalent to 250,000 H100 in terms of computing power, for a total of 600,000 H100-equivalent GPUs.

The raw number of GPUs installed comes at a steep price. With the average selling price of H100 GPU nearing 30,000 US dollars, Meta's investment will settle the company back around $10.5 billion. Other GPUs should be in the infrastructure, but most will comprise the NVIDIA Hopper family. Additionally, Meta is currently training the LLama 3 AI model, which will be much more capable than the existing LLama 2 family and will include better reasoning, coding, and math-solving capabilities. These models will be open-source. Later down the pipeline, as the artificial general intelligence (AGI) comes into play, Zuckerberg has noted that "Our long term vision is to build general intelligence, open source it responsibly, and make it widely available so everyone can benefit." So, expect to see these models in the GitHub repositories in the future.

Source: Mark Zuckerberg (Instagram)

Add your own comment

53 Comments on Meta Will Acquire 350,000 H100 GPUs Worth More Than 10 Billion US Dollars

#26

Easo

Six_Times"We're currently training"

I wonder how many people know what that really means.

They probably imagine something similar to training a puppy or a small child. Well, it is about the same as calling this an AI, even though everything we have is nothing more than pure math.
P.S.
Was interesting to learn about GPT hallucinations, that stuff is going to make more headlines when people do not check the result. Like that lawyer, for example.

nguyenwhen I can get Android girl, I will gladly trade human race for them

Well Musk already has failed us with the promised genetically engineered catgirls, this will too.

#27

trsttte

the54thvoid"I'm the child of a monster!"

DavenWe know now that compute GPUs are eating x86 server CPU lunch

Not only that but bad business practises and lack of improvement is motivating companies to simply roll out their own ARM cpus designed to their precise requirements.

FeelinFroggyThis is why Nvidia's stock will split this year and why Intel wants in the GPU space.

Lol, nvidia's stock is hilariously inflated by this AI trend, the bubble will burst soon enough.

Vya DomusThis sends two different messages to facebook and nvidia shareholders:

A third message/question would be "weren't you going all in in the metaverse just a month ago!? what's the strategy?" but I guess the cat has been out of the bag for a while that play was dead (still cool to see Apple jumping into a sinking ship now :D)

#28

Daven

Prima.VeraCPUs and GPUs work together, they don't exclude each other. Don't worry, high end Enterprise CPUs are live and well. Now ARM also entered the datacenter and supercomputing market.

That is incorrect as far as a CPU financial standpoint. Server farms use to be made of 2, 4 and 8 socket Intel CPUs for a total of 10s of 1000s of processors. And that was it. No GPUs.

Now server farms are made of much much much less CPUs usually only in dual sockets. The rest of the compute power comes from GPUs. While both exist simultaneously in the same farm, the number of CPUs dropped by the 1000s.

For example, a 2005 server farm could have 100,000 CPUs in lets say 12,500 nodes (8 CPUs per node). Each 8 socket CPU would cost $30,000 due to the number of coherent interlinks to connect 8 sockets together. That’s $3 billion CPU revenue.

In a 2023 server, that same 12,500 node server farm only needs two CPUs per node at a much less $10,000 per CPU due to less interlinks. That’s now just $250 million in CPU revenue. The rest of the compute power is GPUs.

In this hypothetical that’s 12 times less CPU revenue!!!

#29

ThrashZone

Hi,
Yep all we need now is AI mining boom to drive gpu/... hardware up :slap:

#30

trsttte

ThrashZoneHi,
Yep all we need now is AI mining boom to drive gpu/... hardware up :slap:

Just like crypto coin mining it will be very temporary. GPUs are great to get resources fast but anyone serious about pursuing AI will develop their own dedicated hardware, like "late to the game" google has been doing for more than a decade, like tesla is doing for autonomous driving, like microsoft started doing once it started seeing the server bills from openai, etc etc etc...

#31

mechtech

FeelinFroggyThis is why Nvidia's stock will split this year and why Intel wants in the GPU space.

www.nvidia.com/en-us/data-center/h100/

At 300W-350W each and up, I'd start investing in power companies lol

350W x 350000 = 122,500,000 W = 122.5MW * $65/MWHr = $7962.50/hr * 8760hrs/yr = $69,751,500 electric bill/yr (assuming $65/MWhr)

#32

kondamin

trsttteLol, nvidia's stock is hilariously inflated by this AI trend, the bubble will burst soon enough.

yes it is, but no it won’t.
only way that will crash is if some flows start a fight over an island and that one company that makes everything suffers Damages.

those top stocks serve as a refuge, they are considered safer than government bonds for most nations.

#33

Dr. Dro

DavenThat is incorrect as far as a CPU financial standpoint. Server farms use to be made of 2, 4 and 8 socket Intel CPUs for a total of 10s of 1000s of processors. And that was it. No GPUs.

Now server farms are made of much much much less CPUs usually only in dual sockets. The rest of the compute power comes from GPUs. While both exist simultaneously in the same farm, the number of CPUs dropped by the 1000s.

For example, a 2005 server farm could have 100,000 CPUs in lets say 12,500 nodes (8 CPUs per node). Each 8 socket CPU would cost $30,000 due to the number of coherent interlinks to connect 8 sockets together. That’s $3 billion CPU revenue.

In a 2023 server, that same 12,500 node server farm only needs two CPUs per node at a much less $10,000 per CPU due to less interlinks. That’s now just $250 million in CPU revenue. The rest of the compute power is GPUs.

In this hypothetical that’s 12 times less CPU revenue!!!

At the same time, the consumer market has significantly grown - they're shifting chips, alright. Besides, if they were really desperate to get product moving, they'd not be making it their absolute best so server-class gear doesn't make it into the hands of enthusiasts and power users.

I'd MUCH rather own a basic 8-core Ryzen Threadripper or Emerald Rapids Xeon W than this i9-13900KS, even if it's actually slower in terms of absolute compute, it's got the quad-channel memory, the PCIe expansions... but nope, this isn't even a choice, they're not available at all in the DIY channels - and are priced very much accordingly to enterprise gear instead. I do not expect this to change - stock prices are comfy, as you can see. It looks like their business strategy is on point.

#34

RandallFlagg

DavenThat is incorrect as far as a CPU financial standpoint. Server farms use to be made of 2, 4 and 8 socket Intel CPUs for a total of 10s of 1000s of processors. And that was it. No GPUs.

Now server farms are made of much much much less CPUs usually only in dual sockets. The rest of the compute power comes from GPUs. While both exist simultaneously in the same farm, the number of CPUs dropped by the 1000s.

For example, a 2005 server farm could have 100,000 CPUs in lets say 12,500 nodes (8 CPUs per node). Each 8 socket CPU would cost $30,000 due to the number of coherent interlinks to connect 8 sockets together. That’s $3 billion CPU revenue.

In a 2023 server, that same 12,500 node server farm only needs two CPUs per node at a much less $10,000 per CPU due to less interlinks. That’s now just $250 million in CPU revenue. The rest of the compute power is GPUs.

In this hypothetical that’s 12 times less CPU revenue!!!

This is nice in theory and fans of AI narratives, but false in practice. It might become true, but not even close right now.

Server farms currently consist primarily of HCI (highly converged infrastructure).

This is where you have companies like VMWare, HPE (HP Enterprise), Dell / EMC, Nutanix, and Cisco UCS.

A Cisco X series rack for example, can have 7 nodes (blades), each blade with 4 x 60 core Xeons (240 cores) and 16TB of RAM, for a total of 1680 cores in a single 7RU form factor.

This fits in a 12.25" high rack mount chassis.

There are bajillions of these things in server farms all over the country.

#35

chodaboy19

Microsoft offered to buy fb for $15bil in 2007 and everyone dismissed fb as a fad, the rest is history.

The computing industry is moving from a store and retrieve model to one of generated answers. It's a change that is happening as we speak, you can ignore it, but is is happening and it's the future.

#36

Daven

RandallFlaggThis is nice in theory and fans of AI narratives, but false in practice. It might become true, but not even close right now.

Server farms currently consist primarily of HCI (highly converged infrastructure).

This is where you have companies like VMWare, HPE (HP Enterprise), Dell / EMC, Nutanix, and Cisco UCS.

A Cisco X series rack for example, can have 7 nodes (blades), each blade with 4 x 60 core Xeons (240 cores) and 16TB of RAM, for a total of 1680 cores in a single 7RU form factor.

This fits in a 12.25" high rack mount chassis.

There are bajillions of these things in server farms all over the country.

Its all in the financials. Intels data center revenue has been falling.

#37

RandallFlagg

DavenIts all in the financials. Intels data center revenue has been falling.

Intel's data center offerings are currently weak. AMD had 21% quarter over quarter growth.

I'm telling you what is. I've worked in a multibillion dollar company for 30 years and I think we have like 3 or 4 dozen GPUs, used in some very specific areas like transportation. We literally have thousands of normal servers.

#38

remixedcat

amd epyc servers are outperforming a lot of intel xeons.

#39

mb194dc

Who knows what Zuck is up to? He thinks we'll all be wearing Metaverse AI Raybans in 5 years? Maybe I guess.

Meanwhile Meta are laying off people still as well.

#40

Pumper

mb194dcMeanwhile Meta are laying off people still as well.

Makes sense. They are being replace by the 10b worth of "AI".

#41

LazyGamer

FoulOnWhiteNvidia must be celebrating this news.

Jensen immediately went out and bought a lorry load of new leather jackets.

#42

Denver

Stealing data requires a lot of computing power, but why invest in something already outdated? The Mi300X beats it, and even Nvidia already has the H200.

I thought this Mark with the strange surname was more of a strategist.

The race to lower operating costs will begin at Open AI, I guess

#43

Dirt Chip

Just for reference, how many h100 NV is using?

those numbers might seems a lot but I think it just average at best to a company at that magnitude and ambition (meta).

When you consider this 10B investment is worthwhile to them based on the free data each Facebook user is generating, be sure to see life changing computer based processes to emerge in the coming years.

#44

Daven

DenverStealing data requires a lot of computing power, but why invest in something already outdated? The Mi300X beats it, and even Nvidia already has the H200.

I thought this Mark with the strange surname was more of a strategist.

The race to lower operating costs will begin at Open AI, I guess

The answer to your question: availability at volume.

#45

chrcoluk

Given the rapid evolution of the AI chips, this seems kind of dumb to me, unless the investment program includes upgraded models in to the package, otherwise they spending 10 billion on something that will be obsolete in a few years.

#46

trsttte

chrcolukGiven the rapid evolution of the AI chips, this seems kind of dumb to me, unless the investment program includes upgraded models in to the package, otherwise they spending 10 billion on something that will be obsolete in a few years.

In a few years the only way to stay competitive is to roll out specialized hardware, designed in house or not. Thing is today Facebook is playing catch up and they need compute power by 9 AM Monday morning so buying a boatload of GPUs is the way to get that.

#47

marios15

They are definitely not paying market prices for those H100, probably half or less

They are also getting another 250k "equivalent"
There wasn't a price disclosure, just numbers

#48

kondamin

mechtechwww.nvidia.com/en-us/data-center/h100/

At 300W-350W each and up, I'd start investing in power companies lol

350W x 350000 = 122,500,000 W = 122.5MW * $65/MWHr = $7962.50/hr * 8760hrs/yr = $69,751,500 electric bill/yr (assuming $65/MWhr)

I doubt that is going to be a comfortable industry for long, regulations price control other political mumbo jumbo

#49

Denver

trsttteIn a few years the only way to stay competitive is to roll out specialized hardware, designed in house or not. Thing is today Facebook is playing catch up and they need compute power by 9 AM Monday morning so buying a boatload of GPUs is the way to get that.

And when competitors running more efficient hardware, start offering the same, better and cheaper service... what's the point? Hm.

#50

Solaris17

Super Dainty Moderator

kondaminI doubt that is going to be a comfortable industry for long, regulations price control other political mumbo jumbo

and I am sure meta is aware of it.

Add your own comment

Meta Will Acquire 350,000 H100 GPUs Worth More Than 10 Billion US Dollars

53 Comments on Meta Will Acquire 350,000 H100 GPUs Worth More Than 10 Billion US Dollars

Latest GPU Drivers

New Forum Posts

Popular Reviews

TPU on YouTube

Controversial News Posts

Meta Will Acquire 350,000 H100 GPUs Worth More Than 10 Billion US Dollars

Related News

53 Comments on Meta Will Acquire 350,000 H100 GPUs Worth More Than 10 Billion US Dollars

Latest GPU Drivers

New Forum Posts

Popular Reviews

TPU on YouTube

Controversial News Posts