Thursday, December 19th 2024

Microsoft Acquired Nearly 500,000 NVIDIA "Hopper" GPUs This Year

Microsoft is heavily investing in enabling its company and cloud infrastructure to support the massive AI expansion. The Redmond giant has acquired nearly half a million of the NVIDIA "Hopper" family of GPUs to support this effort. According to market research company Omdia, Microsoft was the biggest hyperscaler, with data center CapEx and GPU expenditure reaching a record high. The company acquired precisely 485,000 NVIDIA "Hopper" GPUs, including H100, H200, and H20, resulting in more than $30 billion spent on servers alone. To put things into perspective, this is about double that of the next-biggest GPU purchaser, Chinese ByteDance, who acquired about 230,000 sanction-abiding H800 GPUs and regular H100s sources from third parties.

Regarding US-based companies, the only ones that have come close to the GPU acquisition rate are Meta, Tesla/xAI, Amazon, and Google. They have acquired around 200,000 GPUs on average while significantly boosting their in-house chip design efforts. "NVIDIA GPUs claimed a tremendously high share of the server capex," Vlad Galabov, director of cloud and data center research at Omdia, noted, adding, "We're close to the peak." Hyperscalers like Amazon, Google, and Meta have been working on their custom solutions for AI training and inference. For example, Google has its TPU, Amazon has its Trainium and Inferentia chips, and Meta has its MTIA. Hyperscalers are eager to develop their in-house solutions, but NVIDIA's grip on the software stack paired with timely product updates seems hard to break. The latest "Blackwell" chips are projected to get even bigger orders, so only the sky (and the local power plant) is the limit.
Source: Financial Times (Report and Charts)
Add your own comment

16 Comments on Microsoft Acquired Nearly 500,000 NVIDIA "Hopper" GPUs This Year

#1
Vayra86
Madness. Utter madness
Posted on Reply
#2
Vincero
Vayra86Madness. Utter madness
It's what the people want....

MS would be stupid not to provide a solution for customer demands.
They were slow initially with bringing GPU acceleration/compute into Azure ~10 years ago, with AWS being much quicker to implement it (admittedly a large part of that probably due to hypervisor being ready to work with different vGPU sharing/partitioning implementations) - I don't think they are gonna be in the same position again.
Posted on Reply
#3
Vayra86
VinceroIt's what the people want....
Give the people what they want. Bread and games. Give them death.

Some things never change do they :)

Posted on Reply
#4
kondamin
That’s a lot of O365 accounts
Posted on Reply
#5
Vincero
The problem is you look at the list of companies spending big on this and almost none of them are companies you necessarily want harvesting / processing data and using it to then essentially try and figure out how to get more out of you...


I guess with MS/AWS/Google there is the element that they are hosts for others compute needs so not all of that money is going on resources being used by those companies themselves...

But still, it's not a pretty picture, especially when you consider Meta's massive spending which is likely almost entirely internal usage, probably for making more AI slop to push into Facebook... not that Bytedance or xAI are any better even if lower amounts spent...
Posted on Reply
#6
mb194dc
500k GPUs purchased and all the ended up with is Clippy 2.0 ?
Posted on Reply
#8
SOAREVERSOR
Vayra86Madness. Utter madness
It's just as mad as when people said a computer in everyhome was madness or the cloud was madness. AI and not doing any processing locally is the future.
Posted on Reply
#9
eidairaman1
The Exiled Airman
SOAREVERSORIt's just as mad as when people said a computer in everyhome was madness or the cloud was madness. AI and not doing any processing locally is the future.
Bad idea
Posted on Reply
#10
close
SOAREVERSORIt's just as mad as when people said a computer in everyhome was madness or the cloud was madness. AI and not doing any processing locally is the future.
I would totally like to see a future where the "processing" (and I assume you mean the inference here, the training would of course be handled by someone else in the DC) is done locally. Not all models have to be bajillion parameter behemoths, just like not every phone, tablet, or computer is the fastest thing ever.
Posted on Reply
#11
notoperable
Imagine the SoC gets a bricked firmware downstream and they need to reflash each one of those manually that would be the ultimate DDOS for m$
Posted on Reply
#12
TechBuyingHavoc
SOAREVERSORIt's just as mad as when people said a computer in everyhome was madness or the cloud was madness. AI and not doing any processing locally is the future.
To be clear, you are saying that all local processing will stop in the future?

If so, that does sound mad. Bandwidth will always be an issue here, moving data around is the biggest driver of energy consumption. I would predict MORE local or edge processing, AI or not.
Posted on Reply
#13
Easy Rhino
Linux Advocate
And they use all of your data to train the models. Then you pay for the privilege of it.
Posted on Reply
#14
notoperable
Easy RhinoAnd they use all of your data to train the models. Then you pay for the privilege of it.
Now, jst 9.99$ a month, so you can train our models with your data even more!
Posted on Reply
#15
igormp
Vinceroadmittedly a large part of that probably due to hypervisor being ready to work with different vGPU sharing/partitioning implementations
I don't think AWS, nor any other cloud hyperscaler, has ever offered shared/partitioned GPUs. You can just get instances than have a full blown GPU (or more than one), and then you can do your own work to setup vGPU/sharing on top of it (which is still a PITA, at least when it comes to K8s).
I guess the issue was more about just the passthrough to the rented VM and allocation of such resources per node.
closeI would totally like to see a future where the "processing" (and I assume you mean the inference here, the training would of course be handled by someone else in the DC) is done locally. Not all models have to be bajillion parameter behemoths, just like not every phone, tablet, or computer is the fastest thing ever.
There's a lot of that going on, like all the photo tagging and search stuff in Androids and iPhones, as well as stuff like transcription in Whatsapp. Models on the edge are always cool, albeit sometimes lackluster (Whatsapp's transcription model is pretty anemic).
Posted on Reply
#16
cal5582
for all that information windows 11 siphons off of you.
leeches.
SOAREVERSORIt's just as mad as when people said a computer in everyhome was madness or the cloud was madness. AI and not doing any processing locally is the future.
hopefully ill be dead by that point.
Posted on Reply
Add your own comment
Dec 19th, 2024 11:53 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts