Thursday, December 19th 2024

Microsoft Acquired Nearly 500,000 NVIDIA "Hopper" GPUs This Year

Microsoft is heavily investing in enabling its company and cloud infrastructure to support the massive AI expansion. The Redmond giant has acquired nearly half a million of the NVIDIA "Hopper" family of GPUs to support this effort. According to market research company Omdia, Microsoft was the biggest hyperscaler, with data center CapEx and GPU expenditure reaching a record high. The company acquired precisely 485,000 NVIDIA "Hopper" GPUs, including H100, H200, and H20, resulting in more than $30 billion spent on servers alone. To put things into perspective, this is about double that of the next-biggest GPU purchaser, Chinese ByteDance, who acquired about 230,000 sanction-abiding H800 GPUs and regular H100s sources from third parties.

Regarding US-based companies, the only ones that have come close to the GPU acquisition rate are Meta, Tesla/xAI, Amazon, and Google. They have acquired around 200,000 GPUs on average while significantly boosting their in-house chip design efforts. "NVIDIA GPUs claimed a tremendously high share of the server capex," Vlad Galabov, director of cloud and data center research at Omdia, noted, adding, "We're close to the peak." Hyperscalers like Amazon, Google, and Meta have been working on their custom solutions for AI training and inference. For example, Google has its TPU, Amazon has its Trainium and Inferentia chips, and Meta has its MTIA. Hyperscalers are eager to develop their in-house solutions, but NVIDIA's grip on the software stack paired with timely product updates seems hard to break. The latest "Blackwell" chips are projected to get even bigger orders, so only the sky (and the local power plant) is the limit.
Source: Financial Times (Report and Charts)
Add your own comment

27 Comments on Microsoft Acquired Nearly 500,000 NVIDIA "Hopper" GPUs This Year

#1
Vayra86
Madness. Utter madness
Posted on Reply
#2
Vincero
Vayra86Madness. Utter madness
It's what the people want....

MS would be stupid not to provide a solution for customer demands.
They were slow initially with bringing GPU acceleration/compute into Azure ~10 years ago, with AWS being much quicker to implement it (admittedly a large part of that probably due to hypervisor being ready to work with different vGPU sharing/partitioning implementations) - I don't think they are gonna be in the same position again.
Posted on Reply
#3
Vayra86
VinceroIt's what the people want....
Give the people what they want. Bread and games. Give them death.

Some things never change do they :)

Posted on Reply
#4
kondamin
That’s a lot of O365 accounts
Posted on Reply
#5
Vincero
The problem is you look at the list of companies spending big on this and almost none of them are companies you necessarily want harvesting / processing data and using it to then essentially try and figure out how to get more out of you...


I guess with MS/AWS/Google there is the element that they are hosts for others compute needs so not all of that money is going on resources being used by those companies themselves...

But still, it's not a pretty picture, especially when you consider Meta's massive spending which is likely almost entirely internal usage, probably for making more AI slop to push into Facebook... not that Bytedance or xAI are any better even if lower amounts spent...
Posted on Reply
#6
mb194dc
500k GPUs purchased and all they ended up with is Clippy 2.0 ?
Posted on Reply
#8
SOAREVERSOR
Vayra86Madness. Utter madness
It's just as mad as when people said a computer in everyhome was madness or the cloud was madness. AI and not doing any processing locally is the future.
Posted on Reply
#9
eidairaman1
The Exiled Airman
SOAREVERSORIt's just as mad as when people said a computer in everyhome was madness or the cloud was madness. AI and not doing any processing locally is the future.
Bad idea
Posted on Reply
#10
close
SOAREVERSORIt's just as mad as when people said a computer in everyhome was madness or the cloud was madness. AI and not doing any processing locally is the future.
I would totally like to see a future where the "processing" (and I assume you mean the inference here, the training would of course be handled by someone else in the DC) is done locally. Not all models have to be bajillion parameter behemoths, just like not every phone, tablet, or computer is the fastest thing ever.
Posted on Reply
#11
notoperable
Imagine the SoC gets a bricked firmware downstream and they need to reflash each one of those manually that would be the ultimate DDOS for m$
Posted on Reply
#12
TechBuyingHavoc
SOAREVERSORIt's just as mad as when people said a computer in everyhome was madness or the cloud was madness. AI and not doing any processing locally is the future.
To be clear, you are saying that all local processing will stop in the future?

If so, that does sound mad. Bandwidth will always be an issue here, moving data around is the biggest driver of energy consumption. I would predict MORE local or edge processing, AI or not.
Posted on Reply
#13
Easy Rhino
Linux Advocate
And they use all of your data to train the models. Then you pay for the privilege of it.
Posted on Reply
#14
notoperable
Easy RhinoAnd they use all of your data to train the models. Then you pay for the privilege of it.
Now, jst 9.99$ a month, so you can train our models with your data even more!
Posted on Reply
#15
igormp
Vinceroadmittedly a large part of that probably due to hypervisor being ready to work with different vGPU sharing/partitioning implementations
I don't think AWS, nor any other cloud hyperscaler, has ever offered shared/partitioned GPUs. You can just get instances than have a full blown GPU (or more than one), and then you can do your own work to setup vGPU/sharing on top of it (which is still a PITA, at least when it comes to K8s).
I guess the issue was more about just the passthrough to the rented VM and allocation of such resources per node.
closeI would totally like to see a future where the "processing" (and I assume you mean the inference here, the training would of course be handled by someone else in the DC) is done locally. Not all models have to be bajillion parameter behemoths, just like not every phone, tablet, or computer is the fastest thing ever.
There's a lot of that going on, like all the photo tagging and search stuff in Androids and iPhones, as well as stuff like transcription in Whatsapp. Models on the edge are always cool, albeit sometimes lackluster (Whatsapp's transcription model is pretty anemic).
Posted on Reply
#16
cal5582
for all that information windows 11 siphons off of you.
leeches.
SOAREVERSORIt's just as mad as when people said a computer in everyhome was madness or the cloud was madness. AI and not doing any processing locally is the future.
hopefully ill be dead by that point.
Posted on Reply
#17
Fatalfury
Damn!!! $30B for auto generating images and talk bot. they really bought the hype didnt they..

Hope MS dont end up like FB with their Metaverse universe where things were like virtual room simulator with a 2006 Online game
Posted on Reply
#18
Vincero
igormpI don't think AWS, nor any other cloud hyperscaler, has ever offered shared/partitioned GPUs. You can just get instances than have a full blown GPU (or more than one), and then you can do your own work to setup vGPU/sharing on top of it (which is still a PITA, at least when it comes to K8s).
I guess the issue was more about just the passthrough to the rented VM and allocation of such resources per node.
Directly, I don't think they did to normal people / customers... however on a partner / corporate level who knows....:

This is gonna sound a bit like a 'The Register' post, but I guess it's worth sharing.

Back in 2014/15, a company I worked for was considering using AWS Nvidia GRID GPU instances with Citrix XenDesktop handling the user session and the Xen Hypervisor handling vGPU duty - not an offering which would have been widely required by most people, but apparently was a realistic option when talking to Citrix about options - and before you ask, this was for specific engineering CAD software running a specific software version with bespoke customisations to the software for the company, so the GPU acceleration was required (although not top tier speed - we weren't looking for 16 people all playing GTA V at 60fps on each server types of performance).
Before you say it, yeah the cost wasn't going to be cheap (>$1m... per year), but there were certain logistical/legal/contractual/compliance issues at play with regards to making a specific software environment to certain staff and 3rd party contractual staff.
The incumbant IT services provider (who already hosted several servers for the company LAN and WAN facing web servers/services) were also quoting silly numbers to run Xen servers with GRID K2 GPUs (iirc around $1m initial fee and around $500k ongoing costs for support, etc.) - around this same time AWS and Nvidia were demoing GPU accelerated options so enquiries were made.
We already used Citrix XenApp internally on Windows Servers so, whilst adding XenServer boxes and managing them would be much more work, we already had some of the infrastructure in place in terms of a Citrix Storefront, NetScaler gateways for access, etc.
We did find some smaller hosting companies (who were in the VPS/IaaS/SaaS space) who had the capability to offer the service running the servers and hypervisors, etc., with the caveat that we 'owned' rather than rented the hardware - which after 1 year would have been around 30% cheaper and hence offer an even better return over any further time frame as we wouldn't be renting the equipment and the ongoing costs were less than half the incumbant IT providers - but the company ultimately wasn't willing to do that (which kind of annoys me still) despite having the budget for it.

Ultimately, the potential costs would have been nearly the same as paying out contractual compensation - the incumbant IT services provider did not cover themselves in glory in that sense and would be gone within a few years. In the end the classic IT "lets just do nothing" decision was taken which ultimately meant lots of other charges to the company to provide access to the system by either VPN or facilitating the extra cost to other staff needing to work at our offices over time. I guess in the long run this probably worked out cheaper, but everyone hated it and ultimately fuelled that fire of when staff hate their company IT systems due be outdated / clunky / slow, etc.

The lesson here is always read contracts, and if you're the one making it be sure to always consider what might happen in the future in terms of software updates, availability and potential platform / OS changes.
Posted on Reply
#19
Vayra86
TechBuyingHavocTo be clear, you are saying that all local processing will stop in the future?

If so, that does sound mad. Bandwidth will always be an issue here, moving data around is the biggest driver of energy consumption. I would predict MORE local or edge processing, AI or not.
Exactly. All these wild ideas that something will be the be-all end-all thing for everything are just that: wild ideas. There is certainly a place for cloud - but there is also a place for localized.

What we see instead is that all these new things are just added on top of what we already have. That is why you own a mobile phone, a PC, and possibly a console too, and also a Smart TV - and you can watch TV on all of them, you can game on all of them, and if you connect a keyboard and have internet you can probably even do your productive tasks on all of them - or most.
Posted on Reply
#20
RandallFlagg
TechBuyingHavocTo be clear, you are saying that all local processing will stop in the future?

If so, that does sound mad. Bandwidth will always be an issue here, moving data around is the biggest driver of energy consumption. I would predict MORE local or edge processing, AI or not.
Every big company wants to centralize resources, because that's what fits their revenue model. Consumers usually wind up rejecting this though.

Look at security cams. A few years ago, most of what you could get were like Ring - where your security video goes into the cloud and you have no access to it unless you pay a monthly fee. It didn't take too long for this to fall apart, giving rise to alternatives like Eufy, Arlo, and Blink. I know many people who bought Ring early on, and have switched due to a combination of lack of privacy and being nickel-and-dimed.

Though few here are old enough to remember it, same thing was tried with internet access in the 80s and 90s. AOL and Prodigy are later examples of this, where you had a sort of pseudo-access to the internet but mostly only via using the products that they built for you - products which would constantly throw up ads and otherwise were intrusive. But that was never what the consumer wanted, and resulted in thousands of mom-n-pop ISPs shooting up everywhere that simply provided dial-up networking. It was a decade later that the big players begrudgingly gave consumers the simple direct access they wanted.

And going back even further, in the 1970s to early 80s we had time-share. Everyone was supposed to have a dumb terminal with a modem in their home, all compute resources were "in the cloud" aka on the mainframe or other big-iron centralized system you were dialing up. This began to fall apart the moment personal PCs appeared.

I see no reason to think that the current mania will end any differently.
Posted on Reply
#21
Vayra86
RandallFlaggI see no reason to think that the current mania will end any differently.
Perhaps there is one: social media: misinformation, and fear resulting from it, plus the overall degradation of common sense. We have to appreciate there are generations now growing up without those experiences you mention. But yes, I too, think that we'll eventually figure it out again. That movement and awareness is already happening. It'll be an ongoing battle.
Posted on Reply
#22
RandallFlagg
Vayra86Perhaps there is one: social media: misinformation, and fear resulting from it, plus the overall degradation of common sense. We have to appreciate there are generations now growing up without those experiences you mention. But yes, I too, think that we'll eventually figure it out again. That movement and awareness is already happening. It'll be an ongoing battle.
I think the only thing that will stop it is if tech, specifically chip tech, stops advancing. I think these companies have at most 10 years to get a ROI, and probably a lot less.


"Willow’s performance on this benchmark is astonishing: It performed a computation in under five minutes that would take one of today’s fastest supercomputers 1025 or 10 septillion years. "

-Hartmut Neven
Founder and Lead, Google Quantum AI
Posted on Reply
#23
notoperable
cal5582for all that information windows 11 siphons off of you.
leeches.


hopefully ill be dead by that point.
its called Telemetry™ nowdays in Redmond, makes it more consumer friendly, on the other hand, Windows 11 is M$ first full SaaS - Spyware as a System™
Posted on Reply
#24
TechBuyingHavoc
RandallFlaggEvery big company wants to centralize resources, because that's what fits their revenue model. Consumers usually wind up rejecting this though.

Look at security cams. A few years ago, most of what you could get were like Ring - where your security video goes into the cloud and you have no access to it unless you pay a monthly fee. It didn't take too long for this to fall apart, giving rise to alternatives like Eufy, Arlo, and Blink. I know many people who bought Ring early on, and have switched due to a combination of lack of privacy and being nickel-and-dimed.

Though few here are old enough to remember it, same thing was tried with internet access in the 80s and 90s. AOL and Prodigy are later examples of this, where you had a sort of pseudo-access to the internet but mostly only via using the products that they built for you - products which would constantly throw up ads and otherwise were intrusive. But that was never what the consumer wanted, and resulted in thousands of mom-n-pop ISPs shooting up everywhere that simply provided dial-up networking. It was a decade later that the big players begrudgingly gave consumers the simple direct access they wanted.

And going back even further, in the 1970s to early 80s we had time-share. Everyone was supposed to have a dumb terminal with a modem in their home, all compute resources were "in the cloud" aka on the mainframe or other big-iron centralized system you were dialing up. This began to fall apart the moment personal PCs appeared.

I see no reason to think that the current mania will end any differently.
A big part of this is Enshittification. The centralized services always offer an alluring promise of low cost, high convenience, high reliability, plus some other benefits. It actually is a good deal to consumers at first.

Then Enshittification kicks in...

Corporate profits must continue to rise and if not profits, then revenue growth. At some point, the company has gotten all the consumers it was going to get and then the Squeeze starts. Quality drops, prices go up, and innovation stagnates.
Posted on Reply
#25
Dr. Dro
Vayra86Madness. Utter madness
The pinnacle of AI technology. Thanks for investing trillions in this, Jensen, Satya, Elon and Co.

Posted on Reply
Add your own comment
Jan 18th, 2025 22:31 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts