Thursday, June 15th 2023

AMD Confirms that Instinct MI300X GPU Can Consume 750 W

Jun 15th, 2023 13:15 Discuss (43 Comments)

AMD recently revealed its Instinct MI300X GPU at their Data Center and AI Technology Premiere event on Tuesday (June 15). The keynote presentation did not provide any details about the new accelerator model's power consumption, but that did not stop one tipster - Hoang Anh Phu - from obtaining this information from Team Red's post-event footnotes. A comparative observation was made: "MI300X (192 GB HBM3, OAM Module) TBP is 750 W, compared to last gen, MI250X TBP is only 500-560 W." A leaked Giga Computing roadmap from last month anticipated server-grade GPUs hitting the 700 W mark.

NVIDIA's Hopper H100 took the crown - with its demand for a maximum of 700 W - as the most power-hungry data center enterprise GPU until now. The MI300X's OCP Accelerator Module-based design now surpasses Team Green's flagship with a slightly greater rating. AMD's new "leadership generative AI accelerator" sports 304 CDNA 3 compute units, which is a clear upgrade over the MI250X's 220 (CDNA 2) CUs. Engineers have also introduced new 24G B HBM3 stacks, so the MI300X can be specced with 192 GB of memory (as a maximum), the MI250X is limited to a 128 GB memory capacity with its slower HBM2E stacks. We hope to see sample units producing benchmark results very soon, with the MI300X pitted against H100.

Sources: VideoCardz, AnhPhuH Tweet

Add your own comment

43 Comments on AMD Confirms that Instinct MI300X GPU Can Consume 750 W

Space Lynx

Astronaut

is Microsoft still putting servers in the sea/lakes to keep them cold without exorbitant power costs? I sure hope so.

news.microsoft.com/source/features/sustainability/project-natick-underwater-datacenter/

zo0lykas

192GB HBM3, future is here :)

Denver

On paper this trumps the H100, how is the battle in real workloads?

ZoneDymo

what cpu's save in powerconsumption, gpu's will make up for....

natr0n

Space Lynxis Microsoft still putting servers in the sea/lakes to keep them cold without exorbitant power costs? I sure hope so.

news.microsoft.com/source/features/sustainability/project-natick-underwater-datacenter/

Pretty amazing. They remove the air and basically create a vaccuum.. The cold ocean water is like the ambient temp for the container.

fancucker

This is an immense feat of engineering and hopefully something that helps AMD garner well-deserved market share but issues abound:

1 - HBM3 is costly, and may negate the cost advantage AMD might have with the MI300. Nvidia is likely to ship with HBM3 products at the same timeframe or earlier
2 - There is no apparent equivalent to the transformer engine, which can triple performance in LLM scenarios in Nvidia counterparts
3 - Nvidias H100 is shipping in full volume today, with more researcher and technical support in their superior ecosystem
4 - AMD is yet to disclose benchmarks

I hope AMD can resolve or alleviate some of these issues because it seems like an excellent product overall.

TumbleGeorge

All data here is old and partially incorrect. Must be updated.

Tomgang

Uf that's a lot of power going throw a pcb and gpu.

And I think the 570 watt peak I have seen on my rtx 4090 with oc is bad enough. Yes I am aware this is meant for other things than gaming. But still 700 watt or more is a lot of power for one gpu.

Kapone33

TomgangUf that's a lot of power going throw a pcb and gpu.

And I think the 570 watt peak I have seen on my rtx 4090 with oc is bad enough. Yes I am aware this is meant for other things than gaming. But still 700 watt or more is a lot of power for one gpu.

This monster has about 5 to 10 times the transistors vs a 4090 and 192GB of HBM is no joke but it is actual not bad considering.

#10

R0H1T

DenverOn paper this trumps the H100, how is the battle in real workloads?

Depends on the software stack, this alongside MI300A will beat nearly any combinations of AI/CPU/GPU power Nvidia can muster but then again there's CUDA so!

#11

dgianstefani

TPU Proofreader

R0H1TDepends on the software stack, this alongside MI300A will beat nearly any combinations of AI/CPU/GPU power Nvidia can muster but then again there's CUDA so!

Entire GPU ecosystem is based around software, the hardware is just a checkbox.

NVIDIA is a software company.

#12

Wirko

TomgangUf that's a lot of power going throw a pcb and gpu.

And I think the 570 watt peak I have seen on my rtx 4090 with oc is bad enough. Yes I am aware this is meant for other things than gaming. But still 700 watt or more is a lot of power for one gpu.

If the trend continues, those accelerators are going to consume one half-Xeon in a couple years.

#13

R0H1T

dgianstefaniEntire GPU ecosystem is based around software, the hardware is just a checkbox.

NVIDIA is a software company.

AMD is getting there, if they can string a few consistent wins in a row they should be able to challenge Nvidia kinda like they did with Intel.

#14

Kapone33

dgianstefaniEntire GPU ecosystem is based around software, the hardware is just a checkbox.

NVIDIA is a software company.

Well based on the Adrenline software vs Nvidia's offerings would belie what you are saying. Ray Tracing and DLSS do not apply in this space so.

#15

dgianstefani

TPU Proofreader

kapone32Well based on the Adrenline software vs Nvidia's offerings would belie what you are saying. Ray Tracing and DLSS do not apply in this space so.

Maybe by the time AMD has figured out drivers that can be installed on any of their consumer cards (see: several months gap between RDNA3 release and RDNA2 driver update, or, more recently exclusive driver for RX7600 only), they can start to figure out the software and partner support they need to succeed in datacentre GPUs.

By then NVIDIA will have built it's 1:1 virtual/real world model in the Omniverse, of which every major manufacturer has already signed onto, just as with CUDA dominance for the past decade.

From 15 months 3/22 ago this image.

www.nvidia.com/en-us/omniverse/ecosystem/ current ecosystem.

Where is AMD?

R0H1TAMD is getting there, if they can string a few consistent wins in a row they should be able to challenge Nvidia kinda like they did with Intel.

I hope so, a monopoly isn't a great situation, but the product needs to deliver. Zen did, but that was a hardware success not a software success. The software (AGESA) for Zen has been buggy in each iteration for years now, which they hope to fix eventually with opensource OpenSIL.

#16

Kapone33

dgianstefaniMaybe by the time AMD has figured out drivers that can be installed on any of their consumer cards (see: several months gap between RDNA3 release and RDNA2 driver update, or, more recently exclusive driver for RX7600 only), they can start to figure out the software and partner support they need to succeed in datacentre GPUs.

By then NVIDIA will have built it's 1:1 virtual/real world model in the Omniverse, of which every major manufacturer has already signed onto, just as with CUDA dominance for the past decade.

From 15 months 3/22 ago this image.

www.nvidia.com/en-us/omniverse/ecosystem/ current ecosystem.

Where is AMD?

Let me see I have had AMD since the original 6800 and yes that was a Gigabyte board and it was a gremlin. Then I went with Sapphire and nothing sense. Did you know that Sapphire had upscaling in their software package before FSR or DLSS were even conversation pieces? I could paste the AMD stack as well and as much as people complained about the 3 months that AMD did not give them driver updates for Cards that were working fine is also nothing but the narrative. There is also that while you and I can debate it that AMD will be selling plenty of these as the Companies on that placard are in some ways optimizing for AMD or depending on AWS and Microsoft for their network but we will go on. This card is no joke at all so as much as you give little import to the hardware 192GB of HBM3 (spec that we don't know) is nothing to sneeze at and the amount of transistor counts is also crazy.

#17

dgianstefani

TPU Proofreader

kapone32Let me see I have had AMD since the original 6800 and yes that was a Gigabyte board and it was a gremlin. Then I went with Sapphire and nothing sense. Did you know that Sapphire had upscaling in their software package before FSR or DLSS were even conversation pieces? I could paste the AMD stack as well and as much as people complained about the 3 months that AMD did not give them driver updates for Cards that were working fine is also nothing but the narrative. There is also that while you and I can debate it that AMD will be selling plenty of these as the Companies on that placard are in some ways optimizing for AMD or depending on AWS and Microsoft for their network but we will go on. This card is no joke at all so as much as you give little import to the hardware 192GB of HBM3 (spec that we don't know) is nothing to sneeze at and the amount of transistor counts is also crazy.

Hardware is nice, sure.

"Narrative" aside, we'll see how they do in 2023 for enterprise GPU, in 2021 they couldn't breach 9% and the trend isn't changing, that percentage went down in 2022.

www.nasdaq.com/articles/better-buy:-nvidia-vs.-amd-2

#18

Kapone33

dgianstefaniHardware is nice, sure.

"Narrative" aside, we'll see how they do in 2023 for enterprise GPU, in 2021 they couldn't breach 9% and the trend isn't changing, that percentage went down in 2022.

www.nasdaq.com/articles/better-buy:-nvidia-vs.-amd-2

Why do you think they released this card? I am aware of AMD's position in the Data Centre when it comes to GPUs. My argument was strictly software and again this is a monster of a GPU that could disrupt the stack.

#19

TheoneandonlyMrK

dgianstefaniMaybe by the time AMD has figured out drivers that can be installed on any of their consumer cards (see: several months gap between RDNA3 release and RDNA2 driver update, or, more recently exclusive driver for RX7600 only), they can start to figure out the software and partner support they need to succeed in datacentre GPUs.

By then NVIDIA will have built it's 1:1 virtual/real world model in the Omniverse, of which every major manufacturer has already signed onto, just as with CUDA dominance for the past decade.

From 15 months 3/22 ago this image.

www.nvidia.com/en-us/omniverse/ecosystem/ current ecosystem.

Where is AMD?

I hope so, a monopoly isn't a great situation, but the product needs to deliver. Zen did, but that was a hardware success not a software success. The software (AGESA) for Zen has been buggy in each iteration for years now, which they hope to fix eventually with opensource OpenSIL.

My word.

AMD drivers again? I have no issues and don't/have not seen the masses abnormally arrayed against AMD drivers, just the odd hyperbolic statement.

#20

Tek-Check

natr0nPretty amazing. They remove the air and basically create a vaccuum.. The cold ocean water is like the ambient temp for the container.

Yes, but you can't cheat physics. Imagine hundreds of those warming up local sea and disrupting flora and fauna.

#21

R0H1T

Yes & it's not like whatever they're putting them in wouldn't cause at least some sort of (chemical) reaction at the depths used for them. We're just dumping more of our problems in the oceans if it's not plastic it's excess heat :shadedshu:

If you really want to make them almost completely eco friendly just launch them into space!

#22

dgianstefani

TPU Proofreader

R0H1TYes & it's not like whatever they're putting them in wouldn't cause at least some sort of (chemical) reaction at the depths used for them. We're just dumping more of our problems in the oceans if it's not plastic it's excess heat :shadedshu:

If you really want to make them almost completely eco friendly just launch them into space!

Cooler chips consume less energy, due to lower voltage leakage.

If they're going to be run, they may as well be run more efficiently.

#23

R0H1T

But they are disrupting the ecology of that place, even if a little bit. I'd rather them use the power from solar panels on a satellite & irradiate whatever heat they're generating mostly outside our atmosphere. While this sounded like a fun experiment I'm not sure how practical it is longer term. Especially with oceans getting warmer every day. We're just accelerating this process arguably more quickly with something like it, the data for instance they're crunching/processing would need to travel longer distances & so is it really more (energy) efficient overall?

#24

dgianstefani

TPU Proofreader

R0H1TBut they are disrupting the ecology of that place, even if a little bit. I'd rather them use the power from solar panels on a satellite & irradiate whatever heat they're generating mostly outside our atmosphere. While this sounded like a fun experiment I'm not sure how practical it is longer term. Especially with oceans getting warmer every day. We're just accelerating this process arguably more quickly with something like it, the data for instance they're crunching/processing would need to travel longer distances & so is it really more (energy) efficient overall?

Heat from server farms won't even register compared to the effects of atmosphere changes and pollution from rivers, fuel oil container ships, manufacturing waste, farming run off, reduced reflective white from shrinking ice surface area etc. the list goes on. I doubt you could even calculate the difference submerged computers would make.

Average ocean temperature is 0-20ºC, even if that doubled (projections are a couple ºC increase over several centuries), it would still be an effective cooling medium.

I wouldn't be surprised to see local ecology find some way to benefit from the heat source, as with coral/bacteria ecosystems near underwater volcanos or vents etc.

#25

R-T-B

R0H1TIf you really want to make them almost completely eco friendly just launch them into space!

Space in low earth orbit is way way hotter than under the sea, at least in blackbody temp. The vacuum acts more like an insulator there anyways.

Add your own comment

AMD Confirms that Instinct MI300X GPU Can Consume 750 W

43 Comments on AMD Confirms that Instinct MI300X GPU Can Consume 750 W

Latest GPU Drivers

New Forum Posts

Popular Reviews

TPU on YouTube

Controversial News Posts

AMD Confirms that Instinct MI300X GPU Can Consume 750 W

Related News

43 Comments on AMD Confirms that Instinct MI300X GPU Can Consume 750 W

Latest GPU Drivers

New Forum Posts

Popular Reviews

TPU on YouTube

Controversial News Posts