• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

AMD MI300X Accelerators are Competitive with NVIDIA H100, Crunch MLPerf Inference v4.1

btarunr

Editor & Senior Moderator
Staff member
Joined
Oct 9, 2007
Messages
47,297 (7.53/day)
Location
Hyderabad, India
System Name RBMK-1000
Processor AMD Ryzen 7 5700G
Motherboard ASUS ROG Strix B450-E Gaming
Cooling DeepCool Gammax L240 V2
Memory 2x 8GB G.Skill Sniper X
Video Card(s) Palit GeForce RTX 2080 SUPER GameRock
Storage Western Digital Black NVMe 512GB
Display(s) BenQ 1440p 60 Hz 27-inch
Case Corsair Carbide 100R
Audio Device(s) ASUS SupremeFX S1220A
Power Supply Cooler Master MWE Gold 650W
Mouse ASUS ROG Strix Impact
Keyboard Gamdias Hermes E2
Software Windows 11 Pro
The MLCommons consortium on Wednesday posted MLPerf Inference v4.1 benchmark results for popular AI inferencing accelerators available in the market, across brands that include NVIDIA, AMD, and Intel. AMD's Instinct MI300X accelerators emerged competitive to NVIDIA's "Hopper" H100 series AI GPUs. AMD also used the opportunity to showcase the kind of AI inferencing performance uplifts customers can expect from its next-generation EPYC "Turin" server processors powering these MI300X machines. "Turin" features "Zen 5" CPU cores, sporting a 512-bit FPU datapath, and improved performance in AI-relevant 512-bit SIMD instruction-sets, such as AVX-512, and VNNI. The MI300X, on the other hand, banks on the strengths of its memory sub-system, FP8 data format support, and efficient KV cache management.

The MLPerf Inference v4.1 benchmark focused on the 70 billion-parameter LLaMA2-70B model. AMD's submissions included machines featuring the Instinct MI300X, powered by the current EPYC "Genoa" (Zen 4), and next-gen EPYC "Turin" (Zen 5). The GPUs are backed by AMD's ROCm open-source software stack. The benchmark evaluated inference performance using 24,576 Q&A samples from the OpenORCA dataset, with each sample containing up to 1024 input and output tokens. Two scenarios were assessed: the offline scenario, focusing on batch processing to maximize throughput in tokens per second, and the server scenario, which simulates real-time queries with strict latency limits (TTFT ≤ 2 seconds, TPOT ≤ 200 ms). This lets you see the chip's mettle in both high-throughput and low-latency queries.



AMD's first submission (4.1-0002) sees a server featuring 2P EPYC 9374F "Genoa" processors and 8x Instinct MI300X accelerators. Here, the machine clocks 21,028 tokens/sec in the server test, compared to 21,605 tokens/sec scored in an NVIDIA machine combining 8x NVIDIA DGX100 with a Xeon processor. In the offline test, the AMD machine scores 23,514 tokens/sec compared to 24,525 tokens/sec of the NVIDIA+Intel machine. AMD tested the 8x MI300X with a pair of EPYC "Turin" (Zen 5) processors of comparable core-counts, and gained on NVIDIA, with 22,021 server tokens/sec, and 24,110 offline tokens/sec. AMD claims that is achieving a near-linear scaling in performance between 1x MI300X and 8x MI300X, which speaks for AMD's platform I/O and memory management chops.

AMD's results bode well for future versions of the model, such as LLaMA 3.1 with its gargantuan 405 billion parameters. Here, the 192 GB of HBM3 with 5.3 TB/s of memory bandwidth come in really handy. This earned AMD a partnership with Meta to power LLaMa 3.1 405B. An 8x MI300X blade packs 1.5 TB of memory with over 42 TB/s of memory bandwidth, with Infinity Fabric handling the interconnectivity. A single server is able to accommodate the entire LLaMa 3.1 405B model using the FP16 data type.

View at TechPowerUp Main Site
 
Joined
Jul 31, 2024
Messages
182 (1.26/day)
Processor AMD Ryzen 7 5700X
Motherboard ASUS ROG Strix B550-F Gaming Wifi II
Cooling Noctua NH-U12S Redux
Memory 4x8G Teamgroup Vulcan Z DDR4; 3600MHz @ CL18
Video Card(s) MSI Ventus 2X GeForce RTX 3060 12GB
Storage WD_Black SN770, Leven JPS600, Toshiba DT01ACA
Display(s) Samsung ViewFinity S6
Case Fractal Design Pop Air TG
Power Supply Corsair CX750M
Mouse Corsair Harpoon RGB
Keyboard Keychron C2 Pro
VR HMD Valve Index
I think the main selling point here is going to be deployment + running costs. If this can consistently be cheaper to deploy and run than Nvidia proportionally, then there's definitely something here. If not, they're still chasing coattails as far as I'm concerned.
 
Joined
Sep 15, 2011
Messages
6,760 (1.39/day)
Processor Intel® Core™ i7-13700K
Motherboard Gigabyte Z790 Aorus Elite AX
Cooling Noctua NH-D15
Memory 32GB(2x16) DDR5@6600MHz G-Skill Trident Z5
Video Card(s) ZOTAC GAMING GeForce RTX 3080 AMP Holo
Storage 2TB SK Platinum P41 SSD + 4TB SanDisk Ultra SSD + 500GB Samsung 840 EVO SSD
Display(s) Acer Predator X34 3440x1440@100Hz G-Sync
Case NZXT PHANTOM410-BK
Audio Device(s) Creative X-Fi Titanium PCIe
Power Supply Corsair 850W
Mouse Logitech Hero G502 SE
Software Windows 11 Pro - 64bit
Benchmark Scores 30FPS in NFS:Rivals
Good. nGreedia's monopoly must be challenged.
 

las

Joined
Nov 14, 2012
Messages
1,693 (0.38/day)
System Name Meh
Processor 7800X3D
Motherboard MSI X670E Tomahawk
Cooling Thermalright Phantom Spirit
Memory 32GB G.Skill @ 6000/CL30
Video Card(s) Gainward RTX 4090 Phantom / Undervolt + OC
Storage Samsung 990 Pro 2TB + WD SN850X 1TB + 64TB NAS/Server
Display(s) 27" 1440p IPS @ 360 Hz + 32" 4K/UHD QD-OLED @ 240 Hz + 77" 4K/UHD QD-OLED @ 144 Hz VRR
Case Fractal Design North XL
Audio Device(s) FiiO DAC
Power Supply Corsair RM1000x / Native 12VHPWR
Mouse Logitech G Pro Wireless Superlight + Razer Deathadder V3 Pro
Keyboard Corsair K60 Pro / MX Low Profile Speed
Software Windows 10 Pro x64
How does it fare vs Blackwell B200 tho? H100 is old news at this point
 
Joined
Sep 6, 2013
Messages
3,391 (0.82/day)
Location
Athens, Greece
System Name 3 desktop systems: Gaming / Internet / HTPC
Processor Ryzen 5 7600 / Ryzen 5 4600G / Ryzen 5 5500
Motherboard X670E Gaming Plus WiFi / MSI X470 Gaming Plus Max (1) / MSI X470 Gaming Plus Max (2)
Cooling Aigo ICE 400SE / Segotep T4 / Νoctua U12S
Memory Kingston FURY Beast 32GB DDR5 6000 / 16GB JUHOR / 32GB G.Skill RIPJAWS 3600 + Aegis 3200
Video Card(s) ASRock RX 6600 + GT 710 (PhysX) / Vega 7 integrated / Radeon RX 580
Storage NVMes, ONLY NVMes / NVMes, SATA Storage / NVMe, SATA, external storage
Display(s) Philips 43PUS8857/12 UHD TV (120Hz, HDR, FreeSync Premium) / 19'' HP monitor + BlitzWolf BW-V5
Case Sharkoon Rebel 12 / CoolerMaster Elite 361 / Xigmatek Midguard
Audio Device(s) onboard
Power Supply Chieftec 850W / Silver Power 400W / Sharkoon 650W
Mouse CoolerMaster Devastator III Plus / CoolerMaster Devastator / Logitech
Keyboard CoolerMaster Devastator III Plus / CoolerMaster Devastator / Logitech
Software Windows 10 / Windows 10&Windows 11 / Windows 10
How does it fare vs Blackwell B200 tho? H100 is old news at this point
From what I can understand, B200's advantage is FP4 support.
I have no idea about compute tasks, but I think this is the equivalent advantage to DLSS in gaming. I was reading that Nvidia says that their FP4 is very accurate thanks to their software.
 

W1zzard

Administrator
Staff member
Joined
May 14, 2004
Messages
27,963 (3.72/day)
Processor Ryzen 7 5700X
Memory 48 GB
Video Card(s) RTX 4080
Storage 2x HDD RAID 1, 3x M.2 NVMe
Display(s) 30" 2560x1600 + 19" 1280x1024
Software Windows 10 64-bit
I was reading that Nvidia says that their FP4 is very accurate thanks to their software.
I don't think any FP4 is better than the other one? but still, having hardware support for it can be useful.

In the press call for this news I asked AMD about Block Float 16 support on MI300X, but they acted like I asked for something else and answered it in a general way, which to me seems they evaded the question, which means "not supported"
 
Joined
Jan 18, 2020
Messages
834 (0.46/day)
Nvidia probably already sold enough ML hardware for the next 10 years or even longer. Given the lack of really decent use cases and fundamental flaws with the technology.

By the time AMD get them on the market it won't be there anymore?
 
Joined
Feb 18, 2005
Messages
5,847 (0.81/day)
Location
Ikenai borderline!
System Name Firelance.
Processor Threadripper 3960X
Motherboard ROG Strix TRX40-E Gaming
Cooling IceGem 360 + 6x Arctic Cooling P12
Memory 8x 16GB Patriot Viper DDR4-3200 CL16
Video Card(s) MSI GeForce RTX 4060 Ti Ventus 2X OC
Storage 2TB WD SN850X (boot), 4TB Crucial P3 (data)
Display(s) 3x AOC Q32E2N (32" 2560x1440 75Hz)
Case Enthoo Pro II Server Edition (Closed Panel) + 6 fans
Power Supply Fractal Design Ion+ 2 Platinum 760W
Mouse Logitech G602
Keyboard Razer Pro Type Ultra
Software Windows 10 Professional x64
Weird AMD, why didn't you show us H100 running with AMD CPUs? And why did you test with H100 when B200 is available? It's almost like you're trying to skew this to make you look better... AGAIN.

In the press call for this news I asked AMD about Block Float 16 support on MI300X, but they acted like I asked for something else and answered it in a general way, which to me seems they evaded the question, which means "not supported"
Oh boy...
 

TheToi

New Member
Joined
Nov 4, 2023
Messages
4 (0.01/day)
I wonder why they use llama 2 on their benchmark, llama 3 was released a moment ago already and since a month we are at llama 3.1
 
Joined
Nov 26, 2021
Messages
1,705 (1.52/day)
Location
Mississauga, Canada
Processor Ryzen 7 5700X
Motherboard ASUS TUF Gaming X570-PRO (WiFi 6)
Cooling Noctua NH-C14S (two fans)
Memory 2x16GB DDR4 3200
Video Card(s) Reference Vega 64
Storage Intel 665p 1TB, WD Black SN850X 2TB, Crucial MX300 1TB SATA, Samsung 830 256 GB SATA
Display(s) Nixeus NX-EDG27, and Samsung S23A700
Case Fractal Design R5
Power Supply Seasonic PRIME TITANIUM 850W
Mouse Logitech
VR HMD Oculus Rift
Software Windows 11 Pro, and Ubuntu 20.04
I don't think any FP4 is better than the other one? but still, having hardware support for it can be useful.

In the press call for this news I asked AMD about Block Float 16 support on MI300X, but they acted like I asked for something else and answered it in a general way, which to me seems they evaded the question, which means "not supported"
The ISA reference for MI300 includes instructions that operate on BF16 data.

1724941344758.png
 
Joined
Jul 13, 2016
Messages
3,329 (1.08/day)
Processor Ryzen 7800X3D
Motherboard ASRock X670E Taichi
Cooling Noctua NH-D15 Chromax
Memory 32GB DDR5 6000 CL30
Video Card(s) MSI RTX 4090 Trio
Storage Too much
Display(s) Acer Predator XB3 27" 240 Hz
Case Thermaltake Core X9
Audio Device(s) Topping DX5, DCA Aeon II
Power Supply Seasonic Prime Titanium 850w
Mouse G305
Keyboard Wooting HE60
VR HMD Valve Index
Software Win 10
Nvidia probably already sold enough ML hardware for the next 10 years or even longer. Given the lack of really decent use cases and fundamental flaws with the technology.

By the time AMD get them on the market it won't be there anymore?

AI is used in the engineering, medical, and artistic fields and is already indispensable to them. TSMC and it's customers themselves use AI to improve photo-lithography masks and chip design is aided by AI.

The AI bubble may "pop" at some point similar to the dotcom bubble but what's left behind will still be significant just the same as the dotcom bubble.
 
Joined
Aug 21, 2013
Messages
1,936 (0.47/day)
How does it fare vs Blackwell B200 tho? H100 is old news at this point
From Nvidia's own benchmarks the difference is like 20k vs 30k but B200 also uses 1000W compared to 700W for H100 (that MI300X matches according to Nvidia's slides) and 750W for MI300X itself.
Weird AMD, why didn't you show us H100 running with AMD CPUs? And why did you test with H100 when B200 is available? It's almost like you're trying to skew this to make you look better... AGAIN.
But, but why did Nvidia in their benchmarks use Xeon and not Epyc?
Could it be that they're NOT obliged to use competitors hardware, just like AMD?
It makes sense for AMD to test with their own CPU if they have the solution.
It's the same reason B200 has ARM and Nvidia fused together. Not Xeon and Nvidia.
 

W1zzard

Administrator
Staff member
Joined
May 14, 2004
Messages
27,963 (3.72/day)
Processor Ryzen 7 5700X
Memory 48 GB
Video Card(s) RTX 4080
Storage 2x HDD RAID 1, 3x M.2 NVMe
Display(s) 30" 2560x1600 + 19" 1280x1024
Software Windows 10 64-bit
Joined
Nov 26, 2021
Messages
1,705 (1.52/day)
Location
Mississauga, Canada
Processor Ryzen 7 5700X
Motherboard ASUS TUF Gaming X570-PRO (WiFi 6)
Cooling Noctua NH-C14S (two fans)
Memory 2x16GB DDR4 3200
Video Card(s) Reference Vega 64
Storage Intel 665p 1TB, WD Black SN850X 2TB, Crucial MX300 1TB SATA, Samsung 830 256 GB SATA
Display(s) Nixeus NX-EDG27, and Samsung S23A700
Case Fractal Design R5
Power Supply Seasonic PRIME TITANIUM 850W
Mouse Logitech
VR HMD Oculus Rift
Software Windows 11 Pro, and Ubuntu 20.04
Joined
Oct 27, 2009
Messages
1,190 (0.21/day)
Location
Republic of Texas
System Name [H]arbringer
Processor 4x 61XX ES @3.5Ghz (48cores)
Motherboard SM GL
Cooling 3x xspc rx360, rx240, 4x DT G34 snipers, D5 pump.
Memory 16x gskill DDR3 1600 cas6 2gb
Video Card(s) blah bigadv folder no gfx needed
Storage 32GB Sammy SSD
Display(s) headless
Case Xigmatek Elysium (whats left of it)
Audio Device(s) yawn
Power Supply Antec 1200w HCP
Software Ubuntu 10.10
Benchmark Scores http://valid.canardpc.com/show_oc.php?id=1780855 http://www.hwbot.org/submission/2158678 http://ww
How does it fare vs Blackwell B200 tho? H100 is old news at this point
B100/200 should be faster than mi300... but this is old news to old news. Mi300 has been being deployed into el Capitan since june'23. Mi325x will be going against the b100/200 and should both show up this fall. I still expect b100/200 to win on fp4 inference workloads but mi325x will still be competitive overall given how much faster the mi300 was. Also Nvidia essentially gave up competing on FP64 workloads.
 

W1zzard

Administrator
Staff member
Joined
May 14, 2004
Messages
27,963 (3.72/day)
Processor Ryzen 7 5700X
Memory 48 GB
Video Card(s) RTX 4080
Storage 2x HDD RAID 1, 3x M.2 NVMe
Display(s) 30" 2560x1600 + 19" 1280x1024
Software Windows 10 64-bit
Thanks for explaining the difference between these three formats. I believe you're correct about block float 16 being unsupported; there are no references to it in the MI300's ISA documentation.
It is quite exotic, but has interesting properties, and it was also an opportunity for AMD to talk more about formats, relevance, maybe some other innovations they've added .. but nope
 

las

Joined
Nov 14, 2012
Messages
1,693 (0.38/day)
System Name Meh
Processor 7800X3D
Motherboard MSI X670E Tomahawk
Cooling Thermalright Phantom Spirit
Memory 32GB G.Skill @ 6000/CL30
Video Card(s) Gainward RTX 4090 Phantom / Undervolt + OC
Storage Samsung 990 Pro 2TB + WD SN850X 1TB + 64TB NAS/Server
Display(s) 27" 1440p IPS @ 360 Hz + 32" 4K/UHD QD-OLED @ 240 Hz + 77" 4K/UHD QD-OLED @ 144 Hz VRR
Case Fractal Design North XL
Audio Device(s) FiiO DAC
Power Supply Corsair RM1000x / Native 12VHPWR
Mouse Logitech G Pro Wireless Superlight + Razer Deathadder V3 Pro
Keyboard Corsair K60 Pro / MX Low Profile Speed
Software Windows 10 Pro x64
AMD will have MI350 to compete against those soon enough. Performance to price ratio is far higher though for AMD and Intel.
Except that companies need a complete solution like Nvidia is providing, not just a GPU that performs well in a cherrypicked benchmark.

This is why AMD bought up ZT Systems for 5 billions, they want to provide a complete solution, right now they are just providing a GPU.

And this is why Nvidia is king of AI. Lets see if AMD gets on the train before it leaves.
 
Last edited:
Joined
Oct 27, 2009
Messages
1,190 (0.21/day)
Location
Republic of Texas
System Name [H]arbringer
Processor 4x 61XX ES @3.5Ghz (48cores)
Motherboard SM GL
Cooling 3x xspc rx360, rx240, 4x DT G34 snipers, D5 pump.
Memory 16x gskill DDR3 1600 cas6 2gb
Video Card(s) blah bigadv folder no gfx needed
Storage 32GB Sammy SSD
Display(s) headless
Case Xigmatek Elysium (whats left of it)
Audio Device(s) yawn
Power Supply Antec 1200w HCP
Software Ubuntu 10.10
Benchmark Scores http://valid.canardpc.com/show_oc.php?id=1780855 http://www.hwbot.org/submission/2158678 http://ww
Except that companies need a complete solution like Nvidia is providing, not just a GPU that performs well in a cherrypicked benchmark.

This is why AMD bought up ZT Systems for 5 billions, they want to provide a complete solution, right now they are just providing a GPU.

And this is why Nvidia is king of AI. Lets see if AMD gets on the train before it leaves.
El Capitan my Capitan. Yes nvidia leads in software development and pushing new non-industry standards that lock you into their ecosystem. Thankfully AMD has been fighting back with consortiums and uses standards like OAM so that you can use their gpus' in future systems they develop to compete against the DGX or in any partner system that uses OAM. Ala HPE, Dell, Supermicro... etc etc.

ML Perf is a bit cherry picked, it heavily favors nvidia as they have hundreds of engineers tuning for it, most workloads do not use FP8 or FP4 yet that is what Nvidia pushes. Blackwell decimates these mi300x results and will allegedly be shipping by years end. But again, supertuned. The mi325x will not win in throughput, it is expected to bring a 20-30% perf uplift but has a memory density advantage which will allow it to run more on single gpus and at higher precisions. 288GB HBM3e. Mi350x may be out by years end but is more likely shipping next year, and will bring FP4 support to AMD. I don't see how 'their claim of 35x inference improvement over mi300 will be true but I am guessing it has to do with memory constrained models.

Nvidia is king because they have a trapped ecosystem, but the industry is rebelling. There is very little that you cannot run on AMD mi300x's natively from hugging face. Almost all new models can be run natively without a hipify conversion. The memory advantage AMD has is pretty extreme, to the point that Meta has worked with AMD for day zero support of their insane model sizes.

So, why build a server that supports SXM when NVidia wants to take your customers and sell them DGX's ?
When you can build an OAM server that supports... Intels Gaudi and Max gpus, or AMD gpus or all the banned Chinese accelerators lol.
AMD is on the train, the limit is TSMC fab time. For everyone really.,
 

las

Joined
Nov 14, 2012
Messages
1,693 (0.38/day)
System Name Meh
Processor 7800X3D
Motherboard MSI X670E Tomahawk
Cooling Thermalright Phantom Spirit
Memory 32GB G.Skill @ 6000/CL30
Video Card(s) Gainward RTX 4090 Phantom / Undervolt + OC
Storage Samsung 990 Pro 2TB + WD SN850X 1TB + 64TB NAS/Server
Display(s) 27" 1440p IPS @ 360 Hz + 32" 4K/UHD QD-OLED @ 240 Hz + 77" 4K/UHD QD-OLED @ 144 Hz VRR
Case Fractal Design North XL
Audio Device(s) FiiO DAC
Power Supply Corsair RM1000x / Native 12VHPWR
Mouse Logitech G Pro Wireless Superlight + Razer Deathadder V3 Pro
Keyboard Corsair K60 Pro / MX Low Profile Speed
Software Windows 10 Pro x64
El Capitan my Capitan. Yes nvidia leads in software development and pushing new non-industry standards that lock you into their ecosystem. Thankfully AMD has been fighting back with consortiums and uses standards like OAM so that you can use their gpus' in future systems they develop to compete against the DGX or in any partner system that uses OAM. Ala HPE, Dell, Supermicro... etc etc.

ML Perf is a bit cherry picked, it heavily favors nvidia as they have hundreds of engineers tuning for it, most workloads do not use FP8 or FP4 yet that is what Nvidia pushes. Blackwell decimates these mi300x results and will allegedly be shipping by years end. But again, supertuned. The mi325x will not win in throughput, it is expected to bring a 20-30% perf uplift but has a memory density advantage which will allow it to run more on single gpus and at higher precisions. 288GB HBM3e. Mi350x may be out by years end but is more likely shipping next year, and will bring FP4 support to AMD. I don't see how 'their claim of 35x inference improvement over mi300 will be true but I am guessing it has to do with memory constrained models.

Nvidia is king because they have a trapped ecosystem, but the industry is rebelling. There is very little that you cannot run on AMD mi300x's natively from hugging face. Almost all new models can be run natively without a hipify conversion. The memory advantage AMD has is pretty extreme, to the point that Meta has worked with AMD for day zero support of their insane model sizes.

So, why build a server that supports SXM when NVidia wants to take your customers and sell them DGX's ?
When you can build an OAM server that supports... Intels Gaudi and Max gpus, or AMD gpus or all the banned Chinese accelerators lol.
AMD is on the train, the limit is TSMC fab time. For everyone really.,
Yeah AMD likes to play the good guy, till they don't.

Nvidia is king because they deliver what companies actually look for. AMD don't, they just provide a GPU, with no CUDA support as Nvidia invented that. AMD has AI GPUs on paper but in reality, Nvidia stands for 90% of AI GPU shipments.

If AMD were actually competitive in AI, their valuation would have exploded like Nvidias.
 
Joined
Oct 27, 2009
Messages
1,190 (0.21/day)
Location
Republic of Texas
System Name [H]arbringer
Processor 4x 61XX ES @3.5Ghz (48cores)
Motherboard SM GL
Cooling 3x xspc rx360, rx240, 4x DT G34 snipers, D5 pump.
Memory 16x gskill DDR3 1600 cas6 2gb
Video Card(s) blah bigadv folder no gfx needed
Storage 32GB Sammy SSD
Display(s) headless
Case Xigmatek Elysium (whats left of it)
Audio Device(s) yawn
Power Supply Antec 1200w HCP
Software Ubuntu 10.10
Benchmark Scores http://valid.canardpc.com/show_oc.php?id=1780855 http://www.hwbot.org/submission/2158678 http://ww
Yeah AMD likes to play the good guy, till they don't.

Nvidia is king because they deliver what companies actually look for. AMD don't, they just provide a GPU, with no CUDA support as Nvidia invented that. AMD has AI GPUs on paper but in reality, Nvidia stands for 90% of AI GPU shipments.

If AMD were actually competitive in AI, their valuation would have exploded like Nvidias.
This may shock you, but you don't need cuda to run workloads on a gpu. I snuck a little joke in the first line and it cleared the treetops it was so far over your head. El-capitan is set to be the first 2+ exaflop supercomputer running on mi300A apus. The current top supercomputer is frontier on mi250x's AMD is selling as many as they can make, the limit is TSMC not demand. In the past few years there has been a shift to hardware agnostic software, rather than cuda first, for those that still put cuda first, hipify exists to convert the code.
 

las

Joined
Nov 14, 2012
Messages
1,693 (0.38/day)
System Name Meh
Processor 7800X3D
Motherboard MSI X670E Tomahawk
Cooling Thermalright Phantom Spirit
Memory 32GB G.Skill @ 6000/CL30
Video Card(s) Gainward RTX 4090 Phantom / Undervolt + OC
Storage Samsung 990 Pro 2TB + WD SN850X 1TB + 64TB NAS/Server
Display(s) 27" 1440p IPS @ 360 Hz + 32" 4K/UHD QD-OLED @ 240 Hz + 77" 4K/UHD QD-OLED @ 144 Hz VRR
Case Fractal Design North XL
Audio Device(s) FiiO DAC
Power Supply Corsair RM1000x / Native 12VHPWR
Mouse Logitech G Pro Wireless Superlight + Razer Deathadder V3 Pro
Keyboard Corsair K60 Pro / MX Low Profile Speed
Software Windows 10 Pro x64
This may shock you, but you don't need cuda to run workloads on a gpu. I snuck a little joke in the first line and it cleared the treetops it was so far over your head. El-capitan is set to be the first 2+ exaflop supercomputer running on mi300A apus. The current top supercomputer is frontier on mi250x's AMD is selling as many as they can make, the limit is TSMC not demand. In the past few years there has been a shift to hardware agnostic software, rather than cuda first, for those that still put cuda first, hipify exists to convert the code.
Keep believing that, meanwhile Nvidia sits on 98% of the AI market

Lets see if AMD releases something good before AI hype dies out

If AMD actually had something truly competitive in the AI and Enterprise market, their stock value would reflect it - Hint: Look at Nvidia stock


Even AMD know they are way behind and H100 is old news
 
Top