AMD's Pain Point is ROCm Software, NVIDIA's CUDA Software is Still Superior for AI Development: Report

Visible Noise · Dec 24, 2024

Patriot said:
Snip...

I don’t get into pointless arguments with apologists of any brand.

Later.

Patriot · Dec 24, 2024

Visible Noise said:
I don’t get into pointless arguments with apologists of any brand.

Later.

I am sorry that people with relevant knowledge and expertise intimidate you, that is a sad way to live.

Visible Noise · Dec 24, 2024

Patriot said:
I am sorry that people with relevant knowledge and expertise intimidate you, that is a sad way to live.

Nah, I just don’t deal with people that pull an appeal to authority, especially when they claim they are the authority and their post history shows it’s evident they are on a team.

Merry Christmas!

Neo_Morpheus · Dec 24, 2024

Not sure if related but bumped into this today on X:

igormp · Dec 24, 2024

Patriot said:
Negative. The article focused only on training models using predefined containers not pretrained models for inference.
It is super super easy to run models on mi100/250x/300x 7900gre/xt/xtx

I read as far as the paywall goes.
I also... have used ROCm since vega64/mi25
I also... have used Cuda since K80/GTX690

I currently run a hive of mi100s, and a sxm v100 box.

When it comes to inference, MI300x gets day0 support. Training is very lacking and Nvidia's deep bench of software engineers shows.
I expect part 2 of the article to be a bit different.

I am fully aware of the lacking's of AMDs ecosystem, but I am also aware of its strengths.
And the ability to just grab containers and go exists... hugging face is full of native containers for ROCm, Hipify can convert most* things that are cuda native, abet at performance penalty.
But when it comes to inference AMD is not a 2nd class citizen. It has full support with triton, and flash attention...
And Llama 405b fp16 launched exclusively on mi300x, most likely due to the ram requirements.
As it was quantized down to fp8, then it could fit on 8x h100 80gb, but as it was announced, Meta and AMD announced together that all Meta 405b live instances were run on mi300x.
If that is still true or was just a limited exclusivity while it was quantized down... idk...

But claiming things like a mi300x cant run OOB models is just... ignorant af, and not even what the article claims.
It claims bad training performance and strange bugs and as a user of the ecosystem... yup. AMD has strange bugs.
They have known lockups for multi gpu instances... and the solution is to run additional grub parameters, perfectly stable with iommu=pt, randomly hangs without.
But all this information is in the tuning guides. The install process is easy, and hugging face is full of models to run.

I mean, FA2 only got supported on AMD GPUs recently. Even though pytorch does include ROCm support OOB nowadays, you often face issues not found with CUDA.
ROCm's performance is way subpar still, achieving like a fraction of its theoretical performance (both in terms of memory bandwidth and also FLOPs).

It is still clearly a second class citizen, but it's the second class citizen. As soon as something comes out (defaulting to CUDA, of course), then people immediately get their hands trying to port it to ROCm.
The strides it has made in the past years is really impressive. I remember trying it out with an rx480 back then, and immediately buying a 1050ti to replace it, nowadays it's not 100% (nor that close), but you sure can get your hands dirt and at least get something working out of it.

As for lockups and hangs, eh, I've heard this quite a lot from some folks that do work with many AMD GPUs, but it's also not that uncommon in the Nvidia world either (albeit to a lesser degree). Just get a GH200 (lambdalabs even has those with a discount for now) and have some fun locking up your machine trying to use their so called "unified" memory haha

Visible Noise · Dec 24, 2024

x.com

You get an hour and a half with the CEO. Then she spends the next hour and a half tearing someone a new one for getting surprised by the media.

Heads need to roll in AMDs software group.

tommo1982 · Dec 26, 2024

Visible Noise said:
x.com

x.com

You get an hour and a half with the CEO. Then she spends the next hour and a half tearing someone a new one for getting surprised by the media.

Heads need to roll in AMDs software group.

Do you have a link? I don't have X or Facebook account.

Visible Noise · Dec 26, 2024

tommo1982 said:
Do you have a link? I don't have X or Facebook account.

Patriot · Dec 26, 2024

I have mixed feelings on Geohotz. They announced, built, and started shipping something before testing it... And then demanded AMD to opensource FW on desktop cards that they were shipping for enterprise workloads. They weren't even using the workstation cards. Looks like they also demanded $1M in test boxes.
Kinda nuts demanding enterprise support for desktop cards. That said, I would like AMD to follow through on promises. All they have done so far is the first half of what they promised and handed out a guide on how to talk to the fw better.

System Name	XPS, Lenovo and HP Laptops, HP Xeon Mobile Workstation, HP Servers, Dell Desktops
Processor	Everything from Turion to 13900kf
Motherboard	MSI - they own the OEM market
Cooling	Air on laptops, lots of air on servers, AIO on desktops
Memory	I think one of the laptops is 2GB, to 64GB on gamer, to 128GB on ZFS Filer
Video Card(s)	A pile up to my knee, with a RTX 4090 teetering on top
Storage	Rust in the closet, solid state everywhere else
Display(s)	Laptop crap, LG UltraGear of various vintages
Case	OEM and a 42U rack
Audio Device(s)	Headphones
Power Supply	Whole home UPS w/Generac Standby Generator
Software	ZFS, UniFi Network Application, Entra, AWS IoT Core, Splunk
Benchmark Scores	1.21 GigaBungholioMarks

System Name	[H]arbringer
Processor	4x 61XX ES @3.5Ghz (48cores)
Motherboard	SM GL
Cooling	3x xspc rx360, rx240, 4x DT G34 snipers, D5 pump.
Memory	16x gskill DDR3 1600 cas6 2gb
Video Card(s)	blah bigadv folder no gfx needed
Storage	32GB Sammy SSD
Display(s)	headless
Case	Xigmatek Elysium (whats left of it)
Audio Device(s)	yawn
Power Supply	Antec 1200w HCP
Software	Ubuntu 10.10
Benchmark Scores	http://valid.canardpc.com/show_oc.php?id=1780855 http://www.hwbot.org/submission/2158678 http://ww

System Name	XPS, Lenovo and HP Laptops, HP Xeon Mobile Workstation, HP Servers, Dell Desktops
Processor	Everything from Turion to 13900kf
Motherboard	MSI - they own the OEM market
Cooling	Air on laptops, lots of air on servers, AIO on desktops
Memory	I think one of the laptops is 2GB, to 64GB on gamer, to 128GB on ZFS Filer
Video Card(s)	A pile up to my knee, with a RTX 4090 teetering on top
Storage	Rust in the closet, solid state everywhere else
Display(s)	Laptop crap, LG UltraGear of various vintages
Case	OEM and a 42U rack
Audio Device(s)	Headphones
Power Supply	Whole home UPS w/Generac Standby Generator
Software	ZFS, UniFi Network Application, Entra, AWS IoT Core, Splunk
Benchmark Scores	1.21 GigaBungholioMarks

System Name	GameStation
Processor	AMD R5 5600X
Motherboard	Gigabyte B550
Cooling	Artic Freezer II 120
Memory	16 GB
Video Card(s)	Sapphire Pulse 7900 XTX
Storage	2 TB SSD
Case	Cooler Master Elite 120

Processor	5950x
Motherboard	B550 ProArt
Cooling	Fuma 2
Memory	4x32GB 3200MHz Corsair LPX
Video Card(s)	2x RTX 3090
Display(s)	LG 42" C2 4k OLED
Power Supply	XPG Core Reactor 850W
Software	I use Arch btw

Processor	Ryzen 5 5600
Motherboard	ASRock B450M Steel Legend
Cooling	bequiet! Pure Rock Slim (BK008)
Memory	16GB DDR4 GoodRAM
Video Card(s)	PowerColor Fighter RX7600 8GB
Storage	WD Blue 500GB SSD
Display(s)	iiyama ProLite T2252MTS
Case	CoolerMaster Silencio 352
Power Supply	bequiet! Pure Power 12M 650W
Mouse	Logitech M590
Keyboard	Logitech K270
Software	Linux Mint