AMD Releases ROCm 6.3 with SGLang, Fortran Compiler, Multi-Node FFT, Vision Libraries, and More

GFreeman · 2024-11-26T11:48:16+0000

AMD has released the new ROCm 6.3 version which introduces several new features and optimizations, including SGLang integration for accelerated AI inferencing, a re-engineered FlashAttention-2 for optimized AI training and inference, the introduction of multi-node Fast Fourier Transform (FFT), new Fortran compiler, and enhanced computer vision libraries like rocDecode, rocJPEG, and rocAL.

According to AMD, the SGLang, a runtime that is now supported by ROCm 6.3, is purpose-built for optimizing inference on models like LLMs and VLMs on AMD Instinct GPUs, and promises 6x higher throughput and much easier usage thanks to Python-integrated and pre-configured ROCm Docker containers. In addition, the AMD ROCm 6.3 also brings further transformer optimizations with FlashAttention-2, which should bring significant improvements in forward and backward pass compared to FlashAttention-1, a whole new AMD Fortran compiler with direct GPU offloading, backward compatibility, and integration with HIP Kernels and ROCm libraries, a whole new multi-node FFT support in rocFFT, which simplifies multi-node scaling and improved scalability, as well as enhanced computer vision libraries, rocDecode, rocJPEG, and rocAL, for AV1 codec support, GPU-accelerated JPEG decoding, and better audio augmentation.

AMD was keen to note that ROCm 6.3 continues to "deliver cutting-edge tools to simplify development while driving better performance and scalability for AI and HPC workloads", as well as keep embracing the open-source ethos and evolving to meet developer needs. You can check out more details over at the ROCm Documentation Hub or the AMD ROCm Blogs.

View at TechPowerUp Main Site | Source

Space Lynx · 2024-11-26T12:22:03+0000

I saw the small m next to the 6.3 and just immediately thought headphone cable. :slap:

GoldenX · 2024-11-26T14:25:14+0000

System requirements (Linux) — ROCm installation (Linux)

System requirements for AMD ROCm

rocm.docs.amd.com

And still works on only 3 consumer cards.

Onasi · 2024-11-26T14:29:08+0000

@GoldenX
I like the nice artificial limitation of it supporting W6800, but not consumer Navi 21 cards. Seems bizarrely random.

GoldenX · 2024-11-26T14:36:21+0000

Onasi said:
@GoldenX
I like the nice artificial limitation of it supporting W6800, but not consumer Navi 21 cards. Seems bizarrely random.

Knowing how stingy AMD is now with their "AI first" focus, they most likely won't add support for any other arch until UDNA is out.
Maaaybe the top end RDNA4 with some luck.

Great CUDA competitor, eh.

igormp · 2024-11-26T15:26:46+0000

GoldenX said:
Knowing how stingy AMD is now with their "AI first" focus, they most likely won't add support for any other arch until UDNA is out.
Maaaybe the top end RDNA4 with some luck.

Great CUDA competitor, eh.

Even Intel managed to get their pytorch extensions merged in the upstream, and getting it to work is even easier than CUDA.
ROCm, on the other hand, is still a pain even worse than CUDA with all its shenanigans.

dont whant to set it"' · 2024-11-26T16:31:08+0000

Space Lynx said:
I saw the small m next to the 6.3 and just immediately thought headphone cable.

For a split second I was going about the same thought.

For me its the keyboard one "EPO" , takes me back to the heydays of televised World tour pro cycling when the peletons were full of EPO carrying mules and gregarios(the drug/medicine).le: Allegedly, some were caught, many got caught.
So confusing.

GoldenX · 2024-11-26T17:54:42+0000

igormp said:
Even Intel managed to get their pytorch extensions merged in the upstream, and getting it to work is even easier than CUDA.
ROCm, on the other hand, is still a pain even worse than CUDA with all its shenanigans.

Heh yeah, Intel coming out of nowhere with a real alternative.
Excuse me while I go get an 8400GS from 2007 to run CUDA.

Neo_Morpheus · 2024-11-26T19:16:51+0000

As usual, the normal negative posts, every time that ROCm is mentioned.

Anyways, about their compatibility, AMD needs to do better to clarify this mess.

The link above shows that only the 3 top tier RDNA 3 GPUS are supported, yet on these links, they show way more, including many RDNA2.

Accelerator and GPU hardware specifications — ROCm Documentation

AMD Instinct™ accelerator, AMD Radeon PRO™, and AMD Radeon™ GPU architecture information

rocm.docs.amd.com

Compatibility matrix — ROCm Documentation

ROCm compatibility matrix

rocm.docs.amd.com

So which ones are really supported AMD?

That said, I read about others people being able to use other GPUs besides the 3 RDNA3 gpus mentioned before.

Patriot · 2024-11-26T20:02:22+0000

Neo_Morpheus said:
As usual, the normal negative posts, every time that ROCm is mentioned.

Anyways, about their compatibility, AMD needs to do better to clarify this mess.

The link above shows that only the 3 top tier RDNA 3 GPUS are supported, yet on these links, they show way more, including many RDNA2.

Accelerator and GPU hardware specifications — ROCm Documentation

AMD Instinct™ accelerator, AMD Radeon PRO™, and AMD Radeon™ GPU architecture information

rocm.docs.amd.com

Compatibility matrix — ROCm Documentation

ROCm compatibility matrix

rocm.docs.amd.com

So which ones are really supported AMD?

That said, I read about others people being able to use other GPUs besides the 3 RDNA3 gpus mentioned before.

Support is a strong word. RDNA2/3 work, but what cards do they test on? top 3 RDNA3 mi250x and mi300x.
mi100 already has support waning.

But yes you can use rocm on you 6700xt and other 6000 gen cards just fine. Broader support is coming slowly but surely.

GoldenX · 2024-11-26T22:04:18+0000

Patriot said:
Support is a strong word. RDNA2/3 work, but what cards do they test on? top 3 RDNA3 mi250x and mi300x.
mi100 already has support waning.

But yes you can use rocm on you 6700xt and other 6000 gen cards just fine. Broader support is coming slowly but surely.

One could argue broad support MUST be the first thing you do, else the bar gets harder with every new feature added.

That on top of stability issues on consumer hardware when running ROCm on Linux, and the worse Windows support are strong detrimentals that should be addressed immediately.
It's been years like this by now, "will be better soon" is meaningless when the entire ecosystem is 17 years delayed.

Makaveli · 2024-11-26T23:23:54+0000

GoldenX said:
System requirements (Linux) — ROCm installation (Linux)

System requirements for AMD ROCm

rocm.docs.amd.com

And still works on only 3 consumer cards.

Windows supports more gpu's

System requirements (Windows) — HIP SDK installation (Windows)

Windows GPU and OS support

rocm.docs.amd.com

I use ROCm with LM studio for LLM's

GoldenX · 2024-11-27T00:13:46+0000

Makaveli said:
Windows supports more gpu's

System requirements (Windows) — HIP SDK installation (Windows)

Windows GPU and OS support

rocm.docs.amd.com

I use ROCm with LM studio for LLM's

Last time I tested LM Studio on my 6600, it was far slower than equivalent Ampere cards. Has that improved?

ROCm on Windows is not full support. Only sporadic, like with LM Studio.
Real full support is only available on Linux and WSL.

Makaveli · 2024-11-27T00:17:16+0000

GoldenX said:
Last time I tested LM Studio on my 6600, it was far slower than equivalent Ampere cards. Has that improved?

ROCm on Windows is not full support. Only sporadic, like with LM Studio.
Real full support is only available on Linux and WSL.

you would have to use the Vulkan runtime in LM studio I don't think the 6600 is supported by ROCm. I would installed the newest version of LM studio and test it again.

With my current gpu 7900XTX when I was testing models with some guys in the LM studio discord running 4090's I saw similar performance for most models.

GoldenX · 2024-11-27T00:50:25+0000

The AI extensions RDNA3 added to compute do their work then.

Makaveli · 2024-11-27T00:53:21+0000

GoldenX said:
The AI extensions RDNA3 added to compute do their work then.

Yes the WMMA instructions work well on RDNA 3

igormp · 2024-11-27T05:22:46+0000

Makaveli said:
you would have to use the Vulkan runtime in LM studio I don't think the 6600 is supported by ROCm. I would installed the newest version of LM studio and test it again.

With my current gpu 7900XTX when I was testing models with some guys in the LM studio discord running 4090's I saw similar performance for most models.

Fwiw, overall performance on windows is usually way slower when compared to Linux.
When comparing my performance with 2x3090 on Linux in different models with some folks who had 4090s and 4080s on windows, their performance was way slower than mine.

Not sure if that's the case, but it did use to be 30~70% slower on windows.

Processor	7800X3D -25 all core ($196)
Motherboard	B650 Steel Legend ($179)
Cooling	Frost Commander 140 ($42)
Memory	32gb ddr5 (2x16) cl 30 6000 ($80)
Video Card(s)	Merc 310 7900 XT @3100 core $(705)
Display(s)	Agon 27" QD-OLED Glossy 240hz 1440p ($399)
Case	NZXT H710 (Red/Black) ($60)

System Name	Ciel / Akane
Processor	AMD Ryzen R5 5600X / Intel Core i3 12100F
Motherboard	Asus Tuf Gaming B550 Plus / Biostar H610MHP
Cooling	ID-Cooling 224-XT Basic / Stock
Memory	2x 16GB Kingston Fury 3600MHz / 2x 8GB Patriot 3200MHz
Video Card(s)	Gainward Ghost RTX 3060 Ti / Dell GTX 1660 SUPER
Storage	NVMe Kingston KC3000 2TB + NVMe Toshiba KBG40ZNT256G + HDD WD 4TB / NVMe WD Blue SN550 512GB
Display(s)	AOC Q27G3XMN / Samsung S22F350
Case	Cougar MX410 Mesh-G / Generic
Audio Device(s)	Kingston HyperX Cloud Stinger Core 7.1 Wireless PC
Power Supply	Aerocool KCAS-500W / Gigabyte P450B
Mouse	EVGA X15 / Logitech G203
Keyboard	VSG Alnilam / Dell
Software	Windows 11

System Name	The Workhorse
Processor	AMD Ryzen R9 5900X
Motherboard	Gigabyte Aorus B550 Pro
Cooling	CPU - Noctua NH-D15S Case - 3 Noctua NF-A14 PWM at the bottom, 2 Fractal Design 180mm at the front
Memory	GSkill Trident Z 3200CL14
Video Card(s)	NVidia GTX 1070 MSI QuickSilver
Storage	Adata SX8200Pro
Display(s)	LG 32GK850G
Case	Fractal Design Torrent (Solid)
Audio Device(s)	FiiO E-10K DAC/Amp, Samson Meteorite USB Microphone
Power Supply	Corsair RMx850 (2018)
Mouse	Razer Viper (Original) on a X-Raypad Equate Plus V2
Keyboard	Cooler Master QuickFire Rapid TKL keyboard (Cherry MX Black)
Software	Windows 11 Pro (24H2)

System Name	Ciel / Akane
Processor	AMD Ryzen R5 5600X / Intel Core i3 12100F
Motherboard	Asus Tuf Gaming B550 Plus / Biostar H610MHP
Cooling	ID-Cooling 224-XT Basic / Stock
Memory	2x 16GB Kingston Fury 3600MHz / 2x 8GB Patriot 3200MHz
Video Card(s)	Gainward Ghost RTX 3060 Ti / Dell GTX 1660 SUPER
Storage	NVMe Kingston KC3000 2TB + NVMe Toshiba KBG40ZNT256G + HDD WD 4TB / NVMe WD Blue SN550 512GB
Display(s)	AOC Q27G3XMN / Samsung S22F350
Case	Cougar MX410 Mesh-G / Generic
Audio Device(s)	Kingston HyperX Cloud Stinger Core 7.1 Wireless PC
Power Supply	Aerocool KCAS-500W / Gigabyte P450B
Mouse	EVGA X15 / Logitech G203
Keyboard	VSG Alnilam / Dell
Software	Windows 11

Processor	5950x
Motherboard	B550 ProArt
Cooling	Fuma 2
Memory	4x32GB 3200MHz Corsair LPX
Video Card(s)	2x RTX 3090
Display(s)	LG 42" C2 4k OLED
Power Supply	XPG Core Reactor 850W
Software	I use Arch btw

AMD Releases ROCm 6.3 with SGLang, Fortran Compiler, Multi-Node FFT, Vision Libraries, and More

GFreeman

News Editor

Space Lynx

Astronaut

GoldenX

System requirements (Linux) — ROCm installation (Linux)

Onasi

GoldenX

igormp

dont whant to set it"'

GoldenX

Neo_Morpheus

Accelerator and GPU hardware specifications — ROCm Documentation

Compatibility matrix — ROCm Documentation

Patriot

Accelerator and GPU hardware specifications — ROCm Documentation

Compatibility matrix — ROCm Documentation

GoldenX

Makaveli

System requirements (Linux) — ROCm installation (Linux)

System requirements (Windows) — HIP SDK installation (Windows)

GoldenX

System requirements (Windows) — HIP SDK installation (Windows)

Makaveli

GoldenX

Makaveli

igormp

System Name	3 "rigs"-gaming/spare pc/cruncher
Processor	R7-5800X3D/i7-7700K/R9-7950X
Motherboard	Asus ROG Crosshair VI Extreme/Asus Ranger Z170/Asus ROG Crosshair X670E-GENE
Cooling	Bitspower monoblock ,custom open loop,both passive and active/air tower cooler/air tower cooler
Memory	32GB DDR4/32GB DDR4/64GB DDR5
Video Card(s)	Gigabyte RX6900XT Alphacooled/AMD RX5700XT 50th Aniv./SOC(onboard)
Storage	mix of sata ssds/m.2 ssds/mix of sata ssds+an m.2 ssd
Display(s)	Dell UltraSharp U2410 , HP 24x
Case	mb box/Silverstone Raven RV-05/CoolerMaster Q300L
Audio Device(s)	onboard/onboard/onboard
Power Supply	3 Seasonics, a DeltaElectronics, a FractalDesing
Mouse	various/various/various
Keyboard	various wired and wireless
VR HMD	-
Software	W10.someting or another,all 3

System Name	GameStation
Processor	AMD R5 5600X
Motherboard	Gigabyte B550
Cooling	Artic Freezer II 120
Memory	16 GB
Video Card(s)	Sapphire Pulse 7900 XTX
Storage	2 TB SSD
Case	Cooler Master Elite 120

System Name	[H]arbringer
Processor	4x 61XX ES @3.5Ghz (48cores)
Motherboard	SM GL
Cooling	3x xspc rx360, rx240, 4x DT G34 snipers, D5 pump.
Memory	16x gskill DDR3 1600 cas6 2gb
Video Card(s)	blah bigadv folder no gfx needed
Storage	32GB Sammy SSD
Display(s)	headless
Case	Xigmatek Elysium (whats left of it)
Audio Device(s)	yawn
Power Supply	Antec 1200w HCP
Software	Ubuntu 10.10
Benchmark Scores	http://valid.canardpc.com/show_oc.php?id=1780855 http://www.hwbot.org/submission/2158678 http://ww

System Name	The Expanse
Processor	AMD Ryzen 7 5800X3D
Motherboard	Asus Prime X570-Pro BIOS 5013 AM4 AGESA V2 PI 1.2.0.Cc.
Cooling	Corsair H150i Pro
Memory	32GB GSkill Trident RGB DDR4-3200 14-14-14-34-1T (B-Die)
Video Card(s)	XFX Radeon RX 7900 XTX Magnetic Air (24.10.1)
Storage	WD SN850X 2TB / Corsair MP600 1TB / Samsung 860Evo 1TB x2 Raid 0 / Asus NAS AS1004T V2 20TB
Display(s)	LG 34GP83A-B 34 Inch 21: 9 UltraGear Curved QHD (3440 x 1440) 1ms Nano IPS 160Hz
Case	Fractal Design Meshify S2
Audio Device(s)	Creative X-Fi + Logitech Z-5500 + HS80 Wireless
Power Supply	Corsair AX850 Titanium
Mouse	Corsair Dark Core RGB SE
Keyboard	Corsair K100
Software	Windows 10 Pro x64 22H2
Benchmark Scores	3800X https://valid.x86.fr/1zr4a5 5800X https://valid.x86.fr/2dey9c 5800X3D https://valid.x86.fr/b7d