• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

AMD Releases ROCm 6.3 with SGLang, Fortran Compiler, Multi-Node FFT, Vision Libraries, and More

GFreeman

News Editor
Staff member
Joined
Mar 6, 2023
Messages
1,587 (2.40/day)
AMD has released the new ROCm 6.3 version which introduces several new features and optimizations, including SGLang integration for accelerated AI inferencing, a re-engineered FlashAttention-2 for optimized AI training and inference, the introduction of multi-node Fast Fourier Transform (FFT), new Fortran compiler, and enhanced computer vision libraries like rocDecode, rocJPEG, and rocAL.

According to AMD, the SGLang, a runtime that is now supported by ROCm 6.3, is purpose-built for optimizing inference on models like LLMs and VLMs on AMD Instinct GPUs, and promises 6x higher throughput and much easier usage thanks to Python-integrated and pre-configured ROCm Docker containers. In addition, the AMD ROCm 6.3 also brings further transformer optimizations with FlashAttention-2, which should bring significant improvements in forward and backward pass compared to FlashAttention-1, a whole new AMD Fortran compiler with direct GPU offloading, backward compatibility, and integration with HIP Kernels and ROCm libraries, a whole new multi-node FFT support in rocFFT, which simplifies multi-node scaling and improved scalability, as well as enhanced computer vision libraries, rocDecode, rocJPEG, and rocAL, for AV1 codec support, GPU-accelerated JPEG decoding, and better audio augmentation.



AMD was keen to note that ROCm 6.3 continues to "deliver cutting-edge tools to simplify development while driving better performance and scalability for AI and HPC workloads", as well as keep embracing the open-source ethos and evolving to meet developer needs. You can check out more details over at the ROCm Documentation Hub or the AMD ROCm Blogs.

View at TechPowerUp Main Site | Source
 

Space Lynx

Astronaut
Joined
Oct 17, 2014
Messages
17,437 (4.68/day)
Location
Kepler-186f
Processor 7800X3D -25 all core
Motherboard B650 Steel Legend
Cooling Frost Commander 140
Video Card(s) Merc 310 7900 XT @3100 core -.75v
Display(s) Agon 27" QD-OLED Glossy 240hz 1440p
Case NZXT H710 (Red/Black)
Audio Device(s) Asgard 2, Modi 3, HD58X
Power Supply Corsair RM850x Gold
I saw the small m next to the 6.3 and just immediately thought headphone cable. :slap:
 
Joined
Oct 2, 2015
Messages
3,152 (0.93/day)
Location
Argentina
System Name Ciel / Akane
Processor AMD Ryzen R5 5600X / Intel Core i3 12100F
Motherboard Asus Tuf Gaming B550 Plus / Biostar H610MHP
Cooling ID-Cooling 224-XT Basic / Stock
Memory 2x 16GB Kingston Fury 3600MHz / 2x 8GB Patriot 3200MHz
Video Card(s) Gainward Ghost RTX 3060 Ti / Dell GTX 1660 SUPER
Storage NVMe Kingston KC3000 2TB + NVMe Toshiba KBG40ZNT256G + HDD WD 4TB / NVMe WD Blue SN550 512GB
Display(s) AOC Q27G3XMN / Samsung S22F350
Case Cougar MX410 Mesh-G / Generic
Audio Device(s) Kingston HyperX Cloud Stinger Core 7.1 Wireless PC
Power Supply Aerocool KCAS-500W / Gigabyte P450B
Mouse EVGA X15 / Logitech G203
Keyboard VSG Alnilam / Dell
Software Windows 11
Joined
Nov 27, 2023
Messages
2,518 (6.37/day)
System Name The Workhorse
Processor AMD Ryzen R9 5900X
Motherboard Gigabyte Aorus B550 Pro
Cooling CPU - Noctua NH-D15S Case - 3 Noctua NF-A14 PWM at the bottom, 2 Fractal Design 180mm at the front
Memory GSkill Trident Z 3200CL14
Video Card(s) NVidia GTX 1070 MSI QuickSilver
Storage Adata SX8200Pro
Display(s) LG 32GK850G
Case Fractal Design Torrent (Solid)
Audio Device(s) FiiO E-10K DAC/Amp, Samson Meteorite USB Microphone
Power Supply Corsair RMx850 (2018)
Mouse Razer Viper (Original) on a X-Raypad Equate Plus V2
Keyboard Cooler Master QuickFire Rapid TKL keyboard (Cherry MX Black)
Software Windows 11 Pro (24H2)
@GoldenX
I like the nice artificial limitation of it supporting W6800, but not consumer Navi 21 cards. Seems bizarrely random.
 
Joined
Oct 2, 2015
Messages
3,152 (0.93/day)
Location
Argentina
System Name Ciel / Akane
Processor AMD Ryzen R5 5600X / Intel Core i3 12100F
Motherboard Asus Tuf Gaming B550 Plus / Biostar H610MHP
Cooling ID-Cooling 224-XT Basic / Stock
Memory 2x 16GB Kingston Fury 3600MHz / 2x 8GB Patriot 3200MHz
Video Card(s) Gainward Ghost RTX 3060 Ti / Dell GTX 1660 SUPER
Storage NVMe Kingston KC3000 2TB + NVMe Toshiba KBG40ZNT256G + HDD WD 4TB / NVMe WD Blue SN550 512GB
Display(s) AOC Q27G3XMN / Samsung S22F350
Case Cougar MX410 Mesh-G / Generic
Audio Device(s) Kingston HyperX Cloud Stinger Core 7.1 Wireless PC
Power Supply Aerocool KCAS-500W / Gigabyte P450B
Mouse EVGA X15 / Logitech G203
Keyboard VSG Alnilam / Dell
Software Windows 11
@GoldenX
I like the nice artificial limitation of it supporting W6800, but not consumer Navi 21 cards. Seems bizarrely random.
Knowing how stingy AMD is now with their "AI first" focus, they most likely won't add support for any other arch until UDNA is out.
Maaaybe the top end RDNA4 with some luck.

Great CUDA competitor, eh.
 
Joined
May 10, 2023
Messages
369 (0.62/day)
Location
Brazil
Processor 5950x
Motherboard B550 ProArt
Cooling Fuma 2
Memory 4x32GB 3200MHz Corsair LPX
Video Card(s) 2x RTX 3090
Display(s) LG 42" C2 4k OLED
Power Supply XPG Core Reactor 850W
Software I use Arch btw
Knowing how stingy AMD is now with their "AI first" focus, they most likely won't add support for any other arch until UDNA is out.
Maaaybe the top end RDNA4 with some luck.

Great CUDA competitor, eh.
Even Intel managed to get their pytorch extensions merged in the upstream, and getting it to work is even easier than CUDA.
ROCm, on the other hand, is still a pain even worse than CUDA with all its shenanigans.
 
Joined
Oct 8, 2015
Messages
774 (0.23/day)
Location
Earth's Troposphere
System Name 3 "rigs"-gaming/spare pc/cruncher
Processor R7-5800X3D/i7-7700K/R9-7950X
Motherboard Asus ROG Crosshair VI Extreme/Asus Ranger Z170/Asus ROG Crosshair X670E-GENE
Cooling Bitspower monoblock ,custom open loop,both passive and active/air tower cooler/air tower cooler
Memory 32GB DDR4/32GB DDR4/64GB DDR5
Video Card(s) Gigabyte RX6900XT Alphacooled/AMD RX5700XT 50th Aniv./SOC(onboard)
Storage mix of sata ssds/m.2 ssds/mix of sata ssds+an m.2 ssd
Display(s) Dell UltraSharp U2410 , HP 24x
Case mb box/Silverstone Raven RV-05/CoolerMaster Q300L
Audio Device(s) onboard/onboard/onboard
Power Supply 3 Seasonics, a DeltaElectronics, a FractalDesing
Mouse various/various/various
Keyboard various wired and wireless
VR HMD -
Software W10.someting or another,all 3
I saw the small m next to the 6.3 and just immediately thought headphone cable. :slap:
For a split second I was going about the same thought.

For me its the keyboard one "EPO" , takes me back to the heydays of televised World tour pro cycling when the peletons were full of EPO carrying mules and gregarios(the drug/medicine).le: Allegedly, some were caught, many got caught.
So confusing.
 
Joined
Oct 2, 2015
Messages
3,152 (0.93/day)
Location
Argentina
System Name Ciel / Akane
Processor AMD Ryzen R5 5600X / Intel Core i3 12100F
Motherboard Asus Tuf Gaming B550 Plus / Biostar H610MHP
Cooling ID-Cooling 224-XT Basic / Stock
Memory 2x 16GB Kingston Fury 3600MHz / 2x 8GB Patriot 3200MHz
Video Card(s) Gainward Ghost RTX 3060 Ti / Dell GTX 1660 SUPER
Storage NVMe Kingston KC3000 2TB + NVMe Toshiba KBG40ZNT256G + HDD WD 4TB / NVMe WD Blue SN550 512GB
Display(s) AOC Q27G3XMN / Samsung S22F350
Case Cougar MX410 Mesh-G / Generic
Audio Device(s) Kingston HyperX Cloud Stinger Core 7.1 Wireless PC
Power Supply Aerocool KCAS-500W / Gigabyte P450B
Mouse EVGA X15 / Logitech G203
Keyboard VSG Alnilam / Dell
Software Windows 11
Even Intel managed to get their pytorch extensions merged in the upstream, and getting it to work is even easier than CUDA.
ROCm, on the other hand, is still a pain even worse than CUDA with all its shenanigans.
Heh yeah, Intel coming out of nowhere with a real alternative.
Excuse me while I go get an 8400GS from 2007 to run CUDA.
 
Joined
Dec 6, 2022
Messages
478 (0.64/day)
Location
NYC
System Name GameStation
Processor AMD R5 5600X
Motherboard Gigabyte B550
Cooling Artic Freezer II 120
Memory 16 GB
Video Card(s) Sapphire Pulse 7900 XTX
Storage 2 TB SSD
Case Cooler Master Elite 120
As usual, the normal negative posts, every time that ROCm is mentioned.

Anyways, about their compatibility, AMD needs to do better to clarify this mess.

The link above shows that only the 3 top tier RDNA 3 GPUS are supported, yet on these links, they show way more, including many RDNA2.



So which ones are really supported AMD?

That said, I read about others people being able to use other GPUs besides the 3 RDNA3 gpus mentioned before.
 
Joined
Oct 27, 2009
Messages
1,194 (0.22/day)
Location
Republic of Texas
System Name [H]arbringer
Processor 4x 61XX ES @3.5Ghz (48cores)
Motherboard SM GL
Cooling 3x xspc rx360, rx240, 4x DT G34 snipers, D5 pump.
Memory 16x gskill DDR3 1600 cas6 2gb
Video Card(s) blah bigadv folder no gfx needed
Storage 32GB Sammy SSD
Display(s) headless
Case Xigmatek Elysium (whats left of it)
Audio Device(s) yawn
Power Supply Antec 1200w HCP
Software Ubuntu 10.10
Benchmark Scores http://valid.canardpc.com/show_oc.php?id=1780855 http://www.hwbot.org/submission/2158678 http://ww
As usual, the normal negative posts, every time that ROCm is mentioned.

Anyways, about their compatibility, AMD needs to do better to clarify this mess.

The link above shows that only the 3 top tier RDNA 3 GPUS are supported, yet on these links, they show way more, including many RDNA2.



So which ones are really supported AMD?

That said, I read about others people being able to use other GPUs besides the 3 RDNA3 gpus mentioned before.

Support is a strong word. RDNA2/3 work, but what cards do they test on? top 3 RDNA3 mi250x and mi300x.
mi100 already has support waning.

But yes you can use rocm on you 6700xt and other 6000 gen cards just fine. Broader support is coming slowly but surely.
 
Joined
Oct 2, 2015
Messages
3,152 (0.93/day)
Location
Argentina
System Name Ciel / Akane
Processor AMD Ryzen R5 5600X / Intel Core i3 12100F
Motherboard Asus Tuf Gaming B550 Plus / Biostar H610MHP
Cooling ID-Cooling 224-XT Basic / Stock
Memory 2x 16GB Kingston Fury 3600MHz / 2x 8GB Patriot 3200MHz
Video Card(s) Gainward Ghost RTX 3060 Ti / Dell GTX 1660 SUPER
Storage NVMe Kingston KC3000 2TB + NVMe Toshiba KBG40ZNT256G + HDD WD 4TB / NVMe WD Blue SN550 512GB
Display(s) AOC Q27G3XMN / Samsung S22F350
Case Cougar MX410 Mesh-G / Generic
Audio Device(s) Kingston HyperX Cloud Stinger Core 7.1 Wireless PC
Power Supply Aerocool KCAS-500W / Gigabyte P450B
Mouse EVGA X15 / Logitech G203
Keyboard VSG Alnilam / Dell
Software Windows 11
Support is a strong word. RDNA2/3 work, but what cards do they test on? top 3 RDNA3 mi250x and mi300x.
mi100 already has support waning.

But yes you can use rocm on you 6700xt and other 6000 gen cards just fine. Broader support is coming slowly but surely.
One could argue broad support MUST be the first thing you do, else the bar gets harder with every new feature added.

That on top of stability issues on consumer hardware when running ROCm on Linux, and the worse Windows support are strong detrimentals that should be addressed immediately.
It's been years like this by now, "will be better soon" is meaningless when the entire ecosystem is 17 years delayed.
 
Joined
Feb 21, 2006
Messages
2,240 (0.33/day)
Location
Toronto, Ontario
System Name The Expanse
Processor AMD Ryzen 7 5800X3D
Motherboard Asus Prime X570-Pro BIOS 5013 AM4 AGESA V2 PI 1.2.0.Cc.
Cooling Corsair H150i Pro
Memory 32GB GSkill Trident RGB DDR4-3200 14-14-14-34-1T (B-Die)
Video Card(s) XFX Radeon RX 7900 XTX Magnetic Air (24.12.1)
Storage WD SN850X 2TB / Corsair MP600 1TB / Samsung 860Evo 1TB x2 Raid 0 / Asus NAS AS1004T V2 20TB
Display(s) LG 34GP83A-B 34 Inch 21: 9 UltraGear Curved QHD (3440 x 1440) 1ms Nano IPS 160Hz
Case Fractal Design Meshify S2
Audio Device(s) Creative X-Fi + Logitech Z-5500 + HS80 Wireless
Power Supply Corsair AX850 Titanium
Mouse Corsair Dark Core RGB SE
Keyboard Corsair K100
Software Windows 10 Pro x64 22H2
Benchmark Scores 3800X https://valid.x86.fr/1zr4a5 5800X https://valid.x86.fr/2dey9c 5800X3D https://valid.x86.fr/b7d
Joined
Oct 2, 2015
Messages
3,152 (0.93/day)
Location
Argentina
System Name Ciel / Akane
Processor AMD Ryzen R5 5600X / Intel Core i3 12100F
Motherboard Asus Tuf Gaming B550 Plus / Biostar H610MHP
Cooling ID-Cooling 224-XT Basic / Stock
Memory 2x 16GB Kingston Fury 3600MHz / 2x 8GB Patriot 3200MHz
Video Card(s) Gainward Ghost RTX 3060 Ti / Dell GTX 1660 SUPER
Storage NVMe Kingston KC3000 2TB + NVMe Toshiba KBG40ZNT256G + HDD WD 4TB / NVMe WD Blue SN550 512GB
Display(s) AOC Q27G3XMN / Samsung S22F350
Case Cougar MX410 Mesh-G / Generic
Audio Device(s) Kingston HyperX Cloud Stinger Core 7.1 Wireless PC
Power Supply Aerocool KCAS-500W / Gigabyte P450B
Mouse EVGA X15 / Logitech G203
Keyboard VSG Alnilam / Dell
Software Windows 11
Joined
Feb 21, 2006
Messages
2,240 (0.33/day)
Location
Toronto, Ontario
System Name The Expanse
Processor AMD Ryzen 7 5800X3D
Motherboard Asus Prime X570-Pro BIOS 5013 AM4 AGESA V2 PI 1.2.0.Cc.
Cooling Corsair H150i Pro
Memory 32GB GSkill Trident RGB DDR4-3200 14-14-14-34-1T (B-Die)
Video Card(s) XFX Radeon RX 7900 XTX Magnetic Air (24.12.1)
Storage WD SN850X 2TB / Corsair MP600 1TB / Samsung 860Evo 1TB x2 Raid 0 / Asus NAS AS1004T V2 20TB
Display(s) LG 34GP83A-B 34 Inch 21: 9 UltraGear Curved QHD (3440 x 1440) 1ms Nano IPS 160Hz
Case Fractal Design Meshify S2
Audio Device(s) Creative X-Fi + Logitech Z-5500 + HS80 Wireless
Power Supply Corsair AX850 Titanium
Mouse Corsair Dark Core RGB SE
Keyboard Corsair K100
Software Windows 10 Pro x64 22H2
Benchmark Scores 3800X https://valid.x86.fr/1zr4a5 5800X https://valid.x86.fr/2dey9c 5800X3D https://valid.x86.fr/b7d
Last time I tested LM Studio on my 6600, it was far slower than equivalent Ampere cards. Has that improved?

ROCm on Windows is not full support. Only sporadic, like with LM Studio.
Real full support is only available on Linux and WSL.
you would have to use the Vulkan runtime in LM studio I don't think the 6600 is supported by ROCm. I would installed the newest version of LM studio and test it again.

With my current gpu 7900XTX when I was testing models with some guys in the LM studio discord running 4090's I saw similar performance for most models.
 
Joined
Oct 2, 2015
Messages
3,152 (0.93/day)
Location
Argentina
System Name Ciel / Akane
Processor AMD Ryzen R5 5600X / Intel Core i3 12100F
Motherboard Asus Tuf Gaming B550 Plus / Biostar H610MHP
Cooling ID-Cooling 224-XT Basic / Stock
Memory 2x 16GB Kingston Fury 3600MHz / 2x 8GB Patriot 3200MHz
Video Card(s) Gainward Ghost RTX 3060 Ti / Dell GTX 1660 SUPER
Storage NVMe Kingston KC3000 2TB + NVMe Toshiba KBG40ZNT256G + HDD WD 4TB / NVMe WD Blue SN550 512GB
Display(s) AOC Q27G3XMN / Samsung S22F350
Case Cougar MX410 Mesh-G / Generic
Audio Device(s) Kingston HyperX Cloud Stinger Core 7.1 Wireless PC
Power Supply Aerocool KCAS-500W / Gigabyte P450B
Mouse EVGA X15 / Logitech G203
Keyboard VSG Alnilam / Dell
Software Windows 11
The AI extensions RDNA3 added to compute do their work then.
 
Joined
Feb 21, 2006
Messages
2,240 (0.33/day)
Location
Toronto, Ontario
System Name The Expanse
Processor AMD Ryzen 7 5800X3D
Motherboard Asus Prime X570-Pro BIOS 5013 AM4 AGESA V2 PI 1.2.0.Cc.
Cooling Corsair H150i Pro
Memory 32GB GSkill Trident RGB DDR4-3200 14-14-14-34-1T (B-Die)
Video Card(s) XFX Radeon RX 7900 XTX Magnetic Air (24.12.1)
Storage WD SN850X 2TB / Corsair MP600 1TB / Samsung 860Evo 1TB x2 Raid 0 / Asus NAS AS1004T V2 20TB
Display(s) LG 34GP83A-B 34 Inch 21: 9 UltraGear Curved QHD (3440 x 1440) 1ms Nano IPS 160Hz
Case Fractal Design Meshify S2
Audio Device(s) Creative X-Fi + Logitech Z-5500 + HS80 Wireless
Power Supply Corsair AX850 Titanium
Mouse Corsair Dark Core RGB SE
Keyboard Corsair K100
Software Windows 10 Pro x64 22H2
Benchmark Scores 3800X https://valid.x86.fr/1zr4a5 5800X https://valid.x86.fr/2dey9c 5800X3D https://valid.x86.fr/b7d
Joined
May 10, 2023
Messages
369 (0.62/day)
Location
Brazil
Processor 5950x
Motherboard B550 ProArt
Cooling Fuma 2
Memory 4x32GB 3200MHz Corsair LPX
Video Card(s) 2x RTX 3090
Display(s) LG 42" C2 4k OLED
Power Supply XPG Core Reactor 850W
Software I use Arch btw
you would have to use the Vulkan runtime in LM studio I don't think the 6600 is supported by ROCm. I would installed the newest version of LM studio and test it again.

With my current gpu 7900XTX when I was testing models with some guys in the LM studio discord running 4090's I saw similar performance for most models.
Fwiw, overall performance on windows is usually way slower when compared to Linux.
When comparing my performance with 2x3090 on Linux in different models with some folks who had 4090s and 4080s on windows, their performance was way slower than mine.

Not sure if that's the case, but it did use to be 30~70% slower on windows.
 
Joined
Feb 21, 2006
Messages
2,240 (0.33/day)
Location
Toronto, Ontario
System Name The Expanse
Processor AMD Ryzen 7 5800X3D
Motherboard Asus Prime X570-Pro BIOS 5013 AM4 AGESA V2 PI 1.2.0.Cc.
Cooling Corsair H150i Pro
Memory 32GB GSkill Trident RGB DDR4-3200 14-14-14-34-1T (B-Die)
Video Card(s) XFX Radeon RX 7900 XTX Magnetic Air (24.12.1)
Storage WD SN850X 2TB / Corsair MP600 1TB / Samsung 860Evo 1TB x2 Raid 0 / Asus NAS AS1004T V2 20TB
Display(s) LG 34GP83A-B 34 Inch 21: 9 UltraGear Curved QHD (3440 x 1440) 1ms Nano IPS 160Hz
Case Fractal Design Meshify S2
Audio Device(s) Creative X-Fi + Logitech Z-5500 + HS80 Wireless
Power Supply Corsair AX850 Titanium
Mouse Corsair Dark Core RGB SE
Keyboard Corsair K100
Software Windows 10 Pro x64 22H2
Benchmark Scores 3800X https://valid.x86.fr/1zr4a5 5800X https://valid.x86.fr/2dey9c 5800X3D https://valid.x86.fr/b7d
Fwiw, overall performance on windows is usually way slower when compared to Linux.
When comparing my performance with 2x3090 on Linux in different models with some folks who had 4090s and 4080s on windows, their performance was way slower than mine.

Not sure if that's the case, but it did use to be 30~70% slower on windows.
I'm not a linux user so I will take your word for it.

All of the comparisons i've done has been on windows.
 
Joined
May 10, 2023
Messages
369 (0.62/day)
Location
Brazil
Processor 5950x
Motherboard B550 ProArt
Cooling Fuma 2
Memory 4x32GB 3200MHz Corsair LPX
Video Card(s) 2x RTX 3090
Display(s) LG 42" C2 4k OLED
Power Supply XPG Core Reactor 850W
Software I use Arch btw
I'm not a linux user so I will take your word for it.

All of the comparisons i've done has been on windows.
No worries. You don't really need to take my word for it tho, a quick google shows people doing such comparisons (specially on reddit)
(to make it clear, this is not aimed directly at you, but anyone that may wonder about this claim).

If lots of your work has to do with running LLMs locally, then it might be worth to switch for the extra performance (and easier tooling overall IMO).
But if you just use it sporadically or as a minor assistant thingie, then there's no point to change your entire workflow.
 
Top