ROCm OS?

Hueristic · Feb 24, 2025

I picked up 8 GPU's to run a ROCm accelerated LLM but have no clue what OS dependencies are required, The cpu on the 8 slot board is an old skylake celery (soldered) so I want to keep it as lightweight as possible so can I get away with something like PCLinuxOS or will I need a tiny kernel?
I'm obviously avoiding Bloated SystemD distros.
I haven't found any good sources for info I straight up ROCm setups.

Also does anyone have a clue what the impact of running lower pci-e lanes will have, I may have to move to a board with more lanes but will see if the 1 GB/s restriction is manageable or not, I remember on things like FAH back in the day it didn't matter.

Hueristic · Sunday at 6:51 PM

Solaris17 · Sunday at 7:05 PM

Tinygrad utilizes AMD 7900XTXs maybe it can be a source of enlightenment. They use ubuntu 22.04 but they do have GIT, instructions and an FAQ. Might be a good resource if you are targeting ROCm.

tinygrad: A simple and powerful neural network framework

Denver · Sunday at 7:13 PM

I honestly think you'd get more help posting on Reddit's "Local Llama." Also, Vulkan is now faster than ROCm, dangerously close to CUDA, at least for LLMs;

Hueristic · Sunday at 7:24 PM

Cool, thx for the input guys.

After some more research I've come up with this that looks helpful so I will drop a link if anyone stumbles across this in the futre with the same question.

GitHub - rocm-arch/rocm-arch: A collection of Arch Linux PKGBUILDS for the ROCm platform

A collection of Arch Linux PKGBUILDS for the ROCm platform - rocm-arch/rocm-arch

github.com

igormp · Sunday at 7:41 PM

Hueristic said:
I picked up 8 GPU's to run a ROCm accelerated LLM but have no clue what OS dependencies are required, The cpu on the 8 slot board is an old skylake celery (soldered) so I want to keep it as lightweight as possible so can I get away with something like PCLinuxOS or will I need a tiny kernel?
I'm obviously avoiding Bloated SystemD distros.
I haven't found any good sources for info I straight up ROCm setups.

Also does anyone have a clue what the impact of running lower pci-e lanes will have, I may have to move to a board with more lanes but will see if the 1 GB/s restriction is manageable or not, I remember on things like FAH back in the day it didn't matter.

Your OS of choice shouldn't make much of a difference, specially CPU-wise.
Avoiding systemd might give you more headaches with almost no benefit. No reason to go with a tiny kernel either.

As said above, localllama on reddit is a good place to start.

Rover4444 · 2025-03-03T03:05:25+0000

Hueristic said:
I picked up 8 GPU's to run a ROCm accelerated LLM but have no clue what OS dependencies are required, The cpu on the 8 slot board is an old skylake celery (soldered) so I want to keep it as lightweight as possible so can I get away with something like PCLinuxOS or will I need a tiny kernel?
I'm obviously avoiding Bloated SystemD distros.
I haven't found any good sources for info I straight up ROCm setups.

Also does anyone have a clue what the impact of running lower pci-e lanes will have, I may have to move to a board with more lanes but will see if the 1 GB/s restriction is manageable or not, I remember on things like FAH back in the day it didn't matter.

Headless Arch Linux, use archinstall and choose your init system there. Afterwards check the GPGPU page on the wiki and install the ROCm packages.

Running fewer or slower lanes doesn't impact inference speed, but loading and unloading the model will be slow.

Denver said:
I honestly think you'd get more help posting on Reddit's "Local Llama." Also, Vulkan is now faster than ROCm, dangerously close to CUDA, at least for LLMs;
View attachment 387519

It's awful. Maybe there's a --lowvram option that kobold doesn't expose in their GUI, but my processing speeds take a nosedive with Vulkan even with the extra VRAM and you can't isolate the model itself onto the GPUs. ROCm is much better.

Hueristic · 2025-03-03T05:04:40+0000

Rover4444 said:
Headless Arch Linux, use archinstall and choose your init system there. Afterwards check the GPGPU page on the wiki and install the ROCm packages.

Running fewer or slower lanes doesn't impact inference speed, but loading and unloading the model will be slow.

It's awful. Maybe there's a --lowvram option that kobold doesn't expose in their GUI, but my processing speeds take a nosedive with Vulkan even with the extra VRAM and you can't isolate the model itself onto the GPUs. ROCm is much better.

Thanks for the detailed first hand info!

Rover4444 · 2025-03-03T05:24:46+0000

Denver said:
I honestly think you'd get more help posting on Reddit's "Local Llama." Also, Vulkan is now faster than ROCm, dangerously close to CUDA, at least for LLMs;
View attachment 387519

By the way, NV_coopmat2 is NVIDIA exclusive. Mesa supports KHR_coopmat.

Hueristic said:
Thanks for the detailed first hand info!

You're very welcome! If you have any extra questions or problems, feel free to ask! :toast:

No sane man should be forced to talk to plebbitors.

System Name	RogueOne
Processor	Xeon W9-3495x
Motherboard	ASUS w790E Sage SE
Cooling	SilverStone XE360-4677
Memory	128gb Gskill Zeta R5 DDR5 RDIMMs
Video Card(s)	MSI SUPRIM Liquid X 4090
Storage	1x 2TB WD SN850X \| 2x 8TB GAMMIX S70
Display(s)	49" Philips Evnia OLED (49M2C8900)
Case	Thermaltake Core P3 Pro Snow
Audio Device(s)	Moondrop S8's on schitt Gunnr
Power Supply	Seasonic Prime TX-1600
Mouse	Razer Viper mini signature edition (mercury white)
Keyboard	Monsgeek M3 Lavender, Moondrop Luna lights
VR HMD	Quest 3
Software	Windows 11 Pro Workstation
Benchmark Scores	I dont have time for that.

Processor	5950x
Motherboard	B550 ProArt
Cooling	Fuma 2
Memory	4x32GB 3200MHz Corsair LPX
Video Card(s)	2x RTX 3090
Display(s)	LG 42" C2 4k OLED
Power Supply	XPG Core Reactor 850W
Software	I use Arch btw

ROCm OS?

Super Dainty Moderator