• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

What local LLM-s you use?

johnspack

Here For Good!
Joined
Oct 6, 2007
Messages
6,063 (0.95/day)
Location
Nelson B.C. Canada
System Name System2 Blacknet , System1 Blacknet2
Processor System2 Threadripper 1920x, System1 2699 v3
Motherboard System2 Asrock Fatality x399 Professional Gaming, System1 Asus X99-A
Cooling System2 Noctua NH-U14 TR4-SP3 Dual 140mm fans, System1 AIO
Memory System2 64GBS DDR4 3000, System1 32gbs DDR4 2400
Video Card(s) System2 GTX 980Ti System1 GTX 970
Storage System2 4x SSDs + NVme= 2.250TB 2xStorage Drives=8TB System1 3x SSDs=2TB
Display(s) 1x27" 1440 display 1x 24" 1080 display
Case System2 Some Nzxt case with soundproofing...
Audio Device(s) Asus Xonar U7 MKII
Power Supply System2 EVGA 750 Watt, System1 XFX XTR 750 Watt
Mouse Logitech G900 Chaos Spectrum
Keyboard Ducky
Software Archlinux, Manjaro, Win11 Ent 24h2
Benchmark Scores It's linux baby!
I actually have that model, but would like to go up a bit, maybe q8? I also see llama 70b. But don't see any download links....
I have to find models that will fit in 64gbs of ram.
 
Joined
Jan 12, 2023
Messages
293 (0.37/day)
System Name IZALITH (or just "Lith")
Processor AMD Ryzen 7 7800X3D (4.2Ghz base, 5.0Ghz boost, -30 PBO offset)
Motherboard Gigabyte X670E Aorus Master Rev 1.0
Cooling Deepcool Gammaxx AG400 Single Tower
Memory Corsair Vengeance 64GB (2x32GB) 6000MHz CL40 DDR5 XMP (XMP enabled)
Video Card(s) PowerColor Radeon RX 7900 XTX Red Devil OC 24GB (2.39Ghz base, 2.99Ghz boost, -30 core offset)
Storage 2x1TB SSD, 2x2TB SSD, 2x 8TB HDD
Display(s) Samsung Odyssey G51C 27" QHD (1440p 165Hz) + Samsung Odyssey G3 24" FHD (1080p 165Hz)
Case Corsair 7000D Airflow Full Tower
Audio Device(s) Corsair HS55 Surround Wired Headset/LG Z407 Speaker Set
Power Supply Corsair HX1000 Platinum Modular (1000W)
Mouse Logitech G502 X LIGHTSPEED Wireless Gaming Mouse
Keyboard Keychron K4 Wireless Mechanical Keyboard
Software Arch Linux
Get your own data center cards and leave my gaming GPUs alone!
Never fear friend, my card is primarily for gaming. The AI stuff is just for experimenting from time-to-time :)

Anyway, I decided to download llama3.3, but unfortunately I don't have the VRAM to run it. It maxed out my VRAM and any responses were INCREDIBLY slow. So I suspect i'll need to stick to smaller models.
 
Joined
Mar 11, 2008
Messages
1,149 (0.19/day)
Location
Hungary / Budapest
System Name Kincsem
Processor AMD Ryzen 9 9950X
Motherboard ASUS ProArt X870E-CREATOR WIFI
Cooling Be Quiet Dark Rock Pro 5
Memory Kingston Fury KF560C32RSK2-96 (2×48GB 6GHz)
Video Card(s) Sapphire AMD RX 7900 XT Pulse
Storage Samsung 990PRO 2TB + Samsung 980PRO 2TB + FURY Renegade 2TB+ Adata 2TB + WD Ultrastar HC550 16TB
Display(s) Acer QHD 27"@144Hz 1ms + UHD 27"@60Hz
Case Cooler Master CM 690 III
Power Supply Seasonic 1300W 80+ Gold Prime
Mouse Logitech G502 Hero
Keyboard HyperX Alloy Elite RGB
Software Windows 10-64
Benchmark Scores https://valid.x86.fr/9qw7iq https://valid.x86.fr/4d8n02 X570 https://www.techpowerup.com/gpuz/g46uc
I actually have that model, but would like to go up a bit, maybe q8? I also see llama 70b. But don't see any download links....
I have to find models that will fit in 64gbs of ram.
It is all there:
DeepSeek-R1-Distill-Qwen-32B-Q8_0
Llama-3.3-70B-Instruct-GGUF

Alternatively you can download it with LM Studio like this:
1740048755321.png

Super convenient :toast:
 
Last edited:

johnspack

Here For Good!
Joined
Oct 6, 2007
Messages
6,063 (0.95/day)
Location
Nelson B.C. Canada
System Name System2 Blacknet , System1 Blacknet2
Processor System2 Threadripper 1920x, System1 2699 v3
Motherboard System2 Asrock Fatality x399 Professional Gaming, System1 Asus X99-A
Cooling System2 Noctua NH-U14 TR4-SP3 Dual 140mm fans, System1 AIO
Memory System2 64GBS DDR4 3000, System1 32gbs DDR4 2400
Video Card(s) System2 GTX 980Ti System1 GTX 970
Storage System2 4x SSDs + NVme= 2.250TB 2xStorage Drives=8TB System1 3x SSDs=2TB
Display(s) 1x27" 1440 display 1x 24" 1080 display
Case System2 Some Nzxt case with soundproofing...
Audio Device(s) Asus Xonar U7 MKII
Power Supply System2 EVGA 750 Watt, System1 XFX XTR 750 Watt
Mouse Logitech G900 Chaos Spectrum
Keyboard Ducky
Software Archlinux, Manjaro, Win11 Ent 24h2
Benchmark Scores It's linux baby!
Thanks yeah I finally found more downloads. Right now I have to use Koboldcpp, and it doesn't have the download feature. LMStudio was failing on me, so I switched.
Although after some time the model f's up, but in Kobold I just use start new session and it clears up.
Yep, now have DeepSeek-R1-Distill-Qwen-32B-Q8_0 running just fine. Not bad for an ancient computer!
Oh and Q8 is using around 35gbs of ram.

It's a bit slow... not really using my gpu as much as I'd like:
1740111467983.png
 
Last edited:
Joined
Nov 23, 2023
Messages
167 (0.36/day)
Thanks yeah I finally found more downloads. Right now I have to use Koboldcpp, and it doesn't have the download feature. LMStudio was failing on me, so I switched.
Although after some time the model f's up, but in Kobold I just use start new session and it clears up.
Yep, now have DeepSeek-R1-Distill-Qwen-32B-Q8_0 running just fine. Not bad for an ancient computer!
Oh and Q8 is using around 35gbs of ram.

It's a bit slow... not really using my gpu as much as I'd like:
View attachment 385850
The more layers you can put on VRAM the faster it'll perform. Use the Q4 quants or check how much of your VRAM is being used.
 
Joined
Aug 20, 2007
Messages
21,893 (3.42/day)
Location
Olympia, WA
System Name Pioneer
Processor Ryzen 9 9950X
Motherboard MSI MAG X670E Tomahawk Wifi
Cooling Noctua NH-D15 + A whole lotta Sunon, Phanteks and Corsair Maglev blower fans...
Memory 128GB (4x 32GB) G.Skill Flare X5 @ DDR5-4000(Running 1:1:1 w/ FCLK)
Video Card(s) XFX RX 7900 XTX Speedster Merc 310
Storage Intel 5800X Optane 800GB boot, +2x Crucial P5 Plus 2TB PCIe 4.0 NVMe SSDs, 1x 2TB Seagate Exos 3.5"
Display(s) 55" LG 55" B9 OLED 4K Display
Case Thermaltake Core X31
Audio Device(s) TOSLINK->Schiit Modi MB->Asgard 2 DAC Amp->AKG Pro K712 Headphones or HDMI->B9 OLED
Power Supply FSP Hydro Ti Pro 850W
Mouse Logitech G305 Lightspeed Wireless
Keyboard WASD Code v3 with Cherry Green keyswitches + PBT DS keycaps
Software Gentoo Linux x64
Get your own data center cards and leave my gaming GPUs alone!

Why not both? These guys likely game too given the audience here. You are being mad at the wrong group.
 
Joined
Mar 11, 2008
Messages
1,149 (0.19/day)
Location
Hungary / Budapest
System Name Kincsem
Processor AMD Ryzen 9 9950X
Motherboard ASUS ProArt X870E-CREATOR WIFI
Cooling Be Quiet Dark Rock Pro 5
Memory Kingston Fury KF560C32RSK2-96 (2×48GB 6GHz)
Video Card(s) Sapphire AMD RX 7900 XT Pulse
Storage Samsung 990PRO 2TB + Samsung 980PRO 2TB + FURY Renegade 2TB+ Adata 2TB + WD Ultrastar HC550 16TB
Display(s) Acer QHD 27"@144Hz 1ms + UHD 27"@60Hz
Case Cooler Master CM 690 III
Power Supply Seasonic 1300W 80+ Gold Prime
Mouse Logitech G502 Hero
Keyboard HyperX Alloy Elite RGB
Software Windows 10-64
Benchmark Scores https://valid.x86.fr/9qw7iq https://valid.x86.fr/4d8n02 X570 https://www.techpowerup.com/gpuz/g46uc
The more layers you can put on VRAM the faster it'll perform. Use the Q4 quants or check how much of your VRAM is being used.
Or the other way,
It is more and more painful if you put more and more layers into your system RAM :roll:
Why not both? These guys likely game too given the audience here. You are being mad at the wrong group.
Yeah!
PC is a general computer it can do it all,
You can load and unload those programs on demand! :toast:

Wanted to try out yesterday but did not have the smirki/UIGEN-T1-Qwen-7b is doing good job with match problems, with language, not that great.
And it is quite fast with 74 token/s for me.
 

johnspack

Here For Good!
Joined
Oct 6, 2007
Messages
6,063 (0.95/day)
Location
Nelson B.C. Canada
System Name System2 Blacknet , System1 Blacknet2
Processor System2 Threadripper 1920x, System1 2699 v3
Motherboard System2 Asrock Fatality x399 Professional Gaming, System1 Asus X99-A
Cooling System2 Noctua NH-U14 TR4-SP3 Dual 140mm fans, System1 AIO
Memory System2 64GBS DDR4 3000, System1 32gbs DDR4 2400
Video Card(s) System2 GTX 980Ti System1 GTX 970
Storage System2 4x SSDs + NVme= 2.250TB 2xStorage Drives=8TB System1 3x SSDs=2TB
Display(s) 1x27" 1440 display 1x 24" 1080 display
Case System2 Some Nzxt case with soundproofing...
Audio Device(s) Asus Xonar U7 MKII
Power Supply System2 EVGA 750 Watt, System1 XFX XTR 750 Watt
Mouse Logitech G900 Chaos Spectrum
Keyboard Ducky
Software Archlinux, Manjaro, Win11 Ent 24h2
Benchmark Scores It's linux baby!
Well, can't seem to run any llama models, not sure why. Fortunately deepseek-r1-distill models all work fine. Would like to figure out how to offload more to my gpu though.
Looks like it only assigns about 3.5gbs of vram.
 
Joined
Mar 11, 2008
Messages
1,149 (0.19/day)
Location
Hungary / Budapest
System Name Kincsem
Processor AMD Ryzen 9 9950X
Motherboard ASUS ProArt X870E-CREATOR WIFI
Cooling Be Quiet Dark Rock Pro 5
Memory Kingston Fury KF560C32RSK2-96 (2×48GB 6GHz)
Video Card(s) Sapphire AMD RX 7900 XT Pulse
Storage Samsung 990PRO 2TB + Samsung 980PRO 2TB + FURY Renegade 2TB+ Adata 2TB + WD Ultrastar HC550 16TB
Display(s) Acer QHD 27"@144Hz 1ms + UHD 27"@60Hz
Case Cooler Master CM 690 III
Power Supply Seasonic 1300W 80+ Gold Prime
Mouse Logitech G502 Hero
Keyboard HyperX Alloy Elite RGB
Software Windows 10-64
Benchmark Scores https://valid.x86.fr/9qw7iq https://valid.x86.fr/4d8n02 X570 https://www.techpowerup.com/gpuz/g46uc
Well, can't seem to run any llama models, not sure why. Fortunately deepseek-r1-distill models all work fine. Would like to figure out how to offload more to my gpu though.
Looks like it only assigns about 3.5gbs of vram.
More info?
Not running? You mean not running at all or not on GPU?
What daemon you run the models?
 
Joined
Jun 26, 2023
Messages
79 (0.13/day)
Processor 7800X3D @ Curve Optimizer: All Core: -25
Motherboard TUF Gaming B650-Plus
Memory 2xKSM48E40BD8KM-32HM ECC RAM (ECC enabled in BIOS)
Video Card(s) 4070 @ 110W
Display(s) SAMSUNG S95B 55" QD-OLED TV
Power Supply RM850x
The LLM I use locally is Qwen2.5-32B-Instruct-Q6_K.gguf. It has replaced all the smaller ones for me. Frontend: text-generation-webui (this was what worked first when I tried a LLM GUI on Linux and I stuck with it). Speed: ~2.6 tokens/second when 23 layers are offloaded to my 4070 (set CPU threads: 4. DDR5-4800 dual channel). For bigger LLMs I use Chatbot Arena' Direct Chat.
I benchmarked RAM vs VRAM offloading:
layer-vs-tokens.png
 
Joined
Mar 11, 2008
Messages
1,149 (0.19/day)
Location
Hungary / Budapest
System Name Kincsem
Processor AMD Ryzen 9 9950X
Motherboard ASUS ProArt X870E-CREATOR WIFI
Cooling Be Quiet Dark Rock Pro 5
Memory Kingston Fury KF560C32RSK2-96 (2×48GB 6GHz)
Video Card(s) Sapphire AMD RX 7900 XT Pulse
Storage Samsung 990PRO 2TB + Samsung 980PRO 2TB + FURY Renegade 2TB+ Adata 2TB + WD Ultrastar HC550 16TB
Display(s) Acer QHD 27"@144Hz 1ms + UHD 27"@60Hz
Case Cooler Master CM 690 III
Power Supply Seasonic 1300W 80+ Gold Prime
Mouse Logitech G502 Hero
Keyboard HyperX Alloy Elite RGB
Software Windows 10-64
Benchmark Scores https://valid.x86.fr/9qw7iq https://valid.x86.fr/4d8n02 X570 https://www.techpowerup.com/gpuz/g46uc
Joined
May 10, 2023
Messages
666 (1.00/day)
Location
Brazil
Processor 5950x
Motherboard B550 ProArt
Cooling Fuma 2
Memory 4x32GB 3200MHz Corsair LPX
Video Card(s) 2x RTX 3090
Display(s) LG 42" C2 4k OLED
Power Supply XPG Core Reactor 850W
Software I use Arch btw

johnspack

Here For Good!
Joined
Oct 6, 2007
Messages
6,063 (0.95/day)
Location
Nelson B.C. Canada
System Name System2 Blacknet , System1 Blacknet2
Processor System2 Threadripper 1920x, System1 2699 v3
Motherboard System2 Asrock Fatality x399 Professional Gaming, System1 Asus X99-A
Cooling System2 Noctua NH-U14 TR4-SP3 Dual 140mm fans, System1 AIO
Memory System2 64GBS DDR4 3000, System1 32gbs DDR4 2400
Video Card(s) System2 GTX 980Ti System1 GTX 970
Storage System2 4x SSDs + NVme= 2.250TB 2xStorage Drives=8TB System1 3x SSDs=2TB
Display(s) 1x27" 1440 display 1x 24" 1080 display
Case System2 Some Nzxt case with soundproofing...
Audio Device(s) Asus Xonar U7 MKII
Power Supply System2 EVGA 750 Watt, System1 XFX XTR 750 Watt
Mouse Logitech G900 Chaos Spectrum
Keyboard Ducky
Software Archlinux, Manjaro, Win11 Ent 24h2
Benchmark Scores It's linux baby!
More info?
Not running? You mean not running at all or not on GPU?
What daemon you run the models?
I'm having to use the Koboldcpp daemon, LmStudio doesn't seem to want to use my gpu at all. Just tested Kobold-nocuda and am able to run Llama, but horribly slow. Didn't realize
my old gpu helped that much. Guess I'm stuck with Deepseek until I get a 3090 or something....
Oh and I can load it with clblast I just found out, heats up my gpu quite a bit, but almost as slow as no gpu. CuBlas is by far the fastest, but I can't run Llama models with it.
Vulkan works but on my old computer it's terrible slow. Dam I need a new computer!
 
Last edited:
Joined
Nov 23, 2023
Messages
167 (0.36/day)
I'm having to use the Koboldcpp daemon, LmStudio doesn't seem to want to use my gpu at all. Just tested Kobold-nocuda and am able to run Llama, but horribly slow. Didn't realize
my old gpu helped that much. Guess I'm stuck with Deepseek until I get a 3090 or something....
Oh and I can load it with clblast I just found out, heats up my gpu quite a bit, but almost as slow as no gpu. CuBlas is by far the fastest, but I can't run Llama models with it.
Vulkan works but on my old computer it's terrible slow. Dam I need a new computer!
I have no idea what's happening, but it shouldn't be. Turn on flash attention, set the layers manually, or use --lowvram. You might be getting OOM errors when trying to run llama.

Just to make sure it's an issue with llamacpp and nothing else, try to offload one layer only. If it works it's OOMing when layers are set to auto and if it doesn't you should try recreating your venv and reinstalling packages.
 
Joined
Mar 11, 2008
Messages
1,149 (0.19/day)
Location
Hungary / Budapest
System Name Kincsem
Processor AMD Ryzen 9 9950X
Motherboard ASUS ProArt X870E-CREATOR WIFI
Cooling Be Quiet Dark Rock Pro 5
Memory Kingston Fury KF560C32RSK2-96 (2×48GB 6GHz)
Video Card(s) Sapphire AMD RX 7900 XT Pulse
Storage Samsung 990PRO 2TB + Samsung 980PRO 2TB + FURY Renegade 2TB+ Adata 2TB + WD Ultrastar HC550 16TB
Display(s) Acer QHD 27"@144Hz 1ms + UHD 27"@60Hz
Case Cooler Master CM 690 III
Power Supply Seasonic 1300W 80+ Gold Prime
Mouse Logitech G502 Hero
Keyboard HyperX Alloy Elite RGB
Software Windows 10-64
Benchmark Scores https://valid.x86.fr/9qw7iq https://valid.x86.fr/4d8n02 X570 https://www.techpowerup.com/gpuz/g46uc
I'm having to use the Koboldcpp daemon, LmStudio doesn't seem to want to use my gpu at all. Just tested Kobold-nocuda and am able to run Llama, but horribly slow. Didn't realize
my old gpu helped that much. Guess I'm stuck with Deepseek until I get a 3090 or something....
Oh and I can load it with clblast I just found out, heats up my gpu quite a bit, but almost as slow as no gpu. CuBlas is by far the fastest, but I can't run Llama models with it.
Vulkan works but on my old computer it's terrible slow. Dam I need a new computer!
Sorry mate I don't know Koboldcpp at all.
But when you arrive to get your 3090, make sure that's the 24GB version! :D
 
Joined
May 10, 2023
Messages
666 (1.00/day)
Location
Brazil
Processor 5950x
Motherboard B550 ProArt
Cooling Fuma 2
Memory 4x32GB 3200MHz Corsair LPX
Video Card(s) 2x RTX 3090
Display(s) LG 42" C2 4k OLED
Power Supply XPG Core Reactor 850W
Software I use Arch btw
Joined
Feb 12, 2025
Messages
19 (0.83/day)
Location
EU
Processor AMD 5600X
Motherboard ASUS TUF GAMING B550M-Plus WiFi
Cooling be quiet! Dark Rock 4
Memory G.Skill Ripjaws 2 x 32 GB DDR4-3600 CL18-22-22-42 1.35V F4-3600C18D-64GVK
Video Card(s) Sapphire Pulse RX 7800XT 16GB
Storage Kingston KC3000 2TB + QNAP TBS-464
Display(s) LG 35" LCD 35WN75C-B 3440x1440
Case Kolink Bastion RGB Midi-Tower
Power Supply Enermax Digifanless 550W
Mouse Razer Deathadder v2
Benchmark Scores phi4 - 42.00 tokens/s
Joined
May 10, 2023
Messages
666 (1.00/day)
Location
Brazil
Processor 5950x
Motherboard B550 ProArt
Cooling Fuma 2
Memory 4x32GB 3200MHz Corsair LPX
Video Card(s) 2x RTX 3090
Display(s) LG 42" C2 4k OLED
Power Supply XPG Core Reactor 850W
Software I use Arch btw
I believe these are known as nVidia L40 or L40s
The L40 has a higher bin of the AD102 compared to the 4090, but the 4090 has the faster GDDR6X which gives it more memory bandwidth.
The 4090 in raw perf should be faster than the L40 for LLMs, and that 48GB model at a third/quarter of the price of the L40 makes it really interesting.
 
Joined
Feb 12, 2025
Messages
19 (0.83/day)
Location
EU
Processor AMD 5600X
Motherboard ASUS TUF GAMING B550M-Plus WiFi
Cooling be quiet! Dark Rock 4
Memory G.Skill Ripjaws 2 x 32 GB DDR4-3600 CL18-22-22-42 1.35V F4-3600C18D-64GVK
Video Card(s) Sapphire Pulse RX 7800XT 16GB
Storage Kingston KC3000 2TB + QNAP TBS-464
Display(s) LG 35" LCD 35WN75C-B 3440x1440
Case Kolink Bastion RGB Midi-Tower
Power Supply Enermax Digifanless 550W
Mouse Razer Deathadder v2
Benchmark Scores phi4 - 42.00 tokens/s
The L40 has a higher bin of the AD102 compared to the 4090, but the 4090 has the faster GDDR6X which gives it more memory bandwidth.
The 4090 in raw perf should be faster than the L40 for LLMs, and that 48GB model at a third/quarter of the price of the L40 makes it really interesting.
Ok I checked the Chiphell links for 48GB edition. Indeed using GDDR6X, is it double-sided GDDR6X ? That blower looks nasty and not surprised by that 50dB measured noise/roar. For desktop use, should come with noise cancelling helmet.
Getting existing GPU to support double or triple amount of RAM without problems in not trivial task. BIOS needs to be correctly modified, memory power and cooling requirements met etc.
While these GPU based solutions seem cool now, I think unified memory solutions like nVidia project DIGITS are the way to go here. Why limit yourself to GPU memory, when whole system memory can be fast ?
 
Joined
Mar 11, 2008
Messages
1,149 (0.19/day)
Location
Hungary / Budapest
System Name Kincsem
Processor AMD Ryzen 9 9950X
Motherboard ASUS ProArt X870E-CREATOR WIFI
Cooling Be Quiet Dark Rock Pro 5
Memory Kingston Fury KF560C32RSK2-96 (2×48GB 6GHz)
Video Card(s) Sapphire AMD RX 7900 XT Pulse
Storage Samsung 990PRO 2TB + Samsung 980PRO 2TB + FURY Renegade 2TB+ Adata 2TB + WD Ultrastar HC550 16TB
Display(s) Acer QHD 27"@144Hz 1ms + UHD 27"@60Hz
Case Cooler Master CM 690 III
Power Supply Seasonic 1300W 80+ Gold Prime
Mouse Logitech G502 Hero
Keyboard HyperX Alloy Elite RGB
Software Windows 10-64
Benchmark Scores https://valid.x86.fr/9qw7iq https://valid.x86.fr/4d8n02 X570 https://www.techpowerup.com/gpuz/g46uc
Ok I checked the Chiphell links for 48GB edition. Indeed using GDDR6X, is it double-sided GDDR6X ? That blower looks nasty and not surprised by that 50dB measured noise/roar. For desktop use, should come with noise cancelling helmet.
Getting existing GPU to support double or triple amount of RAM without problems in not trivial task. BIOS needs to be correctly modified, memory power and cooling requirements met etc.
While these GPU based solutions seem cool now, I think unified memory solutions like nVidia project DIGITS are the way to go here. Why limit yourself to GPU memory, when whole system memory can be fast ?
"For desktop use, should come with noise cancelling helmet." :roll:
Oh well, sadly these are not for home use, but for datacenters
These are not a place where you want to be - for longer periods anyway.
 
Joined
May 10, 2023
Messages
666 (1.00/day)
Location
Brazil
Processor 5950x
Motherboard B550 ProArt
Cooling Fuma 2
Memory 4x32GB 3200MHz Corsair LPX
Video Card(s) 2x RTX 3090
Display(s) LG 42" C2 4k OLED
Power Supply XPG Core Reactor 850W
Software I use Arch btw
Ok I checked the Chiphell links for 48GB edition. Indeed using GDDR6X, is it double-sided GDDR6X ?
Seems like it's a custom PCB based on the one from the 3090, so yeah, clamshell design with 24x 16Gb GDDR6X modules.
That blower looks nasty and not surprised by that 50dB measured noise/roar. For desktop use, should come with noise cancelling helmet.
Apparently this can be made way better by reducing the power limit. That blower format is also perfect for using multiple GPUs in a single setup.
While these GPU based solutions seem cool now, I think unified memory solutions like nVidia project DIGITS are the way to go here. Why limit yourself to GPU memory, when whole system memory can be fast ?
While I do agree that those devices are cool and will fill a really nice niche, the Issue is that so fa those unified memory system are not as fast as a dGPU.
We're yet to see the specs on DIGITS, but something like the Strix Halo only has 256GB/s, which is on the level of a 6600xt. A 4090 does 4x that, and a 5090 does 1.8TB/s.
And that's only talking about memory bandwidth, those devices also have way more raw compute power compared to those iGPUs.

The M1/M2 Ultra do have nice memory bw (~800GB/s), but their actual iGPU is slower compared to the likes of a 3090/4090, and they're not cheap at all. A theoretical M4 Ultra should achieve 1TB/s, similar to a 4090, but we're yet to see how fast it'll be, and pricing should be on the higher end as well.
 
Joined
Feb 12, 2025
Messages
19 (0.83/day)
Location
EU
Processor AMD 5600X
Motherboard ASUS TUF GAMING B550M-Plus WiFi
Cooling be quiet! Dark Rock 4
Memory G.Skill Ripjaws 2 x 32 GB DDR4-3600 CL18-22-22-42 1.35V F4-3600C18D-64GVK
Video Card(s) Sapphire Pulse RX 7800XT 16GB
Storage Kingston KC3000 2TB + QNAP TBS-464
Display(s) LG 35" LCD 35WN75C-B 3440x1440
Case Kolink Bastion RGB Midi-Tower
Power Supply Enermax Digifanless 550W
Mouse Razer Deathadder v2
Benchmark Scores phi4 - 42.00 tokens/s
While I do agree that those devices are cool and will fill a really nice niche, the Issue is that so fa those unified memory system are not as fast as a dGPU.
We're yet to see the specs on DIGITS, but something like the Strix Halo only has 256GB/s, which is on the level of a 6600xt. A 4090 does 4x that, and a 5090 does 1.8TB/s.
And that's only talking about memory bandwidth, those devices also have way more raw compute power compared to those iGPUs.
The M1/M2 Ultra do have nice memory bw (~800GB/s), but their actual iGPU is slower compared to the likes of a 3090/4090, and they're not cheap at all. A theoretical M4 Ultra should achieve 1TB/s, similar to a 4090, but we're yet to see how fast it'll be, and pricing should be on the higher end as well.
I agree that Strix Halo's 256GB/s is puny, but then again that's first gen unified memory PC platform from AMD. It has potential to get much better over time.
DIGITS should get 900GB/s and a GPU to match that bandwidth. With 128GB of RAM, this will best many datacenter class GPUs that cost a lot more money.
I've seen various Apple's benchmarks on LLMs and considering hardware, they do not fare well, I suspect software optimizations are partly to blame here.
Ultimately merging system and GPU memory over high speed link offers superior bang for buck, than just cranking up GDDRN on a GPU each gen.
Oh well, sadly these are not for home use, but for datacenters
These are not a place where you want to be - for longer periods anyway.
I've roamed around various datacenters for decades. Did so even last week. Low oxygen ones are "fun". Most high end compute in DCs is nowadays water cooled (DLC), especially GPUs, but we are going off topic here.
 
Last edited:
Joined
May 10, 2023
Messages
666 (1.00/day)
Location
Brazil
Processor 5950x
Motherboard B550 ProArt
Cooling Fuma 2
Memory 4x32GB 3200MHz Corsair LPX
Video Card(s) 2x RTX 3090
Display(s) LG 42" C2 4k OLED
Power Supply XPG Core Reactor 850W
Software I use Arch btw
DIGITS should get 900GB/s and a GPU to match that bandwidth. With 128GB of RAM, this will best many datacenter class GPUs that cost a lot more money.
That's one possibility, but I think something in the 450~512GB/s mark is more realistic.
Agreed with all your other points tho, those devices have lots of potential in the future and may cover most use cases.
 
Joined
Oct 17, 2021
Messages
115 (0.09/day)
System Name Nirn
Processor Amd Ryzen 7950X3D
Motherboard MSI MEG ACE X670e
Cooling Noctua NH-D15
Memory 128 GB Kingston DDR5 6000 (running at 4000)
Video Card(s) Radeon RX 7900XTX (24G) + Geforce 4070ti (12G) Physx
Storage SAMSUNG 990 EVO SSD 2TB Gen 5 x2 (OS)+SAMSUNG 980 SSD 1TB PCle 3.0x4 (Primocache) +2X 22TB WD Gold
Display(s) Samsung UN55NU8000 (Freesync)
Case Corsair Graphite Series 780T White
Audio Device(s) Creative Soundblaster AE-7 + Sennheiser GSP600
Power Supply Seasonic PRIME TX-1000 Titanium
Mouse Razer Mamba Elite Wired
Keyboard Razer BlackWidow Chroma v1
VR HMD Oculus Quest 2
Software Windows 10
Ok I checked the Chiphell links for 48GB edition. Indeed using GDDR6X, is it double-sided GDDR6X ? That blower looks nasty and not surprised by that 50dB measured noise/roar. For desktop use, should come with noise cancelling helmet.
Getting existing GPU to support double or triple amount of RAM without problems in not trivial task. BIOS needs to be correctly modified, memory power and cooling requirements met etc.
While these GPU based solutions seem cool now, I think unified memory solutions like nVidia project DIGITS are the way to go here. Why limit yourself to GPU memory, when whole system memory can be fast ?
didnt amd do that years ago with the memory controller on vega that could address system ram and even use nvme storage as gpu "ram"
 
Joined
May 10, 2023
Messages
666 (1.00/day)
Location
Brazil
Processor 5950x
Motherboard B550 ProArt
Cooling Fuma 2
Memory 4x32GB 3200MHz Corsair LPX
Video Card(s) 2x RTX 3090
Display(s) LG 42" C2 4k OLED
Power Supply XPG Core Reactor 850W
Software I use Arch btw
didnt amd do that years ago with the memory controller on vega that could address system ram and even use nvme storage as gpu "ram"
No, that was just a dumb PCIe switch/mux, no different than having a regular NVMe in your motherboard and using PCIe P2P to access stuff between devices.

That has nothing to do with unified memory.
 
Top