• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

What local LLM-s you use?

Joined
Mar 11, 2008
Messages
1,100 (0.18/day)
Location
Hungary / Budapest
System Name Kincsem
Processor AMD Ryzen 9 9950X
Motherboard ASUS ProArt X870E-CREATOR WIFI
Cooling Be Quiet Dark Rock Pro 5
Memory Kingston Fury KF560C32RSK2-96 (2×48GB 6GHz)
Video Card(s) Sapphire AMD RX 7900 XT Pulse
Storage Samsung 990PRO 2TB + Samsung 980PRO 2TB + FURY Renegade 2TB+ Adata 2TB + WD Ultrastar HC550 16TB
Display(s) Acer QHD 27"@144Hz 1ms + UHD 27"@60Hz
Case Cooler Master CM 690 III
Power Supply Seasonic 1300W 80+ Gold Prime
Mouse Logitech G502 Hero
Keyboard HyperX Alloy Elite RGB
Software Windows 10-64
Benchmark Scores https://valid.x86.fr/9qw7iq https://valid.x86.fr/4d8n02 X570 https://www.techpowerup.com/gpuz/g46uc
@AusWolf
Local LLM-s are very important, I really reject the trend everything getting "cloud" based, micro$oft wants even you windows account to be online...
But after you posted 3 times, you could tell us what LLM-s you use, and maybe some performance data too!
 
Joined
Feb 12, 2025
Messages
10 (1.00/day)
Location
EU
Processor AMD 5600X
Motherboard ASUS TUF GAMING B550M-Plus WiFi
Cooling be quiet! Dark Rock 4
Memory G.Skill Ripjaws 2 x 32 GB DDR4-3600 CL18-22-22-42 1.35V F4-3600C18D-64GVK
Video Card(s) Sapphire Pulse RX 7800XT 16GB
Storage Kingston KC3000 2TB + QNAP TBS-464
Display(s) LG 35" LCD 35WN75C-B 3440x1440
Case Kolink Bastion RGB Midi-Tower
Power Supply Enermax Digifanless 550W
Mouse Razer Deathadder v2
Benchmark Scores phi4 - 42.00 tokens/s
Maybe I'm a little bit behind on stuff but... Got to ask... What's the point of this for any regular home user?
It's like asking "whats the point of using your brain ?". First time in recorded history humanity has the ability to use thinking tool. It can supercharge almost any skill you have. Just as human brain can be used in nearly limitless ways, same applies to local LLM. Use it to check kids homework, use it to make your homework, help analyze scientific papers, write code for you, explain why vitamin K is good, count starts in the sky, analyze insurance offerings etc etc etc. I am not even going to pretend I know even a fraction of uses cases local LLMs will have next 10 years, but I know its going to be wild. On a level how internet changed our lives (yeah some of us grew up without internet).
 
Joined
May 22, 2024
Messages
431 (1.56/day)
System Name Kuro
Processor AMD Ryzen 7 7800X3D@65W
Motherboard MSI MAG B650 Tomahawk WiFi
Cooling Thermalright Phantom Spirit 120 EVO
Memory Corsair DDR5 6000C30 2x48GB (Hynix M)@6000 30-36-36-76 1.36V
Video Card(s) PNY XLR8 RTX 4070 Ti SUPER 16G@200W
Storage Crucial T500 2TB + WD Blue 8TB
Case Lian Li LANCOOL 216
Power Supply MSI MPG A850G
Software Ubuntu 24.04 LTS + Windows 10 Home Build 19045
Benchmark Scores 17761 C23 Multi@65W
Maybe I'm a little bit behind on stuff but... Got to ask... What's the point of this for any regular home user?
That sounds interesting. Can you explain? :)
At least the larger, 70B+ models, are typically sufficiently knowledgeable that you can ask them some complicated questions and expect a reasonable and not necessarily banal and expected answer. There are things you do not want to send to commercial services, most typically personal information. Some of the latest advancements made even 70B scale models competent with illusion-shattering problems previous generations of models have difficulty with, like how many r's in strawberry, how many boys does Mary have when one of them is gross, et cetera.

Now they are usually useful for common math and programming problems when used with care, can explore human philosophy and condition quite competently, and can tell stories of some interest with the right prompt. They are also useful for getting familiar with what LLM output looked like. Half of the internet looks LLM generated these days.

Some open-weight models are capable of API use, such as those provided by the framework it is running on, including requesting web services. Usefulness of such capabilities is apparently unremarkable given other limitations, and for that matter, the state of the internet and search engine results these days. That require support by the framework the model is running on, and is usually the only time the model - note, not the framework - would access the Internet.

They can also provide some silly fun, especially the roleplay-finetuned ones, for uses where hallucination actually provides some emulation of creativity. Think of it as a text-based holodeck. Throw in an image generator and it is text and image. You typically don't want a lot of those elsewhere, as well: As with all things requiring an account, all things you put into a networked service would be recorded by the provider, and linked to you. Not everyone feel comfortable with the nothing-to-hide mentality even when they really don't, and more than a few have objections to their interactions and personal info being used to train future commercial AI models.

Personally, I've sized my setup to be able to run a "future larger model" in early 2024, which would turn out to be mistral-large-2407, 123B, quantized. The best correctness and general task performance is probably currently achieved by the LLAMA 3 70B distilled version of DeepSeek R1. Anything larger would be costly and impractical for the moment, to me. Might as well make them useful while they are there.
 
Joined
Mar 11, 2008
Messages
1,100 (0.18/day)
Location
Hungary / Budapest
System Name Kincsem
Processor AMD Ryzen 9 9950X
Motherboard ASUS ProArt X870E-CREATOR WIFI
Cooling Be Quiet Dark Rock Pro 5
Memory Kingston Fury KF560C32RSK2-96 (2×48GB 6GHz)
Video Card(s) Sapphire AMD RX 7900 XT Pulse
Storage Samsung 990PRO 2TB + Samsung 980PRO 2TB + FURY Renegade 2TB+ Adata 2TB + WD Ultrastar HC550 16TB
Display(s) Acer QHD 27"@144Hz 1ms + UHD 27"@60Hz
Case Cooler Master CM 690 III
Power Supply Seasonic 1300W 80+ Gold Prime
Mouse Logitech G502 Hero
Keyboard HyperX Alloy Elite RGB
Software Windows 10-64
Benchmark Scores https://valid.x86.fr/9qw7iq https://valid.x86.fr/4d8n02 X570 https://www.techpowerup.com/gpuz/g46uc
70B+ models, are typically sufficiently knowledgeable that you can ask them some complicated questions and expect a reasonable and not necessarily banal and expected answer.
Did you skip DeepSeek?
1739550449205.png
 
Joined
Jan 14, 2019
Messages
14,508 (6.50/day)
Location
Midlands, UK
Processor Various Intel and AMD CPUs
Motherboard Micro-ATX and mini-ITX
Cooling Yes
Memory Overclocking is overrated
Video Card(s) Various Nvidia and AMD GPUs
Storage A lot
Display(s) Monitors and TVs
Case It's not about size, but how you use it
Audio Device(s) Speakers and headphones
Power Supply 300 to 750 W, bronze to gold
Mouse Wireless
Keyboard Mechanic
VR HMD Not yet
Software Linux gaming master race
@AusWolf
Local LLM-s are very important, I really reject the trend everything getting "cloud" based, micro$oft wants even you windows account to be online...
But after you posted 3 times, you could tell us what LLM-s you use, and maybe some performance data too!
I'm not using anything. I didn't even know that you could run them locally until recently. I'm only trying to learn what use it is, to see whether it's something I'd want to do or not.

At least the larger, 70B+ models, are typically sufficiently knowledgeable that you can ask them some complicated questions and expect a reasonable and not necessarily banal and expected answer. There are things you do not want to send to commercial services, most typically personal information. Some of the latest advancements made even 70B scale models competent with illusion-shattering problems previous generations of models have difficulty with, like how many r's in strawberry, how many boys does Mary have when one of them is gross, et cetera.

Now they are usually useful for common math and programming problems when used with care, can explore human philosophy and condition quite competently, and can tell stories of some interest with the right prompt. They are also useful for getting familiar with what LLM output looked like. Half of the internet looks LLM generated these days.

Some open-weight models are capable of API use, such as those provided by the framework it is running on, including requesting web services. Usefulness of such capabilities is apparently unremarkable given other limitations, and for that matter, the state of the internet and search engine results these days. That require support by the framework the model is running on, and is usually the only time the model - note, not the framework - would access the Internet.

They can also provide some silly fun, especially the roleplay-finetuned ones, for uses where hallucination actually provides some emulation of creativity. Think of it as a text-based holodeck. Throw in an image generator and it is text and image. You typically don't want a lot of those elsewhere, as well: As with all things requiring an account, all things you put into a networked service would be recorded by the provider, and linked to you. Not everyone feel comfortable with the nothing-to-hide mentality even when they really don't, and more than a few have objections to their interactions and personal info being used to train future commercial AI models.

Personally, I've sized my setup to be able to run a "future larger model" in early 2024, which would turn out to be mistral-large-2407, 123B, quantized. The best correctness and general task performance is probably currently achieved by the LLAMA 3 70B distilled version of DeepSeek R1. Anything larger would be costly and impractical for the moment, to me. Might as well make them useful while they are there.
Text-based holodeck running locally on your PC... Now that caught my attention! :)

I'm just having a hard time imagining it. LLM still lives in my head as a glorified search engine. :ohwell:
 
Joined
Mar 11, 2008
Messages
1,100 (0.18/day)
Location
Hungary / Budapest
System Name Kincsem
Processor AMD Ryzen 9 9950X
Motherboard ASUS ProArt X870E-CREATOR WIFI
Cooling Be Quiet Dark Rock Pro 5
Memory Kingston Fury KF560C32RSK2-96 (2×48GB 6GHz)
Video Card(s) Sapphire AMD RX 7900 XT Pulse
Storage Samsung 990PRO 2TB + Samsung 980PRO 2TB + FURY Renegade 2TB+ Adata 2TB + WD Ultrastar HC550 16TB
Display(s) Acer QHD 27"@144Hz 1ms + UHD 27"@60Hz
Case Cooler Master CM 690 III
Power Supply Seasonic 1300W 80+ Gold Prime
Mouse Logitech G502 Hero
Keyboard HyperX Alloy Elite RGB
Software Windows 10-64
Benchmark Scores https://valid.x86.fr/9qw7iq https://valid.x86.fr/4d8n02 X570 https://www.techpowerup.com/gpuz/g46uc
Joined
May 22, 2024
Messages
431 (1.56/day)
System Name Kuro
Processor AMD Ryzen 7 7800X3D@65W
Motherboard MSI MAG B650 Tomahawk WiFi
Cooling Thermalright Phantom Spirit 120 EVO
Memory Corsair DDR5 6000C30 2x48GB (Hynix M)@6000 30-36-36-76 1.36V
Video Card(s) PNY XLR8 RTX 4070 Ti SUPER 16G@200W
Storage Crucial T500 2TB + WD Blue 8TB
Case Lian Li LANCOOL 216
Power Supply MSI MPG A850G
Software Ubuntu 24.04 LTS + Windows 10 Home Build 19045
Benchmark Scores 17761 C23 Multi@65W
Did you skip DeepSeek?
The way they do chain-of-thought makes for interesting reading. I think they are the first to do it well enough in an open-weight model, too. Wherever they might be from, I do not have quite sufficient trust to send any hosted services anything confidential or profiling.

Even "free" services come with the implicit permission of using your interaction for any number of further purposes buried in the user agreement assuming it is followed, and God forbid if there is a data breach.

FWIW 70B Q6_k quantized models are a bit more than ~0.9 token/s to almost 1.2 token/s on my setup running on official distribution of Ollama 0.5.7. Latest llama.cpp compiled from source gives ~1.2 token/s.

Text-based holodeck running locally on your PC... Now that caught my attention! :)

I'm just having a hard time imagining it. LLM still lives in my head as a glorified search engine. :ohwell:
To be fair, they are still even worse than that when used for factual stuff without verification. And whatever it is that the various search engines are integrating, they certainly aren't doing it quite right yet.

They do have uses where even current models could play to their strengths though, and even the smaller models have the superhuman passing familiarity with almost everything anyone would - or could - ever have seen in text on a computer display. As long as you don't try something too unusual, they'd often do fine.
 
Last edited:
Joined
Oct 16, 2018
Messages
978 (0.42/day)
Location
Uttar Pradesh, India
Processor AMD R7 1700X @ 4100Mhz
Motherboard MSI B450M MORTAR MAX (MS-7B89)
Cooling Phanteks PH-TC14PE
Memory Crucial Technology 16GB DR (DDR4-3600) - C9BLM:045M:E BL16G36C16U4W.M16FE1 X2 @ CL14
Video Card(s) XFX RX480 GTR 8GB @ 1408Mhz (AMD Auto OC)
Storage Samsung SSD 850 EVO 250GB
Display(s) Acer KG271 1080p @ 81Hz
Power Supply SuperFlower Leadex II 750W 80+ Gold
Keyboard Redragon Devarajas RGB
Software Microsoft Windows 10 (10.0) Professional 64-bit
Benchmark Scores https://valid.x86.fr/mvvj3a
Maybe I'm a little bit behind on stuff but... Got to ask... What's the point of this for any regular home user?
This video helped me get started with running llm locally. It should also answer many of the questions you raised.
 
Joined
Mar 11, 2008
Messages
1,100 (0.18/day)
Location
Hungary / Budapest
System Name Kincsem
Processor AMD Ryzen 9 9950X
Motherboard ASUS ProArt X870E-CREATOR WIFI
Cooling Be Quiet Dark Rock Pro 5
Memory Kingston Fury KF560C32RSK2-96 (2×48GB 6GHz)
Video Card(s) Sapphire AMD RX 7900 XT Pulse
Storage Samsung 990PRO 2TB + Samsung 980PRO 2TB + FURY Renegade 2TB+ Adata 2TB + WD Ultrastar HC550 16TB
Display(s) Acer QHD 27"@144Hz 1ms + UHD 27"@60Hz
Case Cooler Master CM 690 III
Power Supply Seasonic 1300W 80+ Gold Prime
Mouse Logitech G502 Hero
Keyboard HyperX Alloy Elite RGB
Software Windows 10-64
Benchmark Scores https://valid.x86.fr/9qw7iq https://valid.x86.fr/4d8n02 X570 https://www.techpowerup.com/gpuz/g46uc
This video helped me get started with running llm locally. It should also answer many of the questions you raised.
OMG, this scare on the first seconds of the video.... :ohwell: :ohwell: :ohwell:
instant downvote from me for these kind of "content"
There is no worries when you run it locally on your local program.
I would never install DeepSeek's app on my phone tho...
 
Joined
Jul 21, 2008
Messages
5,266 (0.87/day)
System Name [Daily Driver]
Processor [Ryzen 7 5800X3D]
Motherboard [MSI MAG B550 TOMAHAWK]
Cooling [be quiet! Dark Rock Slim]
Memory [64GB Crucial Pro 3200MHz (32GBx2)]
Video Card(s) [PNY RTX 3070Ti XLR8]
Storage [1TB SN850 NVMe, 4TB 990 Pro NVMe, 2TB 870 EVO SSD, 2TB SA510 SSD]
Display(s) [2x 27" HP X27q at 1440p]
Case [Fractal Meshify-C]
Audio Device(s) [Fanmusic TRUTHEAR IEM, HyperX Duocast]
Power Supply [CORSAIR RMx 1000]
Mouse [Logitech G Pro Wireless]
Keyboard [Logitech G512 Carbon (GX-Brown)]
Software [Windows 11 64-Bit]
I'm still fairly new to the scene, but have a phi-4 and deepseek-r1 instance locally. Right now just use them when I run into coding issues or need some inspiration. Was using a lot of Grok to fill that role, but the local stuff is neat.
 
Joined
May 10, 2023
Messages
610 (0.93/day)
Location
Brazil
Processor 5950x
Motherboard B550 ProArt
Cooling Fuma 2
Memory 4x32GB 3200MHz Corsair LPX
Video Card(s) 2x RTX 3090
Display(s) LG 42" C2 4k OLED
Power Supply XPG Core Reactor 850W
Software I use Arch btw
I've tried a bunch of models, and keep switching back and forth between those.
Those are the ones currently downloaded into my MBP:
1739838571050.png


I usually use ollama as the backend, and either the python API for some software I run, or Open webUI for when I want a chat-like thingie.

Some performance numbers from my setups on a "what's the meaning of life?" prompt, without using fa or any other software speedups:

Model2x3090 (tok/s)M3 Max (tok/s)
phi4:14b-q4_K_M6225.5
phi4:14b-q8_047.516.9
deepseek-r1:32b-qwen-distill-q4_K_M27.812.6
deepseek-r1:7b-qwen-distill-q4_K_M11650.4
gemma:7b-instruct-v1.1-q4_0113.746.3
llama3.1:8b-instruct-q4_K_M110.146.08
llama3.1:8b-instruct-fp1649.817.5
deepseek-r1:32b-qwen-distill-q8_021-
deepseek-r1:70b-llama-distill-q4_K_M16.6-
llama3.3:70b-instruct-q4_K_M16.7-

With the above models that need 2 GPUs, both GPUs average their utilization at 50%. I did not run those into my MBP since I was out of memory for those given the other crap I had open.
Both my 3090s are also set to a 275W power limit.
 

izy

Joined
Jun 30, 2022
Messages
1,072 (1.11/day)
Which one did you guys find to be the best for code?

I'm also curious how something like AMD Ryzen Al Max+ 395 would perform compared to 16gb GPUs.
 
Last edited:
Joined
Jan 12, 2023
Messages
278 (0.36/day)
System Name IZALITH (or just "Lith")
Processor AMD Ryzen 7 7800X3D (4.2Ghz base, 5.0Ghz boost, -30 PBO offset)
Motherboard Gigabyte X670E Aorus Master Rev 1.0
Cooling Deepcool Gammaxx AG400 Single Tower
Memory Corsair Vengeance 64GB (2x32GB) 6000MHz CL40 DDR5 XMP (XMP enabled)
Video Card(s) PowerColor Radeon RX 7900 XTX Red Devil OC 24GB (2.39Ghz base, 2.99Ghz boost, -30 core offset)
Storage 2x1TB SSD, 2x2TB SSD, 2x 8TB HDD
Display(s) Samsung Odyssey G51C 27" QHD (1440p 165Hz) + Samsung Odyssey G3 24" FHD (1080p 165Hz)
Case Corsair 7000D Airflow Full Tower
Audio Device(s) Corsair HS55 Surround Wired Headset/LG Z407 Speaker Set
Power Supply Corsair HX1000 Platinum Modular (1000W)
Mouse Logitech G502 X LIGHTSPEED Wireless Gaming Mouse
Keyboard Keychron K4 Wireless Mechanical Keyboard
Software Arch Linux
I'm not using anything. I didn't even know that you could run them locally until recently. I'm only trying to learn what use it is, to see whether it's something I'd want to do or not.


Text-based holodeck running locally on your PC... Now that caught my attention! :)

I'm just having a hard time imagining it. LLM still lives in my head as a glorified search engine. :ohwell:

It's quite impressive how knowledgeable a local LLM can be without having internet access. I use a couple of models locally for benchmarking my GPU, answering the odd question or just for fun and there's been a few times where the models have surprised me with their answers. Here's a quick example running llama3.2 of it translating your post into Spanish. There is no internet access involved here, it's all running on my 7900XTX. And the model is only ~9GB in size.

1739856375929.png


The obvious downside of local LLMs is that they only have knowledge up until a point, eg: the date they were compiled. llama3.2 for example knows the current president of the United States is Joe Biden and the 2024 election hasn't happened yet. I imagine if I updated to llama3.3 (which is 43GB up from 9GB in 3.2!) it would have more up to date information. But I'd highly recommend you give it a try, even if it doesn't become a daily use tool on your machine it's a good benchmark and neat gimmick.

To answer the original question, I have the following models installed via ollama:
- codegemma
- deepseek-coder-v2
- gemma2
- llama3.2
- nemotron-mini
- phi4
- qwen2

Note that I mostly use gemma2 as it seems to be the most "accurate". I did experiment with the coding-focused ones to assist me with work from time to time, they are mostly not helpful and prone to hallucination or just outright wrong information :(

EDIT: I would also add that ollama is an excellent starting point to get set up, especially for AMD users with the ROCm version. It is very easy to get going across all major OSes.
 
Joined
Mar 11, 2008
Messages
1,100 (0.18/day)
Location
Hungary / Budapest
System Name Kincsem
Processor AMD Ryzen 9 9950X
Motherboard ASUS ProArt X870E-CREATOR WIFI
Cooling Be Quiet Dark Rock Pro 5
Memory Kingston Fury KF560C32RSK2-96 (2×48GB 6GHz)
Video Card(s) Sapphire AMD RX 7900 XT Pulse
Storage Samsung 990PRO 2TB + Samsung 980PRO 2TB + FURY Renegade 2TB+ Adata 2TB + WD Ultrastar HC550 16TB
Display(s) Acer QHD 27"@144Hz 1ms + UHD 27"@60Hz
Case Cooler Master CM 690 III
Power Supply Seasonic 1300W 80+ Gold Prime
Mouse Logitech G502 Hero
Keyboard HyperX Alloy Elite RGB
Software Windows 10-64
Benchmark Scores https://valid.x86.fr/9qw7iq https://valid.x86.fr/4d8n02 X570 https://www.techpowerup.com/gpuz/g46uc
It's quite impressive how knowledgeable a local LLM can be without having internet access. I use a couple of models locally for benchmarking my GPU, answering the odd question or just for fun and there's been a few times where the models have surprised me with their answers. Here's a quick example running llama3.2 of it translating your post into Spanish. There is no internet access involved here, it's all running on my 7900XTX. And the model is only ~9GB in size.

View attachment 385432

The obvious downside of local LLMs is that they only have knowledge up until a point, eg: the date they were compiled. llama3.2 for example knows the current president of the United States is Joe Biden and the 2024 election hasn't happened yet. I imagine if I updated to llama3.3 (which is 43GB up from 9GB in 3.2!) it would have more up to date information. But I'd highly recommend you give it a try, even if it doesn't become a daily use tool on your machine it's a good benchmark and neat gimmick.

To answer the original question, I have the following models installed via ollama:
- codegemma
- deepseek-coder-v2
- gemma2
- llama3.2
- nemotron-mini
- phi4
- qwen2

Note that I mostly use gemma2 as it seems to be the most "accurate". I did experiment with the coding-focused ones to assist me with work from time to time, they are mostly not helpful and prone to hallucination or just outright wrong information :(

EDIT: I would also add that ollama is an excellent starting point to get set up, especially for AMD users with the ROCm version. It is very easy to get going across all major OSes.
My entry point was that my colleague introduced me to LM Studio, I did not even hear about ollama until last December.
And I was not able to get it work with my GPU so I stick to LM Studio, which works real nice plus I like the great UI, even if I am using console on a daily basis
On the llama 3.2, why do you keeping the obsolete v3.2?
 
Joined
Nov 23, 2023
Messages
72 (0.16/day)
Which one did you guys find to be the best for code?

I'm also curious how something like AMD Ryzen Al Max+ 395 would perform compared to 16gb GPUs.
I'm guessing the new DeepSeek model is gonna be the best one. I'm also waiting for Strix Halo, it should perform at minimum as well as a 16GB RX 580 seeing as they'll have similar bandwidths. The rumors I hear for Medusa are insane though, 30% performance increase over Strix is crazy...
 
Joined
Jan 12, 2023
Messages
278 (0.36/day)
System Name IZALITH (or just "Lith")
Processor AMD Ryzen 7 7800X3D (4.2Ghz base, 5.0Ghz boost, -30 PBO offset)
Motherboard Gigabyte X670E Aorus Master Rev 1.0
Cooling Deepcool Gammaxx AG400 Single Tower
Memory Corsair Vengeance 64GB (2x32GB) 6000MHz CL40 DDR5 XMP (XMP enabled)
Video Card(s) PowerColor Radeon RX 7900 XTX Red Devil OC 24GB (2.39Ghz base, 2.99Ghz boost, -30 core offset)
Storage 2x1TB SSD, 2x2TB SSD, 2x 8TB HDD
Display(s) Samsung Odyssey G51C 27" QHD (1440p 165Hz) + Samsung Odyssey G3 24" FHD (1080p 165Hz)
Case Corsair 7000D Airflow Full Tower
Audio Device(s) Corsair HS55 Surround Wired Headset/LG Z407 Speaker Set
Power Supply Corsair HX1000 Platinum Modular (1000W)
Mouse Logitech G502 X LIGHTSPEED Wireless Gaming Mouse
Keyboard Keychron K4 Wireless Mechanical Keyboard
Software Arch Linux
My entry point was that my colleague introduced me to LM Studio, I did not even hear about ollama until last December.
And I was not able to get it work with my GPU so I stick to LM Studio, which works real nice plus I like the great UI, even if I am using console on a daily basis
On the llama 3.2, why do you keeping the obsolete v3.2?

Because llama3.3 is 45GB and I am on Starlink :) I have to pick my day to download large files or I'll be waiting a long time!
 
Joined
Mar 11, 2008
Messages
1,100 (0.18/day)
Location
Hungary / Budapest
System Name Kincsem
Processor AMD Ryzen 9 9950X
Motherboard ASUS ProArt X870E-CREATOR WIFI
Cooling Be Quiet Dark Rock Pro 5
Memory Kingston Fury KF560C32RSK2-96 (2×48GB 6GHz)
Video Card(s) Sapphire AMD RX 7900 XT Pulse
Storage Samsung 990PRO 2TB + Samsung 980PRO 2TB + FURY Renegade 2TB+ Adata 2TB + WD Ultrastar HC550 16TB
Display(s) Acer QHD 27"@144Hz 1ms + UHD 27"@60Hz
Case Cooler Master CM 690 III
Power Supply Seasonic 1300W 80+ Gold Prime
Mouse Logitech G502 Hero
Keyboard HyperX Alloy Elite RGB
Software Windows 10-64
Benchmark Scores https://valid.x86.fr/9qw7iq https://valid.x86.fr/4d8n02 X570 https://www.techpowerup.com/gpuz/g46uc
I am on Starlink :) I have to pick my day to download large files or I'll be waiting a long time!
I see, Well you could get the file from a friend on a flashdrive maybe, or...
With LM Studio you can also pause/resume the downloads! So maybe you could have a look!
:toast:
I did not even consider this kind of limitations, sorry!
Where are you live that you need starlink?
 
Joined
Jan 12, 2023
Messages
278 (0.36/day)
System Name IZALITH (or just "Lith")
Processor AMD Ryzen 7 7800X3D (4.2Ghz base, 5.0Ghz boost, -30 PBO offset)
Motherboard Gigabyte X670E Aorus Master Rev 1.0
Cooling Deepcool Gammaxx AG400 Single Tower
Memory Corsair Vengeance 64GB (2x32GB) 6000MHz CL40 DDR5 XMP (XMP enabled)
Video Card(s) PowerColor Radeon RX 7900 XTX Red Devil OC 24GB (2.39Ghz base, 2.99Ghz boost, -30 core offset)
Storage 2x1TB SSD, 2x2TB SSD, 2x 8TB HDD
Display(s) Samsung Odyssey G51C 27" QHD (1440p 165Hz) + Samsung Odyssey G3 24" FHD (1080p 165Hz)
Case Corsair 7000D Airflow Full Tower
Audio Device(s) Corsair HS55 Surround Wired Headset/LG Z407 Speaker Set
Power Supply Corsair HX1000 Platinum Modular (1000W)
Mouse Logitech G502 X LIGHTSPEED Wireless Gaming Mouse
Keyboard Keychron K4 Wireless Mechanical Keyboard
Software Arch Linux
I see, Well you could get the file from a friend on a flashdrive maybe, or...
With LM Studio you can also pause/resume the downloads! So maybe you could have a look!
:toast:
I did not even consider this kind of limitations, sorry!
Where are you live that you need starlink?

I'll get around to downloading it eventually. The thing is I don't use LLMs in my day-to-day, they're essentially a novelty I fire up from time to time. So getting the latest model isn't high on the priority list.

I live in rural Australia, far from any civilization! :)
 
Joined
Feb 12, 2025
Messages
10 (1.00/day)
Location
EU
Processor AMD 5600X
Motherboard ASUS TUF GAMING B550M-Plus WiFi
Cooling be quiet! Dark Rock 4
Memory G.Skill Ripjaws 2 x 32 GB DDR4-3600 CL18-22-22-42 1.35V F4-3600C18D-64GVK
Video Card(s) Sapphire Pulse RX 7800XT 16GB
Storage Kingston KC3000 2TB + QNAP TBS-464
Display(s) LG 35" LCD 35WN75C-B 3440x1440
Case Kolink Bastion RGB Midi-Tower
Power Supply Enermax Digifanless 550W
Mouse Razer Deathadder v2
Benchmark Scores phi4 - 42.00 tokens/s
I'm also curious how something like AMD Ryzen Al Max+ 395 would perform compared to 16gb GPUs.
RAM speed of 256-bit LPDDR5x-8000 is not that great. It's only 256GB/s compared to 256-bit GDDR6 7800XT having 624.1GB/s. So all models that fit into 16GB will be much faster on 7800XT or similar 16GB GPUs. The only advantage is when you go with 32GB or more RAM with Ryzen AI setup, then those bigger models not fitting into 16GB GPU will run faster on Ryzen AI laptop.
 
Joined
Mar 11, 2008
Messages
1,100 (0.18/day)
Location
Hungary / Budapest
System Name Kincsem
Processor AMD Ryzen 9 9950X
Motherboard ASUS ProArt X870E-CREATOR WIFI
Cooling Be Quiet Dark Rock Pro 5
Memory Kingston Fury KF560C32RSK2-96 (2×48GB 6GHz)
Video Card(s) Sapphire AMD RX 7900 XT Pulse
Storage Samsung 990PRO 2TB + Samsung 980PRO 2TB + FURY Renegade 2TB+ Adata 2TB + WD Ultrastar HC550 16TB
Display(s) Acer QHD 27"@144Hz 1ms + UHD 27"@60Hz
Case Cooler Master CM 690 III
Power Supply Seasonic 1300W 80+ Gold Prime
Mouse Logitech G502 Hero
Keyboard HyperX Alloy Elite RGB
Software Windows 10-64
Benchmark Scores https://valid.x86.fr/9qw7iq https://valid.x86.fr/4d8n02 X570 https://www.techpowerup.com/gpuz/g46uc
I'll get around to downloading it eventually. The thing is I don't use LLMs in my day-to-day, they're essentially a novelty I fire up from time to time. So getting the latest model isn't high on the priority list.

I live in rural Australia, far from any civilization! :)
Yeah, it has limited use, but still good to have!
So you live in the outback, now I get it!
Well local LLMs are great that you don't need to be connected yet it may answer many questions you have! :)
 
Joined
Oct 17, 2021
Messages
112 (0.09/day)
System Name Nirn
Processor Amd Ryzen 7950X3D
Motherboard MSI MEG ACE X670e
Cooling Noctua NH-D15
Memory 128 GB Kingston DDR5 6000 (running at 4000)
Video Card(s) Radeon RX 7900XTX (24G) + Geforce 4070ti (12G) Physx
Storage SAMSUNG 990 EVO SSD 2TB Gen 5 x2 (OS)+SAMSUNG 980 SSD 1TB PCle 3.0x4 (Primocache) +2X 22TB WD Gold
Display(s) Samsung UN55NU8000 (Freesync)
Case Corsair Graphite Series 780T White
Audio Device(s) Creative Soundblaster AE-7 + Sennheiser GSP600
Power Supply Seasonic PRIME TX-1000 Titanium
Mouse Razer Mamba Elite Wired
Keyboard Razer BlackWidow Chroma v1
VR HMD Oculus Quest 2
Software Windows 10
anyone got any local llms that can generate unskinned 3d models?
 

johnspack

Here For Good!
Joined
Oct 6, 2007
Messages
6,055 (0.95/day)
Location
Nelson B.C. Canada
System Name System2 Blacknet , System1 Blacknet2
Processor System2 Threadripper 1920x, System1 2699 v3
Motherboard System2 Asrock Fatality x399 Professional Gaming, System1 Asus X99-A
Cooling System2 Noctua NH-U14 TR4-SP3 Dual 140mm fans, System1 AIO
Memory System2 64GBS DDR4 3000, System1 32gbs DDR4 2400
Video Card(s) System2 GTX 980Ti System1 GTX 970
Storage System2 4x SSDs + NVme= 2.250TB 2xStorage Drives=8TB System1 3x SSDs=2TB
Display(s) 1x27" 1440 display 1x 24" 1080 display
Case System2 Some Nzxt case with soundproofing...
Audio Device(s) Asus Xonar U7 MKII
Power Supply System2 EVGA 750 Watt, System1 XFX XTR 750 Watt
Mouse Logitech G900 Chaos Spectrum
Keyboard Ducky
Software Archlinux, Manjaro, Win11 Ent 24h2
Benchmark Scores It's linux baby!
Finally got it working, but with Koboldcpp-cuda. My vid card heats up real nice when its thinking! Doesn't seem to be a download feature in it though so
difficult to install ggufs from Hugging Face. The other day I somehow managed to dl deepseek-r1-distill-qwen-32b-q5 and it runs it just fine. Don't remember
how I did that...... glad I have 64gbs ram, just for that it uses 30gbs.
 

johnspack

Here For Good!
Joined
Oct 6, 2007
Messages
6,055 (0.95/day)
Location
Nelson B.C. Canada
System Name System2 Blacknet , System1 Blacknet2
Processor System2 Threadripper 1920x, System1 2699 v3
Motherboard System2 Asrock Fatality x399 Professional Gaming, System1 Asus X99-A
Cooling System2 Noctua NH-U14 TR4-SP3 Dual 140mm fans, System1 AIO
Memory System2 64GBS DDR4 3000, System1 32gbs DDR4 2400
Video Card(s) System2 GTX 980Ti System1 GTX 970
Storage System2 4x SSDs + NVme= 2.250TB 2xStorage Drives=8TB System1 3x SSDs=2TB
Display(s) 1x27" 1440 display 1x 24" 1080 display
Case System2 Some Nzxt case with soundproofing...
Audio Device(s) Asus Xonar U7 MKII
Power Supply System2 EVGA 750 Watt, System1 XFX XTR 750 Watt
Mouse Logitech G900 Chaos Spectrum
Keyboard Ducky
Software Archlinux, Manjaro, Win11 Ent 24h2
Benchmark Scores It's linux baby!
Well now I've got DeepSeek R1 Distill Qwen 32b-q6 running, but only under windows arg. I signed up to Hugging Face, but still can't find any download links for gguf files.
Kobold needs the actual gguf file to load, there is no way to add using the interface. How do I dl bigger ggufs to use?
 
Joined
May 10, 2023
Messages
610 (0.93/day)
Location
Brazil
Processor 5950x
Motherboard B550 ProArt
Cooling Fuma 2
Memory 4x32GB 3200MHz Corsair LPX
Video Card(s) 2x RTX 3090
Display(s) LG 42" C2 4k OLED
Power Supply XPG Core Reactor 850W
Software I use Arch btw
Well now I've got DeepSeek R1 Distill Qwen 32b-q6 running, but only under windows arg. I signed up to Hugging Face, but still can't find any download links for gguf files.
Kobold needs the actual gguf file to load, there is no way to add using the interface. How do I dl bigger ggufs to use?

For Q6_K: https://huggingface.co/bartowski/De...b/main/DeepSeek-R1-Distill-Qwen-32B-Q6_K.gguf
 
Joined
Apr 15, 2009
Messages
1,051 (0.18/day)
Processor Ryzen 9 5900X
Motherboard Gigabyte X570 Aorus Master
Cooling ARCTIC Liquid Freezer III 360 A-RGB
Memory 32 GB Ballistix Elite DDR4-3600 CL16
Video Card(s) XFX 6800 XT Speedster Merc 319 Black
Storage Sabrent Rocket NVMe 4.0 1TB
Display(s) LG 27GL850B x 2 / ASUS MG278Q
Case be quiet! Silent Base 802
Audio Device(s) Sound Blaster AE-7 / Sennheiser HD 660S
Power Supply Seasonic Vertex PX-1200
Software Windows 11 Pro 64
Get your own data center cards and leave my gaming GPUs alone!
 
Top