• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

What local LLM-s you use?

Joined
Mar 11, 2008
Messages
1,086 (0.18/day)
Location
Hungary / Budapest
System Name Kincsem
Processor AMD Ryzen 9 9950X
Motherboard ASUS ProArt X870E-CREATOR WIFI
Cooling Be Quiet Dark Rock Pro 5
Memory Kingston Fury KF560C32RSK2-96 (2×48GB 6GHz)
Video Card(s) Sapphire AMD RX 7900 XT Pulse
Storage Samsung 990PRO 2TB + Samsung 980PRO 2TB + FURY Renegade 2TB+ Adata 2TB + WD Ultrastar HC550 16TB
Display(s) Acer QHD 27"@144Hz 1ms + UHD 27"@60Hz
Case Cooler Master CM 690 III
Power Supply Seasonic 1300W 80+ Gold Prime
Mouse Logitech G502 Hero
Keyboard HyperX Alloy Elite RGB
Software Windows 10-64
Benchmark Scores https://valid.x86.fr/9qw7iq https://valid.x86.fr/4d8n02 X570 https://www.techpowerup.com/gpuz/g46uc
@AusWolf
Local LLM-s are very important, I really reject the trend everything getting "cloud" based, micro$oft wants even you windows account to be online...
But after you posted 3 times, you could tell us what LLM-s you use, and maybe some performance data too!
 
Joined
Feb 12, 2025
Messages
8 (2.00/day)
Location
EU
Processor AMD 5600X
Motherboard ASUS TUF GAMING B550M-Plus WiFi
Cooling be quiet! Dark Rock 4
Memory G.Skill Ripjaws 2 x 32 GB DDR4-3600 CL18-22-22-42 1.35V F4-3600C18D-64GVK
Video Card(s) Sapphire Pulse RX 7800XT 16GB
Storage Kingston KC3000 2TB + QNAP TBS-464
Display(s) LG 35" LCD 35WN75C-B 3440x1440
Case Kolink Bastion RGB Midi-Tower
Power Supply Enermax Digifanless 550W
Mouse Razer Deathadder v2
Benchmark Scores phi4 - 42.00 tokens/s
Maybe I'm a little bit behind on stuff but... Got to ask... What's the point of this for any regular home user?
It's like asking "whats the point of using your brain ?". First time in recorded history humanity has the ability to use thinking tool. It can supercharge almost any skill you have. Just as human brain can be used in nearly limitless ways, same applies to local LLM. Use it to check kids homework, use it to make your homework, help analyze scientific papers, write code for you, explain why vitamin K is good, count starts in the sky, analyze insurance offerings etc etc etc. I am not even going to pretend I know even a fraction of uses cases local LLMs will have next 10 years, but I know its going to be wild. On a level how internet changed our lives (yeah some of us grew up without internet).
 
Joined
May 22, 2024
Messages
429 (1.59/day)
System Name Kuro
Processor AMD Ryzen 7 7800X3D@65W
Motherboard MSI MAG B650 Tomahawk WiFi
Cooling Thermalright Phantom Spirit 120 EVO
Memory Corsair DDR5 6000C30 2x48GB (Hynix M)@6000 30-36-36-76 1.36V
Video Card(s) PNY XLR8 RTX 4070 Ti SUPER 16G@200W
Storage Crucial T500 2TB + WD Blue 8TB
Case Lian Li LANCOOL 216
Power Supply MSI MPG A850G
Software Ubuntu 24.04 LTS + Windows 10 Home Build 19045
Benchmark Scores 17761 C23 Multi@65W
Maybe I'm a little bit behind on stuff but... Got to ask... What's the point of this for any regular home user?
That sounds interesting. Can you explain? :)
At least the larger, 70B+ models, are typically sufficiently knowledgeable that you can ask them some complicated questions and expect a reasonable and not necessarily banal and expected answer. There are things you do not want to send to commercial services, most typically personal information. Some of the latest advancements made even 70B scale models competent with illusion-shattering problems previous generations of models have difficulty with, like how many r's in strawberry, how many boys does Mary have when one of them is gross, et cetera.

Now they are usually useful for common math and programming problems when used with care, can explore human philosophy and condition quite competently, and can tell stories of some interest with the right prompt. They are also useful for getting familiar with what LLM output looked like. Half of the internet looks LLM generated these days.

Some open-weight models are capable of API use, such as those provided by the framework it is running on, including requesting web services. Usefulness of such capabilities is apparently unremarkable given other limitations, and for that matter, the state of the internet and search engine results these days. That require support by the framework the model is running on, and is usually the only time the model - note, not the framework - would access the Internet.

They can also provide some silly fun, especially the roleplay-finetuned ones, for uses where hallucination actually provides some emulation of creativity. Think of it as a text-based holodeck. Throw in an image generator and it is text and image. You typically don't want a lot of those elsewhere, as well: As with all things requiring an account, all things you put into a networked service would be recorded by the provider, and linked to you. Not everyone feel comfortable with the nothing-to-hide mentality even when they really don't, and more than a few have objections to their interactions and personal info being used to train future commercial AI models.

Personally, I've sized my setup to be able to run a "future larger model" in early 2024, which would turn out to be mistral-large-2407, 123B, quantized. The best correctness and general task performance is probably currently achieved by the LLAMA 3 70B distilled version of DeepSeek R1. Anything larger would be costly and impractical for the moment, to me. Might as well make them useful while they are there.
 
Joined
Mar 11, 2008
Messages
1,086 (0.18/day)
Location
Hungary / Budapest
System Name Kincsem
Processor AMD Ryzen 9 9950X
Motherboard ASUS ProArt X870E-CREATOR WIFI
Cooling Be Quiet Dark Rock Pro 5
Memory Kingston Fury KF560C32RSK2-96 (2×48GB 6GHz)
Video Card(s) Sapphire AMD RX 7900 XT Pulse
Storage Samsung 990PRO 2TB + Samsung 980PRO 2TB + FURY Renegade 2TB+ Adata 2TB + WD Ultrastar HC550 16TB
Display(s) Acer QHD 27"@144Hz 1ms + UHD 27"@60Hz
Case Cooler Master CM 690 III
Power Supply Seasonic 1300W 80+ Gold Prime
Mouse Logitech G502 Hero
Keyboard HyperX Alloy Elite RGB
Software Windows 10-64
Benchmark Scores https://valid.x86.fr/9qw7iq https://valid.x86.fr/4d8n02 X570 https://www.techpowerup.com/gpuz/g46uc
70B+ models, are typically sufficiently knowledgeable that you can ask them some complicated questions and expect a reasonable and not necessarily banal and expected answer.
Did you skip DeepSeek?
1739550449205.png
 
Joined
Jan 14, 2019
Messages
14,410 (6.48/day)
Location
Midlands, UK
Processor Various Intel and AMD CPUs
Motherboard Micro-ATX and mini-ITX
Cooling Yes
Memory Overclocking is overrated
Video Card(s) Various Nvidia and AMD GPUs
Storage A lot
Display(s) Monitors and TVs
Case It's not about size, but how you use it
Audio Device(s) Speakers and headphones
Power Supply 300 to 750 W, bronze to gold
Mouse Wireless
Keyboard Mechanic
VR HMD Not yet
Software Linux gaming master race
@AusWolf
Local LLM-s are very important, I really reject the trend everything getting "cloud" based, micro$oft wants even you windows account to be online...
But after you posted 3 times, you could tell us what LLM-s you use, and maybe some performance data too!
I'm not using anything. I didn't even know that you could run them locally until recently. I'm only trying to learn what use it is, to see whether it's something I'd want to do or not.

At least the larger, 70B+ models, are typically sufficiently knowledgeable that you can ask them some complicated questions and expect a reasonable and not necessarily banal and expected answer. There are things you do not want to send to commercial services, most typically personal information. Some of the latest advancements made even 70B scale models competent with illusion-shattering problems previous generations of models have difficulty with, like how many r's in strawberry, how many boys does Mary have when one of them is gross, et cetera.

Now they are usually useful for common math and programming problems when used with care, can explore human philosophy and condition quite competently, and can tell stories of some interest with the right prompt. They are also useful for getting familiar with what LLM output looked like. Half of the internet looks LLM generated these days.

Some open-weight models are capable of API use, such as those provided by the framework it is running on, including requesting web services. Usefulness of such capabilities is apparently unremarkable given other limitations, and for that matter, the state of the internet and search engine results these days. That require support by the framework the model is running on, and is usually the only time the model - note, not the framework - would access the Internet.

They can also provide some silly fun, especially the roleplay-finetuned ones, for uses where hallucination actually provides some emulation of creativity. Think of it as a text-based holodeck. Throw in an image generator and it is text and image. You typically don't want a lot of those elsewhere, as well: As with all things requiring an account, all things you put into a networked service would be recorded by the provider, and linked to you. Not everyone feel comfortable with the nothing-to-hide mentality even when they really don't, and more than a few have objections to their interactions and personal info being used to train future commercial AI models.

Personally, I've sized my setup to be able to run a "future larger model" in early 2024, which would turn out to be mistral-large-2407, 123B, quantized. The best correctness and general task performance is probably currently achieved by the LLAMA 3 70B distilled version of DeepSeek R1. Anything larger would be costly and impractical for the moment, to me. Might as well make them useful while they are there.
Text-based holodeck running locally on your PC... Now that caught my attention! :)

I'm just having a hard time imagining it. LLM still lives in my head as a glorified search engine. :ohwell:
 
Joined
Mar 11, 2008
Messages
1,086 (0.18/day)
Location
Hungary / Budapest
System Name Kincsem
Processor AMD Ryzen 9 9950X
Motherboard ASUS ProArt X870E-CREATOR WIFI
Cooling Be Quiet Dark Rock Pro 5
Memory Kingston Fury KF560C32RSK2-96 (2×48GB 6GHz)
Video Card(s) Sapphire AMD RX 7900 XT Pulse
Storage Samsung 990PRO 2TB + Samsung 980PRO 2TB + FURY Renegade 2TB+ Adata 2TB + WD Ultrastar HC550 16TB
Display(s) Acer QHD 27"@144Hz 1ms + UHD 27"@60Hz
Case Cooler Master CM 690 III
Power Supply Seasonic 1300W 80+ Gold Prime
Mouse Logitech G502 Hero
Keyboard HyperX Alloy Elite RGB
Software Windows 10-64
Benchmark Scores https://valid.x86.fr/9qw7iq https://valid.x86.fr/4d8n02 X570 https://www.techpowerup.com/gpuz/g46uc
Joined
May 22, 2024
Messages
429 (1.59/day)
System Name Kuro
Processor AMD Ryzen 7 7800X3D@65W
Motherboard MSI MAG B650 Tomahawk WiFi
Cooling Thermalright Phantom Spirit 120 EVO
Memory Corsair DDR5 6000C30 2x48GB (Hynix M)@6000 30-36-36-76 1.36V
Video Card(s) PNY XLR8 RTX 4070 Ti SUPER 16G@200W
Storage Crucial T500 2TB + WD Blue 8TB
Case Lian Li LANCOOL 216
Power Supply MSI MPG A850G
Software Ubuntu 24.04 LTS + Windows 10 Home Build 19045
Benchmark Scores 17761 C23 Multi@65W
Did you skip DeepSeek?
The way they do chain-of-thought makes for interesting reading. I think they are the first to do it well enough in an open-weight model, too. Wherever they might be from, I do not have quite sufficient trust to send any hosted services anything confidential or profiling.

Even "free" services come with the implicit permission of using your interaction for any number of further purposes buried in the user agreement assuming it is followed, and God forbid if there is a data breach.

FWIW 70B Q6_k quantized models are a bit more than ~0.9 token/s to almost 1.2 token/s on my setup running on official distribution of Ollama 0.5.7. Latest llama.cpp compiled from source gives ~1.2 token/s.

Text-based holodeck running locally on your PC... Now that caught my attention! :)

I'm just having a hard time imagining it. LLM still lives in my head as a glorified search engine. :ohwell:
To be fair, they are still even worse than that when used for factual stuff without verification. And whatever it is that the various search engines are integrating, they certainly aren't doing it quite right yet.

They do have uses where even current models could play to their strengths though, and even the smaller models have the superhuman passing familiarity with almost everything anyone would - or could - ever have seen in text on a computer display. As long as you don't try something too unusual, they'd often do fine.
 
Last edited:
Joined
Oct 16, 2018
Messages
976 (0.42/day)
Location
Uttar Pradesh, India
Processor AMD R7 1700X @ 4100Mhz
Motherboard MSI B450M MORTAR MAX (MS-7B89)
Cooling Phanteks PH-TC14PE
Memory Crucial Technology 16GB DR (DDR4-3600) - C9BLM:045M:E BL16G36C16U4W.M16FE1 X2 @ CL14
Video Card(s) XFX RX480 GTR 8GB @ 1408Mhz (AMD Auto OC)
Storage Samsung SSD 850 EVO 250GB
Display(s) Acer KG271 1080p @ 81Hz
Power Supply SuperFlower Leadex II 750W 80+ Gold
Keyboard Redragon Devarajas RGB
Software Microsoft Windows 10 (10.0) Professional 64-bit
Benchmark Scores https://valid.x86.fr/mvvj3a
Maybe I'm a little bit behind on stuff but... Got to ask... What's the point of this for any regular home user?
This video helped me get started with running llm locally. It should also answer many of the questions you raised.
 
Joined
Mar 11, 2008
Messages
1,086 (0.18/day)
Location
Hungary / Budapest
System Name Kincsem
Processor AMD Ryzen 9 9950X
Motherboard ASUS ProArt X870E-CREATOR WIFI
Cooling Be Quiet Dark Rock Pro 5
Memory Kingston Fury KF560C32RSK2-96 (2×48GB 6GHz)
Video Card(s) Sapphire AMD RX 7900 XT Pulse
Storage Samsung 990PRO 2TB + Samsung 980PRO 2TB + FURY Renegade 2TB+ Adata 2TB + WD Ultrastar HC550 16TB
Display(s) Acer QHD 27"@144Hz 1ms + UHD 27"@60Hz
Case Cooler Master CM 690 III
Power Supply Seasonic 1300W 80+ Gold Prime
Mouse Logitech G502 Hero
Keyboard HyperX Alloy Elite RGB
Software Windows 10-64
Benchmark Scores https://valid.x86.fr/9qw7iq https://valid.x86.fr/4d8n02 X570 https://www.techpowerup.com/gpuz/g46uc
This video helped me get started with running llm locally. It should also answer many of the questions you raised.
OMG, this scare on the first seconds of the video.... :ohwell: :ohwell: :ohwell:
instant downvote from me for these kind of "content"
There is no worries when you run it locally on your local program.
I would never install DeepSeek's app on my phone tho...
 
Joined
Jul 21, 2008
Messages
5,266 (0.87/day)
System Name [Daily Driver]
Processor [Ryzen 7 5800X3D]
Motherboard [MSI MAG B550 TOMAHAWK]
Cooling [be quiet! Dark Rock Slim]
Memory [64GB Crucial Pro 3200MHz (32GBx2)]
Video Card(s) [PNY RTX 3070Ti XLR8]
Storage [1TB SN850 NVMe, 4TB 990 Pro NVMe, 2TB 870 EVO SSD, 2TB SA510 SSD]
Display(s) [2x 27" HP X27q at 1440p]
Case [Fractal Meshify-C]
Audio Device(s) [Fanmusic TRUTHEAR IEM, HyperX Duocast]
Power Supply [CORSAIR RMx 1000]
Mouse [Logitech G Pro Wireless]
Keyboard [Logitech G512 Carbon (GX-Brown)]
Software [Windows 11 64-Bit]
I'm still fairly new to the scene, but have a phi-4 and deepseek-r1 instance locally. Right now just use them when I run into coding issues or need some inspiration. Was using a lot of Grok to fill that role, but the local stuff is neat.
 
Top