What local LLM-s you use?

csendesmark · 2025-02-14T11:17:23+0000

@AusWolf
Local LLM-s are very important, I really reject the trend everything getting "cloud" based, micro$oft wants even you windows account to be online...
But after you posted 3 times, you could tell us what LLM-s you use, and maybe some performance data too!

Ultron1337 · 2025-02-14T15:09:30+0000

AusWolf said:
Maybe I'm a little bit behind on stuff but... Got to ask... What's the point of this for any regular home user?

It's like asking "whats the point of using your brain ?". First time in recorded history humanity has the ability to use thinking tool. It can supercharge almost any skill you have. Just as human brain can be used in nearly limitless ways, same applies to local LLM. Use it to check kids homework, use it to make your homework, help analyze scientific papers, write code for you, explain why vitamin K is good, count starts in the sky, analyze insurance offerings etc etc etc. I am not even going to pretend I know even a fraction of uses cases local LLMs will have next 10 years, but I know its going to be wild. On a level how internet changed our lives (yeah some of us grew up without internet).

JWNoctis · 2025-02-14T16:16:04+0000

AusWolf said:
Maybe I'm a little bit behind on stuff but... Got to ask... What's the point of this for any regular home user?

AusWolf said:
That sounds interesting. Can you explain?

At least the larger, 70B+ models, are typically sufficiently knowledgeable that you can ask them some complicated questions and expect a reasonable and not necessarily banal and expected answer. There are things you do not want to send to commercial services, most typically personal information. Some of the latest advancements made even 70B scale models competent with illusion-shattering problems previous generations of models have difficulty with, like how many r's in strawberry, how many boys does Mary have when one of them is gross, et cetera.

Now they are usually useful for common math and programming problems when used with care, can explore human philosophy and condition quite competently, and can tell stories of some interest with the right prompt. They are also useful for getting familiar with what LLM output looked like. Half of the internet looks LLM generated these days.

Some open-weight models are capable of API use, such as those provided by the framework it is running on, including requesting web services. Usefulness of such capabilities is apparently unremarkable given other limitations, and for that matter, the state of the internet and search engine results these days. That require support by the framework the model is running on, and is usually the only time the model - note, not the framework - would access the Internet.

They can also provide some silly fun, especially the roleplay-finetuned ones, for uses where hallucination actually provides some emulation of creativity. Think of it as a text-based holodeck. Throw in an image generator and it is text and image. You typically don't want a lot of those elsewhere, as well: As with all things requiring an account, all things you put into a networked service would be recorded by the provider, and linked to you. Not everyone feel comfortable with the nothing-to-hide mentality even when they really don't, and more than a few have objections to their interactions and personal info being used to train future commercial AI models.

Personally, I've sized my setup to be able to run a "future larger model" in early 2024, which would turn out to be mistral-large-2407, 123B, quantized. The best correctness and general task performance is probably currently achieved by the LLAMA 3 70B distilled version of DeepSeek R1. Anything larger would be costly and impractical for the moment, to me. Might as well make them useful while they are there.

csendesmark · 2025-02-14T16:27:20+0000

JWNoctis said:
70B+ models, are typically sufficiently knowledgeable that you can ask them some complicated questions and expect a reasonable and not necessarily banal and expected answer.

Did you skip DeepSeek?

AusWolf · 2025-02-14T16:50:48+0000

csendesmark said:
@AusWolf
Local LLM-s are very important, I really reject the trend everything getting "cloud" based, micro$oft wants even you windows account to be online...
But after you posted 3 times, you could tell us what LLM-s you use, and maybe some performance data too!

I'm not using anything. I didn't even know that you could run them locally until recently. I'm only trying to learn what use it is, to see whether it's something I'd want to do or not.

JWNoctis said:
At least the larger, 70B+ models, are typically sufficiently knowledgeable that you can ask them some complicated questions and expect a reasonable and not necessarily banal and expected answer. There are things you do not want to send to commercial services, most typically personal information. Some of the latest advancements made even 70B scale models competent with illusion-shattering problems previous generations of models have difficulty with, like how many r's in strawberry, how many boys does Mary have when one of them is gross, et cetera.

Now they are usually useful for common math and programming problems when used with care, can explore human philosophy and condition quite competently, and can tell stories of some interest with the right prompt. They are also useful for getting familiar with what LLM output looked like. Half of the internet looks LLM generated these days.

Some open-weight models are capable of API use, such as those provided by the framework it is running on, including requesting web services. Usefulness of such capabilities is apparently unremarkable given other limitations, and for that matter, the state of the internet and search engine results these days. That require support by the framework the model is running on, and is usually the only time the model - note, not the framework - would access the Internet.

They can also provide some silly fun, especially the roleplay-finetuned ones, for uses where hallucination actually provides some emulation of creativity. Think of it as a text-based holodeck. Throw in an image generator and it is text and image. You typically don't want a lot of those elsewhere, as well: As with all things requiring an account, all things you put into a networked service would be recorded by the provider, and linked to you. Not everyone feel comfortable with the nothing-to-hide mentality even when they really don't, and more than a few have objections to their interactions and personal info being used to train future commercial AI models.

Personally, I've sized my setup to be able to run a "future larger model" in early 2024, which would turn out to be mistral-large-2407, 123B, quantized. The best correctness and general task performance is probably currently achieved by the LLAMA 3 70B distilled version of DeepSeek R1. Anything larger would be costly and impractical for the moment, to me. Might as well make them useful while they are there.

Text-based holodeck running locally on your PC... Now that caught my attention!

I'm just having a hard time imagining it. LLM still lives in my head as a glorified search engine. :ohwell:

csendesmark · 2025-02-14T18:34:56+0000

AusWolf said:
I'm not using anything.

It is "current year"
Try starting here: https://lmstudio.ai

System Name	Kincsem
Processor	AMD Ryzen 9 9950X
Motherboard	ASUS ProArt X870E-CREATOR WIFI
Cooling	Be Quiet Dark Rock Pro 5
Memory	Kingston Fury KF560C32RSK2-96 (2×48GB 6GHz)
Video Card(s)	Sapphire AMD RX 7900 XT Pulse
Storage	Samsung 990PRO 2TB + Samsung 980PRO 2TB + FURY Renegade 2TB+ Adata 2TB + WD Ultrastar HC550 16TB
Display(s)	Acer QHD 27"@144Hz 1ms + UHD 27"@60Hz
Case	Cooler Master CM 690 III
Power Supply	Seasonic 1300W 80+ Gold Prime
Mouse	Logitech G502 Hero
Keyboard	HyperX Alloy Elite RGB
Software	Windows 10-64
Benchmark Scores	https://valid.x86.fr/9qw7iq https://valid.x86.fr/4d8n02 X570 https://www.techpowerup.com/gpuz/g46uc

Processor	AMD 5600X
Motherboard	ASUS TUF GAMING B550M-Plus WiFi
Cooling	be quiet! Dark Rock 4
Memory	G.Skill Ripjaws 2 x 32 GB DDR4-3600 CL18-22-22-42 1.35V F4-3600C18D-64GVK
Video Card(s)	Sapphire Pulse RX 7800XT 16GB
Storage	Kingston KC3000 2TB + QNAP TBS-464
Display(s)	LG 35" LCD 35WN75C-B 3440x1440
Case	Kolink Bastion RGB Midi-Tower
Power Supply	Enermax Digifanless 550W
Mouse	Razer Deathadder v2
Benchmark Scores	phi4 - 42.00 tokens/s

System Name	Kuro
Processor	AMD Ryzen 7 7800X3D@65W
Motherboard	MSI MAG B650 Tomahawk WiFi
Cooling	Thermalright Phantom Spirit 120 EVO
Memory	Corsair DDR5 6000C30 2x48GB (Hynix M)@6000 30-36-36-76 1.36V
Video Card(s)	PNY XLR8 RTX 4070 Ti SUPER 16G@200W
Storage	Crucial T500 2TB + WD Blue 8TB
Case	Lian Li LANCOOL 216
Power Supply	MSI MPG A850G
Software	Ubuntu 24.04 LTS + Windows 10 Home Build 19045
Benchmark Scores	17761 C23 Multi@65W

System Name	Kincsem
Processor	AMD Ryzen 9 9950X
Motherboard	ASUS ProArt X870E-CREATOR WIFI
Cooling	Be Quiet Dark Rock Pro 5
Memory	Kingston Fury KF560C32RSK2-96 (2×48GB 6GHz)
Video Card(s)	Sapphire AMD RX 7900 XT Pulse
Storage	Samsung 990PRO 2TB + Samsung 980PRO 2TB + FURY Renegade 2TB+ Adata 2TB + WD Ultrastar HC550 16TB
Display(s)	Acer QHD 27"@144Hz 1ms + UHD 27"@60Hz
Case	Cooler Master CM 690 III
Power Supply	Seasonic 1300W 80+ Gold Prime
Mouse	Logitech G502 Hero
Keyboard	HyperX Alloy Elite RGB
Software	Windows 10-64
Benchmark Scores	https://valid.x86.fr/9qw7iq https://valid.x86.fr/4d8n02 X570 https://www.techpowerup.com/gpuz/g46uc

Processor	Various Intel and AMD CPUs
Motherboard	Micro-ATX and mini-ITX
Cooling	Yes
Memory	Overclocking is overrated
Video Card(s)	Various Nvidia and AMD GPUs
Storage	A lot
Display(s)	Monitors and TVs
Case	It's not about size, but how you use it
Audio Device(s)	Speakers and headphones
Power Supply	300 to 750 W, bronze to gold
Mouse	Wireless
Keyboard	Mechanic
VR HMD	Not yet
Software	Linux gaming master race

What local LLM-s you use?

csendesmark

Ultron1337

JWNoctis

csendesmark

AusWolf

csendesmark