Dear AMD, NVIDIA, INTEL and others, we need cheap (192-bit to 384-bit), high VRAM, consumer, GPUs to locally self-host/inference AI/LLMs

10tothemin9volts · 2025-02-03T19:59:43+0000

With the AI age being here, we need fast memory and lots of it, so we can host our favorite LLMs locally.
Even Edward Snowden is complaining.

We also need DDR6 quad channel (or something entirely new and faster) consumer desktop motherboards with up to 256 or 384GB RAM (a/my current B650 mobo supports only up to 128GB RAM), so we can self-host our favourite big (MOE) LLMs like DeepSeek-R1 (quants: e.g. 1, 2) (real DeepSeek are the ones without the "Distill" in the names) (MOE LLMs run (much) faster than dense LLM, when both have the same number of parameters, DS-R1 tested on 4.gen Epyc server at like 8 tokens/s.) or Llama-3.1 405B quants. Bigger LLMs will always be better than smaller ones (everything else being equal, and not specialized). Please, don't hold humanity back.

lilhasselhoffer · 2025-02-03T20:12:22+0000

The response back, amidst peals of laughter:

Dear consumer, silicon production is a finite resource. It's expensive, and there are a lot of people out there who believe the next person who nails AI as something more than a novelty will rewrite our economy (and thus be stupidly rich). As such, the demand for processing cards which mirror or match those that used to be GPUs is functionally infinite. The only reason that we aren't royally paddling your wallet is that anti-trust lawsuits hurt, and now that billion dollar investments are happening even a slightly unfair pricing policy would trigger legal issues that we can entirely avoid and still print money.

In short, either make your own or pay for ours in a market that will take almost any price we can imagine.

-GPU Makers.

P.S. In a little bit of time, when this shell of an industry collapses under its own weight, we won't be bailing ourselves out. Because we now control enough resources critical to the defense industry, we literally are incapable of failing. Please continue your stupid team red vs. team green shenanigans, while we count our green all the way to the bank.

-I like these thought exercises. Imagining Nvidia as a Bond villain with steepled hands in a volcano seems...somehow fitting.

adilazimdegilx · 2025-02-03T20:20:18+0000

There are B650 mbs with 256GB ram support even on budget side (gigabyte b650m ds3h, msi b650 gaming plus wifi...) I'm surprised (well not really cuz Asus) Asus one doesnt have that.

I think more VRAM or more system RAM alone is not a good solution to this and I think AMDs new laptops with up to 128GB shared ram specially targeted for LLM works are better approach. It solves both problems really. GPU power and bandwidth are the next problems but they can only get better from this. And Deepseek kinda showed that LLMs will also get better and will be more usable on consumer devices day by day.

lilhasselhoffer · 2025-02-03T21:07:07+0000

adilazimdegilx said:
There are B650 mbs with 256GB ram support even on budget side (gigabyte b650m ds3h, msi b650 gaming plus wifi...) I'm surprised (well not really cuz Asus) Asus one doesnt have that.

I think more VRAM or more system RAM alone is not a good solution to this and I think AMDs new laptops with up to 128GB shared ram specially targeted for LLM works are better approach. It solves both problems really. GPU power and bandwidth are the next problems but they can only get better from this. And Deepseek kinda showed that LLMs will also get better and will be more usable on consumer devices day by day.

This is where I get to the point of laughing.

Research DeepSeek...I'll wait a second.

1) Company came from nowhere.
2) Company is in China.
3) Company sensors its responses, see 2.
4) Company has breakthrough that is basically repackaging the technique of teaching an LLM with a more complex LLM...and thus requiring it to use less resources.
5) Company is seeking large capital investment...because their "new" ideas will get it.
6) Company has nothing really to show...but through social media manipulation creates a spark of ignorance that burns the monopoly money making Nvidia stupid rich...promising the usual from China. They'll take something complex, copy it incompletely, repackage it, and claim it as their globe leading technology until 6-18 months from now when it fails to deliver anything new and everyone disappears with the money.

I love me some Oroboros levels of cyclical stupidity...but the OP is asking for the horsepower of a supercar while paying for a four-banger. This is genuinely funny, because it's like asking Ferrari to make something like a Kia Sol. It's...if it happened then I'm sure next week it'd rain frogs and absolutely apoplectic former Ferrari personnel.

mb194dc · 2025-02-03T21:59:57+0000

lilhasselhoffer said:
This is where I get to the point of laughing.

Research DeepSeek...I'll wait a second.

1) Company came from nowhere.
2) Company is in China.
3) Company sensors its responses, see 2.
4) Company has breakthrough that is basically repackaging the technique of teaching an LLM with a more complex LLM...and thus requiring it to use less resources.
5) Company is seeking large capital investment...because their "new" ideas will get it.
6) Company has nothing really to show...but through social media manipulation creates a spark of ignorance that burns the monopoly money making Nvidia stupid rich...promising the usual from China. They'll take something complex, copy it incompletely, repackage it, and claim it as their globe leading technology until 6-18 months from now when it fails to deliver anything new and everyone disappears with the money.

I love me some Oroboros levels of cyclical stupidity...but the OP is asking for the horsepower of a supercar while paying for a four-banger. This is genuinely funny, because it's like asking Ferrari to make something like a Kia Sol. It's...if it happened then I'm sure next week it'd rain frogs and absolutely apoplectic former Ferrari personnel.

The Chinese trained model is pretty useless. Only the model structure is useful.

The interesting part will be when a model is trained using western data with deepseek code. Which given it's open source, shouldn't be too long.

Could be useful to run the full model locally, can do with 1TB of ram, are some ideas around using server boards to do it in not too expensively.

qxp · 2025-02-03T22:24:20+0000

HBM memory is even better, and I can think of far more useful things to do with my GPUs than play with AI. I think this is coming as the high margins on professional chips will lead to overbuilt capacity. If we are really lucky CPUs will converge to look like Xeon Max, but with more cores and far smaller pricing and then who needs GPUs.

igormp · 2025-02-03T23:26:34+0000

10tothemin9volts said:
With the AI age being here, we need fast memory and lots of it, so we can host our favorite LLMs locally.
Even Edward Snowden is complaining.

You'll be seeing such products with higher VRAM amount in their datacenter/workstation offerings with a heavier price tag, not in the consumer space.

Don't expect those to be any cheaper than a 5090. All in all, if you want a large vram consumer product, the 5090 existis for this sole reason.

10tothemin9volts said:
consumer desktop motherboards with up to 256 or 384GB RAM (a/my current B650 mobo supports only up to 128GB RAM),

Most DDR5 motherboards should support 256GB of RAM. The problem is that there are no 64GB UDIMMs available for sale yet.

qxp · 2025-02-04T01:01:55+0000

igormp said:
Most DDR5 motherboards should support 256GB of RAM. The problem is that there are no 64GB UDIMMs available for sale yet.

What would be nicer is to have more channels. Why are we limited to only 4 sticks ?

R0H1T · 2025-02-04T01:05:57+0000

Dear ~~suckers~~ gamers your concerns are duly noted, we'll take that into consideration before announcing our next set of blowout results.

Regards,
Your capitalist overlords.

Japanese RTX 50 retailer apologizes for lottery-fueled mayhem — unruly crowd caused injuries, damaged nearby kindergarten

Store and neighboring businesses were overwhelmed by crowds.

www.tomshardware.com

igormp · 2025-02-04T01:24:20+0000

qxp said:
What would be nicer is to have more channels. Why are we limited to only 4 sticks ?

More channels = more CPU pins = more mobo traces = higher costs.
Even Threadripper non-pro only does 1DPC with its 4 channels, likely for market segmentation reasons.

R0H1T · 2025-02-04T01:34:59+0000

You'd need an entirely new I/O die for that as well, TR reuses the EPYC(?) ones IIRC.

DirtyDingusMcgee · 2025-02-04T01:51:16+0000

What is being asked for simply isn't consumer equipment. There is not a single thing stopping you from buying any of the things you want. That is what they are.....wants. They are not needs. Humanity needs a lot of things......nothing pertaining to computer equipment even comes close to making the list.

Hyderz · 2025-02-04T02:32:58+0000

Don’t chase tech… just wait for the parts that brings you at least 2-4x performance upgrade. Currently my 9900k compared to 9800x3d is around 2.7x for gaming and rtx3090 to rtx5090 is around 2.6x and after this year I should be looking at over 3.5x performance upgrade over my current pc…

I went from 1070ti to 3090 and that was just over 2x but I reckon I could have held out for 40 series but I upgraded my monitor 3440x1440 and the 1070ti can play several titles well but a lot of the newer games struggled this was back 2020

kondamin · 2025-02-04T02:50:08+0000

If only xpoint wasn’t dropped, they could have made a killing in the ai boom

eidairaman1 · 2025-02-04T05:18:47+0000

They won't see your swan song here, go to their websites and email them. h
Heck get on their social media and their forums and make your requests be known there.

qxp · 2025-02-04T12:48:22+0000

igormp said:
More channels = more CPU pins = more mobo traces = higher costs.
Even Threadripper non-pro only does 1DPC with its 4 channels, likely for market segmentation reasons.

Market segmentation is right, and that's why Intel is having problems right now. They had perfectly good competitor to GPUs in Xeon Phi cards, but they had to market segment it away to prevent Xeon Phis competing with their server processors. Result - lost market share and lost revenue streams they need to stay competitive.

DirtyDingusMcgee said:
What is being asked for simply isn't consumer equipment. There is not a single thing stopping you from buying any of the things you want. That is what they are.....wants. They are not needs. Humanity needs a lot of things......nothing pertaining to computer equipment even comes close to making the list.

You could say a computer capable of running nuclear explosion simulations is not consumer equipment either, and yet, quite likely, you have a phone in your pocket. Consumer equipment is simply what consumer is willing to buy. Some of it is frivolous, and some becomes indispensable as use increases.

Denver · 2025-02-04T13:09:27+0000

U$ 9999 is the best I can do.

It won't happen because all the companies can sell the precious silicon at much better prices directly to hyperscalers.

However, it seems to me that the current trend of running large LLMs locally will initially make powerful APUs like the Halo Strix scarce. In a second phase, however, it will stimulate the development of bigger and better APUs. Just my theory, but I believe this will bring significant changes to the market.

The big three players will likely try to sell CPU+GPU as a single product, effectively eliminating the low-end and mid-end dGPU market in the medium term.

Assimilator · 2025-02-04T13:13:17+0000

The self-entitlement from the OP is exactly what I've come to expect from "AI" companies and the people who believe those companies are in any way shape or form useful to humanity.

qxp said:
You could say a computer capable of running nuclear explosion simulations is not consumer equipment either, and yet, quite likely, you have a phone in your pocket.

"Capable of running" is not in the same solar system as "good at running". If you want the latter for a nonstandard consumer use case you're not a consumer, you're a professional, and you need to pull the stick outta your a** and pony up the cash for professional products.

azrael · 2025-02-04T13:19:41+0000

lexluthermiester · 2025-02-04T13:24:44+0000

10tothemin9volts said:
Dear AMD, NVIDIA, INTEL and others, we need cheap (192-bit to 384-bit), high VRAM, consumer, GPUs to locally self-host/inference AI/LLMs

No, we really don't. DeepSeek has proven very well that one needs only a 6GB GPU and a RaspberryPi4 or 5 to get the job done in a very good way. Your "request" is not very applicable to the general consumer anyway.

Seriously, how many people need AI at all? Hmm? What would they use it for?
(those are rhetorical questions, meaning they do not need answers)

azrael said:

Out of touch is my vote.

phanbuey · 2025-02-04T13:52:51+0000

Right now Apple M chips with the unified memory are crushing local LLM development. I think as AI devs flock to apple devices nvidia will react and release their N1X chips with unified memory or start offering higher GB consumer cards.

Deepseek 14B model is capable of fitting into the 4090's buffer, but it's far inferior to the 32B model that's available (like if you ask it to code a typescript website, it will create jsx files and make a bunch of basic mistakes) IMO the 32B model is better than chat 4o. 32B runs brutally slow and only uses 60% GPU since it runs out of framebuffer -- I would love to be able to run the larger models.

The models that run on a 6GB gpu have a difficult time answering simple math problems.

Something like this is probably an ideal development workstatoin if you want to run local LLM - $/performance.

Dristun · 2025-02-04T14:03:15+0000

Used 3090s will be the cheapest ticket to running stuff locally for at least a couple more years. Sure, "Arc Pro B580 24GB" or whatever Intel will name it is on the way but nothing will work out of the box on that, ever + it's still going to be slower than a 3090. Then you have 4090s and 5090s. It is what it is - I'd love to see more options but let's be real they're not coming because there's a ton more money to be made elsewhere for anyone producing GPUs. Nobody's running a charity in this biz.

Frick · 2025-02-04T14:04:08+0000

Assimilator said:
The self-entitlement from the OP is exactly what I've come to expect from "AI" companies and the people who believe those companies are in any way shape or form useful to humanity.

Sam Altman was talking about how we need to reform the social contract....

phanbuey · 2025-02-04T14:08:10+0000

Dristun said:
Used 3090s will be the cheapest ticket to running stuff locally for at least a couple more years. Sure, "Arc Pro B580 24GB" or whatever Intel will name it is on the way but nothing will work out of the box on that, ever + it's still going to be slower than a 3090. Then you have 4090s and 5090s. It is what it is - I'd love to see more options but let's be real they're not coming because there's a ton more money to be made elsewhere for anyone producing GPUs. Nobody's running a charity in this biz.

but even then you only have 24GB or 32gb if you pony up $2k and can even find a 5090.

if you get a refurb M chip you can get 64GB unified ~ 50-55GB usable to load in the model for $2500 or over 85GB if you can get the 96GB version for $3200.

For the same money you would build a 5090 rig. -- Granted the models will run alot slower, but if you're looking for ram size m4 max might be the best price/performance.

windwhirl · 2025-02-04T14:10:37+0000

lexluthermiester said:
Seriously, how many people need AI at all? Hmm? What would they use it for?

I have like two uses for it and:
1-Image generation is an ethics minefield to say the least due to the lack of proper crediting/repayment to the people whose works were used to build the AI's models
2-Instead of spending a week trying to make the AI draw whatever it is I want, I could probably ask a human and get a better result in less time.
3-Asking anything written from an AI is bound to bring in errors of varied nature. I'm better off reading and writing stuff by myself.

Processor	7800X3D @ Curve Optimizer: All Core: -25
Motherboard	TUF Gaming B650-Plus
Memory	2xKSM48E40BD8KM-32HM ECC RAM (ECC enabled in BIOS)
Video Card(s)	4070 @ 110W
Display(s)	SAMSUNG S95B 55" QD-OLED TV
Power Supply	RM850x

Processor	Ryzen 7 5800x3D
Motherboard	Gigabyte B550 Gaming X v2
Cooling	Thermalright Phantom Spirit 120 SE
Memory	Corsair Vengeance LPX 2x32GB 3600Mhz C18
Video Card(s)	XFX RX 6800 XT Merc 319
Storage	Kingston KC2500 2TB NVMe + Crucial MX100 256GB + Samsung 860QVO 1TB + Samsung Spinpoint F3 500GB HDD
Display(s)	Samsung CJG5 27" 144 Hz QHD
Case	Phanteks Eclipse P360A DRGB Black + 3x Thermalright TL-C12C-S ARGB
Audio Device(s)	Logitech X530 5.1 + Logitech G35 7.1 Headset
Power Supply	Cougar GEX850 80+ Gold
Mouse	Razer Viper 8K
Keyboard	Logitech G105

Processor	5950x
Motherboard	B550 ProArt
Cooling	Fuma 2
Memory	4x32GB 3200MHz Corsair LPX
Video Card(s)	2x RTX 3090
Display(s)	LG 42" C2 4k OLED
Power Supply	XPG Core Reactor 850W
Software	I use Arch btw

Processor	5950x
Motherboard	B550 ProArt
Cooling	Fuma 2
Memory	4x32GB 3200MHz Corsair LPX
Video Card(s)	2x RTX 3090
Display(s)	LG 42" C2 4k OLED
Power Supply	XPG Core Reactor 850W
Software	I use Arch btw

System Name	Obliterator
Processor	Ryzen 7 7700x PBO
Motherboard	ASRock x670e Steel Legend
Cooling	Noctua NH-D15 G2 LBC
Memory	G.skill Trident Z5 Neo 6000@CL30
Video Card(s)	ASRock rx7900 GRE Steel Legend
Storage	2 x 2TB Samsung 990 pro nmve ssd 2 X 4TB Samsung 870 evo sata ssd 1 X 18TB WD Gold sata hdd
Display(s)	LG 27GN750-B
Case	Fractal Torrent
Audio Device(s)	Klipsch promedia heritage 2.1
Power Supply	FSP Hydro TI 1000w
Mouse	SteelSeries Prime+
Keyboard	Lenovo SK-8825 (L)
Software	Windows 10 Enterprise LTSC 21H2 / Windows 11 Enterprise LTSC 24H2 with multiple flavors of VM

Dear AMD, NVIDIA, INTEL and others, we need cheap (192-bit to 384-bit), high VRAM, consumer, GPUs to locally self-host/inference AI/LLMs

10tothemin9volts

lilhasselhoffer

adilazimdegilx

lilhasselhoffer

mb194dc

qxp

igormp

qxp

R0H1T

Japanese RTX 50 retailer apologizes for lottery-fueled mayhem — unruly crowd caused injuries, damaged nearby kindergarten

igormp

R0H1T

DirtyDingusMcgee

Hyderz

kondamin

eidairaman1

The Exiled Airman

qxp

Denver

Assimilator

azrael

lexluthermiester

phanbuey

Dristun

Frick

Fishfaced Nincompoop

phanbuey

windwhirl

System Name	Custom
Processor	i9 9900k
Motherboard	Gigabyte Z390 arous master
Cooling	corsair h150i
Memory	4x8 3200mhz corsair
Video Card(s)	Galax RTX 3090 EX Gamer White OC
Storage	500gb Samsung 970 Evo PLus
Display(s)	MSi MAG341CQ
Case	Lian Li Pc-011 Dynamic
Audio Device(s)	Arctis Pro Wireless
Power Supply	850w Seasonic Focus Platinum
Mouse	Logitech G403
Keyboard	Logitech G110

System Name	PCGOD
Processor	AMD FX 8350@ 5.0GHz
Motherboard	Asus TUF 990FX Sabertooth R2 2901 Bios
Cooling	Scythe Ashura, 2×BitFenix 230mm Spectre Pro LED (Blue,Green), 2x BitFenix 140mm Spectre Pro LED
Memory	16 GB Gskill Ripjaws X 2133 (2400 OC, 10-10-12-20-20, 1T, 1.65V)
Video Card(s)	AMD Radeon 290 Sapphire Vapor-X
Storage	Samsung 840 Pro 256GB, WD Velociraptor 1TB
Display(s)	NEC Multisync LCD 1700V (Display Port Adapter)
Case	AeroCool Xpredator Evil Blue Edition
Audio Device(s)	Creative Labs Sound Blaster ZxR
Power Supply	Seasonic 1250 XM2 Series (XP3)
Mouse	Roccat Kone XTD
Keyboard	Roccat Ryos MK Pro
Software	Windows 7 Pro 64

System Name	Firelance.
Processor	Threadripper 3960X
Motherboard	ROG Strix TRX40-E Gaming
Cooling	IceGem 360 + 6x Arctic Cooling P12
Memory	8x 16GB Patriot Viper DDR4-3200 CL16
Video Card(s)	MSI GeForce RTX 4060 Ti Ventus 2X OC
Storage	2TB WD SN850X (boot), 4TB Crucial P3 (data)
Display(s)	Dell S3221QS(A) (32" 38x21 60Hz) + 2x AOC Q32E2N (32" 25x14 75Hz)
Case	Enthoo Pro II Server Edition (Closed Panel) + 6 fans
Power Supply	Fractal Design Ion+ 2 Platinum 760W
Mouse	Logitech G604
Keyboard	Razer Pro Type Ultra
Software	Windows 10 Professional x64

Processor	AMD Ryzen 7 3800X
Motherboard	ASUS Prime X470-Pro
Cooling	bequiet! Dark Rock Slim
Memory	64 GB ECC DDR4 2666 MHz (Samsung M391A2K43BB1-CTD)
Video Card(s)	eVGA GTX 1080 SC Gaming, 8 GB
Storage	1 TB Samsung 970 EVO Plus, 1 TB Samsung 850 EVO, 4 TB Lexar NM790, 12 TB WD HDDs
Display(s)	Acer Predator XB271HU
Case	Corsair Obsidian 550D
Audio Device(s)	Creative X-Fi Fatal1ty
Power Supply	Seasonic X-Series 560W
Mouse	Logitech G502
Keyboard	Glorious GMMK

System Name	stress-less
Processor	9800X3D @ 5.42GHZ
Motherboard	MSI PRO B650M-A Wifi
Cooling	Thermalright Phantom Spirit EVO
Memory	64GB DDR5 6000 1:1 CL30-36-36-96 FCLK 2000
Video Card(s)	RTX 4090 FE
Storage	2TB WD SN850, 4TB WD SN850X
Display(s)	Alienware 32" 4k 240hz OLED
Case	Jonsbo Z20
Audio Device(s)	Yes
Power Supply	RIP Corsair SF750... Waiting for SF1000
Mouse	DeathadderV2 X Hyperspeed
Keyboard	65% HE Keyboard
Software	Windows 11
Benchmark Scores	They're pretty good, nothing crazy.

Processor	Intel 12600K
Motherboard	Gigabyte Z690 Gaming X
Cooling	CPU: Noctua NH-D15S; Case: 2xNoctua NF-A14, 1xNF-S12A.
Memory	Ballistix Sport LT DDR4 @3600CL16 2*16GB
Video Card(s)	Palit RTX 4080
Storage	Samsung 970 Pro 512GB + Crucial MX500 500gb + WD Red 6TB
Display(s)	Dell S2721qs
Case	Phanteks P300A Mesh
Audio Device(s)	Behringer UMC204HD
Power Supply	Fractal Design Ion+ 560W
Mouse	Glorious Model D-

System Name	Black MC in Tokyo
Processor	Ryzen 5 7600
Motherboard	MSI X670E Gaming Plus Wifi
Cooling	Be Quiet! Pure Rock 2
Memory	2 x 16GB Corsair Vengeance @ 6000Mhz
Video Card(s)	XFX 6950XT Speedster MERC 319
Storage	Kingston KC3000 1TB \| WD Black SN750 2TB \|WD Blue 1TB x 2 \| Toshiba P300 2TB \| Seagate Expansion 8TB
Display(s)	Samsung U32J590U 4K + BenQ GL2450HT 1080p
Case	Fractal Design Define R4
Audio Device(s)	Plantronics 5220, Nektar SE61 keyboard
Power Supply	Corsair RM850x v3
Mouse	Logitech G602
Keyboard	Dell SK3205
Software	Windows 10 Pro
Benchmark Scores	Rimworld 4K ready!

System Name	System V
Processor	AMD Ryzen 5 3600
Motherboard	Asus Prime X570-P
Cooling	Cooler Master Hyper 212 // a bunch of 120 mm Xigmatek 1500 RPM fans (2 ins, 3 outs)
Memory	2x8GB Ballistix Sport LT 3200 MHz (BLS8G4D32AESCK.M8FE) (CL16-18-18-36)
Video Card(s)	Gigabyte AORUS Radeon RX 580 8 GB
Storage	SHFS37A240G / DT01ACA200 / ST10000VN0008 / ST8000VN004 / SA400S37960G / SNV21000G / NM620 2TB
Display(s)	LG 22MP55 IPS Display
Case	NZXT Source 210
Audio Device(s)	Logitech G430 Headset
Power Supply	Corsair CX650M
Software	Whatever build of Windows 11 is being served in Canary channel at the time.
Benchmark Scores	Corona 1.3: 3120620 r/s Cinebench R20: 3355 FireStrike: 12490 TimeSpy: 4624