• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Dear AMD, NVIDIA, INTEL and others, we need cheap (192-bit to 384-bit), high VRAM, consumer, GPUs to locally self-host/inference AI/LLMs

Joined
Jun 26, 2023
Messages
58 (0.10/day)
Processor 7800X3D @ Curve Optimizer: All Core: -25
Motherboard TUF Gaming B650-Plus
Memory 2xKSM48E40BD8KM-32HM ECC RAM (ECC enabled in BIOS)
Video Card(s) 4070 @ 110W
Display(s) SAMSUNG S95B 55" QD-OLED TV
Power Supply RM850x
With the AI age being here, we need fast memory and lots of it, so we can host our favorite LLMs locally.
Even Edward Snowden is complaining.
vram.png


We also need DDR6 quad channel (or something entirely new and faster) consumer desktop motherboards with up to 256 or 384GB RAM (a/my current B650 mobo supports only up to 128GB RAM), so we can self-host our favourite big (MOE) LLMs like DeepSeek-R1 (quants: e.g. 1, 2) (real DeepSeek are the ones without the "Distill" in the names) (MOE LLMs run (much) faster than dense LLM, when both have the same number of parameters, DS-R1 tested on 4.gen Epyc server at like 8 tokens/s.) or Llama-3.1 405B quants. Bigger LLMs will always be better than smaller ones (everything else being equal, and not specialized). Please, don't hold humanity back.
 
Joined
Apr 2, 2011
Messages
2,896 (0.57/day)
The response back, amidst peals of laughter:

Dear consumer, silicon production is a finite resource. It's expensive, and there are a lot of people out there who believe the next person who nails AI as something more than a novelty will rewrite our economy (and thus be stupidly rich). As such, the demand for processing cards which mirror or match those that used to be GPUs is functionally infinite. The only reason that we aren't royally paddling your wallet is that anti-trust lawsuits hurt, and now that billion dollar investments are happening even a slightly unfair pricing policy would trigger legal issues that we can entirely avoid and still print money.

In short, either make your own or pay for ours in a market that will take almost any price we can imagine.

-GPU Makers.



P.S. In a little bit of time, when this shell of an industry collapses under its own weight, we won't be bailing ourselves out. Because we now control enough resources critical to the defense industry, we literally are incapable of failing. Please continue your stupid team red vs. team green shenanigans, while we count our green all the way to the bank.




-I like these thought exercises. Imagining Nvidia as a Bond villain with steepled hands in a volcano seems...somehow fitting.
 
Joined
Dec 26, 2013
Messages
201 (0.05/day)
Processor Ryzen 7 5800x3D
Motherboard Gigabyte B550 Gaming X v2
Cooling Thermalright Phantom Spirit 120 SE
Memory Corsair Vengeance LPX 2x32GB 3600Mhz C18
Video Card(s) XFX RX 6800 XT Merc 319
Storage Kingston KC2500 2TB NVMe + Crucial MX100 256GB + Samsung 860QVO 1TB + Samsung Spinpoint F3 500GB HDD
Display(s) Samsung CJG5 27" 144 Hz QHD
Case Phanteks Eclipse P360A DRGB Black + 3x Thermalright TL-C12C-S ARGB
Audio Device(s) Logitech X530 5.1 + Logitech G35 7.1 Headset
Power Supply Cougar GEX850 80+ Gold
Mouse Razer Viper 8K
Keyboard Logitech G105
There are B650 mbs with 256GB ram support even on budget side (gigabyte b650m ds3h, msi b650 gaming plus wifi...) I'm surprised (well not really cuz Asus) Asus one doesnt have that.


I think more VRAM or more system RAM alone is not a good solution to this and I think AMDs new laptops with up to 128GB shared ram specially targeted for LLM works are better approach. It solves both problems really. GPU power and bandwidth are the next problems but they can only get better from this. And Deepseek kinda showed that LLMs will also get better and will be more usable on consumer devices day by day.
 
Joined
Apr 2, 2011
Messages
2,896 (0.57/day)
There are B650 mbs with 256GB ram support even on budget side (gigabyte b650m ds3h, msi b650 gaming plus wifi...) I'm surprised (well not really cuz Asus) Asus one doesnt have that.


I think more VRAM or more system RAM alone is not a good solution to this and I think AMDs new laptops with up to 128GB shared ram specially targeted for LLM works are better approach. It solves both problems really. GPU power and bandwidth are the next problems but they can only get better from this. And Deepseek kinda showed that LLMs will also get better and will be more usable on consumer devices day by day.

This is where I get to the point of laughing.

Research DeepSeek...I'll wait a second.


1) Company came from nowhere.
2) Company is in China.
3) Company sensors its responses, see 2.
4) Company has breakthrough that is basically repackaging the technique of teaching an LLM with a more complex LLM...and thus requiring it to use less resources.
5) Company is seeking large capital investment...because their "new" ideas will get it.
6) Company has nothing really to show...but through social media manipulation creates a spark of ignorance that burns the monopoly money making Nvidia stupid rich...promising the usual from China. They'll take something complex, copy it incompletely, repackage it, and claim it as their globe leading technology until 6-18 months from now when it fails to deliver anything new and everyone disappears with the money.


I love me some Oroboros levels of cyclical stupidity...but the OP is asking for the horsepower of a supercar while paying for a four-banger. This is genuinely funny, because it's like asking Ferrari to make something like a Kia Sol. It's...if it happened then I'm sure next week it'd rain frogs and absolutely apoplectic former Ferrari personnel.
 
Joined
Jan 18, 2020
Messages
925 (0.50/day)
This is where I get to the point of laughing.

Research DeepSeek...I'll wait a second.


1) Company came from nowhere.
2) Company is in China.
3) Company sensors its responses, see 2.
4) Company has breakthrough that is basically repackaging the technique of teaching an LLM with a more complex LLM...and thus requiring it to use less resources.
5) Company is seeking large capital investment...because their "new" ideas will get it.
6) Company has nothing really to show...but through social media manipulation creates a spark of ignorance that burns the monopoly money making Nvidia stupid rich...promising the usual from China. They'll take something complex, copy it incompletely, repackage it, and claim it as their globe leading technology until 6-18 months from now when it fails to deliver anything new and everyone disappears with the money.


I love me some Oroboros levels of cyclical stupidity...but the OP is asking for the horsepower of a supercar while paying for a four-banger. This is genuinely funny, because it's like asking Ferrari to make something like a Kia Sol. It's...if it happened then I'm sure next week it'd rain frogs and absolutely apoplectic former Ferrari personnel.

The Chinese trained model is pretty useless. Only the model structure is useful.

The interesting part will be when a model is trained using western data with deepseek code. Which given it's open source, shouldn't be too long.

Could be useful to run the full model locally, can do with 1TB of ram, are some ideas around using server boards to do it in not too expensively.
 

qxp

Joined
Oct 27, 2024
Messages
130 (1.29/day)
HBM memory is even better, and I can think of far more useful things to do with my GPUs than play with AI. I think this is coming as the high margins on professional chips will lead to overbuilt capacity. If we are really lucky CPUs will converge to look like Xeon Max, but with more cores and far smaller pricing and then who needs GPUs.
 
Joined
May 10, 2023
Messages
582 (0.91/day)
Location
Brazil
Processor 5950x
Motherboard B550 ProArt
Cooling Fuma 2
Memory 4x32GB 3200MHz Corsair LPX
Video Card(s) 2x RTX 3090
Display(s) LG 42" C2 4k OLED
Power Supply XPG Core Reactor 850W
Software I use Arch btw
With the AI age being here, we need fast memory and lots of it, so we can host our favorite LLMs locally.
Even Edward Snowden is complaining.
vram.png
You'll be seeing such products with higher VRAM amount in their datacenter/workstation offerings with a heavier price tag, not in the consumer space.

Don't expect those to be any cheaper than a 5090. All in all, if you want a large vram consumer product, the 5090 existis for this sole reason.
consumer desktop motherboards with up to 256 or 384GB RAM (a/my current B650 mobo supports only up to 128GB RAM),
Most DDR5 motherboards should support 256GB of RAM. The problem is that there are no 64GB UDIMMs available for sale yet.
 

qxp

Joined
Oct 27, 2024
Messages
130 (1.29/day)
Most DDR5 motherboards should support 256GB of RAM. The problem is that there are no 64GB UDIMMs available for sale yet.
What would be nicer is to have more channels. Why are we limited to only 4 sticks ?
 
Joined
Apr 12, 2013
Messages
7,644 (1.77/day)
Joined
May 10, 2023
Messages
582 (0.91/day)
Location
Brazil
Processor 5950x
Motherboard B550 ProArt
Cooling Fuma 2
Memory 4x32GB 3200MHz Corsair LPX
Video Card(s) 2x RTX 3090
Display(s) LG 42" C2 4k OLED
Power Supply XPG Core Reactor 850W
Software I use Arch btw
What would be nicer is to have more channels. Why are we limited to only 4 sticks ?
More channels = more CPU pins = more mobo traces = higher costs.
Even Threadripper non-pro only does 1DPC with its 4 channels, likely for market segmentation reasons.
 
Joined
Apr 12, 2013
Messages
7,644 (1.77/day)
You'd need an entirely new I/O die for that as well, TR reuses the EPYC(?) ones IIRC.
 
Joined
Aug 19, 2024
Messages
473 (2.78/day)
Location
Texas, USA
System Name Obliterator
Processor Ryzen 7 7700x PBO
Motherboard ASRock x670e Steel Legend
Cooling Noctua NH-D15 G2 LBC
Memory G.skill Trident Z5 Neo 6000@CL30
Video Card(s) ASRock rx7900 GRE Steel Legend
Storage 2 x 2TB Samsung 990 pro nmve ssd 2 X 4TB Samsung 870 evo sata ssd 1 X 18TB WD Gold sata hdd
Display(s) LG 27GN750-B
Case Fractal Torrent
Audio Device(s) Klipsch promedia heritage 2.1
Power Supply FSP Hydro TI 1000w
Mouse SteelSeries Prime+
Keyboard Lenovo SK-8825 (L)
Software Windows 10 Enterprise LTSC 21H2 / Windows 11 Enterprise LTSC 24H2 with multiple flavors of VM
What is being asked for simply isn't consumer equipment. There is not a single thing stopping you from buying any of the things you want. That is what they are.....wants. They are not needs. Humanity needs a lot of things......nothing pertaining to computer equipment even comes close to making the list.
 
Joined
Aug 12, 2019
Messages
2,283 (1.14/day)
Location
LV-426
System Name Custom
Processor i9 9900k
Motherboard Gigabyte Z390 arous master
Cooling corsair h150i
Memory 4x8 3200mhz corsair
Video Card(s) Galax RTX 3090 EX Gamer White OC
Storage 500gb Samsung 970 Evo PLus
Display(s) MSi MAG341CQ
Case Lian Li Pc-011 Dynamic
Audio Device(s) Arctis Pro Wireless
Power Supply 850w Seasonic Focus Platinum
Mouse Logitech G403
Keyboard Logitech G110
Don’t chase tech… just wait for the parts that brings you at least 2-4x performance upgrade. Currently my 9900k compared to 9800x3d is around 2.7x for gaming and rtx3090 to rtx5090 is around 2.6x and after this year I should be looking at over 3.5x performance upgrade over my current pc…

I went from 1070ti to 3090 and that was just over 2x but I reckon I could have held out for 40 series but I upgraded my monitor 3440x1440 and the 1070ti can play several titles well but a lot of the newer games struggled this was back 2020
 

eidairaman1

The Exiled Airman
Joined
Jul 2, 2007
Messages
43,448 (6.76/day)
Location
Republic of Texas (True Patriot)
System Name PCGOD
Processor AMD FX 8350@ 5.0GHz
Motherboard Asus TUF 990FX Sabertooth R2 2901 Bios
Cooling Scythe Ashura, 2Ă—BitFenix 230mm Spectre Pro LED (Blue,Green), 2x BitFenix 140mm Spectre Pro LED
Memory 16 GB Gskill Ripjaws X 2133 (2400 OC, 10-10-12-20-20, 1T, 1.65V)
Video Card(s) AMD Radeon 290 Sapphire Vapor-X
Storage Samsung 840 Pro 256GB, WD Velociraptor 1TB
Display(s) NEC Multisync LCD 1700V (Display Port Adapter)
Case AeroCool Xpredator Evil Blue Edition
Audio Device(s) Creative Labs Sound Blaster ZxR
Power Supply Seasonic 1250 XM2 Series (XP3)
Mouse Roccat Kone XTD
Keyboard Roccat Ryos MK Pro
Software Windows 7 Pro 64
They won't see your swan song here, go to their websites and email them. h
Heck get on their social media and their forums and make your requests be known there.
 

qxp

Joined
Oct 27, 2024
Messages
130 (1.29/day)
More channels = more CPU pins = more mobo traces = higher costs.
Even Threadripper non-pro only does 1DPC with its 4 channels, likely for market segmentation reasons.

Market segmentation is right, and that's why Intel is having problems right now. They had perfectly good competitor to GPUs in Xeon Phi cards, but they had to market segment it away to prevent Xeon Phis competing with their server processors. Result - lost market share and lost revenue streams they need to stay competitive.

What is being asked for simply isn't consumer equipment. There is not a single thing stopping you from buying any of the things you want. That is what they are.....wants. They are not needs. Humanity needs a lot of things......nothing pertaining to computer equipment even comes close to making the list.
You could say a computer capable of running nuclear explosion simulations is not consumer equipment either, and yet, quite likely, you have a phone in your pocket. Consumer equipment is simply what consumer is willing to buy. Some of it is frivolous, and some becomes indispensable as use increases.
 
Joined
Oct 6, 2021
Messages
1,660 (1.36/day)
U$ 9999 is the best I can do.

It won't happen because all the companies can sell the precious silicon at much better prices directly to hyperscalers.

However, it seems to me that the current trend of running large LLMs locally will initially make powerful APUs like the Halo Strix scarce. In a second phase, however, it will stimulate the development of bigger and better APUs. Just my theory, but I believe this will bring significant changes to the market.

The big three players will likely try to sell CPU+GPU as a single product, effectively eliminating the low-end and mid-end dGPU market in the medium term.
 
Joined
Feb 18, 2005
Messages
5,993 (0.82/day)
Location
Ikenai borderline!
System Name Firelance.
Processor Threadripper 3960X
Motherboard ROG Strix TRX40-E Gaming
Cooling IceGem 360 + 6x Arctic Cooling P12
Memory 8x 16GB Patriot Viper DDR4-3200 CL16
Video Card(s) MSI GeForce RTX 4060 Ti Ventus 2X OC
Storage 2TB WD SN850X (boot), 4TB Crucial P3 (data)
Display(s) Dell S3221QS(A) (32" 38x21 60Hz) + 2x AOC Q32E2N (32" 25x14 75Hz)
Case Enthoo Pro II Server Edition (Closed Panel) + 6 fans
Power Supply Fractal Design Ion+ 2 Platinum 760W
Mouse Logitech G604
Keyboard Razer Pro Type Ultra
Software Windows 10 Professional x64
The self-entitlement from the OP is exactly what I've come to expect from "AI" companies and the people who believe those companies are in any way shape or form useful to humanity.

You could say a computer capable of running nuclear explosion simulations is not consumer equipment either, and yet, quite likely, you have a phone in your pocket.
"Capable of running" is not in the same solar system as "good at running". If you want the latter for a nonstandard consumer use case you're not a consumer, you're a professional, and you need to pull the stick outta your a** and pony up the cash for professional products.
 
Joined
Dec 16, 2021
Messages
378 (0.33/day)
Location
Denmark
Processor AMD Ryzen 7 3800X
Motherboard ASUS Prime X470-Pro
Cooling bequiet! Dark Rock Slim
Memory 64 GB ECC DDR4 2666 MHz (Samsung M391A2K43BB1-CTD)
Video Card(s) eVGA GTX 1080 SC Gaming, 8 GB
Storage 1 TB Samsung 970 EVO Plus, 1 TB Samsung 850 EVO, 4 TB Lexar NM790, 12 TB WD HDDs
Display(s) Acer Predator XB271HU
Case Corsair Obsidian 550D
Audio Device(s) Creative X-Fi Fatal1ty
Power Supply Seasonic X-Series 560W
Mouse Logitech G502
Keyboard Glorious GMMK
Joined
Jul 5, 2013
Messages
28,958 (6.84/day)
Dear AMD, NVIDIA, INTEL and others, we need cheap (192-bit to 384-bit), high VRAM, consumer, GPUs to locally self-host/inference AI/LLMs
No, we really don't. DeepSeek has proven very well that one needs only a 6GB GPU and a RaspberryPi4 or 5 to get the job done in a very good way. Your "request" is not very applicable to the general consumer anyway.

Seriously, how many people need AI at all? Hmm? What would they use it for?
(those are rhetorical questions, meaning they do not need answers)

Out of touch is my vote.
 
Joined
Nov 13, 2007
Messages
10,970 (1.74/day)
Location
Austin Texas
System Name stress-less
Processor 9800X3D @ 5.42GHZ
Motherboard MSI PRO B650M-A Wifi
Cooling Thermalright Phantom Spirit EVO
Memory 64GB DDR5 6000 1:1 CL30-36-36-96 FCLK 2000
Video Card(s) RTX 4090 FE
Storage 2TB WD SN850, 4TB WD SN850X
Display(s) Alienware 32" 4k 240hz OLED
Case Jonsbo Z20
Audio Device(s) Yes
Power Supply RIP Corsair SF750... Waiting for SF1000
Mouse DeathadderV2 X Hyperspeed
Keyboard 65% HE Keyboard
Software Windows 11
Benchmark Scores They're pretty good, nothing crazy.
Right now Apple M chips with the unified memory are crushing local LLM development. I think as AI devs flock to apple devices nvidia will react and release their N1X chips with unified memory or start offering higher GB consumer cards.

Deepseek 14B model is capable of fitting into the 4090's buffer, but it's far inferior to the 32B model that's available (like if you ask it to code a typescript website, it will create jsx files and make a bunch of basic mistakes) IMO the 32B model is better than chat 4o. 32B runs brutally slow and only uses 60% GPU since it runs out of framebuffer -- I would love to be able to run the larger models.

The models that run on a 6GB gpu have a difficult time answering simple math problems.

1738677675131.png


Something like this is probably an ideal development workstatoin if you want to run local LLM - $/performance.
 
Last edited:
Joined
Jun 11, 2019
Messages
685 (0.33/day)
Location
Moscow, Russia
Processor Intel 12600K
Motherboard Gigabyte Z690 Gaming X
Cooling CPU: Noctua NH-D15S; Case: 2xNoctua NF-A14, 1xNF-S12A.
Memory Ballistix Sport LT DDR4 @3600CL16 2*16GB
Video Card(s) Palit RTX 4080
Storage Samsung 970 Pro 512GB + Crucial MX500 500gb + WD Red 6TB
Display(s) Dell S2721qs
Case Phanteks P300A Mesh
Audio Device(s) Behringer UMC204HD
Power Supply Fractal Design Ion+ 560W
Mouse Glorious Model D-
Used 3090s will be the cheapest ticket to running stuff locally for at least a couple more years. Sure, "Arc Pro B580 24GB" or whatever Intel will name it is on the way but nothing will work out of the box on that, ever + it's still going to be slower than a 3090. Then you have 4090s and 5090s. It is what it is - I'd love to see more options but let's be real they're not coming because there's a ton more money to be made elsewhere for anyone producing GPUs. Nobody's running a charity in this biz.
 

Frick

Fishfaced Nincompoop
Joined
Feb 27, 2006
Messages
19,805 (2.86/day)
Location
north
System Name Black MC in Tokyo
Processor Ryzen 5 7600
Motherboard MSI X670E Gaming Plus Wifi
Cooling Be Quiet! Pure Rock 2
Memory 2 x 16GB Corsair Vengeance @ 6000Mhz
Video Card(s) XFX 6950XT Speedster MERC 319
Storage Kingston KC3000 1TB | WD Black SN750 2TB |WD Blue 1TB x 2 | Toshiba P300 2TB | Seagate Expansion 8TB
Display(s) Samsung U32J590U 4K + BenQ GL2450HT 1080p
Case Fractal Design Define R4
Audio Device(s) Plantronics 5220, Nektar SE61 keyboard
Power Supply Corsair RM850x v3
Mouse Logitech G602
Keyboard Dell SK3205
Software Windows 10 Pro
Benchmark Scores Rimworld 4K ready!
The self-entitlement from the OP is exactly what I've come to expect from "AI" companies and the people who believe those companies are in any way shape or form useful to humanity.

Sam Altman was talking about how we need to reform the social contract....
 
Joined
Nov 13, 2007
Messages
10,970 (1.74/day)
Location
Austin Texas
System Name stress-less
Processor 9800X3D @ 5.42GHZ
Motherboard MSI PRO B650M-A Wifi
Cooling Thermalright Phantom Spirit EVO
Memory 64GB DDR5 6000 1:1 CL30-36-36-96 FCLK 2000
Video Card(s) RTX 4090 FE
Storage 2TB WD SN850, 4TB WD SN850X
Display(s) Alienware 32" 4k 240hz OLED
Case Jonsbo Z20
Audio Device(s) Yes
Power Supply RIP Corsair SF750... Waiting for SF1000
Mouse DeathadderV2 X Hyperspeed
Keyboard 65% HE Keyboard
Software Windows 11
Benchmark Scores They're pretty good, nothing crazy.
Used 3090s will be the cheapest ticket to running stuff locally for at least a couple more years. Sure, "Arc Pro B580 24GB" or whatever Intel will name it is on the way but nothing will work out of the box on that, ever + it's still going to be slower than a 3090. Then you have 4090s and 5090s. It is what it is - I'd love to see more options but let's be real they're not coming because there's a ton more money to be made elsewhere for anyone producing GPUs. Nobody's running a charity in this biz.
but even then you only have 24GB or 32gb if you pony up $2k and can even find a 5090.

if you get a refurb M chip you can get 64GB unified ~ 50-55GB usable to load in the model for $2500 or over 85GB if you can get the 96GB version for $3200.

For the same money you would build a 5090 rig. -- Granted the models will run alot slower, but if you're looking for ram size m4 max might be the best price/performance.
 
Last edited:
Joined
Dec 16, 2017
Messages
2,990 (1.15/day)
System Name System V
Processor AMD Ryzen 5 3600
Motherboard Asus Prime X570-P
Cooling Cooler Master Hyper 212 // a bunch of 120 mm Xigmatek 1500 RPM fans (2 ins, 3 outs)
Memory 2x8GB Ballistix Sport LT 3200 MHz (BLS8G4D32AESCK.M8FE) (CL16-18-18-36)
Video Card(s) Gigabyte AORUS Radeon RX 580 8 GB
Storage SHFS37A240G / DT01ACA200 / ST10000VN0008 / ST8000VN004 / SA400S37960G / SNV21000G / NM620 2TB
Display(s) LG 22MP55 IPS Display
Case NZXT Source 210
Audio Device(s) Logitech G430 Headset
Power Supply Corsair CX650M
Software Whatever build of Windows 11 is being served in Canary channel at the time.
Benchmark Scores Corona 1.3: 3120620 r/s Cinebench R20: 3355 FireStrike: 12490 TimeSpy: 4624
Seriously, how many people need AI at all? Hmm? What would they use it for?
I have like two uses for it and:
1-Image generation is an ethics minefield to say the least due to the lack of proper crediting/repayment to the people whose works were used to build the AI's models
2-Instead of spending a week trying to make the AI draw whatever it is I want, I could probably ask a human and get a better result in less time.
3-Asking anything written from an AI is bound to bring in errors of varied nature. I'm better off reading and writing stuff by myself.
 
Top