• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Should we have a "benchmarks" type thread for LLMS?

Local LLM Benchmark table


  • Total voters
    8

Easy Rhino

Linux Advocate
Staff member
Joined
Nov 13, 2006
Messages
15,649 (2.35/day)
Location
Mid-Atlantic
System Name Desktop
Processor i5 13600KF
Motherboard AsRock B760M Steel Legend Wifi
Cooling Noctua NH-U9S
Memory 4x 16 Gb Gskill S5 DDR5 @6000
Video Card(s) Gigabyte Gaming OC 6750 XT 12GB
Storage WD_BLACK 4TB SN850x
Display(s) Gigabye M32U
Case Corsair Carbide 400C
Audio Device(s) On Board
Power Supply EVGA Supernova 650 P2
Mouse MX Master 3s
Keyboard Logitech G915 Wireless Clicky
Software Fedora KDE Spin
I can't seem to find any good data that highlights tokens per second with local LLMS and graphics cards.

I am proposing something like a table that include, tpu username, graphics card and vram, graphics driver and version, platform (ollama, lm studio, etc), cpu, ram, operating system and version, model, tokens per second, date of benchmark.

Are there any other relevant fields to add? Are other people interested in this kind of information?
 

Solaris17

Super Dainty Moderator
Staff member
Joined
Aug 16, 2005
Messages
27,258 (3.83/day)
Location
Alabama
System Name RogueOne
Processor Xeon W9-3495x
Motherboard ASUS w790E Sage SE
Cooling SilverStone XE360-4677
Memory 128gb Gskill Zeta R5 DDR5 RDIMMs
Video Card(s) MSI SUPRIM Liquid X 4090
Storage 1x 2TB WD SN850X | 2x 8TB GAMMIX S70
Display(s) 49" Philips Evnia OLED (49M2C8900)
Case Thermaltake Core P3 Pro Snow
Audio Device(s) Moondrop S8's on schitt Gunnr
Power Supply Seasonic Prime TX-1600
Mouse Razer Viper mini signature edition (mercury white)
Keyboard Monsgeek M3 Lavender, Moondrop Luna lights
VR HMD Quest 3
Software Windows 11 Pro Workstation
Benchmark Scores I dont have time for that.
I was thinking about doing something like this so I’m for it.
 

Easy Rhino

Linux Advocate
Staff member
Joined
Nov 13, 2006
Messages
15,649 (2.35/day)
Location
Mid-Atlantic
System Name Desktop
Processor i5 13600KF
Motherboard AsRock B760M Steel Legend Wifi
Cooling Noctua NH-U9S
Memory 4x 16 Gb Gskill S5 DDR5 @6000
Video Card(s) Gigabyte Gaming OC 6750 XT 12GB
Storage WD_BLACK 4TB SN850x
Display(s) Gigabye M32U
Case Corsair Carbide 400C
Audio Device(s) On Board
Power Supply EVGA Supernova 650 P2
Mouse MX Master 3s
Keyboard Logitech G915 Wireless Clicky
Software Fedora KDE Spin
I think the challenge will be verifying the results. People could just post crap because reasons. it would be neat if projects like ollama had a way to upload results to a database.
 

Solaris17

Super Dainty Moderator
Staff member
Joined
Aug 16, 2005
Messages
27,258 (3.83/day)
Location
Alabama
System Name RogueOne
Processor Xeon W9-3495x
Motherboard ASUS w790E Sage SE
Cooling SilverStone XE360-4677
Memory 128gb Gskill Zeta R5 DDR5 RDIMMs
Video Card(s) MSI SUPRIM Liquid X 4090
Storage 1x 2TB WD SN850X | 2x 8TB GAMMIX S70
Display(s) 49" Philips Evnia OLED (49M2C8900)
Case Thermaltake Core P3 Pro Snow
Audio Device(s) Moondrop S8's on schitt Gunnr
Power Supply Seasonic Prime TX-1600
Mouse Razer Viper mini signature edition (mercury white)
Keyboard Monsgeek M3 Lavender, Moondrop Luna lights
VR HMD Quest 3
Software Windows 11 Pro Workstation
Benchmark Scores I dont have time for that.
I think the challenge will be verifying the results. People could just post crap because reasons. it would be neat if projects like ollama had a way to upload results to a database.

I was just thinking that. I was going to maybe make something bundled that just runs the benchmarks, but I'm legit stupid when it comes to DB APIs. We should def think tank this though, I was going to start a thread of my own literally like last week in my case, specifically for the Intel cards using AI playground just for the lols, but I had my teeth yanked so im on hella drugs right now and Dont feel like the song and dance.

I think a good place to start is reading some of the GPU reviews where w1zz displays some AI stats to see what is important.

Off the top of my head (imo)

- Tokens total (some models and services charge by tokens total)
- Tokens/s (tokens /s determines readability, and weather your talking to an adult at normal speech cadence or someone that barely knows how to read.)
- Time to completion
- VRAM Usage

I was also thinking "Accuracy" In my case I was going to throw it a known problem, to make a script I already know how to make as a human, and see how well it does in replicating it or something that works, in a vanilla sense. IE: not telling it to iterate or improve my own.

Accuracy is a little harder though and I wasd in the middle of thinking about it when I started taking pain killers. Love the idea though.
 
Last edited:
Joined
Jun 24, 2017
Messages
189 (0.07/day)
Local LLMs should be the way to go in most cases. TPU should encourage local testing on non-workstation PCs.
 

Solaris17

Super Dainty Moderator
Staff member
Joined
Aug 16, 2005
Messages
27,258 (3.83/day)
Location
Alabama
System Name RogueOne
Processor Xeon W9-3495x
Motherboard ASUS w790E Sage SE
Cooling SilverStone XE360-4677
Memory 128gb Gskill Zeta R5 DDR5 RDIMMs
Video Card(s) MSI SUPRIM Liquid X 4090
Storage 1x 2TB WD SN850X | 2x 8TB GAMMIX S70
Display(s) 49" Philips Evnia OLED (49M2C8900)
Case Thermaltake Core P3 Pro Snow
Audio Device(s) Moondrop S8's on schitt Gunnr
Power Supply Seasonic Prime TX-1600
Mouse Razer Viper mini signature edition (mercury white)
Keyboard Monsgeek M3 Lavender, Moondrop Luna lights
VR HMD Quest 3
Software Windows 11 Pro Workstation
Benchmark Scores I dont have time for that.
Local LLMs should be the way to go in most cases. TPU should encourage local testing on non-workstation PCs.

I dont think any of the testing that is done in the GPU reviews is on workstation PCs. We actually dont do any testing on workstation class gear since forever. I think Easy is talking about member contributed here, not like a new way to review.
 
Joined
Jun 24, 2017
Messages
189 (0.07/day)
I dont think any of the testing that is done in the GPU reviews is on workstation PCs. We actually dont do any testing on workstation class gear since forever. I think Easy is talking about member contributed here, not like a new way to review.

My bad, what I meant is TPU is the perfect place to do tests of local LLMs for non-workstation hardware.

Smaller sites haven't the resources and the variety of hardware TPU has.

For workstation class hardware servethehome, level1techs or maybe the articles from puget. I must say its difficult to find good "workstation" review places or at least as "wide" as TPU. Since brands don't usually want their hardware to be exposed (hp, dell, etc.) and non-branded workstations is such a niche market.
 

Solaris17

Super Dainty Moderator
Staff member
Joined
Aug 16, 2005
Messages
27,258 (3.83/day)
Location
Alabama
System Name RogueOne
Processor Xeon W9-3495x
Motherboard ASUS w790E Sage SE
Cooling SilverStone XE360-4677
Memory 128gb Gskill Zeta R5 DDR5 RDIMMs
Video Card(s) MSI SUPRIM Liquid X 4090
Storage 1x 2TB WD SN850X | 2x 8TB GAMMIX S70
Display(s) 49" Philips Evnia OLED (49M2C8900)
Case Thermaltake Core P3 Pro Snow
Audio Device(s) Moondrop S8's on schitt Gunnr
Power Supply Seasonic Prime TX-1600
Mouse Razer Viper mini signature edition (mercury white)
Keyboard Monsgeek M3 Lavender, Moondrop Luna lights
VR HMD Quest 3
Software Windows 11 Pro Workstation
Benchmark Scores I dont have time for that.
My bad, what I meant is TPU is the perfect place to do tests of local LLMs for non-workstation hardware.

Smaller sites haven't the resources and the variety of hardware TPU has.

For workstation class hardware servethehome, level1techs or maybe the articles from puget. I must say its difficult to find good "workstation" review places or at least as "wide" as TPU. Since brands don't usually want their hardware to be exposed (hp, dell, etc.) and non-branded workstations is such a niche market.

I always dreamed that one day TPU would expand there review market; I would jump on signing up to do WS/Enterprise gear. Maybe someday.

As far as doing more AI stuff; I hope given its proliferation more tests will be included during things like GPU reviews, that fact that it was even added at all is a good sign, maybe one day there will be more targeted tests in an official capacity from w1zz.

We're just peons though. So humble threads is the best we can do.
 
Top