• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Tech Stocks Brace for a DeepSeek Haircut, NVIDIA Down 12% in Pre-market Trading

btarunr

Editor & Senior Moderator
Staff member
Joined
Oct 9, 2007
Messages
47,448 (7.50/day)
Location
Hyderabad, India
System Name RBMK-1000
Processor AMD Ryzen 7 5700G
Motherboard ASUS ROG Strix B450-E Gaming
Cooling DeepCool Gammax L240 V2
Memory 2x 8GB G.Skill Sniper X
Video Card(s) Palit GeForce RTX 2080 SUPER GameRock
Storage Western Digital Black NVMe 512GB
Display(s) BenQ 1440p 60 Hz 27-inch
Case Corsair Carbide 100R
Audio Device(s) ASUS SupremeFX S1220A
Power Supply Cooler Master MWE Gold 650W
Mouse ASUS ROG Strix Impact
Keyboard Gamdias Hermes E2
Software Windows 11 Pro
The DeepSeek open-source large language model from China has been the hottest topic in the AI industry over the weekend. The model promises a leap in performance over OpenAI and Meta, and can be accelerated by far less complex hardware. The AI enthusiast community has been able to get it to run on much less complex accelerators such as the M4 SoCs of Mac minis, and gaming GPUs. The model could cause companies to reassess their AI strategy completely, perhaps pulling them away from big cloud companies, toward local acceleration on cheaper hardware; and cloud companies themselves would want to reconsider their orders of AI GPUs in the short-to-medium term.

All this puts the supply chain of AI acceleration hardware in a bit of a spot. The NVIDIA stock is down 12 percent in pre-market trading as of this writing. Microsoft and Meta Platforms also faced a cut, shedding over 3% each. Alphabet lost 3% and Apple 1.5%. Microsoft, Meta and Apple are slated to post their quarterly earnings this week. Companies within NVIDIA's supply chain, such as ASML and TSMC, also saw drops, with ASML and ASM International losing 10-14% in European pre-trading.



View at TechPowerUp Main Site | Source
 
Joined
Nov 11, 2016
Messages
3,543 (1.18/day)
System Name The de-ploughminator Mk-III
Processor 9800X3D
Motherboard Gigabyte X870E Aorus Master
Cooling DeepCool AK620
Memory 2x32GB G.SKill 6400MT Cas32
Video Card(s) Asus RTX4090 TUF
Storage 4TB Samsung 990 Pro
Display(s) 48" LG OLED C4
Case Corsair 5000D Air
Audio Device(s) KEF LSX II LT speakers + KEF KC62 Subwoofer
Power Supply Corsair HX850
Mouse Razor Death Adder v3
Keyboard Razor Huntsman V3 Pro TKL
Software win11
Woop, Nvidia better get back to their gaming market soon :roll:.

I'm interested in trying out DeepSeek on my computer too.
 
Joined
Dec 12, 2016
Messages
2,152 (0.72/day)
Down almost 15% including Friday.

Also when the US sanctions a country, that country has to learn how to do more with less. Looks like China succeeded in doing this.

Finally Strix Halo is probably looking really good right about now and SoCs don’t have sanctions so AMD is free to sell as many SoCs to China as they want. Their close relationship with Lenovo might also pay off.
 
Joined
Jan 8, 2017
Messages
9,651 (3.28/day)
System Name Good enough
Processor AMD Ryzen R9 7900 - Alphacool Eisblock XPX Aurora Edge
Motherboard ASRock B650 Pro RS
Cooling 2x 360mm NexXxoS ST30 X-Flow, 1x 360mm NexXxoS ST30, 1x 240mm NexXxoS ST30
Memory 32GB - FURY Beast RGB 5600 Mhz
Video Card(s) Sapphire RX 7900 XT - Alphacool Eisblock Aurora
Storage 1x Kingston KC3000 1TB 1x Kingston A2000 1TB, 1x Samsung 850 EVO 250GB , 1x Samsung 860 EVO 500GB
Display(s) LG UltraGear 32GN650-B + 4K Samsung TV
Case Phanteks NV7
Power Supply GPS-750C
Everyone can see everything is heading towards cheaper to train/infer models not bigger and bigger as the narrative has been until now. DeepSeek isn't going to be what will burst this bubble but simply the unsustainability of investing more and more without enough revenue to back it up. Reminder that OpenAI is still losing billions of dollars to this day.
 
Joined
Oct 4, 2017
Messages
716 (0.27/day)
Location
France
Processor RYZEN 7 5800X3D
Motherboard Aorus B-550I Pro AX
Cooling HEATKILLER IV PRO , EKWB Vector FTW3 3080/3090 , Barrow res + Xylem DDC 4.2, SE 240 + Dabel 20b 240
Memory Viper Steel 4000 PVS416G400C6K
Video Card(s) EVGA 3080Ti FTW3
Storage XPG SX8200 Pro 512 GB NVMe + Samsung 980 1TB
Display(s) ROG Strix OLED XG27AQDMG
Case NR 200
Power Supply CORSAIR SF750
Mouse Logitech G PRO
Keyboard Meletrix Zoom 75 GT Silver
Software Windows 11 22H2
Chad CHINA single handly saving gaming , in CCP we trust !!!
 
Joined
Oct 29, 2016
Messages
114 (0.04/day)
More efficient AI was always going to come as the next most obvious advancement. Better invest wisely and not forget the fundamental disruptness of technology in the first place.
 
Joined
Jun 22, 2012
Messages
312 (0.07/day)
Processor Intel i7-12700K
Motherboard MSI PRO Z690-A WIFI
Cooling Noctua NH-D15S
Memory Corsair Vengeance 4x16 GB (64GB) DDR4-3600 C18
Video Card(s) MSI GeForce RTX 3090 GAMING X TRIO 24G
Storage Samsung 980 Pro 1TB, SK hynix Platinum P41 2TB
Case Fractal Define C
Power Supply Corsair RM850x
Mouse Logitech G203
Software openSUSE Tumbleweed
The AI enthusiast community has been able to get it to run on much less complex accelerators such as the M4 SoCs of Mac minis, and gaming GPUs.

Worth pointing out that the models that people are running on M4, Mac Minis and gaming GPUs have very little to do with the actually capable one DeepSeek is operating on its website. That is a completely different, much larger model requiring at least 700GB of VRAM.

What the market is concerned about is that such a capable model could be trained with ~5M USD worth of compute (excluding GPU costs). That doesn't mean though that putting more compute on it won't improve the results...
 
Joined
Dec 12, 2016
Messages
2,152 (0.72/day)
More efficient AI was always going to come as the next most obvious advancement. Better invest wisely and not forget the fundamental disruptness of technology in the first place.
Don’t worry. Huang will save Nvidia by signing his name on everything and creating more bitmoji’s of himself.

Edit: Damnit. He already found my post!
1737983369510.png

1737983396422.png
 
Joined
May 29, 2012
Messages
538 (0.12/day)
System Name CUBE_NXT
Processor i9 12900K @ 5.0Ghz all P-cores with E-cores enabled
Motherboard Gigabyte Z690 Aorus Master
Cooling EK AIO Elite Cooler w/ 3 Phanteks T30 fans
Memory 64GB DDR5 @ 5600Mhz
Video Card(s) EVGA 3090Ti Ultra Hybrid Gaming w/ 3 Phanteks T30 fans
Storage 1 x SK Hynix P41 Platinum 1TB, 1 x 2TB, 1 x WD_BLACK SN850 2TB, 1 x WD_RED SN700 4TB
Display(s) Alienware AW3418DW
Case Lian-Li O11 Dynamic Evo w/ 3 Phanteks T30 fans
Power Supply Seasonic PRIME 1000W Titanium
Software Windows 11 Pro 64-bit
Joined
Sep 17, 2014
Messages
23,084 (6.09/day)
Location
The Washing Machine
System Name Tiny the White Yeti
Processor 7800X3D
Motherboard MSI MAG Mortar b650m wifi
Cooling CPU: Thermalright Peerless Assassin / Case: Phanteks T30-120 x3
Memory 32GB Corsair Vengeance 30CL6000
Video Card(s) ASRock RX7900XT Phantom Gaming
Storage Lexar NM790 4TB + Samsung 850 EVO 1TB + Samsung 980 1TB + Crucial BX100 250GB
Display(s) Gigabyte G34QWC (3440x1440)
Case Lian Li A3 mATX White
Audio Device(s) Harman Kardon AVR137 + 2.1
Power Supply EVGA Supernova G2 750W
Mouse Steelseries Aerox 5
Keyboard Lenovo Thinkpad Trackpoint II
VR HMD HD 420 - Green Edition ;)
Software W11 IoT Enterprise LTSC
Benchmark Scores Over 9000
Everyone can see everything is heading towards cheaper to train/infer models not bigger and bigger as the narrative has been until now. DeepSeek isn't going to be what will burst this bubble but simply the unsustainability of investing more and more without enough revenue to back it up. Reminder that OpenAI is still losing billions of dollars to this day.
Its all just a big gamble using our wallets, our planet, our data, and our attention spans...

X/Twitter has yet to make money for a year as well... Money isn't the biggest thing to chase here. The real prize is power and information dominance.

Money is just a means not an end. There's a war going on here.
Remember that Sutskever circus about responsible AI? It fits right in. The power struggle at OpenAI, MS, etc.? Yep. This is all big tech trying to create their Cyberpunk future here. They've already openly accepted government posts just now. The dystopia is real
 
Last edited:
Joined
Feb 20, 2019
Messages
8,647 (3.98/day)
System Name Bragging Rights
Processor Atom Z3735F 1.33GHz
Motherboard It has no markings but it's green
Cooling No, it's a 2.2W processor
Memory 2GB DDR3L-1333
Video Card(s) Gen7 Intel HD (4EU @ 311MHz)
Storage 32GB eMMC and 128GB Sandisk Extreme U3
Display(s) 10" IPS 1280x800 60Hz
Case Veddha T2
Audio Device(s) Apparently, yes
Power Supply Samsung 18W 5V fast-charger
Mouse MX Anywhere 2
Keyboard Logitech MX Keys (not Cherry MX at all)
VR HMD Samsung Oddyssey, not that I'd plug it into this though....
Software W10 21H1, barely
Benchmark Scores I once clocked a Celeron-300A to 564MHz on an Abit BE6 and it scored over 9000.
Good. OpenAI isn't open and Meta are anti-consumer in every valid definition of the term.

The situation is so bad that I don't even care that DeepSeek is Chinese! They are the lesser evil here.
 
Joined
Mar 11, 2008
Messages
1,054 (0.17/day)
Location
Hungary / Budapest
System Name Kincsem
Processor AMD Ryzen 9 9950X
Motherboard ASUS ProArt X870E-CREATOR WIFI
Cooling Be Quiet Dark Rock Pro 5
Memory Kingston Fury KF560C32RSK2-96 (2×48GB 6GHz)
Video Card(s) Sapphire AMD RX 7900 XT Pulse
Storage Samsung 970PRO 500GB + Samsung 980PRO 2TB + FURY Renegade 2TB+ Adata 2TB + WD Ultrastar HC550 16TB
Display(s) Acer QHD 27"@144Hz 1ms + UHD 27"@60Hz
Case Cooler Master CM 690 III
Power Supply Seasonic 1300W 80+ Gold Prime
Mouse Logitech G502 Hero
Keyboard HyperX Alloy Elite RGB
Software Windows 10-64
Benchmark Scores https://valid.x86.fr/9qw7iq https://valid.x86.fr/4d8n02 X570 https://www.techpowerup.com/gpuz/g46uc
Competition is good for the consumers.
Looking forward to see better models from meta microsoft and deepseek.
And looking forward to see some 40GB cards for consumers in the near future from nvidia!
When I bought my 7900XT, did even think that I will be the most happy of the 20GB VRAM :D
 
Joined
Jan 18, 2020
Messages
907 (0.49/day)
Good. OpenAI isn't open and Meta are anti-consumer in every valid definition of the term.

The situation is so bad that I don't even care that DeepSeek is Chinese! They are the lesser evil here.

Their model is open source, so anyone can copy their methodology, don't need to run their version of it from China. You just need training data which shouldn't be too difficult. Then if you can pay the few million for GPU training time... You've got a business similar to OpenAI for less than 5% of the cost... (possibly even for far less than that, we'll see...)

Which I guess is a fairly big problem for those that already sunk billions in to hardware...
 
Joined
Aug 12, 2010
Messages
163 (0.03/day)
Location
Brazil
Processor Ryzen 7 7800X3D
Motherboard ASRock B650M PG Riptide
Cooling AMD Wraith Max + 2x Noctua Redux NF-P14r + 2x NF-P12
Memory 2x16GB ADATA XPG Lancer Blade DDR5-6000
Video Card(s) Powercolor RX 7800 XT Fighter OC
Storage ADATA Legend 970 2TB PCIe 5.0
Display(s) Dell 32" S3222DGM - 1440P 165Hz + P2422H
Case HYTE Y40
Audio Device(s) Microsoft Xbox TLL-00008
Power Supply Cooler Master MWE 750 V2
Mouse Alienware AW320M
Keyboard Alienware AW510K
Software Windows 11 Pro
Isn't that the bottom line with everything that comes from China? They end up doing it better or just as well, charging a fraction of the price.

Just look at the car industry, BYD is eating the EV market up, and mr orange face can't do anything about it, other than imposing his most beloved word: tariffs.
 
Joined
Nov 13, 2007
Messages
10,945 (1.74/day)
Location
Austin Texas
System Name stress-less
Processor 9800X3D @ 5.42GHZ
Motherboard MSI PRO B650M-A Wifi
Cooling Thermalright Phantom Spirit EVO
Memory 64GB DDR5 6000 1:1 CL30-36-36-96 FCLK 2000
Video Card(s) RTX 4090 FE
Storage 2TB WD SN850, 4TB WD SN850X
Display(s) Alienware 32" 4k 240hz OLED
Case Jonsbo Z20
Audio Device(s) Yes
Power Supply RIP Corsair SF750... Waiting for SF1000
Mouse DeathadderV2 X Hyperspeed
Keyboard 65% HE Keyboard
Software Windows 11
Benchmark Scores They're pretty good, nothing crazy.
Their model is open source, so anyone can copy their methodology, don't need to run their version of it from China. You just need training data which shouldn't be too difficult. Then if you can pay the few million for GPU training time... You've got a business similar to OpenAI for less than 5% of the cost... (possibly even for far less than that, we'll see for that...)

Which I guess is a fairly big problem for those that already sunk billions in to hardware...
+1

It seems the majority of the cost is training the model. They took the model that was trained and "Distilled it" i.e. they essentially copied openai's homework that they spent so much time and money crunching and then added their own improvements on top.

It's like if Luis Vitton made a handbag and they took that design, improved it and then manufactured it for 1/20th of the cost...

But essentially this wouldn't exist if it was not for the already trained models out there, so there will be a fight about IP in the AI space coming soon, especially all those people that sunk money into open AI I imagine they're a bit salty about this.

That's what's really dangerous about this... it isn't the AI portion of it - it's the fact that alot of very rich people sunk alot of money into making something that just got used to leapfrog the thing they sunk money into. Historically that tends not to end too well.
 
Last edited:
Joined
Jun 22, 2012
Messages
312 (0.07/day)
Processor Intel i7-12700K
Motherboard MSI PRO Z690-A WIFI
Cooling Noctua NH-D15S
Memory Corsair Vengeance 4x16 GB (64GB) DDR4-3600 C18
Video Card(s) MSI GeForce RTX 3090 GAMING X TRIO 24G
Storage Samsung 980 Pro 1TB, SK hynix Platinum P41 2TB
Case Fractal Define C
Power Supply Corsair RM850x
Mouse Logitech G203
Software openSUSE Tumbleweed
It seems the majority of the cost is training the model. They took the model that was trained and "Distilled it" i.e. they essentially copied openai's homework that they spent so much time and money crunching and then added their own improvements on top.

That's not what they did for DeepSeek R1. That one was trained from scratch using an innovative method:

It's the the smaller models released together with it that were "distilled' from DeepSeek R1 outputs.

1737986803287.png
 
Joined
Nov 13, 2007
Messages
10,945 (1.74/day)
Location
Austin Texas
System Name stress-less
Processor 9800X3D @ 5.42GHZ
Motherboard MSI PRO B650M-A Wifi
Cooling Thermalright Phantom Spirit EVO
Memory 64GB DDR5 6000 1:1 CL30-36-36-96 FCLK 2000
Video Card(s) RTX 4090 FE
Storage 2TB WD SN850, 4TB WD SN850X
Display(s) Alienware 32" 4k 240hz OLED
Case Jonsbo Z20
Audio Device(s) Yes
Power Supply RIP Corsair SF750... Waiting for SF1000
Mouse DeathadderV2 X Hyperspeed
Keyboard 65% HE Keyboard
Software Windows 11
Benchmark Scores They're pretty good, nothing crazy.
That's not what they did for DeepSeek R1. That one was trained from scratch using an innovative method:

It's the the smaller models that were released together with it that were "distilled' from DeepSeek R1 outputs.

View attachment 381939
They have clips of it saying it's chat gpt tho... So the accusation will be there.
 
Joined
Jan 18, 2020
Messages
907 (0.49/day)
+1

It seems the majority of the cost is training the model. They took the model that was trained and "Distilled it" i.e. they essentially copied openai's homework that they spent so much time and money crunching and then added their own improvements on top.

It's like if Luis Vitton made a handbag and they took that design, improved it and then manufactured it for 1/20th of the cost...

But essentially this wouldn't exist if it was not for the already trained models out there, so there will be a fight about IP in the AI space coming soon, especially all those people that sunk money into open AI I imagine they're a bit salty about this.

Isn't a lot of the training data even OpenAI used copyright / propitiatory stuff? They've just scrapped the web, Reddit and the rest or scanned thousand of books and other sources in and used that as the training data. It's not used directly in the output, which is why they've not been sued back to the stone age for it. It's just for the language linking and knowledge they want.

There are reams of sites on the internet doing eBooks and the like without any DRM (not going to link them here obv!). Nothing to stop anyone who wants to train a model just scraping (downloading) all that and inputting it. Literally tens or hundreds of thousands of books, then there's all the content out on the web you can just create a bot and try to crawl like Google et al do. They can try to block you, but if you're spoofing user agents and using consumer IP addresses combined with bot detection defeating technology, it's very hard to.

Even getting data to train a model can be done on a shoe string.
 

BUCK NASTY

4P Enthusiust
Joined
Aug 8, 2007
Messages
5,016 (0.79/day)
Location
Fort Pierce, FL. U.S.A.
System Name Main Rig
Processor AMD RYZEN 5600X
Motherboard MSI MPG X570 GAMING EDGE WIFI
Cooling CORSAIR H115i
Memory 16gb DDR4 @ 1200MHZ
Video Card(s) GIGABYTE RTX 4080 SUPER
Storage SAMSUNG M.2 1TB
Display(s) ASUS 24" IPS
Case Coolermaster Centurion 590
Audio Device(s) Onboard
Power Supply EVGA SUPERNOVA 1000
Software Windows 10 64
Their model is open source, so anyone can copy their methodology, don't need to run their version of it from China. You just need training data which shouldn't be too difficult. Then if you can pay the few million for GPU training time... You've got a business similar to OpenAI for less than 5% of the cost... (possibly even for far less than that, we'll see...)

Which I guess is a fairly big problem for those that already sunk billions in to hardware...
Even if someone copies the "methodology", will they be able to innovate and keep it current, as well as build on it enough to compete with for profit firms? Look at Linux being open source.. They are always trailing commercial products from Microsoft and others. Profit will always drive the highest levels of innovation, although open source products have merit and are chivalrous.

Maybe I'm just butthurt because my nVidia stock is tanking.....
 
Joined
Jun 22, 2012
Messages
312 (0.07/day)
Processor Intel i7-12700K
Motherboard MSI PRO Z690-A WIFI
Cooling Noctua NH-D15S
Memory Corsair Vengeance 4x16 GB (64GB) DDR4-3600 C18
Video Card(s) MSI GeForce RTX 3090 GAMING X TRIO 24G
Storage Samsung 980 Pro 1TB, SK hynix Platinum P41 2TB
Case Fractal Define C
Power Supply Corsair RM850x
Mouse Logitech G203
Software openSUSE Tumbleweed
They have clips of it saying it's chat gpt tho... So the accusation will be there.

That's because the training data included 600k samples of reasoning outputs from DeepSeek-R1-Zero (the model purely trained via so-called Reinforcement Learning), plus 200k samples of non-reasoning outputs from a variety of sources, including presumably publicly available data originally obtained from GPT4 and likely not cleaned well enough. Putting that aside, GPT refusals or GPT-style text have now also contaminated the web as well, so it could have picked it from that too.

1737987022060.png

1737987011523.png
 
Last edited:
Joined
Nov 13, 2007
Messages
10,945 (1.74/day)
Location
Austin Texas
System Name stress-less
Processor 9800X3D @ 5.42GHZ
Motherboard MSI PRO B650M-A Wifi
Cooling Thermalright Phantom Spirit EVO
Memory 64GB DDR5 6000 1:1 CL30-36-36-96 FCLK 2000
Video Card(s) RTX 4090 FE
Storage 2TB WD SN850, 4TB WD SN850X
Display(s) Alienware 32" 4k 240hz OLED
Case Jonsbo Z20
Audio Device(s) Yes
Power Supply RIP Corsair SF750... Waiting for SF1000
Mouse DeathadderV2 X Hyperspeed
Keyboard 65% HE Keyboard
Software Windows 11
Benchmark Scores They're pretty good, nothing crazy.
That's because the training data included 600k samples of reasoning outputs from DeepSeek-R1-Zero (the model trained via so-called Reinforcement Learning), plus 200k samples non-reasoning outputs from a variety of sources, including presumably publicly available data originally obtained from GPT4 and likely not cleaned well enough. Putting that aside, GPT refusals or GPT-style text have now also contaminated the web as well, so it could have picked it from that too.

View attachment 381941
View attachment 381940
Sure but - to play devils advocate , 'we reuse portions of the SFT dataset of DeepSeek-V3' will read as "we bypassed alot of the heavy computational expense by using output from other models".

The people that sank trillions of dollars into open ai, and said other models, thats what they will see. Add to that a general distrust of AI announcements from Chinese firms/labs and voila. A large amount of capital is about to have/is having a hissy fit.
 
Last edited:
Joined
Feb 23, 2019
Messages
6,178 (2.85/day)
Location
Poland
Processor Ryzen 7 5800X3D
Motherboard Gigabyte X570 Aorus Elite
Cooling Thermalright Phantom Spirit 120 SE
Memory 2x16 GB Crucial Ballistix 3600 CL16 Rev E @ 3600 CL14
Video Card(s) RTX3080 Ti FE
Storage SX8200 Pro 1 TB, Plextor M6Pro 256 GB, WD Blue 2TB
Display(s) LG 34GN850P-B
Case SilverStone Primera PM01 RGB
Audio Device(s) SoundBlaster G6 | Fidelio X2 | Sennheiser 6XX
Power Supply SeaSonic Focus Plus Gold 750W
Mouse Endgame Gear XM1R
Keyboard Wooting Two HE
Here we go ? You don't need all that hardware to create a good model, evidently.
They clearly stated that they're using H800's for their model:
Deepseek trained its DeepSeek-V3 Mixture-of-Experts (MoE) language model with 671 billion parameters using a cluster containing 2,048 Nvidia H800 GPUs in just two months, which means 2.8 million GPU hours, according to its paper. For comparison, it took Meta 11 times more compute power (30.8 million GPU hours) to train its Llama 3 with 405 billion parameters using a cluster containing 16,384 H100 GPUs over the course of 54 days.
Maybe they don't need that much hardware, but they still need it.
 
Last edited:
Top