Monday, January 27th 2025

Tech Stocks Brace for a DeepSeek Haircut, NVIDIA Down 12% in Pre-market Trading

Jan 27th, 2025 07:35 Discuss (102 Comments)

The DeepSeek open-source large language model from China has been the hottest topic in the AI industry over the weekend. The model promises a leap in performance over OpenAI and Meta, and can be accelerated by far less complex hardware. The AI enthusiast community has been able to get it to run on much less complex accelerators such as the M4 SoCs of Mac minis, and gaming GPUs. The model could cause companies to reassess their AI strategy completely, perhaps pulling them away from big cloud companies, toward local acceleration on cheaper hardware; and cloud companies themselves would want to reconsider their orders of AI GPUs in the short-to-medium term.

All this puts the supply chain of AI acceleration hardware in a bit of a spot. The NVIDIA stock is down 12 percent in pre-market trading as of this writing. Microsoft and Meta Platforms also faced a cut, shedding over 3% each. Alphabet lost 3% and Apple 1.5%. Microsoft, Meta and Apple are slated to post their quarterly earnings this week. Companies within NVIDIA's supply chain, such as ASML and TSMC, also saw drops, with ASML and ASM International losing 10-14% in European pre-trading.

Sources: LiveMint, The Kobeissi Letter

Add your own comment

102 Comments on Tech Stocks Brace for a DeepSeek Haircut, NVIDIA Down 12% in Pre-market Trading

nguyen

Woop, Nvidia better get back to their gaming market soon :roll:.

I'm interested in trying out DeepSeek on my computer too.

Baks

It's time to introduce sanctions. And put a couple of people in jail.

mb194dc

Here we go ? You don't need all that hardware to create a good model, evidently. Huang been selling stock for 6 months... He knew ?

If we get a 2000/2 style tech correction.
Looks like the bond market knew, 10y-3m, similar to 2020.

R0H1T

BaksAnd put a couple of people in jail.

Starting with the guy who has trouble spelling "China" or the one who wants to (re)name the English channel after the Continental Army's hero?

Daven

Down almost 15% including Friday.

Also when the US sanctions a country, that country has to learn how to do more with less. Looks like China succeeded in doing this.

Finally Strix Halo is probably looking really good right about now and SoCs don’t have sanctions so AMD is free to sell as many SoCs to China as they want. Their close relationship with Lenovo might also pay off.

Vya Domus

Everyone can see everything is heading towards cheaper to train/infer models not bigger and bigger as the narrative has been until now. DeepSeek isn't going to be what will burst this bubble but simply the unsustainability of investing more and more without enough revenue to back it up. Reminder that OpenAI is still losing billions of dollars to this day.

RH92

Chad CHINA single handly saving gaming , in CCP we trust !!!

MaMoo

More efficient AI was always going to come as the next most obvious advancement. Better invest wisely and not forget the fundamental disruptness of technology in the first place.

Solid State Brain

The AI enthusiast community has been able to get it to run on much less complex accelerators such as the M4 SoCs of Mac minis, and gaming GPUs.

Worth pointing out that the models that people are running on M4, Mac Minis and gaming GPUs have very little to do with the actually capable one DeepSeek is operating on its website. That is a completely different, much larger model requiring at least 700GB of VRAM.

What the market is concerned about is that such a capable model could be trained with ~5M USD worth of compute (excluding GPU costs). That doesn't mean though that putting more compute on it won't improve the results...

#10

Daven

MaMooMore efficient AI was always going to come as the next most obvious advancement. Better invest wisely and not forget the fundamental disruptness of technology in the first place.

Don’t worry. Huang will save Nvidia by signing his name on everything and creating more bitmoji’s of himself.

Edit: Damnit. He already found my post!

#11

Solidstate89

lmao

#12

Vayra86

Vya DomusEveryone can see everything is heading towards cheaper to train/infer models not bigger and bigger as the narrative has been until now. DeepSeek isn't going to be what will burst this bubble but simply the unsustainability of investing more and more without enough revenue to back it up. Reminder that OpenAI is still losing billions of dollars to this day.

Its all just a big gamble using our wallets, our planet, our data, and our attention spans...

X/Twitter has yet to make money for a year as well... Money isn't the biggest thing to chase here. The real prize is power and information dominance.

Money is just a means not an end. There's a war going on here.
Remember that Sutskever circus about responsible AI? It fits right in. The power struggle at OpenAI, MS, etc.? Yep. This is all big tech trying to create their Cyberpunk future here. They've already openly accepted government posts just now. The dystopia is real

#13

Chrispy_

Good. OpenAI isn't open and Meta are anti-consumer in every valid definition of the term.

The situation is so bad that I don't even care that DeepSeek is Chinese! They are the lesser evil here.

#14

csendesmark

Competition is good for the consumers.
Looking forward to see better models from meta microsoft and deepseek.
And looking forward to see some 40GB cards for consumers in the near future from nvidia!
When I bought my 7900XT, did even think that I will be the most happy of the 20GB VRAM :D

#15

mb194dc

Chrispy_Good. OpenAI isn't open and Meta are anti-consumer in every valid definition of the term.

The situation is so bad that I don't even care that DeepSeek is Chinese! They are the lesser evil here.

Their model is open source, so anyone can copy their methodology, don't need to run their version of it from China. You just need training data which shouldn't be too difficult. Then if you can pay the few million for GPU training time... You've got a business similar to OpenAI for less than 5% of the cost... (possibly even for far less than that, we'll see...)

Which I guess is a fairly big problem for those that already sunk billions in to hardware...

#16

ThomasK

Isn't that the bottom line with everything that comes from China? They end up doing it better or just as well, charging a fraction of the price.

Just look at the car industry, BYD is eating the EV market up, and mr orange face can't do anything about it, other than imposing his most beloved word: tariffs.

#17

phanbuey

mb194dcTheir model is open source, so anyone can copy their methodology, don't need to run their version of it from China. You just need training data which shouldn't be too difficult. Then if you can pay the few million for GPU training time... You've got a business similar to OpenAI for less than 5% of the cost... (possibly even for far less than that, we'll see for that...)

Which I guess is a fairly big problem for those that already sunk billions in to hardware...

+1

It seems the majority of the cost is training the model. They took the model that was trained and "Distilled it" i.e. they essentially copied openai's homework that they spent so much time and money crunching and then added their own improvements on top.

It's like if Luis Vitton made a handbag and they took that design, improved it and then manufactured it for 1/20th of the cost...

But essentially this wouldn't exist if it was not for the already trained models out there, so there will be a fight about IP in the AI space coming soon, especially all those people that sunk money into open AI I imagine they're a bit salty about this.

That's what's really dangerous about this... it isn't the AI portion of it - it's the fact that alot of very rich people sunk alot of money into making something that just got used to leapfrog the thing they sunk money into. Historically that tends not to end too well.

#18

Solid State Brain

phanbueyIt seems the majority of the cost is training the model. They took the model that was trained and "Distilled it" i.e. they essentially copied openai's homework that they spent so much time and money crunching and then added their own improvements on top.

That's not what they did for DeepSeek R1. That one was trained from scratch using an innovative method:
arxiv.org/abs/2501.12948

It's the the smaller models released together with it that were "distilled' from DeepSeek R1 outputs.

#19

phanbuey

Solid State BrainThat's not what they did for DeepSeek R1. That one was trained from scratch using an innovative method:
arxiv.org/abs/2501.12948

It's the the smaller models that were released together with it that were "distilled' from DeepSeek R1 outputs.

They have clips of it saying it's chat gpt tho... So the accusation will be there.

#20

mb194dc

phanbuey+1

It seems the majority of the cost is training the model. They took the model that was trained and "Distilled it" i.e. they essentially copied openai's homework that they spent so much time and money crunching and then added their own improvements on top.

It's like if Luis Vitton made a handbag and they took that design, improved it and then manufactured it for 1/20th of the cost...

But essentially this wouldn't exist if it was not for the already trained models out there, so there will be a fight about IP in the AI space coming soon, especially all those people that sunk money into open AI I imagine they're a bit salty about this.

Isn't a lot of the training data even OpenAI used copyright / propitiatory stuff? They've just scrapped the web, Reddit and the rest or scanned thousand of books and other sources in and used that as the training data. It's not used directly in the output, which is why they've not been sued back to the stone age for it. It's just for the language linking and knowledge they want.

There are reams of sites on the internet doing eBooks and the like without any DRM (not going to link them here obv!). Nothing to stop anyone who wants to train a model just scraping (downloading) all that and inputting it. Literally tens or hundreds of thousands of books, then there's all the content out on the web you can just create a bot and try to crawl like Google et al do. They can try to block you, but if you're spoofing user agents and using consumer IP addresses combined with bot detection defeating technology, it's very hard to.

Even getting data to train a model can be done on a shoe string.

#21

BUCK NASTY

4P Enthusiust

mb194dcTheir model is open source, so anyone can copy their methodology, don't need to run their version of it from China. You just need training data which shouldn't be too difficult. Then if you can pay the few million for GPU training time... You've got a business similar to OpenAI for less than 5% of the cost... (possibly even for far less than that, we'll see...)

Which I guess is a fairly big problem for those that already sunk billions in to hardware...

Even if someone copies the "methodology", will they be able to innovate and keep it current, as well as build on it enough to compete with for profit firms? Look at Linux being open source.. They are always trailing commercial products from Microsoft and others. Profit will always drive the highest levels of innovation, although open source products have merit and are chivalrous.

Maybe I'm just butthurt because my nVidia stock is tanking.....

#22

Solid State Brain

phanbueyThey have clips of it saying it's chat gpt tho... So the accusation will be there.

That's because the training data included 600k samples of reasoning outputs from DeepSeek-R1-Zero (the model purely trained via so-called Reinforcement Learning), plus 200k samples of non-reasoning outputs from a variety of sources, including presumably publicly available data originally obtained from GPT4 and likely not cleaned well enough. Putting that aside, GPT refusals or GPT-style text have now also contaminated the web as well, so it could have picked it from that too.

#23

phanbuey

Solid State BrainThat's because the training data included 600k samples of reasoning outputs from DeepSeek-R1-Zero (the model trained via so-called Reinforcement Learning), plus 200k samples non-reasoning outputs from a variety of sources, including presumably publicly available data originally obtained from GPT4 and likely not cleaned well enough. Putting that aside, GPT refusals or GPT-style text have now also contaminated the web as well, so it could have picked it from that too.

Sure but - to play devils advocate , 'we reuse portions of the SFT dataset of DeepSeek-V3' will read as "we bypassed alot of the heavy computational expense by using output from other models".

The people that sank trillions of dollars into open ai, and said other models, thats what they will see. Add to that a general distrust of AI announcements from Chinese firms/labs and voila. A large amount of capital is about to have/is having a hissy fit.

#24

Chomiq

mb194dcHere we go ? You don't need all that hardware to create a good model, evidently.

They clearly stated that they're using H800's for their model:

Deepseek trained its DeepSeek-V3 Mixture-of-Experts (MoE) language model with 671 billion parameters using a cluster containing 2,048 Nvidia H800 GPUs in just two months, which means 2.8 million GPU hours, according to its paper. For comparison, it took Meta 11 times more compute power (30.8 million GPU hours) to train its Llama 3 with 405 billion parameters using a cluster containing 16,384 H100 GPUs over the course of 54 days.

Maybe they don't need that much hardware, but they still need it.

#25

Albatros39

Solid State BrainWorth pointing out that the models that people are running on M4, Mac Minis and gaming GPUs have very little to do with the actually capable one DeepSeek is operating on its website. That is a completely different, much larger model requiring at least 700GB of VRAM.

What the market is concerned about is that such a capable model could be trained with ~5M USD worth of compute (excluding GPU costs). That doesn't mean though that putting more compute on it won't improve the results...

But also some people are running the 685B model.
It gets 6 tokens/s with 1 Epyc CPU, using the 12-channel SP5 platform, with a maximum memory bandwidth 576 GB/s, per CPU.
The DDR4 platform gets <2 tokens/s, but with a few RTX3090 you can get it to usable speed.
You still need an expensive machine, but with used parts, its doable for an individual.

Add your own comment

Tech Stocks Brace for a DeepSeek Haircut, NVIDIA Down 12% in Pre-market Trading

102 Comments on Tech Stocks Brace for a DeepSeek Haircut, NVIDIA Down 12% in Pre-market Trading

Latest GPU Drivers

New Forum Posts

Popular Reviews

Controversial News Posts

Tech Stocks Brace for a DeepSeek Haircut, NVIDIA Down 12% in Pre-market Trading

Related News

102 Comments on Tech Stocks Brace for a DeepSeek Haircut, NVIDIA Down 12% in Pre-market Trading

Latest GPU Drivers

New Forum Posts

Popular Reviews

Controversial News Posts