Wednesday, January 29th 2025
Reports Suggest DeepSeek Running Inference on Huawei Ascend 910C AI GPUs
Huawei's Ascend 910C AI chip was positioned as one of the better Chinese-developed alternatives to NVIDIA's H100 accelerator—reports from last autumn suggested that samples were being sent to highly important customers. The likes of Alibaba, Baidu, and Tencent have long relied on Team Green enterprise hardware for all manner of AI crunching, but trade sanctions have severely limited the supply and potency of Western-developed AI chips. NVIDIA's region-specific B20 "Blackwell" accelerator is due for release this year, but industry watchdogs reckon that the Ascend 910C AI GPU is a strong rival. The latest online rumblings have pointed to another major Huawei customer—DeepSeek—having Ascend silicon in their back pockets.
DeepSeek's recent unveiling of its R1 open-source large language model has disrupted international AI markets. A lot of press attention has focused on DeepSeek's CEO stating that his team can access up to 50,000 NVIDIA H100 GPUs, but many have not looked into the company's (alleged) pool of natively-made chips. Yesterday, Alexander Doria—an LLM enthusiast—shared an interesting insight: "I feel this should be a much bigger story—DeepSeek has trained on NVIDIA H800, but is running inference on the new home Chinese chips made by Huawei, the 910C." Experts believe that there will be a plentiful supply of Ascend 910C GPUs—estimates from last September posit that 70,000 chips (worth around $2 billion) were in the mass production pipeline. Additionally, industry whispers suggest that Huawei is already working on a—presumably, even more powerful—successor.
Sources:
Alexander Doria Tweet, Wccftech
DeepSeek's recent unveiling of its R1 open-source large language model has disrupted international AI markets. A lot of press attention has focused on DeepSeek's CEO stating that his team can access up to 50,000 NVIDIA H100 GPUs, but many have not looked into the company's (alleged) pool of natively-made chips. Yesterday, Alexander Doria—an LLM enthusiast—shared an interesting insight: "I feel this should be a much bigger story—DeepSeek has trained on NVIDIA H800, but is running inference on the new home Chinese chips made by Huawei, the 910C." Experts believe that there will be a plentiful supply of Ascend 910C GPUs—estimates from last September posit that 70,000 chips (worth around $2 billion) were in the mass production pipeline. Additionally, industry whispers suggest that Huawei is already working on a—presumably, even more powerful—successor.
10 Comments on Reports Suggest DeepSeek Running Inference on Huawei Ascend 910C AI GPUs
Also keep in mind that OP is talking about inference, not training.
Had to check what inference and training mean in regards to AI. I thought it's the same, or English as 2nd language plays a role :)
Training is more intensive but you perform inference a lot more times.
The hardware functions required for neural networks are basic mathematical operations on huge matrices and vectors. This is not rocket science nor a secret sauce.
The performance difference between a 3/4 nm chip or a 7nm (as I understand the chinese are able to produce) does not seem that important.
And really, trying to have smaller nodes and large numbers of GPU is trying to brute force the problem.
DeepSeek just demonstrated that cleverness and algorithmic optimizations provides much more benefits than brute force.
That's why I think that the usage of nVidia chips is more related to the software stack associated with it (CUDA) than the hardware, as it allows better capacities (to easily add, remove, change) to develop the model. Inference once you have the model with the layers and the weights is straightforward and thus does not need such flexible developpement.
the market reaction on this is dumb and extremely short sighted.
problem with the bubble still remains how are they going to make money with it, it’s not 60 queries a month for $20
Hence all the media chatter trying to smear it or fearmonger over it.
Also, you can always find better optimizations, and that is not something that the US can somehow restrict China from developping.
And that it is China that did it while the US was stuck on brute force is quite a problem, I think.