Saturday, August 3rd 2024
Design Issues May Postpone Launch of NVIDIA's Advanced Blackwell AI Chips
NVIDIA may face delays in releasing its newest artificial intelligence chips due to design issues, according to anonymous sources involved in chip and server hardware production cited by The Information. The delay could extend to three months or more, potentially affecting major customers such as Meta, Google, and Microsoft. An unnamed Microsoft employee and another source claim that NVIDIA has already informed Microsoft about delays affecting the most advanced models in the Blackwell AI chip series. As a result, significant shipments are not expected until the first quarter of 2025.
When approached for comment, an NVIDIA spokesperson did not address communications with customers regarding the delay but stated that "production is on track to ramp" later this year. The Information reports that Microsoft, Google, Amazon Web Services, and Meta declined to comment on the matter, while Taiwan Semiconductor Manufacturing Company (TSMC) did not respond to inquiries.Update 1:
Update 2:
SemiAnalysis's Dylan Patel reports in a message on Twitter (now known as X) that Blackwell supply will be considerably lower than anticipated in Q4 2024 and H1 2025. This shortage stems from TSMC's transition from CoWoS-S to CoWoS-L technology, required for NVIDIA's advanced Blackwell chips. Currently, TSMC's AP3 packaging facility is dedicated to CoWoS-S production, while initial CoWoS-L capacity is being installed in the AP6 facility.Additionally, NVIDIA appears to be prioritizing production of GB200 NVL72 units over NVL36 units. The GB200 NVL36 configuration features 36 GPUs in a single rack with 18 individual GB200 compute nodes. In contrast, the NVL72 design incorporates 72 GPUs, either in a single rack with 18 double GB200 compute nodes or spread across two racks, each containing 18 single nodes.
Source:
Bloomberg
When approached for comment, an NVIDIA spokesperson did not address communications with customers regarding the delay but stated that "production is on track to ramp" later this year. The Information reports that Microsoft, Google, Amazon Web Services, and Meta declined to comment on the matter, while Taiwan Semiconductor Manufacturing Company (TSMC) did not respond to inquiries.Update 1:
The production issue was discovered by manufacturer TSMC, and involves the processor die that connects two Blackwell GPUs on a GB200." — via Data Center DynamicsNVIDIA needs to redesign its chip, requiring a new TSMC production test before mass production. Rumors say they're considering a single-GPU version to expedite delivery. The delay leaves TSMC production lines idle temporarily.
Update 2:
SemiAnalysis's Dylan Patel reports in a message on Twitter (now known as X) that Blackwell supply will be considerably lower than anticipated in Q4 2024 and H1 2025. This shortage stems from TSMC's transition from CoWoS-S to CoWoS-L technology, required for NVIDIA's advanced Blackwell chips. Currently, TSMC's AP3 packaging facility is dedicated to CoWoS-S production, while initial CoWoS-L capacity is being installed in the AP6 facility.Additionally, NVIDIA appears to be prioritizing production of GB200 NVL72 units over NVL36 units. The GB200 NVL36 configuration features 36 GPUs in a single rack with 18 individual GB200 compute nodes. In contrast, the NVL72 design incorporates 72 GPUs, either in a single rack with 18 double GB200 compute nodes or spread across two racks, each containing 18 single nodes.
105 Comments on Design Issues May Postpone Launch of NVIDIA's Advanced Blackwell AI Chips
And of course the modern stock markets are the prefect culmination/demonstration of what some would say as late stage capitalism! The greedy shareholders at Intel for instance choose to reward themselves first over doing more R&D or paying more to their employees & look where they are now?
But yeah, I'll leave it at that. :)
Edit: post was a bit more aggressive than I intented without much explanation. Inflation is just calculated as an average price increase on a bunch of products. You could have inflation at 0% while groceries increased by 100%. Inflation also isn't a reason for increasing prices. Just because product A & B increased in price, doesn't mean product C also needs to increase in price. Things that are very price sensitive tend to not follow the rate of inflation at all. Consoles & games for example.
And why are the perf increases in new GPU generations so much higher than new CPU generations?
we're not hearing as much about it as say 2 years ago but crypto is still a thing.
now if you had said NFT's...
And people are surprised inflation appears. Oh nooo WHO could have seen this coming?!
Let’s just blame the hardware vendors, grocery stores and so on for the prices jump.
If just .0001% of the people scrutinising Nvidia prices actually scrutinised their money printing organisations with the same ardour …
www.tomshardware.com/pc-components/gpus/nvidia-bans-using-translation-layers-for-cuda-software-to-run-on-other-chips-new-restriction-apparently-targets-zluda-and-some-chinese-gpu-makers
I'm not disputing the fact that AMD made some wonderful processors with recent Instinct chips, it's just that the software isn't and won't be there. Not until an open-source solution which is more flexible and performant than OpenCL comes to exist. Some newer AI software for training and inference has been written for Radeon given that Instincts perform so wonderfully, but it remains that CUDA software is still superior, corporations are risk averse, and Nvidia's toolkit is just better.