Saturday, August 3rd 2024

Design Issues May Postpone Launch of NVIDIA's Advanced Blackwell AI Chips

NVIDIA may face delays in releasing its newest artificial intelligence chips due to design issues, according to anonymous sources involved in chip and server hardware production cited by The Information. The delay could extend to three months or more, potentially affecting major customers such as Meta, Google, and Microsoft. An unnamed Microsoft employee and another source claim that NVIDIA has already informed Microsoft about delays affecting the most advanced models in the Blackwell AI chip series. As a result, significant shipments are not expected until the first quarter of 2025.

When approached for comment, an NVIDIA spokesperson did not address communications with customers regarding the delay but stated that "production is on track to ramp" later this year. The Information reports that Microsoft, Google, Amazon Web Services, and Meta declined to comment on the matter, while Taiwan Semiconductor Manufacturing Company (TSMC) did not respond to inquiries.

Update 1:
The production issue was discovered by manufacturer TSMC, and involves the processor die that connects two Blackwell GPUs on a GB200." — via Data Center Dynamics
NVIDIA needs to redesign its chip, requiring a new TSMC production test before mass production. Rumors say they're considering a single-GPU version to expedite delivery. The delay leaves TSMC production lines idle temporarily.

Update 2:
SemiAnalysis's Dylan Patel reports in a message on Twitter (now known as X) that Blackwell supply will be considerably lower than anticipated in Q4 2024 and H1 2025. This shortage stems from TSMC's transition from CoWoS-S to CoWoS-L technology, required for NVIDIA's advanced Blackwell chips. Currently, TSMC's AP3 packaging facility is dedicated to CoWoS-S production, while initial CoWoS-L capacity is being installed in the AP6 facility.
Additionally, NVIDIA appears to be prioritizing production of GB200 NVL72 units over NVL36 units. The GB200 NVL36 configuration features 36 GPUs in a single rack with 18 individual GB200 compute nodes. In contrast, the NVL72 design incorporates 72 GPUs, either in a single rack with 18 double GB200 compute nodes or spread across two racks, each containing 18 single nodes.
Source: Bloomberg
Add your own comment

105 Comments on Design Issues May Postpone Launch of NVIDIA's Advanced Blackwell AI Chips

#26
AusWolf
R0H1TI'd argue the bigger reason is that management & the board keeps giving themselves massive bonuses for a lot of work built/done by actually hard working employees! Just saw a minor rant on reddit/Intel that the workers were paid less & less bonuses over the last few years! And yet they had money for dividends & stock buybacks? That's my experience in corporate world as well though I'd have far better/choice words to sink into their fat hides!

They have increased prices across the board for all their products, so they have already passed on the TSMC price hikes to customers & then some.
That's today's corporate world in general. That's why you're on minimum wage regardless of what you do here in the UK (unless you're a lawyer, doctor, or CEO). Absolutely disgusting. But it's kind of off-topic.
Posted on Reply
#27
R0H1T
This is why in general I'm against this West vs East or US vs China narrative, because when you go down to the bottom it's really all about money & your own coworkers at the top are stabbing you in the back! It isn't China that setup factories 40-50 years back with slave labor like conditions or salaries to take jobs away from the West ~ it was your own greedy corporations :nutkick:

And of course the modern stock markets are the prefect culmination/demonstration of what some would say as late stage capitalism! The greedy shareholders at Intel for instance choose to reward themselves first over doing more R&D or paying more to their employees & look where they are now?
Posted on Reply
#28
AusWolf
R0H1TThis is why in general I'm against this West vs East or US vs China narrative, because when you go down to the bottom it's really all about money & your own coworkers at the top are stabbing you in the back! It isn't China that setup factories 40-50 years back with slave labor like conditions or salaries to take jobs away from the West ~ it was your own greedy corporations :nutkick:

And of course the modern stock markets are the prefect culmination/demonstration of what some would say as late stage capitalism! The greedy shareholders at Intel for instance choose to reward themselves first over doing more R&D or paying more to their employees & look where they are now?
I'm also against the East vs. West narrative, because you and I get f*ed either way, wherever we are. Greedy corporations and corrupt governments are everywhere. The problem is the system, not its players.

But yeah, I'll leave it at that. :)
Posted on Reply
#29
danc
This is good for 5090! Nvidia has idling GB200 dies, give it to us!
Posted on Reply
#30
Dragokar
nguyenouch, let hope it does not affect rtx5000 series
Well we are only a side effect nowadays.......
Posted on Reply
#31
stimpy88
Everything nGreedia does these days is pure BS. For all we know this could possibly be true, but unlikely, or it's punishment for the complaint/investigation.
Posted on Reply
#32
kapone32
AusWolfI'm also against the East vs. West narrative, because you and I get f*ed either way, wherever we are. Greedy corporations and corrupt governments are everywhere. The problem is the system, not its players.

But yeah, I'll leave it at that. :)
There was only 1 Govt that triggered that.
Posted on Reply
#33
Konomi
starfalsK, i refuse to buy this obsolete DisplayPort 2.1less generation... that doesn't even have enough power to max out the current games under 1000 dollars. So see yaa in 2025. Shame, i was ready to give them money now... cus playing laggy games aint fun.. but oh well. Back to Witcher 3. No more modern games for me till next year. Even 4080 is not enough in some cases, even less if its native without DLSS lol. The 4080 that was supposed to be 1000 bucks is actually 1300 still... and some high-end models even sell for 1600. God knows i ain't spending 1300-1600 so close to 2025 and the next gen stuff. Yes, im in Europe. Tax is one thing, but these cards cost a lot more. No comment on 4090. Even some models of 4070ti super are 1100!! The sooner this horrible generation is over the better.
To be fair, the average mid-range GPU can play most games just fine - if you adjust your expectations and game settings. A lot of issues can be solved if you don't run the game at 4K Ultra or whatever absurd settings one may desire. You just don't need to. You're contributing to the problem. You can play perfectly fine now and revisit games in the future - might be difficult to believe but this is an option. If enough people lower their expectations (and game settings), GPU pricing will have to change due to less demand.
Posted on Reply
#34
Klemc
They can postpone as much as they want idc i keep my 4070ti forever anyway.
Posted on Reply
#35
evernessince
ZoneDymowhy? rtx4090 not doing it for you anymore?
Not the person you are replying to but honestly no. The next gen text to image model flux just dropped and it uses 24GB of VRAM. The 4090 feels pretty sluggish on it. 24GB isn't enough for complicated SDLX workloads either and spillovers to the RAM / Disk cause a huge performance slowdown.
nguyenouch, let hope it does not affect rtx5000 series
According to the article this only impacts their dual chip flagship part, which was never going to get a consumer release anyways.
Posted on Reply
#36
Sound_Card
The upcoming Instinct MI325X is poise to grab a nice chunk of market share from Nvidia. Probably going to further boost AMD stock.
Posted on Reply
#37
scottslayer
I was waiting for "according to MLID" to be somewhere in the article
Posted on Reply
#38
napata
dgianstefaniInflation exists, and seems to be much higher than what governments quote "3%", if groceries are anything to go by.

Personally I don't mind a big GPU expense every four years, since the old one is still fine anyway and typically goes into a friends PC.

I'm not sure I would say there are zero price/performance improvements either, with current gen cards. Maybe if you're the type to buy an xx60/xx70/x7xx class card every generation, but the budget trap has always been a thing.
What do you think inflation is or how it's calculated? The fact that you use groceries to dispute 3% inflation makes it seem you do not really understand it.

Edit: post was a bit more aggressive than I intented without much explanation. Inflation is just calculated as an average price increase on a bunch of products. You could have inflation at 0% while groceries increased by 100%. Inflation also isn't a reason for increasing prices. Just because product A & B increased in price, doesn't mean product C also needs to increase in price. Things that are very price sensitive tend to not follow the rate of inflation at all. Consoles & games for example.
Posted on Reply
#39
Upgrayedd
Why do GPUs come every 2 years and CPUs are every year?
And why are the perf increases in new GPU generations so much higher than new CPU generations?
Posted on Reply
#40
64K
napataWhat do you think inflation is or how it's calculated? The fact that you use groceries to dispute 3% inflation makes it seem you do not really understand it.

Edit: post was a bit more aggressive than I intented without much explanation. Inflation is just calculated as an average price increase on a bunch of products. You could have inflation at 0% while groceries increased by 100%. Inflation also isn't a reason for increasing prices. Just because product A & B increased in price, doesn't mean product C also needs to increase in price. Things that are very price sensitive tend to not follow the rate of inflation at all. Consoles & games for example.
Food and energy aren't factored into Core Inflation Govt Statistics.
Posted on Reply
#41
kondamin
AusWolfI wouldn't even go that far. ;) Staying within the tech world, it's just like crypto, which was "the future" according to believers, until one day, it suddenly wasn't anymore.
Bitcoin is still at $60K and I can see it reach 70-80k in september as the fed lowers interest rates
we're not hearing as much about it as say 2 years ago but crypto is still a thing.

now if you had said NFT's...
Posted on Reply
#42
thesmokingman
The powers that be are really timing this to depress NV.
Posted on Reply
#43
Qwerty101
dgianstefaniInflation exists, and seems to be much higher than what governments quote "3%", if groceries are anything to go by.
The powers that be “printed” literally trillions $/€ expanding the money in the system massively.

And people are surprised inflation appears. Oh nooo WHO could have seen this coming?!

Let’s just blame the hardware vendors, grocery stores and so on for the prices jump.

If just .0001% of the people scrutinising Nvidia prices actually scrutinised their money printing organisations with the same ardour …
Posted on Reply
#44
mb194dc
Any decent cost effective use case for any of the chips or the technology generally yet? Be interesting to see how long the hardware sales keep up given that.
Posted on Reply
#45
Totally
AusWolfI wouldn't even go that far. ;) Staying within the tech world, it's just like crypto, which was "the future" according to believers, until one day, it suddenly wasn't anymore.
Nah they just became too poor from to afford an internet connection to tell everyone that the crypto ponzi scheme is the future.
Posted on Reply
#46
Darmok N Jalad
DarkholmEverything seems so rushed. Why we have to see new uArch every year or every 18 months? Everywhere we look, "beta" products are out...
You gotta do that so you can say there's some unfixable flaw in a perfectly-good previous generation.
Posted on Reply
#47
Dr. Dro
Sound_CardThe upcoming Instinct MI325X is poise to grab a nice chunk of market share from Nvidia. Probably going to further boost AMD stock.
It will never happen. Nvidia is king in the datacenter, and unlike Xeon vs. Epyc, the is a software side at play.
Posted on Reply
#48
Sound_Card
Dr. DroIt will never happen. Nvidia is king in the datacenter, and unlike Xeon vs. Epyc, the is a software side at play.
You do realize that the MI300 is AMD's fastest ramping product in its entire product history? Their data center GPU's sold so well in Q2 that they changed their market forecast by 500 million dollars. People are buying them and will continue to buy them because they perform really well. Nvidia failing to secure demand is more potential market share for AMD.
Posted on Reply
#49
Darmok N Jalad
Sound_CardYou do realize that the MI300 is AMD's fastest ramping product in its entire product history? Their data center GPU's sold so well in Q2 that they changed their market forecast by 500 million dollars. People are buying them and will continue to buy them because they perform really well. Nvidia failing to secure demand is more potential market share for AMD.
The MI300 is impressive, and the MI300X looks even better. A beast of a setup and really puts the whole AMD portfolio on one package.
Posted on Reply
#50
Dr. Dro
Sound_CardYou do realize that the MI300 is AMD's fastest ramping product in its entire product history? Their data center GPU's sold so well in Q2 that they changed their market forecast by 500 million dollars. People are buying them and will continue to buy them because they perform really well. Nvidia failing to secure demand is more potential market share for AMD.
It is and it is not enough to dethrone or even dent Nvidia's business. It's huge. Zluda (CUDA interpreter for ROCm/HIP) has no future ahead of it due to CUDA licensing terms blocking it

www.tomshardware.com/pc-components/gpus/nvidia-bans-using-translation-layers-for-cuda-software-to-run-on-other-chips-new-restriction-apparently-targets-zluda-and-some-chinese-gpu-makers

I'm not disputing the fact that AMD made some wonderful processors with recent Instinct chips, it's just that the software isn't and won't be there. Not until an open-source solution which is more flexible and performant than OpenCL comes to exist. Some newer AI software for training and inference has been written for Radeon given that Instincts perform so wonderfully, but it remains that CUDA software is still superior, corporations are risk averse, and Nvidia's toolkit is just better.
Posted on Reply
Add your own comment
Nov 21st, 2024 06:33 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts