Saturday, August 3rd 2024

Design Issues May Postpone Launch of NVIDIA's Advanced Blackwell AI Chips

Updated by

Aug 3rd, 2024 03:53 Updated: Aug 3rd, 2024 07:15 Discuss (105 Comments)

NVIDIA may face delays in releasing its newest artificial intelligence chips due to design issues, according to anonymous sources involved in chip and server hardware production cited by The Information. The delay could extend to three months or more, potentially affecting major customers such as Meta, Google, and Microsoft. An unnamed Microsoft employee and another source claim that NVIDIA has already informed Microsoft about delays affecting the most advanced models in the Blackwell AI chip series. As a result, significant shipments are not expected until the first quarter of 2025.

When approached for comment, an NVIDIA spokesperson did not address communications with customers regarding the delay but stated that "production is on track to ramp" later this year. The Information reports that Microsoft, Google, Amazon Web Services, and Meta declined to comment on the matter, while Taiwan Semiconductor Manufacturing Company (TSMC) did not respond to inquiries.

Update 1:

The production issue was discovered by manufacturer TSMC, and involves the processor die that connects two Blackwell GPUs on a GB200." — via Data Center Dynamics

NVIDIA needs to redesign its chip, requiring a new TSMC production test before mass production. Rumors say they're considering a single-GPU version to expedite delivery. The delay leaves TSMC production lines idle temporarily.

Update 2:
SemiAnalysis's Dylan Patel reports in a message on Twitter (now known as X) that Blackwell supply will be considerably lower than anticipated in Q4 2024 and H1 2025. This shortage stems from TSMC's transition from CoWoS-S to CoWoS-L technology, required for NVIDIA's advanced Blackwell chips. Currently, TSMC's AP3 packaging facility is dedicated to CoWoS-S production, while initial CoWoS-L capacity is being installed in the AP6 facility.

Additionally, NVIDIA appears to be prioritizing production of GB200 NVL72 units over NVL36 units. The GB200 NVL36 configuration features 36 GPUs in a single rack with 18 individual GB200 compute nodes. In contrast, the NVL72 design incorporates 72 GPUs, either in a single rack with 18 double GB200 compute nodes or spread across two racks, each containing 18 single nodes.

Source: Bloomberg

Add your own comment

105 Comments on Design Issues May Postpone Launch of NVIDIA's Advanced Blackwell AI Chips

Quicks

Sure they just figured out there is such a huge demand for their old chips and no need to rush out something better. When sales start dropping then release something new and coin it all over again.

john_

Damn those coincidences lately.

So, let's go in conspiracy mode.

NVIDIA is hit with a US Department of Justice (DOJ) antitrust probe and just the next day we hear that "Design Issues May Postpone Launch of NVIDIA's Advanced Blackwell AI Chips" which of course gives some time for Nvidia's competitors to manage to get some customers who just can't keep waiting for months for the new chips?

ARF

QuicksSure they just figured out there is such a huge demand for their old chips and no need to rush out something better. When sales start dropping then release something new and coin it all over again.

There are always seasonal fluctuations, recession, depression, etc, great things.
Just vote with your wallet and don't buy.

nguyen

ouch, let hope it does not affect rtx5000 series

Dr. Dro

ARFThere are always seasonal fluctuations, recession, depression, etc, great things.
Just vote with your wallet and don't buy.

The target segment of these GPUs certainly votes with their wallet - that is why they purchase their most advanced products in bulk.

AusWolf

Let's hope that TSMC won't start charging extra for the lost production and revenue.

Hakker

nguyenouch, let hope it does not affect rtx5000 series

In other news RTX5090 suspected to ship mid 2025 at the earliest. -soon on TechPowerup

GoldenX

Heh, what a year.

Easy Rhino

Linux Advocate

may face delays

according to anonymous sources

:laugh:

#10

Darkholm

GoldenXHeh, what a year.

Everything seems so rushed. Why we have to see new uArch every year or every 18 months? Everywhere we look, "beta" products are out...

#11

AusWolf

DarkholmEverything seems so rushed. Why we have to see new uArch every year or every 18 months? Everywhere we look, "beta" products are out...

Not to mention that the cost of making said architectures is exponentially increasing, while the end-user benefits are diminishing. At least when you have a 50% faster card for a 50% higher price, it's not progress.

#12

Darkholm

Agreed.
I read somewhere that AI profits are in, decilne or not growing as expected, at least from companies who built AI tools etc. Only one who profits are HW manufacturers as nVidia and AMD. Could that also be a baloon like "everyone is junior developer" during 2020-2023 and than BOOM.

#13

ARF

AusWolfNot to mention that the cost of making said architectures is exponentially increasing, while the end-user benefits are diminishing. At least when you have a 50% faster card for a 50% higher price, it's not progress.

Unsustainable development. A tricky, vapour bubble that is supposed to burst at any given moment.

#14

AusWolf

DarkholmAgreed.
I read somewhere that AI profits are in, decilne or not growing as expected, at least from companies who built AI tools etc. Only one who profits are HW manufacturers as nVidia and AMD. Could that also be a baloon like "everyone is junior developer" during 2020-2023 and than BOOM.

AI isn't as useful as these companies would like us to think, imo. It is definitely a balloon.

#15

R0H1T

More like a time bomb, if when it pops it'll cause a massive ripple for sure!

#16

Darkholm

AusWolfAI isn't as useful as these companies would like us to think, imo. It is definitely a balloon.

All of this reminds me on those super magical diet beverages from tele-sales "buy it now, drink it and you will lose 30 kg in 30 days doing nothing"... fine print of course "do not eat anything" :D

#17

AusWolf

DarkholmAll of this reminds me on those super magical diet beverages from tele-sales "buy it now, drink it and you will lose 30 kg in 30 days doing nothing"... fine print of course "do not eat anything" :D

I wouldn't even go that far. ;) Staying within the tech world, it's just like crypto, which was "the future" according to believers, until one day, it suddenly wasn't anymore.

#18

Darkholm

Yes, its always "we must support the next best thing". Currently is AI. Cryptomining was.

#19

Dr. Dro

Easy Rhino:laugh:

"Trust me bro. I once won 5 bucks at the lottery!"

#20

starfals

K, i refuse to buy this obsolete DisplayPort 2.1less generation... that doesn't even have enough power to max out the current games under 1000 dollars. So see yaa in 2025. Shame, i was ready to give them money now... cus playing laggy games aint fun.. but oh well. Back to Witcher 3. No more modern games for me till next year. Even 4080 is not enough in some cases, even less if its native without DLSS lol. The 4080 that was supposed to be 1000 bucks is actually 1300 still... and some high-end models even sell for 1600. God knows i ain't spending 1300-1600 so close to 2025 and the next gen stuff. Yes, im in Europe. Tax is one thing, but these cards cost a lot more. No comment on 4090. Even some models of 4070ti super are 1100!! The sooner this horrible generation is over the better.

#21

dgianstefani

TPU Proofreader

AusWolfNot to mention that the cost of making said architectures is exponentially increasing, while the end-user benefits are diminishing. At least when you have a 50% faster card for a 50% higher price, it's not progress.

Inflation exists, and seems to be much higher than what governments quote "3%", if groceries are anything to go by.

Personally I don't mind a big GPU expense every four years, since the old one is still fine anyway and typically goes into a friends PC.

I'm not sure I would say there are zero price/performance improvements either, with current gen cards. Maybe if you're the type to buy an xx60/xx70/x7xx class card every generation, but the budget trap has always been a thing.

#22

AusWolf

dgianstefaniInflation exists, and seems to be much higher than what governments quote "3%", if groceries are anything to go by.

Not just that. I think the main reason for the stagnation on the price-to-performance front is the fact that TSMC charges an arm and leg for their most advanced nodes, and since we need them for cutting-edge GPUs, AMD and Nvidia don't have a choice but to pay it and put the increased cost on the customer.

dgianstefaniPersonally I don't mind a big GPU expense every four years, since the old one is still fine anyway and typically goes into a friends PC.

We don't need more anyway, considering that game graphics aren't evolving fast enough to warrant an upgrade every generation.

dgianstefaniI'm not sure I would say there are zero price/performance improvements either, with current gen cards. Maybe if you're the type to buy an xx60/xx70/x7xx class card every generation, but the budget trap has always been a thing.

I'm the type who buys all sorts of unnecessary hardware just out of curiosity. Not saying it's a good idea, though (it's really not). :laugh: I rarely go above x70 level, if ever, though.

#23

Vya Domus

Called it some time ago, they've clearly gotten into a habit of announcing stuff long before it's ready for release to keep the momentum of the AI hype.

#24

ZoneDymo

nguyenouch, let hope it does not affect rtx5000 series

why? rtx4090 not doing it for you anymore?

#25

R0H1T

AusWolfI think the main reason for the stagnation on the price-to-performance front is the fact that TSMC charges an arm and leg for their most advanced nodes, and since we need them for cutting-edge GPUs, AMD and Nvidia don't have a choice but to pay it and put the increased cost on the customer.

I'd argue the bigger reason is that management & the board keeps giving themselves massive bonuses for a lot of work built/done by actually hard working employees! Just saw a minor rant on reddit/Intel that the workers were paid less & less bonuses over the last few years! And yet they had money for dividends & stock buybacks? That's my experience in corporate world as well though I'd have far better/choice words to sink into their fat hides!

They have increased prices across the board for all their products, so they have already passed on the TSMC price hikes to customers & then some.

Add your own comment

Design Issues May Postpone Launch of NVIDIA's Advanced Blackwell AI Chips

105 Comments on Design Issues May Postpone Launch of NVIDIA's Advanced Blackwell AI Chips

Latest GPU Drivers

New Forum Posts

Popular Reviews

Controversial News Posts

Design Issues May Postpone Launch of NVIDIA's Advanced Blackwell AI Chips

Related News

105 Comments on Design Issues May Postpone Launch of NVIDIA's Advanced Blackwell AI Chips

Latest GPU Drivers

New Forum Posts

Popular Reviews

Controversial News Posts