Friday, July 26th 2024
Intel Will Not Recall Failing 13th and 14th Gen CPUs
It's official, Intel will not issue a recall for its failing 13th and 14th gen CPUs, despite the problem being much bigger than initially thought. The company was approached by The Verge and the answers to the questions asked, are not looking great. First of all, it appears that at least all 65 W or higher base power Intel 13th and 14th gen CPUs are affected—regardless of SKU and lettering—by the so-called elevated Voltage issue. To be clear, it doesn't mean all these CPUs will start to fail and Intel claims that its microcode update will solve the issue for CPUs that haven't shown any signs of stability issues. However, Intel is not promising that the microcode update will solve the stability issues of CPUs that are experiencing problems, but rather state that "It is possible the patch will provide some instability improvements", but it's asking those with stability issues to contact customer support. The patch is on the other hand expected to solve it for new CPUs, but that doesn't help those that are already experiencing stability issues.
Intel does appear to be swapping out degraded chips, but there's no guarantee that the replacement CPUs will come with the microcode update installed, as Intel is only starting to apply it to products that are currently being produced. The company has also asked all of its OEM partners to apply the update before shipping out new products, but this isn't likely to happen until sometime in early to mid-August according to Intel. It's also unclear when BIOS/UEFI updates will be available for end users from the motherboard manufacturers, since this is the only way to install the microcode update as a consumer. Intel has not gone on record to say if it'll extend the warranty of the affected products, nor did the company provide any details about what kind of information consumers have to provide to their customer support to be able to RMA a faulty CPU. Intel will not halt sales of the affected CPUs either, which means that if you're planning to or are in the middle of building a system using said CPUs, you might want to wait with using it, until a BIOS/UEFI with the microcode update in it, is available for your motherboard. There are more details over at The Verge for those that want to read the full questions and answers, but it's clear that Intel isn't considering the issue as anything more than a regular support issue at this point in time.
Source:
The Verge
Intel does appear to be swapping out degraded chips, but there's no guarantee that the replacement CPUs will come with the microcode update installed, as Intel is only starting to apply it to products that are currently being produced. The company has also asked all of its OEM partners to apply the update before shipping out new products, but this isn't likely to happen until sometime in early to mid-August according to Intel. It's also unclear when BIOS/UEFI updates will be available for end users from the motherboard manufacturers, since this is the only way to install the microcode update as a consumer. Intel has not gone on record to say if it'll extend the warranty of the affected products, nor did the company provide any details about what kind of information consumers have to provide to their customer support to be able to RMA a faulty CPU. Intel will not halt sales of the affected CPUs either, which means that if you're planning to or are in the middle of building a system using said CPUs, you might want to wait with using it, until a BIOS/UEFI with the microcode update in it, is available for your motherboard. There are more details over at The Verge for those that want to read the full questions and answers, but it's clear that Intel isn't considering the issue as anything more than a regular support issue at this point in time.
270 Comments on Intel Will Not Recall Failing 13th and 14th Gen CPUs
Yes it will cost them a fortune, but it's all their fault. They insist on pretending a new CPU needs a new MB chipset, that's the same as the last.
The very interesting thing is that it's also affecting Laptops, and OEM systems. This is going to be very expensive.
They probably thought they could get thier wafer cost down by getting more out of the lower bins by pumping absurd voltage through them to hit clocks. There are informal reports of the replaced chips are coming down from 1.45-1.5 vcore to mid 1.2s idling the bios, probably lower VID tables on the proper bins all around.
And what does contact frame have to do with this? Contact frame improves the mounting pressure and temps but this issues does not seem temperature related based on Wendell's data as none of the server CPU's exceeded 83c. If ESD occurred it would kill the motherboard first, not the CPU.
We have a bunch of CPUs on 13th gen deployed in our fleet - over roughly 200 laptops on 13th gen i7s that have been deployed for a while now, plus the 2 14700k dev boxes, and my own custom builds for friends - none of them are crashing (so far). Which is why I think alot of this issue is being used to drive clicks - it's a overbinning (greed) problem at intel for sure, but it's obvious that it's being sensationalized. Same thing happened with AMD 7900XTX coolers, Nvidia power connectors, 7800x3ds exploding... it's "THE END OF THE WORLD" and then everyone forgets about it and moves on and in the end the issue ended up being a fraction of the apocalypse that it was made out to be.
A lot of people said/implied the fail rate is 100% or close to it, in this news there are multiple people claiming things of the kind, some youtubers and "devs" have also claimed similar things. The issue is real, but how widespread the degradation is, continues to be unknown for the moment.
And not just one studio or developer is claiming this. Multiple are claiming this.
Also the return rate for 13th and 14th gen remains 3-4x higher that other models. I very much doubt all these people returning them are doing so because of the news.
The issues you brought up are all very different. One is a stock HSF defect on one model that was solved quickly. The other is a power connector issue that is going on even today but still mainly affects 4090. The third was an issue again with one CPU model that affected mostly ASUS boards and was fixed quickly by AMD with clear communication to affected users. All of these issues mainly affected a single model. Intel's problem - whatever it turns up to be affects multiple series of models. Not just some balls to the wall enthusiast models.
I agree that 100% failure rate seems very unlikely. Based on current reporting it seems to be 25-50% in some instances.
The microcode patch is going to be revealing, IMO. If we see some drastic changes in the voltage and boost behavior, then I think that's a telling sign that these chips were indeed pushed way too hard. If they really need to hobble TVB, it's literally damage control, as they needed to do something drastic to reduce the failure rates. Their first step is trying to cut down on replacements and warranty service. The BYO crowd isn't even the biggest problem--they need to worry about upset OEM partners that are dealing with a lot of warranty claims. The US isn't the issue, but in many other countries, warranties have a 2-3 year minimum. And now, Intel can lose OEM volume to not just AMD, but now Qualcomm.
The developer that's claiming a 50% failure rate is using a farm that put 14900ks and 13900ks in blades that were designed for Alder Lake Xeons and had to be flashed with a custom bios to support RPL-S, the desktop variant. And when those blades started chewing through 13900ks and 14900ks - they put in more 13900ks and 14900ks to replace them. If a system is consistently having problems with a certain model of processor, and you keep putting said processor in that system, then you can reach a 50% failure rate fairly easily.
For sure there is a massive problem - but I would bet based on past issues that it's unlikely to be even in the double digits. 3-4x return rate on Intel processors when return rates ar 1-2% for processors as a whole means there is a 3-8% return rate, which is MASSIVE -- but no where near the 25% - 100% being pushed. For the overbinned i9s (14900ks,13900ks) are probably the most affected.
Return rate for CPUs are probably really low normally, so even a 4x increase might not be that much in absolute numbers. Still a major issue but without knowing the number it's getting compared against, it's hard to understand how big the issue is.
On 12th and amd cpus return rates are around 1.5 to 3. It's one of those "the boy who cried wolf" situations. The massive intel hatred leads a lot of content creators to over dramatize the situation for the clicks. If there was less intel hatred in general I might take these reports more seriously, but as is, I just don't believe the numbers are anywhere near those 50 and 100% claims.
All the hysteria up until now about any other intel issue has been a big nothing burger (needs W11 to work, ecores cause megastutters in games, background task performance drops like a rock and what have you). It's really hard to trust the techtubers on this one because up until now they haven't earned my trust. Intel has. Maybe less clickbaiting and more facts would go a long way towards actually protecting me and the rest of the consumers. Cause if this is an actual big issue I'd really like to know and avoid intel but alas - it's about the clicks.
25% would be good enough, money and reputation wise, but 50% i doubt it can be called good.
IMO not offering a recall isn't enough even if the failure rate is only around 7%, who knows how damaged these cpus are and how affected the lifespan is even with the supposed microcode fix, I definitely wouldn't buy used 13th or 14th gen.
Whether Intel makes lots of small bins that goes into a single SKU, or they make one large bin with a lot of variance ranging from lower quality samples to golden samples, I don't know.
But they do most certainly have this data for every single chip, and they would also know if they changed the binning at some time and it had unintentional effects on RMAs. So if it were the case that the lower quality xx% of i9-14900Ks were overrepresented in RMAs, but the rest were normal return rates, they would be able to identify this problem very quickly and recall those serial numbers. Similarly, if there were certain batches or production lines which had extremely high failure rates, they could have identified this a long time ago. Remember, these chips were run through qualification two years ago, but the widespread problems only showed up fairly recently (this year?).
I don't think the problem is really that simple. While silicon quality probably plays a role in when/if the symptoms arrive, their binning is likely not the main cause, but rather the aggressive application of voltage combined with all the various design considerations. Most have long forgotten that the great Sandy Bridge was a horrible disaster in the beginning. Bad chips, defective chipsets and even many motherboards with major issues, and it kept on going for like six months or so. So many swore not to buy Intel ever again…
When they eventually got the issues sorted out, the platform turned out pretty great. Considering how low RMA rates for CPUs normally are compared to other PC parts, it wouldn't take much to "dominate" those statistics. If the return rates were like 20% or 50% like some have suggested, the big systems integrators would have stopped selling these a long time ago, that overhead would simply just kill their margins.
The contact frame is just a "snake-oil" product and only useful for extreme overclockers. It's usefulness was debunked by GamerNexus.
The contact frame can introduce instability itself if not ideally mounted, which is difficult to do in home conditions.
AFAIK, ESD can kill any electronic device, doesn't mater if it's motherboard, CPU , RAM, or GPU. The voltage is high enough to do damage.
There is also the huge elephant in the room that by the time these launch the new X3D chips should be in the channel. I guess it makes sense though if you want to have stability or your CPU has degraded by the time these launch so that the expense is not as great. This makes them not recalling these chips make more sense in a world where the biggest tech tubers are now hated by Intel fans for exposing the truth.