Friday, July 26th 2024
Intel Will Not Recall Failing 13th and 14th Gen CPUs
It's official, Intel will not issue a recall for its failing 13th and 14th gen CPUs, despite the problem being much bigger than initially thought. The company was approached by The Verge and the answers to the questions asked, are not looking great. First of all, it appears that at least all 65 W or higher base power Intel 13th and 14th gen CPUs are affected—regardless of SKU and lettering—by the so-called elevated Voltage issue. To be clear, it doesn't mean all these CPUs will start to fail and Intel claims that its microcode update will solve the issue for CPUs that haven't shown any signs of stability issues. However, Intel is not promising that the microcode update will solve the stability issues of CPUs that are experiencing problems, but rather state that "It is possible the patch will provide some instability improvements", but it's asking those with stability issues to contact customer support. The patch is on the other hand expected to solve it for new CPUs, but that doesn't help those that are already experiencing stability issues.
Intel does appear to be swapping out degraded chips, but there's no guarantee that the replacement CPUs will come with the microcode update installed, as Intel is only starting to apply it to products that are currently being produced. The company has also asked all of its OEM partners to apply the update before shipping out new products, but this isn't likely to happen until sometime in early to mid-August according to Intel. It's also unclear when BIOS/UEFI updates will be available for end users from the motherboard manufacturers, since this is the only way to install the microcode update as a consumer. Intel has not gone on record to say if it'll extend the warranty of the affected products, nor did the company provide any details about what kind of information consumers have to provide to their customer support to be able to RMA a faulty CPU. Intel will not halt sales of the affected CPUs either, which means that if you're planning to or are in the middle of building a system using said CPUs, you might want to wait with using it, until a BIOS/UEFI with the microcode update in it, is available for your motherboard. There are more details over at The Verge for those that want to read the full questions and answers, but it's clear that Intel isn't considering the issue as anything more than a regular support issue at this point in time.
Source:
The Verge
Intel does appear to be swapping out degraded chips, but there's no guarantee that the replacement CPUs will come with the microcode update installed, as Intel is only starting to apply it to products that are currently being produced. The company has also asked all of its OEM partners to apply the update before shipping out new products, but this isn't likely to happen until sometime in early to mid-August according to Intel. It's also unclear when BIOS/UEFI updates will be available for end users from the motherboard manufacturers, since this is the only way to install the microcode update as a consumer. Intel has not gone on record to say if it'll extend the warranty of the affected products, nor did the company provide any details about what kind of information consumers have to provide to their customer support to be able to RMA a faulty CPU. Intel will not halt sales of the affected CPUs either, which means that if you're planning to or are in the middle of building a system using said CPUs, you might want to wait with using it, until a BIOS/UEFI with the microcode update in it, is available for your motherboard. There are more details over at The Verge for those that want to read the full questions and answers, but it's clear that Intel isn't considering the issue as anything more than a regular support issue at this point in time.
270 Comments on Intel Will Not Recall Failing 13th and 14th Gen CPUs
www.intel.com/content/www/us/en/products/sku/236791/intel-core-i9-processor-14900t-36m-cache-up-to-5-50-ghz/specifications.html
Imo they should have published the timeframe and SNs that were affected by the oxidization and also a made a statement how to RMA all the affected CPUs. When the microcode update hits, they also should offer an easy way to RMA the cpus that degraded.
Also, there are big rumors about the Ring Bus voltage be the problem.........overall I think they handled this very poorly.
See
eclypsium.com/blog/demystifying-cpu-microcode-vulnerabilities-updates-and-remediation/
en.wikipedia.org/wiki/Intel_Management_Engine
en.wikipedia.org/wiki/AMD_Platform_Security_Processor
You can even go to 12 series or some after market CPU to just boot PC. Not sure if you need the power of 14900hf for work reasons.
Sure if shouldn't be this way, but if you have a job, you have income. Simple as that. I'm pretty sure like every greedy corp, this is a result of cutting costs and outsourcing everything. I wonder how many of those having instability issues did overclock their CPUs over the insanely high factory OC (Turbo 2.0, TVB, eTVB, Turbo 3.0), and how many of them used the third party contact frame to mount their CPU. Because you know Deubouer says it's good. :D
Probably even without ESD equipment. ;)
1. They knew/found out the problem in 13th gen. And that was not only High tier high power CPU either. Intel knew the "voltage" issue is most likely is an outcome of the matufacturing problem.
2. They don't tell the particular date and batch number of the failed silicon, in order to isolate/segregate the issue, and recall the defective dies from the market/stores (separate the wheat from the chaff, before the bigger damage has been done)
3. They still pushed the agressive advertizement of these oxidized/failed CPUs, "snake oil" included.
4. They've been selling entire 13th gen, not telling a thing about the real (not even potential) chemical/physical disaster. Even worse, they've deliberately refused to RMA/recall the broken CPUs, during the fab issue were relevant, while having the internal database and proof, that these client products were indeed broken, and should be replaced ASAP.
5. Knowing, this all, they've derived the 14gen from 13th gen, advertizing it as "new" architecture, while this is just the same Raptor Lake, with the same troubles.
6. They've tried to limit the "accident" and downplay it to be only the 13th gen "voltage" issue, and that manufacturing issue that was "quickly" rectified in "2023", had only "limited impact" on "very few" products. At the same time the fabs were churning defective chips, now for the "two" generations. How this issue slipped to the 14 gen is unknown, if it was "already" fixed on "early samples" of 13th?
Intel has put themselves in the critical finacial and reputational problem, and tried to diminish the issue, to hide the scale, and thus reduce the amount of expences while maintaining the same money flow, and "top chip" manufacturer status. There's no way to tollerate this problem, and this slap on the face, unless: someone is lucky, and had no issues yet (the CPU is still might be prone to self-destruction, nevertheless), one is deliberate Intel fanboy, or Intel's staff. The issue will hunt down the Intel user of 13/14 gen, regardless of the stance and affection.
The only proper solution Intel can (should) come up with, is public aknowledge of the problem, and begin the "dumb" replacement of all CPU, that are claimed for an RMA. Just cover them all. They just had, were simply obligated to answer to RMA and replace every single CPU in question, even to the scared single non-savvy scared customers, just to save the reputation, or at least make a look of a proper resolving the situation (since the scale is unknown for the buyers). Hence, the satisfied customer, with their CPU quickly replaced without an hustle, will least likely to go and complain about the issue on forums and social network, as one would be busy by using the product already.
But intel being intel, and they're so greedy, that they've decided to diminish the problem to just voltage, and fix it with software "bandage". They wanted all the money to themselves, and deliberately advertise and sell the defective dies. And it is unknown, if the entire 13/14th generation Raptor-Lake is flawed, and will eventually degrade one way or another, regardless of voltage and clocks. Since this is just a matter of time, not "if". As it was already mentioned, the scale of the problem might be so huge, that it covers all RPL chips, and this will cause Intel to put huge amount of money, just to swap one proken product, for the another, that is about to broke in some unknown period of time.
The huge problem is that is unknown, which revision, stepping, batch is bad, and which is "okay". The manufacturing defects can happen to anyone at any time, especially with the silicon chips. But the problem is the handlng of the situation is inacceptable. Instead of loosing just a part of money, by replacing the CPUs, that were already claimed for RMA, they've lost the "ever-reliable" image, and what's is worse, to the "big" clients (which might bolster AMD Epyc and Ryzen positions, despite how clumsy and stubborn, the Enterprise/business is towards changing their vendor oriented infrastructure). If the issue will stay open for a bit more, it should just run Intel's reputation, completely into the ground. There's no way back.
The biggest problem of this all, as an outcome, it could be isolated and rectified very quickly and silently, by recalling the defective batches, with nobody knowing about it, except the supply channels and those connected. Now, however, it puts the shade on any future Intel's product, (even if it will be fine). As there's no guaranty, that the reviewed samples ("golden"?), will not be affected, and degrade in the future, and will not force the "media outlets" to re-do all the exhaustive testing, after countless microcode and UEFI updates, and still rely on these results after. As Intel did not disclose, whether this was their batch issue, or the entire fab process altogether. Considering the problem covers the full range of CPUs, from locked to the OC-oriented, from i5 13600 non K, to 14900KF, there's no confidence, that future Core Ultra products will not be the same. The damage has been done. Doesn't matter whether one is a fanboy or a lucky silicon lottery winner, the confidence is lost.
And this is the horrible thing, since this opens the floodgates for all other chipmakers, AMD included. This is bad tone, but they might gladly use it, to further increase their ever rising profit margins. And there's no guaranty, that AMD won't start doing the same. The only thing, that limits the issue to Intel, is that they use the closed foundries, for their own products. Whereas AMD is using the TSMC, and if it was the problem, the scale of it would be much bigger, and affect much bigger amount of clients and products. But let's see, what AMD is going to make with Samsung.
For modern Intel platforms every CPU ships with base microcode programmed in the factory, and that is not modifiable by the end user. Microcode can only be patched after the processor is running and it reverts to base microcode version after a reboot (and sleep for some platforms [1]). This is probably done to prevent hard bricking of CPUs from botched microcode updates, and to ease up CPU compatibility with different BIOS versions - for example when swapping CPUs between motherboards (assuming base support is present in both BIOSes).
Microcode updates can be applied at five levels - FIT, early BIOS, late BIOS, early OS and late OS [2]. Not every microcode update can be done by an operating system. That's why some security updates required a full BIOS update for the embedded microcode patches. Intel Management Engine is not really involved with this process as far as I can tell.
AMD does it the same way, with read-only microcode embedded in the CPU and loading patches [3][4] that reset to base at every reboot: PSP's involvement is hard to judge from AMD's documentation I looked at.
[1] - www.kernel.org/doc/html/next/x86/microcode.html: "The loader also saves the matching microcode for the CPU in memory. Thus, the cached microcode patch is applied when CPUs resume from a sleep state."
[2] - www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/best-practices/microcode-update-guidance.html
[3] - www.amd.com/content/dam/amd/en/documents/processor-tech-docs/specifications/44065_Arch2008.pdf
[4] - www.amd.com/content/dam/amd/en/documents/processor-tech-docs/programmer-references/55901_B1_pub_053.zip
however, I will never understand purchasing, for gaming, a 220-300W peak load CPU just to rival a max 162W 7800X3D.
ryzen is faster in most games, cheaper versus high-end, heats and consumes much less…
Even if they stopped selling them today and recalled all i5s, i7 and i9s, it would take them nearly two years to make new ones. And if the actual failure rate was anywhere 100%, Intel would have stopped selling them a long time ago.
As I said in post #44, doing a (complete) recall would only make those with defective chips suffer much longer. It only makes sense to RMA those that are actually failing, and it's not very considerate of those having seemingly fine samples to overload the testing labs. Let's jump aboard on every unfounded rumor.
Right now there is a great surplus of wild speculation and a deficit of facts. Don't fall for certain click-bait YouTube channels, they are just riding the hype and chasing every rumor. Wait for the more serious ones doing actual deep-dives in real data.
forums.guru3d.com/threads/geforce-560-70-whql-driver-download-discussion.453025/page-9#post-6250840
Also, without the rumors, Intel wouldn't have done anything publicly for the customers imo. They only (as many other big companies) started talking to “outside” people after the bubble got bigger and they needed to address it. Before that it was always naah these game devs are jumping for PR or these guys just have no clue yaddayadaa.
So yeah rumors are rumores, but they tend to keep things moving onward.