Tuesday, July 23rd 2024
Intel Statement on 13th and 14th Gen Core Instability: Faulty Microcode Causes Excessive Voltages, Fix Out Soon
Long-term reliability issues continue to plague Intel's 13th Gen and 14th Gen Core desktop processors based on the "Raptor Lake" microarchitecture, with users complaining that their processors have become unstable with heavy processing workloads, such as games. This includes the chips that have minor levels of performance tuning or overclocking. Intel had earlier isolated many of these stability issues to faulty CPU core frequency boosting algorithms, which it addressed through updates to the processor microcode that it got motherboard- and prebuilt manufacturers to distribute as UEFI firmware updates. The company has now come out with new findings of what could be causing these issues.
In a statement Intel posted on its website on Monday (22/07), the company said that it has been investigating the processors returned to it by users under warranty claims (which it has been replacing under the terms of its warranty). It has found that faulty processor microcode has been causing the processors to operate under excessive core voltages, leading to their structural degradation over time. "We have determined that elevated operating voltage is causing instability issues in some 13th/14th Gen desktop processors. Our analysis of returned processors confirms that the elevated operating voltage is stemming from a microcode algorithm resulting in incorrect voltage requests to the processor."Modern processor power management runs on an intricate clockwork of collaboration between software, firmware, and hardware, with the software constantly telling the hardware what levels of performance it wants, and the hardware managing its power- and thermal budgets by rapidly altering the power and clock speeds of the various components, such as CPU cores, caches, fabric, and other on-die components. A faulty collaboration between any of the three key components could break this clockwork, as has happened in this case.
Intel is releasing yet another microcode update to its 13th- and 14th Gen Core processors, which will address not just the faulty boosting algorithm issue the company unearthed in June, but also the faulty voltage management the company discovered now. This new microcode should be released some time around mid-August to partners (motherboard manufacturers and PC OEMs), who will then need to validate it on their machines, before passing it along to end-users as UEFI firmware updates.
Meanwhile, an interesting issue has come to light, which that some of Intel's processors built on the Intel 7 node are experiencing chemical oxidation of the die as they age. Intel responded to this, stating that it had discovered the oxidation manufacturing issues in 2023, and addressed it. The company also stated that die oxidation is not related to the stability issues it is embattled with.
Sources:
Intel Community, Intel (Reddit)
In a statement Intel posted on its website on Monday (22/07), the company said that it has been investigating the processors returned to it by users under warranty claims (which it has been replacing under the terms of its warranty). It has found that faulty processor microcode has been causing the processors to operate under excessive core voltages, leading to their structural degradation over time. "We have determined that elevated operating voltage is causing instability issues in some 13th/14th Gen desktop processors. Our analysis of returned processors confirms that the elevated operating voltage is stemming from a microcode algorithm resulting in incorrect voltage requests to the processor."Modern processor power management runs on an intricate clockwork of collaboration between software, firmware, and hardware, with the software constantly telling the hardware what levels of performance it wants, and the hardware managing its power- and thermal budgets by rapidly altering the power and clock speeds of the various components, such as CPU cores, caches, fabric, and other on-die components. A faulty collaboration between any of the three key components could break this clockwork, as has happened in this case.
Intel is releasing yet another microcode update to its 13th- and 14th Gen Core processors, which will address not just the faulty boosting algorithm issue the company unearthed in June, but also the faulty voltage management the company discovered now. This new microcode should be released some time around mid-August to partners (motherboard manufacturers and PC OEMs), who will then need to validate it on their machines, before passing it along to end-users as UEFI firmware updates.
Intel is delivering a microcode patch which addresses the root cause of exposure to elevated voltages. We are continuing validation to ensure that scenarios of instability reported to Intel regarding its Core 13th/14th Gen desktop processors are addressed. Intel is currently targeting mid-August for patch release to partners following full validation. Intel is committed to making this right with our customers, and we continue asking any customers currently experiencing instability issues on their Intel Core 13th/14th Gen desktop processors reach out to Intel Customer Support for further assistance, the company stated.It's important to note here, that the microcode update won't fix the issues on processors already experiencing instability, but prevent it on chips that aren't. The instability is caused by irreversible physical degradation of the chip. These chips will, of course, be covered under warranty.
Meanwhile, an interesting issue has come to light, which that some of Intel's processors built on the Intel 7 node are experiencing chemical oxidation of the die as they age. Intel responded to this, stating that it had discovered the oxidation manufacturing issues in 2023, and addressed it. The company also stated that die oxidation is not related to the stability issues it is embattled with.
We can confirm that the via Oxidation manufacturing issue affected some early Intel Core 13th Gen desktop processors. However, the issue was root caused and addressed with manufacturing improvements and screens in 2023. We have also looked at it from the instability reports on Intel Core 13th Gen desktop processors and the analysis to-date has determined that only a small number of instability reports can be connected to the manufacturing issue, the company stated.If you feel your chip might be affected, you can file for an RMA.
387 Comments on Intel Statement on 13th and 14th Gen Core Instability: Faulty Microcode Causes Excessive Voltages, Fix Out Soon
But nobody is that stupid you pushed your silicon beyond its reasonable limits to compete with amd and it came back to bite you
now you have sold a bunch of cpus to people that are suddenly going to get a lot slower
I hope you enjoy class actions because there is one headed your way
5800: Jan 12th, 2021, OEM
5800 was released several months later as 65W OEM exclusive model.
When problems arise with 15th generation Intel CPUs, just buy a new PC with a 16th or 17th generation Intel CPU and the problem will be solved.
Do it exactly this way and you won't have any problems.
The worst part is, Intel knew about it and still went with it. I only feel sorry for those who have not received RMA's of the product even though they deserved it. Not informing partners about the issue is literally speaking disgusting.
I'm gonna go on a limb and say "to be like Intel" meaning very dishonest. Not a shocker either.
I am not even sure if Intel truly knows which units are safe.
I'm sure they know but that's beside the point because , maybe all of them are bad and it's just hard to admit it. Sometimes it is better to play dumb and look for a solution to the mess they have made. The microcode reasoning is just ridiculous here. They just beat around the bush to buy more time. Or, i really don't know what they are doing to be honest. i also don't care. Lost interest in Intel long time ago.
Temperature limit of 110 degree hotspot or core.
Boosting beyond 6Ghz at the toes of the silicon,
Nics, NUCS (Atom) failing - same degradation happening
Power targets of well over 244W to even 350W for the higher end models.
AMD on the other hand has a certain trust in the products they bring out and PBO for example is going to last you years.
I still have a 2700X at this point, PBO for years and slightly undervolted. No issues at all.
So, Intel ... eating insane amounts of Watts (300+) in benchmarks just to break every record and to dominate over AMD by a few digits of % ... Was it all worth it? Now you have the answer.
After years of Intel product paper releases and product launch postponing they said the'll accelerate the arrival of upcoming generations. They said they'll deliver on time, maybe even 2 generations per year. Now you have it. Lack of time for proper product development and testing caused all this desperate voltage-hungry benchmark-record-breaking unreliable CPUs to exist at first place. It's about goddamn time Intel realized that the proper way of progress (in terms of increasing IPC) is to modify (improve) the architecture. But you can't do that without enough time, right? Touching architecture was the AMD's approach with Zen, Zen 2, Zen 3 and Zen 5 (Zen 4 excluding). [Many don't see it but the Zen 5 is not a minor architectural improvement over Zen 4.]
This headless approach to drop HT so that they push the core clocks (& voltages of course) even higher than before was pointless at first place. Now they're planning on dropping the e-cores after they had made claims that the current e-core has IPC comparable to Raptor Lake P-Core? Why the hell would you want to drop such a good core? Perhaps to free up the resources for another round of brute force freq/voltage pushing round?
They invented the HT, they invented the e-cores for the desktop. What amount of money was put into the development of these ... They managed to win the battle with core scheduling problems and the battle with e-cores being much less powerful than P-cores. Now Arrow Lake will be stripped of HT, Bartlett Lake will be stripped of e-cores. This kind of mess seems like a trial & error approach to me.
Admitting the oxidation issues more than a year after the 13th Gen products were released (and not telling anything to anyone) really needs to see a court. They need to be fined an amount that is of a considerable loss to them - like 10% worth of their year's revenue or so.
It's precisely why AMD released their EPYC 4004 on AM5.
Also, it's not like they just kept putting more 14900K CPU's in new racks. Some devs initially built racks with 13900K/14900K CPU's but many have failed twice and second time around was with an underclock because they thought that stock clocks might be the issue for the first batch of CPU's to degrade. Makes sense to do that because they're getting the CPU's through RMA so it's 'free' other than the downtime and lost time through debugging. It's finally come to the point now that they're just replacing all the racks with AMD because, well, enough is enough
Edit: autocorrect on mobile sucks sorry
No problems with my set up and the old one, 2700x serves my brother till this day with a 5600xt.
There is too many unknowns with Intel at this point for me and my concerns extend with current way of things around the company.
Every technology has its limits and the problematic Intel 10nm process even after all the improvements may not be able to handle high frequencies, due to its higher tendency to degradation at high temperatures (or high current density or both).
It seems that Intel threw any cautious and responsible behavior out of the window and they just cranked the frequency to the maximum that the CPUs will survive in the hands of the reviewers.