Tuesday, July 23rd 2024

Intel Statement on 13th and 14th Gen Core Instability: Faulty Microcode Causes Excessive Voltages, Fix Out Soon
Long-term reliability issues continue to plague Intel's 13th Gen and 14th Gen Core desktop processors based on the "Raptor Lake" microarchitecture, with users complaining that their processors have become unstable with heavy processing workloads, such as games. This includes the chips that have minor levels of performance tuning or overclocking. Intel had earlier isolated many of these stability issues to faulty CPU core frequency boosting algorithms, which it addressed through updates to the processor microcode that it got motherboard- and prebuilt manufacturers to distribute as UEFI firmware updates. The company has now come out with new findings of what could be causing these issues.
In a statement Intel posted on its website on Monday (22/07), the company said that it has been investigating the processors returned to it by users under warranty claims (which it has been replacing under the terms of its warranty). It has found that faulty processor microcode has been causing the processors to operate under excessive core voltages, leading to their structural degradation over time. "We have determined that elevated operating voltage is causing instability issues in some 13th/14th Gen desktop processors. Our analysis of returned processors confirms that the elevated operating voltage is stemming from a microcode algorithm resulting in incorrect voltage requests to the processor."Modern processor power management runs on an intricate clockwork of collaboration between software, firmware, and hardware, with the software constantly telling the hardware what levels of performance it wants, and the hardware managing its power- and thermal budgets by rapidly altering the power and clock speeds of the various components, such as CPU cores, caches, fabric, and other on-die components. A faulty collaboration between any of the three key components could break this clockwork, as has happened in this case.
Intel is releasing yet another microcode update to its 13th- and 14th Gen Core processors, which will address not just the faulty boosting algorithm issue the company unearthed in June, but also the faulty voltage management the company discovered now. This new microcode should be released some time around mid-August to partners (motherboard manufacturers and PC OEMs), who will then need to validate it on their machines, before passing it along to end-users as UEFI firmware updates.
Meanwhile, an interesting issue has come to light, which that some of Intel's processors built on the Intel 7 node are experiencing chemical oxidation of the die as they age. Intel responded to this, stating that it had discovered the oxidation manufacturing issues in 2023, and addressed it. The company also stated that die oxidation is not related to the stability issues it is embattled with.
Sources:
Intel Community, Intel (Reddit)
In a statement Intel posted on its website on Monday (22/07), the company said that it has been investigating the processors returned to it by users under warranty claims (which it has been replacing under the terms of its warranty). It has found that faulty processor microcode has been causing the processors to operate under excessive core voltages, leading to their structural degradation over time. "We have determined that elevated operating voltage is causing instability issues in some 13th/14th Gen desktop processors. Our analysis of returned processors confirms that the elevated operating voltage is stemming from a microcode algorithm resulting in incorrect voltage requests to the processor."Modern processor power management runs on an intricate clockwork of collaboration between software, firmware, and hardware, with the software constantly telling the hardware what levels of performance it wants, and the hardware managing its power- and thermal budgets by rapidly altering the power and clock speeds of the various components, such as CPU cores, caches, fabric, and other on-die components. A faulty collaboration between any of the three key components could break this clockwork, as has happened in this case.
Intel is releasing yet another microcode update to its 13th- and 14th Gen Core processors, which will address not just the faulty boosting algorithm issue the company unearthed in June, but also the faulty voltage management the company discovered now. This new microcode should be released some time around mid-August to partners (motherboard manufacturers and PC OEMs), who will then need to validate it on their machines, before passing it along to end-users as UEFI firmware updates.
Intel is delivering a microcode patch which addresses the root cause of exposure to elevated voltages. We are continuing validation to ensure that scenarios of instability reported to Intel regarding its Core 13th/14th Gen desktop processors are addressed. Intel is currently targeting mid-August for patch release to partners following full validation. Intel is committed to making this right with our customers, and we continue asking any customers currently experiencing instability issues on their Intel Core 13th/14th Gen desktop processors reach out to Intel Customer Support for further assistance, the company stated.It's important to note here, that the microcode update won't fix the issues on processors already experiencing instability, but prevent it on chips that aren't. The instability is caused by irreversible physical degradation of the chip. These chips will, of course, be covered under warranty.
Meanwhile, an interesting issue has come to light, which that some of Intel's processors built on the Intel 7 node are experiencing chemical oxidation of the die as they age. Intel responded to this, stating that it had discovered the oxidation manufacturing issues in 2023, and addressed it. The company also stated that die oxidation is not related to the stability issues it is embattled with.
We can confirm that the via Oxidation manufacturing issue affected some early Intel Core 13th Gen desktop processors. However, the issue was root caused and addressed with manufacturing improvements and screens in 2023. We have also looked at it from the instability reports on Intel Core 13th Gen desktop processors and the analysis to-date has determined that only a small number of instability reports can be connected to the manufacturing issue, the company stated.If you feel your chip might be affected, you can file for an RMA.
387 Comments on Intel Statement on 13th and 14th Gen Core Instability: Faulty Microcode Causes Excessive Voltages, Fix Out Soon
I've just completed a long drive and have some catching up to do. Back to page one.
Then it says Intel update fix is August??? MSI gave that fix June 20th!
My 7 months old Motherboard.
ca.msi.com/Motherboard/MEG-Z790-ACE-MAX/support
I'm a proud owner of both 14900K & 14900KS systems with zero issues. The real problem is people have no clue what they're doing and just easyer to play the blame game.
13900K have been out for 18 months until the bad MAY Bios upgrade.
Very unfortunate what's happening in this falling apart would.
Cheers
So if the excessive core voltage degraded the CPU, then HTF is a microcode update going to fix those degraded chips like..?
The instability referenced is described as "...reported system instability issues such as OS/Application errors, crashes, hangs, and BSOD on boards and systems with 13th and 14th generation Intel K SKU (unlocked) processors are the signs/symptoms of the affected units". The 'affected units' means "...all K SKUs of 13th and 14th Generation Core Processors (i5, i7 and i9)" source: Intel.
I find it very suspicious that intel suddently had 4 different issues, I think they are trying to act like its 4 different issues with most of them fixable, while they release patches that limit the damage and hope users think its fixed until its past the warranty. I think they are going to limit boost clock times a lot, so its only there for a much lower time, that way they can say "peak performance" was not reduced.
Cause : Greed (as always), but how do we got here ?
Intel HQ some years ago :
Boss : How to increase profitability/margins ?
...
Director : Why bother limiting Vcore to "reasonable" levels ?
Push sucker to max and get as much dies validated as "OK" (less wasted dies = better profits), top die quality stuff may even do 6GHz !
Which means there is possibly for even higher prices, and very good marketing opportunity.
Engineer #1 : Sure, that's doable. There will be higher power usage (than necessary), which will mean more expensive cooling and VRMs will be required for end user... but what about longevity ?
Director : Any safeguards and mitigations we can do to mitigate those cons on our part ?
Engineer #2 : We could limit power consumption artificially in BIOS by adding long term and short term power limits (no more power hungry CPUs [s but not really /s]), and prevent very fast CPU degradation by limiting current going through it (decreasing chances of worst scenario occurring when very high voltage and max. temperature are being experienced by CPUs at the same time).
Director : Genius idea - I like it. Our technology process is best in industry, and we never failed to deliver new process technology on time.
So, with those safeguards in place we should be set. Let's do it !
Engineer #3 : But... is this... is this enough safeguards ?
Engineer #2 : Both power and temperatures are covered, so I guess... they all still need to be implemented by MB manufacturers ?
Director #2 : That's their job. If someone wants to run without them we can add those options (being OC friendly company we are and all).
Boss : Enough chit-chat. Do not forget to put everything important into datasheets, so that we are covered if any motherboard manufacturer does anything stupid.
All : Sure thing boss !
Disclaimer : This is fictional story, above does not represent how things work at Intel, and I'm not and never were Intel employee.
- CrowdStrike
- Intel 13/14 Gen
- AMD Ryzen 7000
We are being overwhelmed by complexity; some say that AI is the solution.I'm Only Human | Clone Wars (youtube.com)
I rather not pay for AI.... besides who is the basis for the programming for all the AI makes it kind of scary because there's no ground work in place. Everybody's doing their own things making AI nightmare.
NPU on all new chips by 2025
Cheers
This overvoltage issue is somewhat similar to Dieselgate. In that case, recalls have taken place, but also countless law suits, and VW was also fined by various governments. That would probably be too extreme in this case, but I still feel Intel needs to do a bit more than just warranty replacements. Maybe giving the option to get refunds instead of replacements. Maybe extending the warranty for all 13th and 14th gen CPUs to 5 or 6 years. Maybe a discount when buying another Intel CPU in the next 3 years.
You make it sound like they built server farms with CPU's that weren't designed for the task. Well, Intel should've warned them earlier that these will only last for a few months, which is absolutely positively unheard of when it comes to CPU's.
The CPU's were requesting those voltages from the motherboard, and that's what Intel plans to solve in that microcode update with performance regressions a likely scenario. Let's see.
Yes it is shocking. CPU's are supposed to be designed to run at full load for a number of years minimum and not degrade, end of. Making a CPU that only lasts a few months at their rated speed is at best false advertising and has resulted in a boatload of money lost by all these game devs and server farms, most of which is unrecoverable. Couple that with reports of denied RMA requests due to these companies putting out statements mentioning that their Intel CPU's are faulty only makes it worse for them. Most of these servers are switching over to their competitor because they actually run at their advertised speeds for, well, a long time. To top that off, one server farm and one game dev blatantly said they're faster anyway so it's a win win for them.
Having seen the voltages a 14900KS requests from the motherboard, I really want to see one at full load for a few months. I can bet that without a microcode update, >90% of those will degrade over a couple of years. With the update, it can't really run at the speeds it was rated at with full stability. Let's see what the supposed August update brings, 8 months after the initial reports started to come and almost 2 years after the launch of the CPU. Lame.
All I know is, I ain't touching any of those 13 and 14th gen chips on the second hand market with a ten foot pole even if they depreciate heavily. Which sucks, because over the (many) years i've had pretty good luck buying used CPU's as they get cheap a generation prior. Zen 5 desktop chips don't have the NPU, it's only the laptop chips.
Engineer: Capain' I can canna give you anymore voltage without causing a warp core breach
Pat: I don't care man, we've got the AMD fleet bearing down on us and they about to launch Zen 5 quantum torpedos, we need more voltage man.
Happy faces all around.
Also by Intel :)
So uh..sell dodgy chips to server farms, deny RMA even though they knew about the oxidization issue at the time then release a half assed statement two years after the chips launch that the chips have an issue, sorry wait multiple issues.
So if intel apparently 'fixed' this oxidization issue which apparently plagued early 13th gen batches, obviously they knew about it. And then they did....nothing? For years? They release a statement saying that right when third party analysts start mentioning it? Also, they conveniently fail to mention what batches were affected by the issue. Still trying to figure that out after years eh?
Something smells funny.
edit2: Just putting this out there as well, he rambles for a whole hour but the first two minutes are pretty informative lol
But now apparently when oxidation enters the discussion, its just a factor again. Seriously? Trust -1. And it was already getting pretty low.
It seems even setting Intel's default settings in the Bios isn't good enough.
Of course these are not designed to burn out in a few months - but if you have blades burning out your 14900Ks ... why... put... more... 14900Ks... in those blades? Not saying that the chip is good, but when you're putting a yolked 14900K into a blade to save money this is kind of exactly the downside.