Friday, June 14th 2024
Intel Isolates Root Cause of Raptor Lake Stability Issues to a Faulty eTVB Microcode Algorithm
Intel has identified the root cause for stability issues being observed with certain high-end 13th- and 14th Gen Core "Raptor Lake" processor models, which were causing games and other compute-intensive applications to randomly crash. When the issues were first identified, Intel recommended a workaround that would reduce core-voltages and restrict the boost headroom of these processors, which would end up with reduced performance. The company has apparently discovered the root cause of the problem, as Igor's Lab learned from confidential documents.
The documents say that Intel isolated the problem to a faulty value in the microcode's end of the eTVB (enhanced thermal velocity boost) algorithm. "Root cause is an incorrect value in a microcode algorithm associated with the eTVB feature. Implication Increased frequency and corresponding voltage at high temperature may reduce processor reliability. Observed Found internally," the document says, mentioning "Raptor Lake-S" (13th Gen) and "Raptor Lake Refresh-S" (14th Gen) as the affected products.The company goes on to elaborate on the issue in its Failure Analysis (FA) document:
Source:
Igor's Lab
The documents say that Intel isolated the problem to a faulty value in the microcode's end of the eTVB (enhanced thermal velocity boost) algorithm. "Root cause is an incorrect value in a microcode algorithm associated with the eTVB feature. Implication Increased frequency and corresponding voltage at high temperature may reduce processor reliability. Observed Found internally," the document says, mentioning "Raptor Lake-S" (13th Gen) and "Raptor Lake Refresh-S" (14th Gen) as the affected products.The company goes on to elaborate on the issue in its Failure Analysis (FA) document:
Failure Analysis (FA) of 13th and 14th Generation K SKU processors indicates a shift in minimum operating voltage on affected processors resulting from cumulative exposure to elevated core voltages. Intel analysis has determined a confirmed contributing factor for this issue is elevated voltage input to the processor due to previous BIOS settings which allow the processor to operate at turbo frequencies and voltages even while the processor is at a high temperature. Previous generations of Intel K SKU processors were less sensitive to these type of settings due to lower default operating voltage and frequency.Identifying the root cause of the problem isn't the only good news, Intel also has a new microcode ready for 13th Gen and 14th Gen Core processors (version: 0x125), for motherboard manufacturers and PC OEMs to encapsulate into UEFI firmware updates. This new microcode corrects the issue, which should restore stability of these processors at their normal performance. Be on the lookout for UEFI firmware (BIOS) updates from your motherboard vendor or prebuilt OEM.
107 Comments on Intel Isolates Root Cause of Raptor Lake Stability Issues to a Faulty eTVB Microcode Algorithm
2. Does being ahead or behind by a couple of percent in gaming performance really matter?
Their power/boost guidelines have been unclear for years at this point.
If you buy a CPU they promise maximum specified turbo clocks (which are not assured, turbo clocks have never ever been assured on any GPU or CPU product in my lifetime), Review results are not a promise.
Personally I wouldnt be buying a raptor lake (or refresh) i9 chip.
You sound ridiculous
All these replies to my comment do is show the true anti intel sentiment on TPU
If you look at my main system specs, you'll find all AMD because I find it to be better value than the competition and better suited for my needs these days. My two HTPCs and my netbook don't need to be cutting edge, so they're all Intel + Nvidia (as it was better value at that time). If you still get anti-Intel vibes from me, that's your imagination I'm afraid.
EIGHTEEN months after the launch of the 13900K, Intel has yet to come up with a solution for a problem that end users had to point out.
Still, running Intel CPU's with that high power consumption is innovation compared to adding cache?
If you want defend Intel, this is not the thread for it. They've taken zero responsibility so far. Stop feeling sad in the eye, you brought this on yourself.
If you want to use that stupid argument, well neither of them does anything, they're all just using what ASML makes possible with their machines, it's a ridiculous idea.
Also as if Intel makes everything in house - they dont. And how is that working out for Intel? Not very well considering how much TSMC's process nodes they use in their next desktop and mobile series.
Not to mention their dGPU's that are exclusively made by TSMC. And AMD cant? Look at some GN's videos from AMD labs. They most certainly do in house testing. Depends on the issue. Obviously not even Intel can produce a new revision or a re-spin in house if it's TSMC made silicon. What do you mean by "what else?" How about better security, lower power consumption, better platform longevity, less restrictions on cheaper chipsets etc. At what cost and power? And by how much? Single digit percentages mostly at the expense of 2-3x the power. I wasn't aware that X3D chips were no good for anything outside gaming. You make it sound like they're Bulldozers when it comes to non gaming tasks. In reality i doubt most people would notice the difference in blind test between X3D and 14th gen in boot time, application performance etc. It's Anti-BS sentiment. Don't think that people here have not criticized AMD (justly) when they have deserved it.
This thread is about Intel's screwup.
Hope Intel learns and that they won't make the same mistake again with Arrow Lake going forward. Fingers crossed!
Otherwise it'll be like the GPU market share. By that I mean, it'll be 90% AMD and 10% Intel if this behavior continues.
And that does NOT bode well, for competition's sake. We need competition to drive innovation and of course, for better prices.
Intel needs a 3-5x performance per watt increase for me to go back (for real, just look at 7800X3D benchmarks right here at TPU, Intel is appalling in efficiency). A new architecture and moving away from their ancient lithography to Intel 20A might might do it.
tpucdn.com/review/intel-core-i7-14700k/images/efficiency-gaming.png
I don't really get why w1zzard tested that at 1080p though while in other cases 720p is used to better represent a CPU bottleneck. I don't think it would help things particularly, but it probably would push CPU core usage and thread usage higher in some scenario's. Anyways we need to transition away from 8c/16t consoles before we see forward progress beyond that become standard. You can find examples where developers have targeted better hardware resource, but it won't become common until we see a shift at the largest audience developers target which is the console market.
This really isn't about which is better and why for which purpose use case under which testing scenario example however. This is about Intel making a bad decision or blunder and yes and/or maybe is kind of what we've gathered on the matter to this point.