Thursday, September 26th 2024

Intel Isolates "Raptor Lake" Vmin Shift Instability Root Cause, New Microcode Update Coming

Back in August, Intel started shipping its 0x129 microcode update for 13/14th generation "Raptor Lake" and "Raptor Lake Refresh" processors. This update fixed incorrect voltage requests to the processor that are causing elevated operating voltage. Intel's analysis showed that the root cause of stability problems is voltage levels that are too high during the operation of the processors. These increases in voltage cause degradation that increases the minimum voltage required for stable operation. Intel calls this "Vmin." Today, the company discovered the root cause of this instability issue and informed users that a new microcode patch is underway. As explained by Intel, the Vmin Shift instability problem stems from a clock tree circuit in the IA core. When exposed to high voltage and temperature conditions, this circuit is vulnerable to reliability degradation. Intel's research has shown that these factors can cause a shift in the duty cycle of the clocks, resulting in system instability.

There are four scenarios that can cause Vmin Shift: increased motherboard power delivery, eTVB microcode algorithm running at higher performance operating states even at higher temperatures, microcode SVID algorithm requesting higher voltages at higher frequencies and longer durations, and finally microcode and BIOS requesting elevated core voltages. For motherboard power settings, mitigation is switching back to default settings. For the eTVB issue, the fix is a 0x125 microcode update. The 0x129 patch fixes the SVID algorithm, and the fourth condition, where microcode and BIOS request elevated core voltage, is fixed by the upcoming 0x12B microcode update. Intel is reportedly working with OEMs to start rolling out the 0x12B update with no apparent performance degradation. While the timeframe for shipping this update is unknown, we expect to see it soon. Additionally, Intel once again confirmed that the upcoming "Arrow Lake" CPUs don't have these issues.
Source: Intel
Add your own comment

46 Comments on Intel Isolates "Raptor Lake" Vmin Shift Instability Root Cause, New Microcode Update Coming

#26
Nater
BoggledBeagleYou may never be able to degrade 13600K with its low stock frequency.

I also believe it is not correct (fair) to torture the CPUs at higher than stock speeds and with substandard cooling.
I don't think mine has ever ran at stock with the coolers I've thrown at it along with either Asus AI overclocking or Intel's XTU app. It's always been jacked with voltage and moar MEGERHERTZ.
Posted on Reply
#27
TheinsanegamerN
DavenAnother confirmation that motherboard makers were not to blame whatsoever. This was an Intel chip problem. It happened. Time to move on.

Edit: Oh and the simple answer presented by other commenters is the best answer…Intel ran their chips at too high of frequency.
OCed 12th gen chips are not degrading, so nope. That's not the cause. But dont worry, I'm sure the armchair engineers at techpowerup are on the case : )
BoggledBeagleYou may never be able to degrade 13600K with its low stock frequency.

I also believe it is not correct (fair) to torture the CPUs at higher than stock speeds and with substandard cooling.
There are i5s from the 13/14th gen that have degraded. So what's your answer there?
Posted on Reply
#28
BoggledBeagle
TheinsanegamerNOCed 12th gen chips are not degrading, so nope. That's not the cause. But dont worry, I'm sure the armchair engineers at techpowerup are on the case : )
Nobody said that 12th and 13+14th gen CPUs can handle the same frequencies. Tha latter can have a sensitive bit in them. When somebody says too high frequency, they mean too high for what the chips can handle.
TheinsanegamerNThere are i5s from the 13/14th gen that have degraded. So what's your answer there?
My answer is that they are unlocked CPUs and many people may have used them at far higher than stock frequencies. If Intel knew what they know now, they may have never released unlocked versions of these CPUs.
Posted on Reply
#29
Redwoodz
The real bad thing about this whole fiasco is that it was not necessary. If they had just released a slower and cheaper cpu they would have paved the way for their eventual return to glory.

Intel was more worried about it's ego than actually fixing what was wrong, inferior processes.
Posted on Reply
#30
dyonoctis
BoggledBeagleNobody said that 12th and 13+14th gen CPUs can handle the same frequencies. Tha latter can have a sensitive bit in them. When somebody says too high frequency, they mean too high for what the chips can handle.



My answer is that they are unlocked CPUs and many people may have used them at far higher than stock frequencies. If Intel knew what they know now, they may have never released unlocked versions of these CPUs.
I say bullshit, Intel listed the locked 13700 as part of the problematic CPUs, and it's only running at 5.2 Ghz on two cores and 5.1 GHZ an all cores. That's not much higher than the 5.1Ghz of the 13600k
Posted on Reply
#31
Darmok N Jalad
dyonoctisI say bullshit, Intel listed the locked 13700 as part of the problematic CPUs, and it's only running at 5.2 Ghz on two cores and 5.1 GHZ an all cores. That's not much higher than the 5.1Ghz of the 13600k
You could both be right. They very likely applied the same microcode algorithm to all SKUs, and that microcode had to have a voltage curve sufficient enough to hit the max clock of the highest bin. Basically, their aim to hit 6.0-6.2 set in motion settings that ended up being to the detriment of lesser SKUs, which is why these patches apply to so much of the 13/14 series product line. So no, it’s not technically the frequency, but rather the max frequency target of the product line that likely created the real issue—too much voltage being applied, even on SKUs that should have never needed it.
Posted on Reply
#32
LittleBro
BoggledBeagleDegradation is caused by high temperature and high electric current density, which is caused by high voltage, which is required to run the CPUs at too high frequencies, which are specified by Intel execs.

The root cause for the mess are Intel execs who chose to ignore all the good industry practices and precautionary principles, which are in place to deliver customers a long term reliable product.
But there is a reason even for those frequencies ...

The real root cause is AMD. Had there not been AMD with competitive products, Intel would have kept frequencies much lower.

Shame on you, AMD, once again, for your competitiveness! Where are those Bulldozer times ... Aaaah.
TheinsanegamerNOCed 12th gen chips are not degrading, so nope. That's not the cause. But dont worry, I'm sure the armchair engineers at techpowerup are on the case : )


There are i5s from the 13/14th gen that have degraded. So what's your answer there?
12th gen chips had much less e-cores, also boosted to clocks like 5.2 GHz or so.

When clocks go even higher, voltage and power draw scales exponentially-like. So you may need 40 mV to get from 5.2 GHz to 5.3 GHz, but you may need 100 mV or even more to get from 5.9 GHz to 6.0 GHz.

I recall good old Sandy Bridge days when I oced my 2600K, got from stock 3.4 GHz to 4.1 GHz all core without raising voltage, then the frequency and voltage scaled almost linearly until 4.8 GHz. Btw, my 2600K still works but frequency had to be lowered to 4.6 GHz due to instabilities caused by degradation over 10+ years.
Posted on Reply
#33
Macro Device
LittleBrohad to be lowered to 4.6 GHz
You might consider yourself lucky. Usually these dudes fall off to ~4.2 GHz zone after being constantly OCed to their maximum. Or die if that's not the only way they're abused.
Posted on Reply
#34
Visible Noise
BoggledBeagleIt is funny how they called degradation "reliability aging" :D

The real reason for quick degradation - too high frequency, which is the underlying cause for the elevated temperature and voltage causing high electric current density, is missing from their list of causes.

I am not convinced that even a brand new CPU running the 12B microcode will reliably work for long years at those extreme frequencies.
Have you considered getting a job at Intel? They could probably saved 100,000 engineering hours with you on the job.
Posted on Reply
#35
randomTPUreader
When do owners of 13th and 14th Gen CPUs get some sort of free official software tool to determine if their CPU has been damaged or not? Or do they have to rely solely on random crashes before submitting a warranty claim?

Without such a tool I wouldn't touch a used 13th or 14th gen CPU with a 10 foot pole!
Posted on Reply
#36
InVasMani
Visible NoiseHave you considered getting a job at Intel? They could probably saved 100,000 engineering hours with you on the job.
Probably save them about 100,000,000 marketing engineers hours. That said they fired thousands of them. Honestly those layoffs while sad are necessary they need more people in actual R&D not fluff n spin.
Posted on Reply
#37
BoggledBeagle
Visible NoiseHave you considered getting a job at Intel? They could probably saved 100,000 engineering hours with you on the job.
In fact, a little common sense can save a lot of work.

Speaking about the engineers - I wonder what do the engineers who design the chips and the manufacturing process think about the frequencies the management decided to run these chips at.
randomTPUreaderWhen do owners of 13th and 14th Gen CPUs get some sort of free official software tool to determine if their CPU has been damaged or not? Or do they have to rely solely on random crashes before submitting a warranty claim?
If the problem is some part of internal clock generator/distributor (or whatever a "clock tree" is) getting tired and the clocks are getting out of sync, there may be some way how to sense the different clock signals and check how much out of synch they are.

I am not entirely sure if there really IS a way to do it in a normal customer motherboard.

The time Intel took to identify the issue (if this is the real culprit?) is not a good sign, it may be impossible to detect it in an intact CPU package.
Posted on Reply
#38
Ruru
S.T.A.R.S.
We enthusiasts and hobbyists are just a drop in the ocean what it comes to PC users. A typical gamer has no idea about BIOS updating and other stuff which is normal for us.

There will be LOT of problems in the future for those who have an affected CPU.
Posted on Reply
#39
InVasMani
BoggledBeagleIn fact, a little common sense can save a lot of work.

Speaking about the engineers - I wonder what do the engineers who design the chips and the manufacturing process think about the frequencies the management decided to run these chips at.


If the problem is some part of internal clock generator/distributor (or whatever a "clock tree" is) getting tired and the clocks are getting out of sync, there may be some way how to sense the different clock signals and check how much out of synch they are.

I am not entirely sure if there really IS a way to do it in a normal customer motherboard.

The time Intel took to identify the issue (if this is the real culprit?) is not a good sign, it may be impossible to detect it in an intact CPU package.
I doubt their at liberty to even say.
Posted on Reply
#40
seccentral
This is such a confusing rollercoaster. So, if I'm on 129 now I should wait for 0x12B and then I can go back to encoding compiling and gaming at actual normal advertised speeds ?
Posted on Reply
#41
Lewzke
Another microcode? This is a joke. Incompetent company, probably the engineering team was pushed to release 13/14 gen. to compete with AMD performance and they messed up the validation phase. Utterly rubbish leadership. It is like the Boeing incident with the new engine. You want profit and you lose big, safety validation and testing is nr.1 priority in engineering, to release a reliable product. Go back to the school Intel managers.
Posted on Reply
#42
BoggledBeagle
seccentral... and then I can go back to encoding compiling and gaming at actual normal advertised speeds ?
Are you sure that the "advertised speeds" are not a great mistake, which will kill your CPU after some time?

Intel now cares about just a very short horizon of time, they are minimising damage IN THE NEXT FEW MONTHS. Honestly they do not care if you CPU will die after a year or two, and even if they gave you a new CPU, you will still need to deal with the warranty hassle, getting a temporary CPU for the time you do not have yours, etc.

Lowering the speed is not a great deal, really, your CPU will lose just a bit of performance and will reward you with longer life, lower temps and higher efficiency.

I am running a new 13900KS with 5200 MHz limit for P cores and I am confident it will reliably work for the next at least three years, I do not believe I will resist upgrading after such time.
Posted on Reply
#43
Visible Noise
BoggledBeagleAre you sure that the "advertised speeds" are not a great mistake, which will kill your CPU after some time?

Intel now cares about just a very short horizon of time, they are minimising damage IN THE NEXT FEW MONTHS. Honestly they do not care if you CPU will die after a year or two, and even if they gave you a new CPU, you will still need to deal with the warranty hassle, getting a temporary CPU for the time you do not have yours, etc.

Lowering the speed is not a great deal, really, your CPU will lose just a bit of performance and will reward you with longer life, lower temps and higher efficiency.

I am running a new 13900KS with 5200 MHz limit for P cores and I am confident it will reliably work for the next at least three years, I do not believe I will resist upgrading after such time.
You keep claiming to know a lot of information that is internal to Intel. Such as the “real” cause of the chip degradation, or what “they” (whoever they are) care about.

Are you Pat’s mistress to get all this insider info?
Posted on Reply
#44
:D:D
BoggledBeagle(or whatever a "clock tree" is)
That's it!. Remember all those problems with branches (spectre/meltdown) and now a root problem. Got to get rid of those damn trees.
Posted on Reply
#45
Visible Noise
:D:DThat's it!. Remember all those problems with branches (spectre/meltdown) and now a root problem. Got to get rid of those damn trees.
I didn’t notice that! He doesn’t know the basics of how a CPU works but knows what’s wrong with these. Holy smokes!
Posted on Reply
#46
SOS_Earth
dyonoctisI say bullshit, Intel listed the locked 13700 as part of the problematic CPUs, and it's only running at 5.2 Ghz on two cores and 5.1 GHZ an all cores. That's not much higher than the 5.1Ghz of the 13600k
In the list are all the full Raptor processors (13th and 14th), with the exception of the non-K 14600.
From the i5-13600/14500 down, the cores are 12th, the i5-13400/14400 being practically an underclocked 12600/K.
Posted on Reply
Add your own comment
Jan 5th, 2025 22:08 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts