Thursday, September 26th 2024
Intel Isolates "Raptor Lake" Vmin Shift Instability Root Cause, New Microcode Update Coming
Back in August, Intel started shipping its 0x129 microcode update for 13/14th generation "Raptor Lake" and "Raptor Lake Refresh" processors. This update fixed incorrect voltage requests to the processor that are causing elevated operating voltage. Intel's analysis showed that the root cause of stability problems is voltage levels that are too high during the operation of the processors. These increases in voltage cause degradation that increases the minimum voltage required for stable operation. Intel calls this "Vmin." Today, the company discovered the root cause of this instability issue and informed users that a new microcode patch is underway. As explained by Intel, the Vmin Shift instability problem stems from a clock tree circuit in the IA core. When exposed to high voltage and temperature conditions, this circuit is vulnerable to reliability degradation. Intel's research has shown that these factors can cause a shift in the duty cycle of the clocks, resulting in system instability.
There are four scenarios that can cause Vmin Shift: increased motherboard power delivery, eTVB microcode algorithm running at higher performance operating states even at higher temperatures, microcode SVID algorithm requesting higher voltages at higher frequencies and longer durations, and finally microcode and BIOS requesting elevated core voltages. For motherboard power settings, mitigation is switching back to default settings. For the eTVB issue, the fix is a 0x125 microcode update. The 0x129 patch fixes the SVID algorithm, and the fourth condition, where microcode and BIOS request elevated core voltage, is fixed by the upcoming 0x12B microcode update. Intel is reportedly working with OEMs to start rolling out the 0x12B update with no apparent performance degradation. While the timeframe for shipping this update is unknown, we expect to see it soon. Additionally, Intel once again confirmed that the upcoming "Arrow Lake" CPUs don't have these issues.
Source:
Intel
There are four scenarios that can cause Vmin Shift: increased motherboard power delivery, eTVB microcode algorithm running at higher performance operating states even at higher temperatures, microcode SVID algorithm requesting higher voltages at higher frequencies and longer durations, and finally microcode and BIOS requesting elevated core voltages. For motherboard power settings, mitigation is switching back to default settings. For the eTVB issue, the fix is a 0x125 microcode update. The 0x129 patch fixes the SVID algorithm, and the fourth condition, where microcode and BIOS request elevated core voltage, is fixed by the upcoming 0x12B microcode update. Intel is reportedly working with OEMs to start rolling out the 0x12B update with no apparent performance degradation. While the timeframe for shipping this update is unknown, we expect to see it soon. Additionally, Intel once again confirmed that the upcoming "Arrow Lake" CPUs don't have these issues.
46 Comments on Intel Isolates "Raptor Lake" Vmin Shift Instability Root Cause, New Microcode Update Coming
Intel was more worried about it's ego than actually fixing what was wrong, inferior processes.
The real root cause is AMD. Had there not been AMD with competitive products, Intel would have kept frequencies much lower.
Shame on you, AMD, once again, for your competitiveness! Where are those Bulldozer times ... Aaaah. 12th gen chips had much less e-cores, also boosted to clocks like 5.2 GHz or so.
When clocks go even higher, voltage and power draw scales exponentially-like. So you may need 40 mV to get from 5.2 GHz to 5.3 GHz, but you may need 100 mV or even more to get from 5.9 GHz to 6.0 GHz.
I recall good old Sandy Bridge days when I oced my 2600K, got from stock 3.4 GHz to 4.1 GHz all core without raising voltage, then the frequency and voltage scaled almost linearly until 4.8 GHz. Btw, my 2600K still works but frequency had to be lowered to 4.6 GHz due to instabilities caused by degradation over 10+ years.
Without such a tool I wouldn't touch a used 13th or 14th gen CPU with a 10 foot pole!
Speaking about the engineers - I wonder what do the engineers who design the chips and the manufacturing process think about the frequencies the management decided to run these chips at. If the problem is some part of internal clock generator/distributor (or whatever a "clock tree" is) getting tired and the clocks are getting out of sync, there may be some way how to sense the different clock signals and check how much out of synch they are.
I am not entirely sure if there really IS a way to do it in a normal customer motherboard.
The time Intel took to identify the issue (if this is the real culprit?) is not a good sign, it may be impossible to detect it in an intact CPU package.
There will be LOT of problems in the future for those who have an affected CPU.
Intel now cares about just a very short horizon of time, they are minimising damage IN THE NEXT FEW MONTHS. Honestly they do not care if you CPU will die after a year or two, and even if they gave you a new CPU, you will still need to deal with the warranty hassle, getting a temporary CPU for the time you do not have yours, etc.
Lowering the speed is not a great deal, really, your CPU will lose just a bit of performance and will reward you with longer life, lower temps and higher efficiency.
I am running a new 13900KS with 5200 MHz limit for P cores and I am confident it will reliably work for the next at least three years, I do not believe I will resist upgrading after such time.
Are you Pat’s mistress to get all this insider info?
From the i5-13600/14500 down, the cores are 12th, the i5-13400/14400 being practically an underclocked 12600/K.