• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Intel Isolates Root Cause of Raptor Lake Stability Issues to a Faulty eTVB Microcode Algorithm

btarunr

Editor & Senior Moderator
Staff member
Joined
Oct 9, 2007
Messages
47,166 (7.56/day)
Location
Hyderabad, India
System Name RBMK-1000
Processor AMD Ryzen 7 5700G
Motherboard ASUS ROG Strix B450-E Gaming
Cooling DeepCool Gammax L240 V2
Memory 2x 8GB G.Skill Sniper X
Video Card(s) Palit GeForce RTX 2080 SUPER GameRock
Storage Western Digital Black NVMe 512GB
Display(s) BenQ 1440p 60 Hz 27-inch
Case Corsair Carbide 100R
Audio Device(s) ASUS SupremeFX S1220A
Power Supply Cooler Master MWE Gold 650W
Mouse ASUS ROG Strix Impact
Keyboard Gamdias Hermes E2
Software Windows 11 Pro
Intel has identified the root cause for stability issues being observed with certain high-end 13th- and 14th Gen Core "Raptor Lake" processor models, which were causing games and other compute-intensive applications to randomly crash. When the issues were first identified, Intel recommended a workaround that would reduce core-voltages and restrict the boost headroom of these processors, which would end up with reduced performance. The company has apparently discovered the root cause of the problem, as Igor's Lab learned from confidential documents.

The documents say that Intel isolated the problem to a faulty value in the microcode's end of the eTVB (enhanced thermal velocity boost) algorithm. "Root cause is an incorrect value in a microcode algorithm associated with the eTVB feature. Implication Increased frequency and corresponding voltage at high temperature may reduce processor reliability. Observed Found internally," the document says, mentioning "Raptor Lake-S" (13th Gen) and "Raptor Lake Refresh-S" (14th Gen) as the affected products.



The company goes on to elaborate on the issue in its Failure Analysis (FA) document:
Failure Analysis (FA) of 13th and 14th Generation K SKU processors indicates a shift in minimum operating voltage on affected processors resulting from cumulative exposure to elevated core voltages. Intel analysis has determined a confirmed contributing factor for this issue is elevated voltage input to the processor due to previous BIOS settings which allow the processor to operate at turbo frequencies and voltages even while the processor is at a high temperature. Previous generations of Intel K SKU processors were less sensitive to these type of settings due to lower default operating voltage and frequency.

Identifying the root cause of the problem isn't the only good news, Intel also has a new microcode ready for 13th Gen and 14th Gen Core processors (version: 0x125), for motherboard manufacturers and PC OEMs to encapsulate into UEFI firmware updates. This new microcode corrects the issue, which should restore stability of these processors at their normal performance. Be on the lookout for UEFI firmware (BIOS) updates from your motherboard vendor or prebuilt OEM.

View at TechPowerUp Main Site | Source
 
Joined
Jul 30, 2019
Messages
3,234 (1.68/day)
System Name Still not a thread ripper but pretty good.
Processor Ryzen 9 7950x, Thermal Grizzly AM5 Offset Mounting Kit, Thermal Grizzly Extreme Paste
Motherboard ASRock B650 LiveMixer (BIOS/UEFI version P3.08, AGESA 1.2.0.2)
Cooling EK-Quantum Velocity, EK-Quantum Reflection PC-O11, D5 PWM, EK-CoolStream PE 360, XSPC TX360
Memory Micron DDR5-5600 ECC Unbuffered Memory (2 sticks, 64GB, MTC20C2085S1EC56BD1) + JONSBO NF-1
Video Card(s) XFX Radeon RX 5700 & EK-Quantum Vector Radeon RX 5700 +XT & Backplate
Storage Samsung 4TB 980 PRO, 2 x Optane 905p 1.5TB (striped), AMD Radeon RAMDisk
Display(s) 2 x 4K LG 27UL600-W (and HUANUO Dual Monitor Mount)
Case Lian Li PC-O11 Dynamic Black (original model)
Audio Device(s) Corsair Commander Pro for Fans, RGB, & Temp Sensors (x4)
Power Supply Corsair RM750x
Mouse Logitech M575
Keyboard Corsair Strafe RGB MK.2
Software Windows 10 Professional (64bit)
Benchmark Scores RIP Ryzen 9 5950x, ASRock X570 Taichi (v1.06), 128GB Micron DDR4-3200 ECC UDIMM (18ASF4G72AZ-3G2F1)
Sounds like good news.
 
Joined
Mar 16, 2017
Messages
2,067 (0.74/day)
Location
Tanagra
System Name Budget Box
Processor Xeon E5-2667v2
Motherboard ASUS P9X79 Pro
Cooling Some cheap tower cooler, I dunno
Memory 32GB 1866-DDR3 ECC
Video Card(s) XFX RX 5600XT
Storage WD NVME 1GB
Display(s) ASUS Pro Art 27"
Case Antec P7 Neo
Previous generations of Intel K SKU processors were less sensitive to these type of settings due to lower default operating voltage and frequency.
Terminal Velocity Boost.
 
Joined
Jan 17, 2018
Messages
64 (0.03/day)
TPU left out the part on a few other sites, which is that the problem will cause processor degradation depending on how long it was exposed to the problem. No one knows if it will fall under warranty or not.
 
Joined
Dec 31, 2020
Messages
968 (0.69/day)
Selecting XMP profile raises CPU OC flag and voids warranty. But other than that
Restore stability at their nominal performance, sure. Lower tolerances and you get same performance. Make it make sense.
 
Joined
Apr 24, 2021
Messages
276 (0.21/day)
So a microcode issue from the vendor was the cause of the stability woes. At least the power of the internet and social media put pressure on Intel, and it worked to find a solution…

But the key point is raptor lake was factory-pushed further to the limit (of stability) than previous Lakes.
 
Joined
Jun 3, 2008
Messages
719 (0.12/day)
Location
Pacific Coast
System Name Z77 Rev. 1
Processor Intel Core i7 3770K
Motherboard ASRock Z77 Extreme4
Cooling Water Cooling
Memory 2x G.Skill F3-2400C10D-16GTX
Video Card(s) EVGA GTX 1080
Storage Samsung 850 Pro
Display(s) Samsung 28" UE590 UHD
Case Silverstone TJ07
Audio Device(s) Onboard
Power Supply Seasonic PRIME 600W Titanium
Mouse EVGA TORQ X10
Keyboard Leopold Tenkeyless
Software Windows 10 Pro 64-bit
Benchmark Scores 3DMark Time Spy: 7695
Joined
Jun 3, 2008
Messages
719 (0.12/day)
Location
Pacific Coast
System Name Z77 Rev. 1
Processor Intel Core i7 3770K
Motherboard ASRock Z77 Extreme4
Cooling Water Cooling
Memory 2x G.Skill F3-2400C10D-16GTX
Video Card(s) EVGA GTX 1080
Storage Samsung 850 Pro
Display(s) Samsung 28" UE590 UHD
Case Silverstone TJ07
Audio Device(s) Onboard
Power Supply Seasonic PRIME 600W Titanium
Mouse EVGA TORQ X10
Keyboard Leopold Tenkeyless
Software Windows 10 Pro 64-bit
Benchmark Scores 3DMark Time Spy: 7695
Intel states, "While this issue is potentially contributing to instability, it is not the root cause."

And Intel states that it is "still investigating" looking for a root cause.
 
Last edited:
Joined
Jan 18, 2020
Messages
804 (0.46/day)
TPU left out the part on a few other sites, which is that the problem will cause processor degradation depending on how long it was exposed to the problem. No one knows if it will fall under warranty or not.

If it's a manufacturing fault it should do, or class action lawsuit from anyone with a damaged processor will be incoming.

The bottom line is Intel pushed these chips too hard due to the fundamental design not matching AMD chips.
 
Joined
Jan 14, 2019
Messages
12,201 (5.75/day)
Location
Midlands, UK
System Name Nebulon B
Processor AMD Ryzen 7 7800X3D
Motherboard MSi PRO B650M-A WiFi
Cooling be quiet! Dark Rock 4
Memory 2x 24 GB Corsair Vengeance DDR5-4800
Video Card(s) AMD Radeon RX 6750 XT 12 GB
Storage 2 TB Corsair MP600 GS, 2 TB Corsair MP600 R2
Display(s) Dell S3422DWG, 7" Waveshare touchscreen
Case Kolink Citadel Mesh black
Audio Device(s) Logitech Z333 2.1 speakers, AKG Y50 headphones
Power Supply Seasonic Prime GX-750
Mouse Logitech MX Master 2S
Keyboard Logitech G413 SE
Software Bazzite (Fedora Linux) KDE
In an older article:

It looks like Intel owes a lot of apologies. Oops. :slap:
 
Joined
Mar 21, 2016
Messages
2,508 (0.80/day)
While it's Intel fault it certainly wasn't intentional to write faulty micro code. That's just a innocent **** up by someone that went unnoticed.
 
Joined
Sep 6, 2013
Messages
3,308 (0.81/day)
Location
Athens, Greece
System Name 3 desktop systems: Gaming / Internet / HTPC
Processor Ryzen 5 5500 / Ryzen 5 4600G / FX 6300 (12 years latter got to see how bad Bulldozer is)
Motherboard MSI X470 Gaming Plus Max (1) / MSI X470 Gaming Plus Max (2) / Gigabyte GA-990XA-UD3
Cooling Νoctua U12S / Segotep T4 / Snowman M-T6
Memory 32GB - 16GB G.Skill RIPJAWS 3600+16GB G.Skill Aegis 3200 / 16GB JUHOR / 16GB Kingston 2400MHz (DDR3)
Video Card(s) ASRock RX 6600 + GT 710 (PhysX)/ Vega 7 integrated / Radeon RX 580
Storage NVMes, ONLY NVMes/ NVMes, SATA Storage / NVMe boot(Clover), SATA storage
Display(s) Philips 43PUS8857/12 UHD TV (120Hz, HDR, FreeSync Premium) ---- 19'' HP monitor + BlitzWolf BW-V5
Case Sharkoon Rebel 12 / CoolerMaster Elite 361 / Xigmatek Midguard
Audio Device(s) onboard
Power Supply Chieftec 850W / Silver Power 400W / Sharkoon 650W
Mouse CoolerMaster Devastator III Plus / CoolerMaster Devastator / Logitech
Keyboard CoolerMaster Devastator III Plus / CoolerMaster Devastator / Logitech
Software Windows 10 / Windows 10&Windows 11 / Windows 10
This is the value in the microcode that was creating the problem

"Win_Benchmarks_At_Any_Cost=YES"

And this is the fix

"Win_Benchmarks_At_Any_Cost=NO"
 
Joined
Jan 14, 2019
Messages
12,201 (5.75/day)
Location
Midlands, UK
System Name Nebulon B
Processor AMD Ryzen 7 7800X3D
Motherboard MSi PRO B650M-A WiFi
Cooling be quiet! Dark Rock 4
Memory 2x 24 GB Corsair Vengeance DDR5-4800
Video Card(s) AMD Radeon RX 6750 XT 12 GB
Storage 2 TB Corsair MP600 GS, 2 TB Corsair MP600 R2
Display(s) Dell S3422DWG, 7" Waveshare touchscreen
Case Kolink Citadel Mesh black
Audio Device(s) Logitech Z333 2.1 speakers, AKG Y50 headphones
Power Supply Seasonic Prime GX-750
Mouse Logitech MX Master 2S
Keyboard Logitech G413 SE
Software Bazzite (Fedora Linux) KDE
While it's Intel fault it certainly wasn't intentional to write faulty micro code. That's just a innocent **** up by someone that went unnoticed.
...and was blamed on motherboard makers before they finally noticed it.
 
Joined
Mar 21, 2016
Messages
2,508 (0.80/day)
If they weren't doing dubious question stuff in regards to MB defaults the finger wouldn't have been pointed their direction in the first place. In any case it was a problem regardless. It's a issue for AMD and Intel and MB makers shouldn't be doing that type of thing with defaults. Nearly everyone is in agreement with that who has even a shred of sense and integrity. It's a bad judgement call by MB maker's to eek out more performance to win benchmarks at any costs as john_ satirically puts it on the microcode. It probably saved a number of chips from potentially dying from degradation by MB makers pushing new bios with proper bios defaults that weren't dodgy AF.
 

Solaris17

Super Dainty Moderator
Staff member
Joined
Aug 16, 2005
Messages
26,847 (3.82/day)
Location
Alabama
System Name RogueOne
Processor Xeon W9-3495x
Motherboard ASUS w790E Sage SE
Cooling SilverStone XE360-4677
Memory 128gb Gskill Zeta R5 DDR5 RDIMMs
Video Card(s) MSI SUPRIM Liquid X 4090
Storage 1x 2TB WD SN850X | 2x 8TB GAMMIX S70
Display(s) Odyssey OLED G9 (G95SC)
Case Thermaltake Core P3 Pro Snow
Audio Device(s) Moondrop S8's on schitt Modi+ & Valhalla 2
Power Supply Seasonic Prime TX-1600
Mouse Lamzu Atlantis mini (White)
Keyboard Monsgeek M3 Lavender, Akko Crystal Blues
VR HMD Quest 3
Software Windows 11 Pro Workstation
Benchmark Scores I dont have time for that.
Joined
Feb 1, 2019
Messages
3,520 (1.67/day)
Location
UK, Midlands
System Name Main PC
Processor 13700k
Motherboard Asrock Z690 Steel Legend D4 - Bios 13.02
Cooling Noctua NH-D15S
Memory 32 Gig 3200CL14
Video Card(s) 4080 RTX SUPER FE 16G
Storage 1TB 980 PRO, 2TB SN850X, 2TB DC P4600, 1TB 860 EVO, 2x 3TB WD Red, 2x 4TB WD Red
Display(s) LG 27GL850
Case Fractal Define R4
Audio Device(s) Soundblaster AE-9
Power Supply Antec HCG 750 Gold
Software Windows 10 21H2 LTSC
Interesting interpretation on the news article. My interpretation is running bios at spec isnt a workaround but a fix, but this microcode fix allows the out of spec configuration to run stable again (or at least makes it more likely). I also disagree with the article that the old performance levels are "normal" for the product. Not only were they out of spec but they were using a faulty microcode that enabled TVB too frequently. I would like to see new reviews on this new microcode. :)

I expect now the push for better bios defaults will suddenly vanish, as Intel will want this out the news asap, and the board vendors will want to continue pushing bios's that default to out of spec. The solution for both of those is to let the bios situation drop.

...and was blamed on motherboard makers before they finally noticed it.
They are still not innocent, running the spec does stabilise chips. Just looks like there is 2 triggers to the problem.
 
Joined
May 3, 2019
Messages
2,039 (1.01/day)
System Name BigRed
Processor I7 12700k
Motherboard Asus Rog Strix z690-A WiFi D4
Cooling Noctua D15S chromax black/MX6
Memory TEAM GROUP 32GB DDR4 4000C16 B die
Video Card(s) MSI RTX 3080 Gaming Trio X 10GB
Storage M.2 drives WD SN850X 1TB 4x4 BOOT/WD SN850X 4TB 4x4 STEAM/USB3 4TB OTHER
Display(s) Dell s3422dwg 34" 3440x1440p 144hz ultrawide
Case Corsair 7000D
Audio Device(s) Logitech Z5450/KEF uniQ speakers/Bowers and Wilkins P7 Headphones
Power Supply Corsair RM850x 80% gold
Mouse Logitech G604 lightspeed wireless
Keyboard Logitech G915 TKL lightspeed wireless
Software Windows 10 Pro X64
Benchmark Scores Who cares
This is the value in the microcode that was creating the problem

"Win_Benchmarks_At_Any_Cost=YES"

And this is the fix

"Win_Benchmarks_At_Any_Cost=NO"

Nice non biased commet there, well done.
 
Joined
Oct 6, 2021
Messages
1,605 (1.43/day)
Let's get into conspiracy theory mode. I find it quite convenient to release a fix that degrades RPL performance now months before the release of Arrow Lake. Now I understand the +/- 10% margin of error in intel's slides.

Pat is a criminal genius. lol
 
Top