• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Computer rebooting with black screen while idle

Loubey

New Member
Joined
May 17, 2024
Messages
9 (0.53/day)
I have been facing an issue of random rebooting with my PC for a while now. I have opened event viewer and found a common log that happens every time.
I don't suppose that anybody can tell me what is causing these issues? I have been looking over the internet for answers and I am seeing no definitive one that I can use to my advantage in fixing this problem.

For some system specs, I have a: AMD Ryzen 7 1700X 8 core and GeForce RTX 1060 6GB. If any other information is needed to understand what is going on I can provide it.

Thanks in advance!

-------------------------------------------------------------------------------------

Log Name: System
Source: Microsoft-Windows-WHEA-Logger
Date: 19/05/2024 18:34:26
Event ID: 18
Task Category: None
Level: Error
Keywords:
User: LOCAL SERVICE
Computer: DESKTOP-148MDFF
Description:
A fatal hardware error has occurred.

Reported by component: Processor Core
Error Source: Machine Check Exception
Error Type: Cache Hierarchy Error
Processor APIC ID: 8

-------------------------------------------------------------------------------------

Log Name: System
Source: Microsoft-Windows-WHEA-Logger
Date: 19/05/2024 18:34:26
Event ID: 18
Task Category: None
Level: Error
Keywords:
User: LOCAL SERVICE
Computer: DESKTOP-148MDFF
Description:
A fatal hardware error has occurred.

Reported by component: Processor Core
Error Source: Machine Check Exception
Error Type: Cache Hierarchy Error
Processor APIC ID: 6

Log Name: System
Source: Microsoft-Windows-WHEA-Logger
Date: 19/05/2024 18:34:26
Event ID: 18
Task Category: None
Level: Error
Keywords:
User: LOCAL SERVICE
Computer: DESKTOP-148MDFF
Description:
A fatal hardware error has occurred.

Reported by component: Processor Core
Error Source: Machine Check Exception
Error Type: Cache Hierarchy Error
Processor APIC ID: 6
 

tabascosauz

Moderator
Supporter
Staff member
Joined
Jun 24, 2015
Messages
7,792 (2.39/day)
Location
Western Canada
System Name ab┃ob
Processor 7800X3D┃5800X3D
Motherboard B650E PG-ITX┃X570 Impact
Cooling NH-U12A + T30┃AXP120-x67
Memory 64GB 6400CL32┃32GB 3600CL14
Video Card(s) RTX 4070 Ti Eagle┃RTX A2000
Storage 8TB of SSDs┃1TB SN550
Case Caselabs S3┃Lazer3D HT5
See:

RMAd every single component in my PC, and it's still crashing | TechPowerUp Forums

I am seeing no definitive one that I can use to my advantage in fixing this problem.

Because there isn't one. Cache Hierarchy usually just means unstable cores. But that could be because
  • Straight up defective CPU
  • Cores with a bad factory V-F curve that is inherently unstable due to being more aggressive than what the core is capable of
  • Bad board/firmware that makes the cores unstable, since there is so much board-to-board variation even with the exact same CPU
no one knows, each CPU is different.

There are some things you can try on your end: disable global Cstates, set power supply idle to Typical instead of Low current idle. Additionally can try a bit of an overvolt in BIOS (positive Vcore offset), or just a conservative all-core static OC at a higher Vcore.

But those are not fixes either and don't have much chance of doing anything - the only certain fix is to RMA the CPU. Which unfortunately sounds like it's not an option for something as old as a 1700X. At that point, if the quick fixes don't work, then just pick up another cheap AM4 CPU.
 

Loubey

New Member
Joined
May 17, 2024
Messages
9 (0.53/day)
See:

RMAd every single component in my PC, and it's still crashing | TechPowerUp Forums



Because there isn't one. Cache Hierarchy usually just means unstable cores. But that could be because
  • Defective CPU
  • Cores with a bad factory V-F curve that is inherently unstable
  • Bad board/firmware that makes the cores unstable, since there is so much board-to-board variation even with the exact same CPU
no one knows, each CPU is different.

There are some things you can try on your end: disable global Cstates, set power supply idle to Typical instead of Low current idle. But those are not fixes either and don't have much chance of doing anything - the only certain fix is to RMA the CPU. Which unfortunately sounds like it's not an option for something as old as a 1700X.
It is an old computer so could it just be the CPU has come to its end? It's not like this was an issue right from the start of buying the PC only within the past two years, keeping in mind this computer is maybe around 6 years old perhaps even older. I'll give your suggestions a go and see how I go from there. Thanks for the reply!

It appears I don't have a lot of control with what I can do in the BIOS of this PC. I am using a Dell Inspirion and the options for what I can do in the BIOS are ... limited. Unless I'm being thick and have somehow just missed all of these semi-fixes you have told me I am not sure there is a whole lot I can do.
 
Joined
Jul 25, 2006
Messages
12,304 (1.89/day)
Location
Nebraska, USA
System Name Brightworks Systems BWS-6 E-IV
Processor Intel Core i5-6600 @ 3.9GHz
Motherboard Gigabyte GA-Z170-HD3 Rev 1.0
Cooling Quality case, 2 x Fractal Design 140mm fans, stock CPU HSF
Memory 32GB (4 x 8GB) DDR4 3000 Corsair Vengeance
Video Card(s) EVGA GEForce GTX 1050Ti 4Gb GDDR5
Storage Samsung 850 Pro 256GB SSD, Samsung 860 Evo 500GB SSD
Display(s) Samsung S24E650BW LED x 2
Case Fractal Design Define R4
Power Supply EVGA Supernova 550W G2 Gold
Mouse Logitech M190
Keyboard Microsoft Wireless Comfort 5050
Software W10 Pro 64-bit
With Event ID: 18, Cache Hierarchy Error problems, I've seen fixes range from replacing the PSU, replacing the CPU, replacing the motherboard, replacing the CPU and the motherboard, updating the graphics card driver, and replacing the RAM. So I agree with tabascosauz. There is no one cause therefore, there is no one solution.

As a hardware guy, I always like to start with power. Since EVERYTHING inside the case depends on good, clean stable power, I would swap in a known good PSU - if for no other reason than to eliminate that from the equation.

If you have more than one stick of RAM, try one at a time.

If you have or can borrow a different graphics card I would try that too - preferably going with an AMD graphics solution this time. Changing families (NVIDIA to AMD, or AMD to NVIDIA) forces a complete reinstall, reset and overwrite of all the graphics settings and parameters. Going with the same family (NVIDIA in this case) often does not force a complete reset.
 
Joined
Jan 1, 2012
Messages
128 (0.03/day)
The PC is restarting itself because the CPU is identifying a machine check exception condition. It's more more often a hardware issue rather than a software one, and it can be anything causing it.

If you have multiple Event ID 18 logs, check the "APIC ID" of them all. If it's always different, then that highly suggests that at the very least it's probably not a bad core. If it's always the same, it suggests that it might be.
With Event ID: 18, Cache Hierarchy Error problems, I've seen fixes range from replacing the PSU, replacing the CPU, replacing the motherboard, replacing the CPU and the motherboard, updating the graphics card driver, and replacing the RAM. So I agree with tabascosauz. There is no one cause therefore, there is no one solution.
And just for good measure, I'll add one more. I had this exact issue driving me up the wall for two months after replacing a GTX 1060 with a 7800 XT, and then doing an RMA on the 7800 XT (sort of) resolved it. So even the graphics card can seemingly be a variable.

Only thing still having me questioning it is that I also saw the issue get worse (not better) if I disabled my RAM profile and ran at lower JEDEC speeds. So maybe it seemed platform side after all (I read a theory, and key word, about the possibility of the Infinity Fabric being borderline unstable/touchy to heat and that it may cause chipset stuff, like the graphics subsystem, to get dropped which would... seemingly explain this), but yet it was replacing the GPU that got rid of the issue for me in all but one reproducible case. And I'm not sure what to make of that one remaining case. I want to say game issue but that shouldn't be causing a machine check exception? I sort of gave up with it and accepted "good enough" stability after wasting too much time and money on it.

If you're getting Event ID 18, look in the following directories.

Windows/LiveKernelReports/WHEA
Windows/LiveKernelReports/WATCHDOG

If any logs exist, use WinDbg to open an analyze them and see if there's any clues.

In my case, the WHEA logs were 0x124 errors without much else as a clue, and the WATCHDOG logs pointed to the GPU drivers (which likely weren't causing it, but simply crashing as a cascading effect of the GPU itself dropping out).

I'd also set platform-side stuff (RAM speed, Infinity Fabric speed, etc.) to stock/BIOS defaults. In my case this made it worse though, but as a rule, reset to stock to see if that changes anything.
 
Top