PC randomly reboots while playing specifically Steam games

chopinanopolis · Feb 14, 2024

This is my first PC, and I'll start of by saying, I'm not tech savvy when it comes to PC stuff. But I've been having this issue basically since I built the PC a bit over two weeks ago. While playing steam games my PC randomly restarted. At first all I saw in event viewer where DCOM errors, which do still show up, but I think I never scrolled far enough down (like I said, I'm basically a toddler when it comes to PC stuff, I know nothing). But now I fatal hardware errors come up and they are:

A fatal hardware error has occurred.

Reported by component: Processor Core

Error Source: Machine Check Exception

Error Type: Cache Hierarchy Error

Processor APIC ID: 0

The details view of this entry contains further information.

and:

A fatal hardware error has occurred.

Reported by component: Processor Core Error Source: Machine Check Exception

Error Type: Cache Hierarchy Error

Processor APIC ID: 6

The details view of this entry contains further information.

From what I've seen those can have something to do with Core utilization, but I have no clue how to change any of that stuff, and I'm honestly too scared to brick something accidentally, so I'd rather not poke around on my own. Thank you for the help in advance

I should probably add, my specs are:
Radeon 6650XT
Ryzen 5600
16GB (2x 8GB) Kingston FURY Beast DDR4-3200
1TB Lexar NM620 NVMe
ASRock B450M Pro4
560 Watt LC-Power LC6560GP4

Bill_Bright · Feb 14, 2024

Since this happened from day one, I would take everything out of the case and assemble it on a large, wooden bread/cutting board and see if the problem happens there.

A common mistake by the less experienced and distracted pros alike is to insert one or more extra standoff in the case under the motherboard. Any extra standoff creates the potential for an electrical “short” in one or more circuits. The results range from "nothing" (everything works perfectly) to odd problems, to "nothing" (as in nothing works at all

). To add to the confusion, these issues may be intermittent, depending on heat, expansion/contraction of materials, as well as continuity/resistance through the contact point. Therefore, you need to ensure the case only has an inserted standoff where there is a corresponding motherboard mounting hole.

Note the latest version of the ATX Form Factor standard hopes to eliminate these issues by dictating where standoffs will go, not just where they may go. But not all existing boards or cases comply with those latest standards - yet. So, you still should verify you only inserted a standoff where there is a corresponding motherboard mounting hole.

Taking everything out and testing on the bread board will give you the opportunity to double check your standoffs.

chopinanopolis · Feb 14, 2024

Bill_Bright said:
Since this happened from day one, I would take everything out of the case and assemble it on a large, wooden bread/cutting board and see if the problem happens there.

A common mistake by the less experienced and distracted pros alike is to insert one or more extra standoff in the case under the motherboard. Any extra standoff creates the potential for an electrical “short” in one or more circuits. The results range from "nothing" (everything works perfectly) to odd problems, to "nothing" (as in nothing works at all ). To add to the confusion, these issues may be intermittent, depending on heat, expansion/contraction of materials, as well as continuity/resistance through the contact point. Therefore, you need to ensure the case only has an inserted standoff where there is a corresponding motherboard mounting hole.

Note the latest version of the ATX Form Factor standard hopes to eliminate these issues by dictating where standoffs will go, not just where they may go. But not all existing boards or cases comply with those latest standards - yet. So, you still should verify you only inserted a standoff where there is a corresponding motherboard mounting hole.

Taking everything out and testing on the bread board will give you the opportunity to double check your st

Bill_Bright said:
Since this happened from day one, I would take everything out of the case and assemble it on a large, wooden bread/cutting board and see if the problem happens there.

A common mistake by the less experienced and distracted pros alike is to insert one or more extra standoff in the case under the motherboard. Any extra standoff creates the potential for an electrical “short” in one or more circuits. The results range from "nothing" (everything works perfectly) to odd problems, to "nothing" (as in nothing works at all ). To add to the confusion, these issues may be intermittent, depending on heat, expansion/contraction of materials, as well as continuity/resistance through the contact point. Therefore, you need to ensure the case only has an inserted standoff where there is a corresponding motherboard mounting hole.

Note the latest version of the ATX Form Factor standard hopes to eliminate these issues by dictating where standoffs will go, not just where they may go. But not all existing boards or cases comply with those latest standards - yet. So, you still should verify you only inserted a standoff where there is a corresponding motherboard mounting hole.

Taking everything out and testing on the bread board will give you the opportunity to double check your standoffs.

On the case I have (DeepCool CC360 mATX) all of the standoffs were already installed out of the box. So I didn't put any in myself. And I should probably add, it doesn't happen consistently. Sometimes it happens 3 minutes after launching a game, and sometimes it happens after multiple hours. When it crashes, and I re-register the DCOM and APPIDs it usually doesn't happen again until I turn off my PC. One DCOM ID I managed to stop from popping up, but the one that still comes up is:

{2593F8B9-4EAF-457C-B68A-50F6B8EA6B54} and the APPID is:
{15C20B67-12E7-4BB6-92BB-7AFF07997402}

And like I said, it only happens in steam games specifically. No other launcher has caused this for me

Bill_Bright · Feb 14, 2024

chopinanopolis said:
On the case I have (DeepCool CC360 mATX) all of the standoffs were already installed out of the box.

I would still want to verify my motherboard had a mounting hole where each of those were.

As a technician, I also always want to verify I am supplying good power. This is especially true with computers since everything inside the case depends on good, clean, stable power. I never heard of LC Power or their supplies. But I do like to say to buy a quality supply from a reputable makers because, "You would not want to buy a brand new Porsche then fill it up with unknown fuel at the corner Tobacco and Bait shop".

Regardless, even the best of the best can have a unit that does not meet specs. So I would swap in a known good supply to see what happens.

You might also try 1 stick of RAM at a time.

And you need to make sure you are supplying an adequate supply of cool air flowing through the case. And monitor your temps. I use and recommend Core Temp to monitor CPU temps in real time. Under Options > Settings > Notification Area, I have mine set to display "Highest temperature" only.

KedsDead · Feb 14, 2024

and remove any over clocks... Then retest each one for stability, separately..

Splinterdog · Feb 14, 2024

Have you checked your CPU temps?

HWMONITOR | Softwares | CPUID

HWMonitor for Windows® x86/x64 is a hardware monitoring program that reads PC systems main health sensors : voltages, temperatures, powers, currents, fans speed, utilizations, clock speeds ... The program handles : CPU and GPU-level hardware monitoring...

www.cpuid.com

Try one mem stick at a time and possibly an alternative power supply for testing purposes, if you have access to one.
I realise that you're new to this, but those are the things I would try first, not to mention what @Bill_Bright suggested.

Princess Garnet · Feb 14, 2024

In addition to the other replies which have good ideas to check first, I went through a long two months diagnosing an issue not all that long ago where I suspected something as the cause but had doubts on and decided to do the "let's play with every possible solution and see if it fixes it" and nope, it was what I thought in the end. In my adventure, I had that same Event ID (18, right?) and from what I figured while diagnosing mine is that this is the one created on AMD systems (not that only AMD systems can have this issue) when it encounters a machine check exception situation. I mean, it's right there in the log so it's not stretch or anything. Modern PCs have machine check architecture and they usually have certain things they do to respond to such situations; restarting seems to be a common one.

These are almost always a hardware issue, not a software one. The part it can be can be... anything. Do note that machine exceptions can be caught by the CPU (core) or in memory. I say this because a lot of people seeing that log see "reported by CPU core" and suspect a faulty CPU. It can be anything, not just the CPU. The CPU (in this case) is just what caught it. If the "Processor APIC ID" value is always the same logical core then it might suggest a bad CPU core but if that value always varies, it just means a different core is catching and I would then think the issue is elsewhere.

In my case it was the graphics card, but you're going to need to rule parts out as good to proceed.

The field "Error Type: Cache Hierarchy Error" does match what I had too. Sometimes this says "Error Type: Bus/Interconnect" and that might help narrow it down?

Since this happens while playing games, I would probably be looking in the broad direction of the graphics card or PSU first (in no particular order).

You might want to download OCCT and use some of its stress tests (it has a suite to test different ways to stress different parts) and see if that narrows it down. I did that and found it was the "GPU variable" that liked to crash. No others ever crashed for me. Prime95+ Furmark + a web browser or a game plus a web browser (active in the latter) or just a game did it, in order of most likely/fastest to least likely/longer to occur. You might get a clue to the part causing it based on what test fails, but your own real world use so far suggests graphics card or PSU I think.

Additionally, check these directories...

Windows/LiveKernelReports/WHEA

Windows/LiveKernelReports/WATCHDOG

Are there logs present here? At least ones that correspond to the time of this restart? If so, either upload them, or if you want to check yourself, a program called WinDbg can open and analyze them. With that Event ID log, I suspect WHEA logs (if present) might be the generic 0x124 error and not as helpful, and the Watch Dog logs might be more specific. Or maybe I'm presuming that since that was how it went for me. Either way, check for those. Windows can be helpful if it's reserving any clues but people usually skip straight past it.

Launcestonian · Feb 15, 2024

Princess Garnet said:
... (snip)

Since this happens while playing games, I would probably be looking in the broad direction of the graphics card or PSU first (in no particular order).

You might want to download OCCT and use some of its stress tests (it has a suite to test different ways to stress different parts) and see if that narrows it down. I did that and found it was the "GPU variable" that liked to crash. No others ever crashed for me. Prime95+ Furmark + a web browser or a game plus a web browser (active in the latter) or just a game did it, in order of most likely/fastest to least likely/longer to occur. You might get a clue to the part causing it based on what test fails, but your own real world use so far suggests graphics card or PSU I think.

... (snip)

This is exactly what I would be doing to assert if the hardware can run smoothly & hence properly when under load. That PSU is suspicious imo. Running the system under various load thresholds will bring up any weaknesses in the power supply side of the whole system including the VRMs on the motherboard. But start with that PSU first to rule out reliability problems. A quick search reveals only one review of it on Amazon from 8 yrs ago!

chopinanopolis · Feb 21, 2024

Bill_Bright said:
I would still want to verify my motherboard had a mounting hole where each of those were.

As a technician, I also always want to verify I am supplying good power. This is especially true with computers since everything inside the case depends on good, clean, stable power. I never heard of LC Power or their supplies. But I do like to say to buy a quality supply from a reputable makers because, "You would not want to buy a brand new Porsche then fill it up with unknown fuel at the corner Tobacco and Bait shop".

Regardless, even the best of the best can have a unit that does not meet specs. So I would swap in a known good supply to see what happens.

You might also try 1 stick of RAM at a time.

And you need to make sure you are supplying an adequate supply of cool air flowing through the case. And monitor your temps. I use and recommend Core Temp to monitor CPU temps in real time. Under Options > Settings > Notification Area, I have mine set to display "Highest temperature" only.

Hey, sorry for the late reply. I checked the standoffs, and there are no extra ones. On Mindfactory (one of germanys biggest PC part suppliers) the PSU has good reviews, and I don't have an extra one to test sadly. I ran OCCT tests on the CPU, GPU, Ram, and Power in general (not 100% sure what that means, that's what the test Option is called in OCCT) and nothing came up. I haven't tried running it with one stick of ram yet, I still have to do that. Temperatures are fine aswell, the CPU idles at 30° and maxes out at around 48-50°, and GPU idles at 30° and maxes out at around 75°. The only thing I'm not sure about right now is that in MSI afterburner there are two different GPU temperatures, one is the same as the temp that show in OCCT and other programs that show temperatures, but the second one says it maxes out at 98°, but I have no clue what that means, since nowhere else does it give me two different temperatures

It said online that Cache Hierarchy errors can be caused by drivers and one recommendation I saw was to uninstall an reinstall USB drivers, which I did, and it ran fine after that for a few days, but yesterday it crashed again. I'm not comfortable messing around with other drivers since I saw online that it could possibly mess up your PC, so without some sort of instructions, I'd rather avoid that for now. I thought about checking the mobo bios as well, but I'm not sure if I can do that without messing stuff up. When I bought the mobo and CPU, it said that they're only compatible after a certain bios version, but I'd assume the CPU wouldn't work at all if that was the case rn?

Also, I've been getting the Kernel error ; The driver \Driver\WudfRd failed to load for the device HID\VID_0951&PID_16A4&MI_03&Col02\8&22afd592&0&0001.
Ok, so when I copy and pasted this error it took me to a reddit thread, that this is apparently caused by the exact headset that I'm using right now, so I changed the driver in the device manager, not sure yet if it helped or not. Since the crashes aren't consistent it's hard to unplug the headset and test for crashes since it sometimes doesn't happen for a few days

Princess Garnet said:
In addition to the other replies which have good ideas to check first, I went through a long two months diagnosing an issue not all that long ago where I suspected something as the cause but had doubts on and decided to do the "let's play with every possible solution and see if it fixes it" and nope, it was what I thought in the end. In my adventure, I had that same Event ID (18, right?) and from what I figured while diagnosing mine is that this is the one created on AMD systems (not that only AMD systems can have this issue) when it encounters a machine check exception situation. I mean, it's right there in the log so it's not stretch or anything. Modern PCs have machine check architecture and they usually have certain things they do to respond to such situations; restarting seems to be a common one.

These are almost always a hardware issue, not a software one. The part it can be can be... anything. Do note that machine exceptions can be caught by the CPU (core) or in memory. I say this because a lot of people seeing that log see "reported by CPU core" and suspect a faulty CPU. It can be anything, not just the CPU. The CPU (in this case) is just what caught it. If the "Processor APIC ID" value is always the same logical core then it might suggest a bad CPU core but if that value always varies, it just means a different core is catching and I would then think the issue is elsewhere.

In my case it was the graphics card, but you're going to need to rule parts out as good to proceed.

The field "Error Type: Cache Hierarchy Error" does match what I had too. Sometimes this says "Error Type: Bus/Interconnect" and that might help narrow it down?

Since this happens while playing games, I would probably be looking in the broad direction of the graphics card or PSU first (in no particular order).

You might want to download OCCT and use some of its stress tests (it has a suite to test different ways to stress different parts) and see if that narrows it down. I did that and found it was the "GPU variable" that liked to crash. No others ever crashed for me. Prime95+ Furmark + a web browser or a game plus a web browser (active in the latter) or just a game did it, in order of most likely/fastest to least likely/longer to occur. You might get a clue to the part causing it based on what test fails, but your own real world use so far suggests graphics card or PSU I think.

Additionally, check these directories...

Windows/LiveKernelReports/WHEA

Windows/LiveKernelReports/WATCHDOG

Are there logs present here? At least ones that correspond to the time of this restart? If so, either upload them, or if you want to check yourself, a program called WinDbg can open and analyze them. With that Event ID log, I suspect WHEA logs (if present) might be the generic 0x124 error and not as helpful, and the Watch Dog logs might be more specific. Or maybe I'm presuming that since that was how it went for me. Either way, check for those. Windows can be helpful if it's reserving any clues but people usually skip straight past it.

For me the APIC ID is always 0 and most of the time 6. I ran OCCT tests for the PSU (It's called Power in OCCT, so I assume it's the PSU), GPU, CPU and RAM and no errors came up.

I'm gonna be honest, I have no clue how to get to those directories, is it through CMD?

MarsM4N · Feb 21, 2024

Princess Garnet said:
Are there logs present here? At least ones that correspond to the time of this restart? If so, either upload them, or if you want to check yourself, a program called WinDbg can open and analyze them. With that Event ID log, I suspect WHEA logs (if present) might be the generic 0x124 error and not as helpful, and the Watch Dog logs might be more specific. Or maybe I'm presuming that since that was how it went for me. Either way, check for those. Windows can be helpful if it's reserving any clues but people usually skip straight past it.

This.

Before ripping apart your rig download WinDbg (Windows Debugger) and analyse your crash dumps.

chopinanopolis said:
Also, I've been getting the Kernel error ; The driver \Driver\WudfRd failed to load for the device HID\VID_0951&PID_16A4&MI_03&Col02\8&22afd592&0&0001.
Ok, so when I copy and pasted this error it took me to a reddit thread, that this is apparently caused by the exact headset that I'm using right now, so I changed the driver in the device manager, not sure yet if it helped or not. Since the crashes aren't consistent it's hard to unplug the headset and test for crashes since it sometimes doesn't happen for a few days

For me the APIC ID is always 0 and most of the time 6. I ran OCCT tests for the PSU (It's called Power in OCCT, so I assume it's the PSU), GPU, CPU and RAM and no errors came up.

I'm gonna be honest, I have no clue how to get to those directories, is it through CMD?

There is your problem, "WudfRd". A driver error. :oops:

Now you only need to narrow down for what device you need a new driver. Check if you can find more clues in the crash report. Check your device manager if there is a device showing a warning or showing up with a question mark. Did you install all the latest drivers/BIOS from your mainboard manufacturer? Latest GPU driver?

Auto-Detect and Install Driver Updates for AMD Radeon™ Series Graphics and Ryzen™ Chipsets (first install all drivers from your mainboard manufacturer)

You can also run CHKSK & SFC, doesn't hurt. But what I found on Google & Youtube points more in the direction that it's a driver problem of a USB device.

Princess Garnet · Feb 21, 2024

chopinanopolis said:
For me the APIC ID is always 0 and most of the time 6. I ran OCCT tests for the PSU (It's called Power in OCCT, so I assume it's the PSU), GPU, CPU and RAM and no errors came up.

I'm gonna be honest, I have no clue how to get to those directories, is it through CMD?

If the APIC ID is sometimes different, it starts to rule out the issue is a (single) bad CPU core.

You get to those directories through Windows' file explorer.

C > Windows > LiveKernelReports

The WHEA and WATCHDOG directories will be in there. Check both directories for any log files. If they have them, then these will hold clues to the restarts. You can use WinDbg to open and analyze them, or if you can't figure it out or don't want to try, upload two or three of the latest logs from each directory.

I admit I'm making a guess here, but I think your problem is a possible video card issue. If not, PSU is pretty likely too. It's restarting in games but holding stable in most stress tests, and you're also describing symptoms to the letter of what a Ryzen CPU will do when it encounters a machine check exception. And those are almost always a hardware issue and rarely a software issue. Bridging that all together, I'm suspecting video card or maybe PSU behind that.

That's not to say you can't try alternatives. That's not to say it isn't a driver or software issue. Not everyone likes jumping to conclusions, after all. I don't either, and that's why I also made sure to exhaust everything else when I ran into a similar (if not the same) issue you are now, but in the end it was the thing that I suspected all along. And your symptoms are quite telling. Follow the steps (check those logs, namely), but my guess is on video card.

If you want to check further, run a game (or run FurMark), and set it to window mode/Alt+Tab/whatever you need to to have the game open, and then also open a hardware accelerated browser, and browse for... as long as it takes. Does this restart also eventually occur in this game or Furmark plus hardware accelerated browser scenario? This was a way for me to encourage it to occur for me since games were hit or miss and might take longer, and most stress tests passes (OCCT's "GPU variable" was the only one that ever failed for me, and even that passed more often than it didn't). Since games might be fine for upwards of days for you (just like me), that might be a way to try and force it almost on demand. Then, if logs are created, you can check those.

chopinanopolis · Feb 21, 2024

Princess Garnet said:
If the APIC ID is sometimes different, it starts to rule out the issue is a (single) bad CPU core.

You get to those directories through Windows' file explorer.

C > Windows > LiveKernelReports

The WHEA and WATCHDOG directories will be in there. Check both directories for any log files. If they have them, then these will hold clues to the restarts. You can use WinDbg to open and analyze them, or if you can't figure it out or don't want to try, upload two or three of the latest logs from each directory.

I admit I'm making a guess here, but I think your problem is a possible video card issue. If not, PSU is pretty likely too. It's restarting in games but holding stable in most stress tests, and you're also describing symptoms to the letter of what a Ryzen CPU will do when it encounters a machine check exception. And those are almost always a hardware issue and rarely a software issue. Bridging that all together, I'm suspecting video card or maybe PSU behind that.

That's not to say you can't try alternatives. That's not to say it isn't a driver or software issue. Not everyone likes jumping to conclusions, after all. I don't either, and that's why I also made sure to exhaust everything else when I ran into a similar (if not the same) issue you are now, but in the end it was the thing that I suspected all along. And your symptoms are quite telling. Follow the steps (check those logs, namely), but my guess is on video card.

If you want to check further, run a game (or run FurMark), and set it to window mode/Alt+Tab/whatever you need to to have the game open, and then also open a hardware accelerated browser, and browse for... as long as it takes. Does this restart also eventually occur in this game or Furmark plus hardware accelerated browser scenario? This was a way for me to encourage it to occur for me since games were hit or miss and might take longer, and most stress tests passes (OCCT's "GPU variable" was the only one that ever failed for me, and even that passed more often than it didn't). Since games might be fine for upwards of days for you (just like me), that might be a way to try and force it almost on demand. Then, if logs are created, you can check those.

It just crashed again, and there's a log in the WHEA directory. Theres some in the WATCHDOG aswell, but those are older. This is the WHEA log

WHEA_UNCORRECTABLE_ERROR (124)
A fatal hardware error has occurred. Parameter 1 identifies the type of error
source that reported the error. Parameter 2 holds the address of the
nt!_WHEA_ERROR_RECORD structure that describes the error condition. Try !errrec Address of the nt!_WHEA_ERROR_RECORD structure to get more details.
Arguments:
Arg1: 0000000000000000, Machine Check Exception
Arg2: ffffb78e9de6eb90, Address of the nt!_WHEA_ERROR_RECORD structure.
Arg3: 00000000bea00000, High order 32-bits of the MCi_STATUS value.
Arg4: 0000000001000108, Low order 32-bits of the MCi_STATUS value.

KEY_VALUES_STRING: 1

Key : Analysis.CPU.mSec
Value: 3327

Key : Analysis.Elapsed.mSec
Value: 10734

Key : Analysis.IO.Other.Mb
Value: 17

Key : Analysis.IO.Read.Mb
Value: 10

Key : Analysis.IO.Write.Mb
Value: 30

Key : Analysis.Init.CPU.mSec
Value: 562

Key : Analysis.Init.Elapsed.mSec
Value: 28806

Key : Analysis.Memory.CommitPeak.Mb
Value: 86

Key : Bugcheck.Code.LegacyAPI
Value: 0x124

Key : Dump.Attributes.AsUlong
Value: 18

Key : Dump.Attributes.KernelGeneratedTriageDump
Value: 1

Key : Failure.Bucket
Value: LKD_0x124_0_AuthenticAMD_PROCESSOR__UNKNOWN_IMAGE_AuthenticAMD.sys

Key : Failure.Hash
Value: {f59f17e7-f24e-04f5-3f16-e9425b2acba5}

BUGCHECK_CODE: 124

BUGCHECK_P1: 0

BUGCHECK_P2: ffffb78e9de6eb90

BUGCHECK_P3: bea00000

BUGCHECK_P4: 1000108

FILE_IN_CAB: WHEA-20240221-2248.dmp

DUMP_FILE_ATTRIBUTES: 0x18
Kernel Generated Triage Dump
Live Generated Dump

PROCESS_NAME: smss.exe

STACK_TEXT:
ffff8800`8b207150 fffff806`2ff6089f : ffffb78e`9de6eb70 00000000`00000000 ffffb78e`9de6eb90 00000000`00000022 : nt!LkmdTelCreateReport+0x13e
ffff8800`8b207690 fffff806`2ff60796 : ffffb78e`9de6eb70 fffff806`00000000 00000078`00000000 00000078`f2dff690 : nt!WheapReportLiveDump+0x7b
ffff8800`8b2076d0 fffff806`2fdd3d8d : 00000000`00000001 ffff8800`8b207b40 00000078`f2dff690 00000000`00000218 : nt!WheapReportDeferredLiveDumps+0x7a
ffff8800`8b207700 fffff806`2fc88327 : 00000000`00000000 ffffb78e`99da4030 00000000`00000103 00000000`00000000 : nt!WheaCrashDumpInitializationComplete+0x59
ffff8800`8b207730 fffff806`2fa11138 : ffffb78e`9f120000 ffffb78e`9f11fb00 ffff8800`8b207b40 ffffb78e`00000000 : nt!NtSetSystemInformation+0x1f7
ffff8800`8b207ac0 00007ffd`0f410554 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KiSystemServiceCopyEnd+0x28
00000078`f2dff638 00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : 0x00007ffd`0f410554

MODULE_NAME: AuthenticAMD

IMAGE_NAME: AuthenticAMD.sys

STACK_COMMAND: .cxr; .ecxr ; kb

FAILURE_BUCKET_ID: LKD_0x124_0_AuthenticAMD_PROCESSOR__UNKNOWN_IMAGE_AuthenticAMD.sys

OSPLATFORM_TYPE: x64

OSNAME: Windows 10

FAILURE_ID_HASH: {f59f17e7-f24e-04f5-3f16-e9425b2acba5}

Followup: MachineOwner

Im still getting the WudfRd driver error though. That has popped up everysingle time the PC restarts, and doesn't change. Could it be that my headset is just fucked? Cause googling the error message from Event Viewer brings me to a reddit thread for basically the same error caused by the exact same headset I'm using. I just switched to the third driver option for the headset in Device Manager, maybe that works.

And the APIC ID was different this time, this time it was 10, and only one APIC this time

KedsDead · Feb 22, 2024

If your running 24.1.1 drivers for your GPU.. roll them back.. I had to roll them back so a few games would not stutter and a few other would actually load

LabRat 891 · Feb 22, 2024

Princess Garnet said:
If the APIC ID is sometimes different, it starts to rule out the issue is a (single) bad CPU core.

This was where I was about to go.

My 5600 was 'solid' for the better part of a year, and then started BSODing and giving WHEA and execution errors. Set everything to out of box stock and the issues persisted until it wouldn't boot an OS any longer. Immediately after turning on core leveling (FIVE CORE), all the problems went away.

It took a few days of mucking w/ AMD over e-mail and convincing them I didn't steal the CPU from Newegg but, I did end up w/ a totally BNIB (w/ free cooler) 5600 after about 2 weeks.

Princess Garnet · Feb 22, 2024

chopinanopolis said:
It just crashed again, and there's a log in the WHEA directory. Theres some in the WATCHDOG aswell, but those are older.

You say they are older, but does that just mean in relation to the most recent occurrence? Do they still correspond to any of the prior instances?

As expected, the WHEA entry is an 0x124 log which is a bit generic as far as I know (it seems to be the WHEA equivalent of the Event Viewer's Event ID 18). The Watch Dog log might be a bit more specific here.

I notice these arguments...

chopinanopolis said:
Arg3: 00000000bea00000, High order 32-bits of the MCi_STATUS value.
Arg4: 0000000001000108, Low order 32-bits of the MCi_STATUS value.

chopinanopolis said:
BUGCHECK_CODE: 124

BUGCHECK_P1: 0

BUGCHECK_P2: ffffb78e9de6eb90

BUGCHECK_P3: bea00000

BUGCHECK_P4: 1000108

...Match mine exactly, but I'm not informed on this on a level to know if that means it's the same thing. I just recognized them right away as those were identical to mine (The Event ID 18 in Event Viewer will show these too), and when I researching those while i was having my own issue like this, it did lead me to others having similar symptoms.

If you have Watch Dog logs (that relate to any of the times of these crashes), please post those. They may be more helpful.

Again, I hate jumping to conclusions, but if they are present, I'm going to hazard a guess you might have entries like VIDEO_TDR_TIMEOUT_DETECTED (117), VIDEO_ENGINE_TIMEOUT_DETECTED (141), VIDEO_DXGKRNL_BLACK_SCREEN_LIVEDUMP (1a8), and/or VIDEO_MINIPORT_BLACK_SCREEN_LIVEDUMP (1b8)?

If so, that would suggest the GPU to me (though definitely play around with different driver versions too, but in my case it made zero difference). I had those, did an RMA, and it was only then that the issues went away. I tried, for two months before doing the RMA, way too many other things before then and it got me nowhere.

chopinanopolis said:
Im still getting the WudfRd driver error though. That has popped up everysingle time the PC restarts, and doesn't change. Could it be that my headset is just fucked? Cause googling the error message from Event Viewer brings me to a reddit thread for basically the same error caused by the exact same headset I'm using. I just switched to the third driver option for the headset in Device Manager, maybe that works.

That might be a cascading crash as a result of the underlying one, or it could be the underlying cause itself. Hard to pinpoint. But it's worth investigating.

If you can, try and go without this connected and any associated drivers out of the picture? Maybe use another set or speakers or just go without sound if you can live without it until it happens again. Not an attractive idea, I know.

If I am correct on the cause possibly being the video card, then if you want to try and speed the occurrence up, maybe run Furmark and spend time in a hardware accelerated enabled browser and see if it triggers it sooner. If it still happens without the headphones or its software/drivers in the picture, then you can rule it out and figure those are simply crashing as a result of the real crash. But if the issue goes away with it out of the picture, it might point to the headphones or drivers.

chopinanopolis said:
And the APIC ID was different this time, this time it was 10, and only one APIC this time

It's only going to be one APIC ID, but the fact that it differs suggests your CPU itself is likely fine and not the cause.

chopinanopolis · Feb 23, 2024

Princess Garnet said:
You say they are older, but does that just mean in relation to the most recent occurrence? Do they still correspond to any of the prior instances?

As expected, the WHEA entry is an 0x124 log which is a bit generic as far as I know (it seems to be the WHEA equivalent of the Event Viewer's Event ID 18). The Watch Dog log might be a bit more specific here.

I notice these arguments...

...Match mine exactly, but I'm not informed on this on a level to know if that means it's the same thing. I just recognized them right away as those were identical to mine (The Event ID 18 in Event Viewer will show these too), and when I researching those while i was having my own issue like this, it did lead me to others having similar symptoms.

If you have Watch Dog logs (that relate to any of the times of these crashes), please post those. They may be more helpful.

Again, I hate jumping to conclusions, but if they are present, I'm going to hazard a guess you might have entries like VIDEO_TDR_TIMEOUT_DETECTED (117), VIDEO_ENGINE_TIMEOUT_DETECTED (141), VIDEO_DXGKRNL_BLACK_SCREEN_LIVEDUMP (1a8), and/or VIDEO_MINIPORT_BLACK_SCREEN_LIVEDUMP (1b8)?

If so, that would suggest the GPU to me (though definitely play around with different driver versions too, but in my case it made zero difference). I had those, did an RMA, and it was only then that the issues went away. I tried, for two months before doing the RMA, way too many other things before then and it got me nowhere.

That might be a cascading crash as a result of the underlying one, or it could be the underlying cause itself. Hard to pinpoint. But it's worth investigating.

If you can, try and go without this connected and any associated drivers out of the picture? Maybe use another set or speakers or just go without sound if you can live without it until it happens again. Not an attractive idea, I know.

If I am correct on the cause possibly being the video card, then if you want to try and speed the occurrence up, maybe run Furmark and spend time in a hardware accelerated enabled browser and see if it triggers it sooner. If it still happens without the headphones or its software/drivers in the picture, then you can rule it out and figure those are simply crashing as a result of the real crash. But if the issue goes away with it out of the picture, it might point to the headphones or drivers.

It's only going to be one APIC ID, but the fact that it differs suggests your CPU itself is likely fine and not the cause.

I have two WATCHDOG logs, but they don't line up with the WHEA logs. One of them is from the same day as a WHEA log, but there's an hour time difference between the two. I can still post the log here if you want to see it. I have multiple AMD WATCHDOG logs aswell, but they don't line up with the WHEA logs either. I'm not sure if they line up with the crashes though, since I didn't write down every single time the PC restarted

I did use another headset, but it still restarted anyways, and the driver issue didn't show up in event viewer, so it's probably a result of the bigger issue, like you said.

I posted the WHEA log on the AMD help forums, and the general consensus over there was that my PSU is to weak, could that be the cause? I can't borrow one to test, so I'd have to RMA the old one and get anew one for like 20-30 bucks more for a 700-750 Watt PSU

I just realized, not every crash generates a WHEA or Watchdog log. Don't know if that's normal or not. My PC just restarted again, but there's no log

MarsM4N · Feb 23, 2024

chopinanopolis said:
I did use another headset, but it still restarted anyways, and the driver issue didn't show up in event viewer, so it's probably a result of the bigger issue, like you said.

You need to uninstall the headset/headphones drivers in the device manager, then unplug it & restart.

Btw. you didn't install some experimental drivers like "ASIO"? I installed them once and they where not compatible with my DAC/AMP, got randomly blue screens when playing sound.

chopinanopolis said:
I just realized, not every crash generates a WHEA or Watchdog log. Don't know if that's normal or not. My PC just restarted again, but there's no log

If it only happens when gaming it could be the a overloaded PSU. You can undervolt/underclock the card & the CPU and see if it still crashes.

Princess Garnet · Feb 24, 2024

I'm not sure about WHEA or Watch Dog logs, but the Event Viewer stuff often gets created on the next startup, and not during the crash itself. The times might not exactly match up as a result, but if they are sufficiently far apart, they might be different issues. I'd probably take a look at them anyway if they're falling on the same day and you don't knwo what other issue they would be linked to.

An insufficient PSU is a cause. If this only happens during games (or high GPU load stress tests/benchmarks), it would be a likely possible cause. That's sort of why I floated the idea of testing with a lightly demanding game in the background and a hardware accelerated browser. I even had my issues under such conditions and that's what made my doubt it was the PSU. But you may very well have an issue with PSU lacking (or just being faulty or aged). The GPU or PSU are my top two guesses at any rate (unless your idea on the headphones or its drivers leads you anywhere, and that would be something). As the above post states, to test without the headphones you may also need to remove the drivers/associated software.

chopinanopolis · Feb 24, 2024

MarsM4N said:
You need to uninstall the headset/headphones drivers in the device manager, then unplug it & restart. Btw. you didn't install some experimental drivers like "ASIO"? I installed them once and they where not compatible with my DAC/AMP, got randomly blue screens when playing sound.

If it only happens when gaming it could be the a overloaded PSU. You can undervolt/underclock the card & the CPU and see if it still crashes.

I did uninstall the driver's afterwards. I didn't do that the first time I think, it didn't make a difference though. It still crashed.

I undervolted the GPU as well through the AMD Software's auto undervolting. I didn't do the CPU, because I don't know how to tbh. But it just crashed again. I did plug in my headset again earlier today, because the missing driver error didn't come up the last time it crashed and the headset wasn't connected. I'll have to try to maybe get it to crash again with the drivers fully uninstalled, I just never know how long it takes to crash, that's the issue. Sometimes it crashes when launching the game, and sometimes it takes a few days, even if games are run on it every day

I'm not sure how RMAing something works, could I just RMA the GPU and PSU at the same time? I bought both from the same retailer. Or do you need a definitive answer to what is causing an issue to RMA?

MarsM4N · Feb 25, 2024

There is some more info on the Microsoft forums about the error "0x124: WHEA_UNCORRECTABLE_ERROR" & more ways to decrypt the logs:

https://learn.microsoft.com/en-us/w...er/bug-check-0x124---whea-uncorrectable-error
Was checking AsRock's memory compatibility list for your board, but looks like the stopped updating it (memory for Ryzen 5600 "Vermeer" is not listed). You could run a MemTest86 stability test to see if there is a issue. Just install it to a USB stick and let it run over night. Btw. on what BIOS version are you on? (P.S.: do not update if the system isn't 100% stable)

nomdeplume · Feb 25, 2024

Not familiar with AMD graphics in the least.

Is there any chance you have inadvertently used a graphics programs setting on the steam executable that could be inducing this issue? Would sure be an easier resolution than RMA if all components are in fact working properly. But again I know nothing about AMD.

A Computer Guy · Feb 25, 2024

chopinanopolis said:
This is my first PC, and I'll start of by saying, I'm not tech savvy when it comes to PC stuff. But I've been having this issue basically since I built the PC a bit over two weeks ago. While playing steam games my PC randomly restarted. At first all I saw in event viewer where DCOM errors, which do still show up, but I think I never scrolled far enough down (like I said, I'm basically a toddler when it comes to PC stuff, I know nothing). But now I fatal hardware errors come up and they are:

A fatal hardware error has occurred.

Reported by component: Processor Core

Error Source: Machine Check Exception

Error Type: Cache Hierarchy Error

Processor APIC ID: 0

The details view of this entry contains further information.

and:

A fatal hardware error has occurred.

Reported by component: Processor Core Error Source: Machine Check Exception

Error Type: Cache Hierarchy Error

Processor APIC ID: 6

The details view of this entry contains further information.

From what I've seen those can have something to do with Core utilization, but I have no clue how to change any of that stuff, and I'm honestly too scared to brick something accidentally, so I'd rather not poke around on my own. Thank you for the help in advance

I should probably add, my specs are:
Radeon 6650XT
Ryzen 5600
16GB (2x 8GB) Kingston FURY Beast DDR4-3200
1TB Lexar NM620 NVMe
ASRock B450M Pro4
560 Watt LC-Power LC6560GP4

I'm surprised no one asked you to run and post a zentimings (https://zentimings.protonrom.com/) screenshot. There are situations where boosting SOC voltage can help with memory related issues. Typically setting the SOC voltage to 1.1v for example you can do for troubleshooting and see if that helps stabilize the system.

I have this motherboard with a 3950x CPU. Make sure your ram is installed in slots A2 and B2!

What UEFI/BIOS version are you running? You can run CPUz (https://www.cpuid.com/softwares/cpu-z.html) to find out. One of the tabs there will tell you or you can go into UEFI/BIOS (top of 1st screen) to find out. I can recommend being on version P5.30 (asrock notes fixes for memory compatibility issues) however DO NOT UPDATE if you suspect your current memory is unstable as this board does NOT have BIOS flashback.

Reset UEFI/BIOS to defaults, save, then run Passmark Memtest86 (free version is ok). If that reports no errors in 4 passes then there is a good chance you can update your UEFI/BIOS safely. Before and after updating UEFI/BIOS reset to defaults, save and reboot. Do not use previously saved memory profiles from a prior UEFI/BIOS version. Sometimes Asrock forgets to clear any you may have saved.

If memtest86 reports errors then I would try only one memory stick at a time in memory slot A2 and retest. Use whichever one does not fail for updating the UEFI/BIOS. If both memory sticks fail you can go into UEFI/BIOS and downgrade the speed to 2933, 2666, or 2400 and try again. If you still get errors then you need to try different RAM.

If I remember correctly the 5600 CPU is a 5600G cpu with the iGPU disabled so you probably want to be on a UEFI/BIOS at or greater than P5.00
from looking at the Asrock webpage. (https://www.asrock.com/mb/AMD/B450M Pro4/index.asp#BIOS) (edit) that was the 5700 never mind
I've been running version P5.70 since release with no issues.

Note because your OS has crashed multiple times it may be damaged. If you have another drive you can use for testing they might be helpful so you can do a vanilla install with only the latest chipset drivers from amd, GPU drivers from amd, audio and network drivers from asrock, and Steam.

chopinanopolis · Feb 26, 2024

MarsM4N said:
There is some more info on the Microsoft forums about the error "0x124: WHEA_UNCORRECTABLE_ERROR" & more ways to decrypt the logs:

https://learn.microsoft.com/en-us/w...er/bug-check-0x124---whea-uncorrectable-error
Was checking AsRock's memory compatibility list for your board, but looks like the stopped updating it (memory for Ryzen 5600 "Vermeer" is not listed). You could run a MemTest86 stability test to see if there is a issue. Just install it to a USB stick and let it run over night. Btw. on what BIOS version are you on? (P.S.: do not update if the system isn't 100% stable)

The BIOS version I'm running is P5.70.

I honestly have no clue what I'm looking at when I open that link, does that dictate where the issue is coming from? Or is that just general information?

I ran memtest for 9 hours and it came up with zero errors, so it's not the ram. I also ran two hours of Unigine Heaven, and no errors came up, it didn't crash once, so I'm kinda betting on that it's actually the PSU that's the issue since no errors came up in the OCCT CPU test either. There weren't any errors for the PSU test either, but since the crashes aren't consistent and happen sometimes only once every few days, I might've just gotten lucky

Splinterdog · Feb 26, 2024

chopinanopolis said:
The BIOS version I'm running is P5.70.

I honestly have no clue what I'm looking at when I open that link, does that dictate where the issue is coming from? Or is that just general information?

I ran memtest for 9 hours and it came up with zero errors, so it's not the ram. I also ran two hours of Unigine Heaven, and no errors came up, it didn't crash once, so I'm kinda betting on that it's actually the PSU that's the issue since no errors came up in the OCCT CPU test either. There weren't any errors for the PSU test either, but since the crashes aren't consistent and happen sometimes only once every few days, I might've just gotten lucky

It may be costly, but I would definitely buy a new, quality PSU now.

GerKNG · Feb 26, 2024

chopinanopolis said:
560 Watt LC-Power LC6560GP4

Lowest End PSU with weird wattages = almost always the issue.

System Name	Brightworks Systems BWS-6 E-IV
Processor	Intel Core i5-6600 @ 3.9GHz
Motherboard	Gigabyte GA-Z170-HD3 Rev 1.0
Cooling	Quality Fractal Design Define R4 case, 2 x FD 140mm fans, CM Hyper 212 EVO HSF
Memory	32GB (4 x 8GB) DDR4 3000 Corsair Vengeance
Video Card(s)	EVGA GEForce GTX 1050Ti 4Gb GDDR5
Storage	Samsung 850 Pro 256GB SSD, Samsung 860 Evo 500GB SSD
Display(s)	Samsung S24E650BW LED x 2
Case	Fractal Design Define R4
Power Supply	EVGA Supernova 550W G2 Gold
Mouse	Logitech M190
Keyboard	Microsoft Wireless Comfort 5050
Software	W10 Pro 64-bit

System Name	Brightworks Systems BWS-6 E-IV
Processor	Intel Core i5-6600 @ 3.9GHz
Motherboard	Gigabyte GA-Z170-HD3 Rev 1.0
Cooling	Quality Fractal Design Define R4 case, 2 x FD 140mm fans, CM Hyper 212 EVO HSF
Memory	32GB (4 x 8GB) DDR4 3000 Corsair Vengeance
Video Card(s)	EVGA GEForce GTX 1050Ti 4Gb GDDR5
Storage	Samsung 850 Pro 256GB SSD, Samsung 860 Evo 500GB SSD
Display(s)	Samsung S24E650BW LED x 2
Case	Fractal Design Define R4
Power Supply	EVGA Supernova 550W G2 Gold
Mouse	Logitech M190
Keyboard	Microsoft Wireless Comfort 5050
Software	W10 Pro 64-bit

System Name	Keds
Processor	5600X3D
Motherboard	Asus Prime B550-A AC
Cooling	Corsair H100i
Memory	32 GB Team Force DDR4 3200 CL16
Video Card(s)	Asrock Phantom Gaming 6900XT
Storage	256GB NVME / 2TB NVME
Display(s)	Acer 32in 1440p 180hz (32HC5QU S3) / AOpen 27in 1440p 170hz (27HC5UR)
Case	Modified Corsair 540 Air
Audio Device(s)	Logitech G35 / Corsair HS80
Power Supply	EVGA 850GQ
Mouse	Corsair M65
Keyboard	Corsair Strafe Silent
Software	Win 11 Home (Modded)
Benchmark Scores	It will beat a snail in a down hill race.

System Name	Ryzen Monster
Processor	Ryzen 7 5700X3D
Motherboard	Asus ROG Crosshair Hero VII WiFi
Cooling	Corsair H100i RGB Platinum
Memory	Corsair Vengeance RGB Pro 32GB (4x8GB) 3200Mhz CMW16GX4M2C3200C16
Video Card(s)	Asus ROG Strix RX5700XT OC 8Gb
Storage	WD Black 500GB NVMe 250Gb Samsung SSD, OCZ 500Gb SSD WD M.2 500Gb, plus three spinners up to 1.5Tb
Display(s)	LG 32GK650F-B 32" UltraGear™ QHD
Case	Cooler Master Storm Trooper
Audio Device(s)	Supreme FX on board
Power Supply	Corsair RM850X full modular
Mouse	Corsair Ironclaw wireless
Keyboard	Logitech G213
VR HMD	Headphones Logitech G533 wireless
Software	Windows 11 Start 11
Benchmark Scores	3DMark Time Spy 4532 (9258 March 2021, 9399 July 2021)

System Name	No.1
Processor	Ryzen 9 9900X with custom PBO + 2200 FCLK fully stable
Motherboard	B650 Gigabyte Aorus Elite v1.0
Cooling	Thermaltake toughair 710 + Thermal Grizzly Kryonaut extreme
Memory	Patriot Viper PVV532G740C36K @ 6200MT/s 30-36-36-63 1:1
Video Card(s)	Asus TUF gaming RX 7900 XTX OC edition
Storage	1TB T-Force Z44A7 + 2TB T-Force A440 Pro
Display(s)	34 " Asus TUF Gaming VG3A series
Case	Antec C8 constellation white edition
Audio Device(s)	Asus Xonar AE 7.1 + Logitech Z906
Power Supply	Corsair RM1000x V2
Mouse	MSI Clutch GM20 Elite
Keyboard	Logitech G512 Carbon

System Name	✨ Lenovo M75q-1 [Tiny]
Cooling	⚠️ 78,08% N² ⌬ 20,95% O² ⌬ 0,93% Ar ⌬ 0,04% CO²
Display(s)	DELL S2722DGM
Audio Device(s)	◐◑ AKG K702 ⌬ FiiO E10K Olympus 2
Mouse	✌️ Corsair M65 RGB Elite [Black] ⌬ Endgame Gear MPC-890 Cordura
Keyboard	⌨ Turtle Beach Impact 500
Software	Windows 11 Pro (Debloated)

System Name	Metalia
Processor	AMD Ryzen 7 5800X3D
Motherboard	Asus TuF Gaming X570-PLUS
Cooling	ID Cooling 280mm AIO w/ Arctic P14s
Memory	2x32GB DDR4-3600
Video Card(s)	Sapphire Pulse RX 9070 XT
Storage	Optane P5801X 400GB, Samsung 990Pro 2TB
Display(s)	LG ‎32GS95UV 32" OLED 240/480hz 4K/1080P Dual Mode
Case	Geometric Future M8 Dharma
Audio Device(s)	Xonar Essence STX
Power Supply	Seasonic Focus GX-1000 Gold
Mouse	Attack Shark R3 Magnesium - White
Keyboard	Keychron K8 Pro - White - Tactile Brown Switch
Software	Windows 10 IoT Enterprise LTSC 2021

Processor	Intel i5 8400
Motherboard	Asus Prime H370M-Plus/CSM
Cooling	Scythe Big Shuriken & Noctua NF-A15 HS-PWM chromax.black.swap
Memory	8GB Crucial Ballistix Sport LT DDR4-2400
Video Card(s)	ROG-STRIX-GTX1060-O6G-GAMING
Storage	1TB 980 Pro
Display(s)	Samsung UN55KU6300F
Case	Cooler Master MasterCase Pro 3
Power Supply	Super Flower Leadex III 750w
Software	W11 Pro

System Name	Still not a thread ripper but pretty good.
Processor	Ryzen 9 7950x, Thermal Grizzly AM5 Offset Mounting Kit, Thermal Grizzly Extreme Paste
Motherboard	ASRock B650 LiveMixer (BIOS/UEFI version P3.08, AGESA 1.2.0.2)
Cooling	EK-Quantum Velocity, EK-Quantum Reflection PC-O11, D5 PWM, EK-CoolStream PE 360, XSPC TX360
Memory	Micron DDR5-5600 ECC Unbuffered Memory (2 sticks, 64GB, MTC20C2085S1EC56BD1) + JONSBO NF-1
Video Card(s)	XFX Radeon RX 5700 & EK-Quantum Vector Radeon RX 5700 +XT & Backplate
Storage	Samsung 4TB 980 PRO, 2 x Optane 905p 1.5TB (striped), AMD Radeon RAMDisk
Display(s)	2 x 4K LG 27UL600-W (and HUANUO Dual Monitor Mount)
Case	Lian Li PC-O11 Dynamic Black (original model)
Audio Device(s)	Corsair Commander Pro for Fans, RGB, & Temp Sensors (x4)
Power Supply	Corsair RM750x
Mouse	Logitech M575
Keyboard	Corsair Strafe RGB MK.2
Software	Windows 10 Professional (64bit)
Benchmark Scores	RIP Ryzen 9 5950x, ASRock X570 Taichi (v1.06), 128GB Micron DDR4-3200 ECC UDIMM (18ASF4G72AZ-3G2F1)

Processor	AMD Ryzen 9 9950X3D
Motherboard	ASRock B850M PRO-A
Cooling	Corsair Nautilus 360 RS
Memory	2x32GB Kingston Fury Beast 6000 CL30
Video Card(s)	PowerColor Hellhound RX 9070 XT
Storage	1TB Samsung 990 Pro, 2TB Samsung 990 Pro, 4TB Samsung 990 Pro
Display(s)	LG 27GS95QE-B, MSI G272QPF E2
Case	Lian Li DAN Case A3 Black Wood Edition
Audio Device(s)	Bose Companion Series 2 III, Sennheiser GSP600 and HD599 SE - Creative Soundblaster X4
Power Supply	Corsair RM1000X ATX 3.1
Mouse	Razer Deathadder V3
Keyboard	Razer Black Widow V3 TKL
VR HMD	Oculus Rift S