# WHEA Logger Fatal Hardware error



## Derek12 (Mar 8, 2021)

Hi
I have this new build since Januarry and it works fine stability and performance-wise

But checking Event Viewer I saw two WHEA-Logger events. Both say this

*A fatal hardware error has occurred. A record describing the condition is contained in the data section of this event.*


They happened on  Feb 4 and Feb 24
I don't remember any freezing/BSOD/reboot nor stability issue when they happened but the first one happened right after a (seemingly normal) boot after a Kernel-boot  event (The boot type was 0x1)

The XML view for the Feb 4 event is this
- <Event xmlns="*


			http://schemas.microsoft.com/win/2004/08/events/event
		

*">
- <System>
  <Provider Name="*Microsoft-Windows-WHEA-Logger*" Guid="*{c26c4f3c-3f66-4e99-8f8a-39405cfed220}*" />
  <EventID>1</EventID>
  <Version>0</Version>
  <Level>2</Level>
  <Task>0</Task>
  <Opcode>0</Opcode>
  <Keywords>0x8000000000000002</Keywords>
  <TimeCreated SystemTime="*2021-02-04T05:27:30.5093180Z*" />
  <EventRecordID>39430</EventRecordID>
  <Correlation ActivityID="*{509fe7ac-fe26-4480-98e5-b63f7a4e4921}*" />
  <Execution ProcessID="*4976*" ThreadID="*6444*" />
  <Channel>System</Channel>
  <Computer>DESKTOP-xxxxxx</Computer>
  <Security UserID="*S-1-5-19*" />
  </System>
- <EventData>
  <Data Name="*Length*">462</Data>
  <Data Name="*RawData*">435045521002FFFFFFFF01000100000007000000CE0100001E1B0500040215143C60C1835215A74887D114D9467D7765000000000000000000000000000000008D7C2157665EFB4480339B74CACEDF5B03F83300702E884E992C6F26DAF3DB7ACBF63F9BE1F7D601080000000000000000000000000000000000000000000000C8000000060100000003020001000000000000000000000000000000000000000000000000000000000000000000000001000000000000000000000000000000000000000000000053544F52504F525401000601A4000000010003000B000000B06873F5B0486387EF62D0A1B6918DD0730074006F00720061006800630069000000000000000000000000000000000057444320202020200057443230455A525A2D30305A3548423000A4000000A40000007DE050020000000016000000000100001D69070000000000000000000000000000000000630000000000000000000000000000000000000041000000FFFFFFFF00000000000000000000000000200000010000000000000000000000000000000100000000000000310000000000000065000000640000000000000000000000000000003300000031000000310000000000000000000000D7890000</Data>
  </EventData>
  </Event>



Does anyone have a clue about what happened? I tried to find how to decode that string but nada!
Many thanks


----------



## PooPipeBoy (Mar 8, 2021)

They normally happen for me after a system sleep or restart. Very uncommon though, at most they only appear once a week or so in the event logs. Many Ryzen owners get WHEA errors, but generally they're only a cause of concern when they create instability or BSODs.


----------



## Derek12 (Mar 8, 2021)

PooPipeBoy said:


> They normally happen for me after a system sleep or restart. Very uncommon though, at most they only appear once a week or so in the event logs. Many Ryzen owners get WHEA errors, but generally they're only a cause of concern when they create instability or BSODs.


Thank you
One of them happened, as you said right after boot, the other happened while the computer was idle ( I think I was away RDPing it but I don't remember quite well).

As you said I didn't get any stability issues, the computer is rock solid.***

***I just remembered I got a BSOD once, searching event viewer the BSOD was 0x00000139 (0x000000000000001d, 0xffffd08b40562eb0, 0xffffd08b40562e08, 0x0000000000000000). (KERNEL_SECURITY_CHECK_FAILURE) and happened on 24 January right after a boot.  I hadn't had any BSODs since.

I also updated UEFI, just in case...


----------



## PaulieG (Mar 8, 2021)

Are you overclocked at all?


----------



## Derek12 (Mar 8, 2021)

PaulieG said:


> Are you overclocked at all?


No, I am at stock with PBO enabled and power limited to 65W (due to overheating with stock cooler)


----------



## PaulieG (Mar 8, 2021)

So, no curve optimizer settings either?


----------



## Derek12 (Mar 8, 2021)

PaulieG said:


> So, no curve optimizer settings either?


I suppose not or using stock settings (I didn't even know what's that so unlikely I have touched it)


----------



## PaulieG (Mar 8, 2021)

What about you memory? What kit is it, and are you using the XMP profile?


----------



## Derek12 (Mar 8, 2021)

PaulieG said:


> What about you memory? What kit is it, and are you using the XMP profile?


16GB 2x8GB Corsair Vengeance LPX 2400MHz (is not a kit, it's 2 separate bought modules but they are the same model)
No I am not using XMP (DOCP is disabled), I am using UEFI's stock settings (I did checked and RAM is running at its stock timings on





I just noticed one stick chips is made by Micron and the other is made by SK Hynix could this be bad?


----------



## PaulieG (Mar 8, 2021)

Derek12 said:


> 16GB 2x8GB Corsair Vengeance LPX 2400MHz (is not a kit, it's 2 separate bought modules but they are the same model)
> No I am not using XMP (DOCP is disabled), I am using UEFI's stock settings (I did checked and RAM is running at its stock timings on
> View attachment 191534
> 
> I just noticed one stick chips is made by Micron and the other is made by SK Hynix could this be bad?


Unlikely at those conservative stock settings, but if you were running at XMP it is technically overclocking and hynix and micron will respond differently to timing and voltage adjustments. It is highly recommended to buy kits rather than individual sticks to rule out memory as a culprit in many scenarios.


----------



## Derek12 (Mar 19, 2021)

Wow I've forgotten about this thread... until yesterday I got another brand new WHEA-Logger event.
This time seemed to happen right after awaking the computer from sleep, again no weird behaviors (I noticed it by chance)
BIOS is latest version
Seems to repeat it on a 20-day basis more or less.
Isn't there any way to know what happened and what component caused the event?

Maybe I will follow @PaulieG advice and get a kit (or try to remove a stick, but I would have to wait a month or so to see if the event repeats lol


----------



## PooPipeBoy (Mar 19, 2021)

Derek12 said:


> Wow I've forgotten about this thread... until yesterday I got another brand new WHEA-Logger event.
> This time seemed to happen right after awaking the computer from sleep, again no weird behaviors (I noticed it by chance)
> BIOS is latest version
> Seems to repeat it on a 20-day basis more or less.
> ...



Are they being reported by a local service like mine are?

You should be able to look at the WHEA event details and find a Process ID. Then go into Details tab of Task Manager and find the active process that reported the error. Note that if you restart the computer, all the process IDs will be different and you won't be able to find it.

It's not much to go by (mine are reported by svchost.exe) but it helps gather a bit more information at least.

EDIT: I just learned that if you right-click a process in the Details tab, you can click "Go To Service(s)" and it will take you to the actual service running in the Services tab. It looks like my WHEAs were reported by "Diagnostic Policy Service". Interesting.


----------



## Derek12 (Mar 19, 2021)

PooPipeBoy said:


> Are they being reported by a local service like mine are?
> 
> You should be able to look at the WHEA event details and find a Process ID. Then go into Details tab of Task Manager and find the active process that reported the error. Note that if you restart the computer, all the process IDs will be different and you won't be able to find it.
> 
> ...


Thanks!
Same as you, the process is svchost and service is DPS (luckily I didn't reboot or shutdown since yesterday, I usually put it on sleep).

I see that many users getting this event have Ryzen like you and me, maybe it's something "normal"?

Do you have stability issues?  I don't (except a one time bsod a month ago)


----------



## PooPipeBoy (Mar 19, 2021)

Derek12 said:


> Thanks!
> Same as you, the process is svchost and service is DPS (luckily I didn't reboot or shutdown since yesterday, I usually put it on sleep).
> 
> I see that many users getting this event have Ryzen like you and me, maybe it's something "normal"?
> ...



Huh, no way! Maybe there's some rhyme or reason to these WHEA errors after all....

Nope I haven't had even one sign of stability issues for months now. Perfectly reliable. See, one would think that if there were a serious hardware flaw that a whole multitude of different processes would be spitting out errors, but instead it's just this one jerkbutt process in Windows. But it would explain why my system posts two WHEA errors every week like clockwork. Maybe it's just a software issue and it's not a "fatal" failure at all.

I figured I'd just disable the DPS process entirely and see if that fixes the problem. That would be nice if it does lol


----------



## Zareek (Mar 19, 2021)

I had WHEA errors all the time with my Ryzen 3800x until I updated to Windows 10 20H2 and then they all went away. They never caused an issue anyway. It's one of the lovely gifts we get from Microsoft for being their BETA testers for Windows 10!


----------



## Derek12 (Mar 19, 2021)

PooPipeBoy said:


> Huh, no way! Maybe there's some rhyme or reason to these WHEA errors after all....
> 
> Nope I haven't had even one sign of stability issues for months now. Perfectly reliable. See, one would think that if there were a serious hardware flaw that a whole multitude of different processes would be spitting out errors, but instead it's just this one jerkbutt process in Windows. But it would explain why my system posts two WHEA errors every week like clockwork. Maybe it's just a software issue and it's not a "fatal" failure at all.
> 
> I figured I'd just disable the DPS process entirely and see if that fixes the problem. That would be nice if it does lol


Maybe I will try that too!


Zareek said:


> I had WHEA errors all the time with my Ryzen 3800x until I updated to Windows 10 20H2 and then they all went away. They never caused an issue anyway. It's one of the lovely gifts we get from Microsoft for being their BETA testers for Windows 10!


I currently have 20H2 but I have a cumulative update pending, I will install it. Also I'm not on insider channels


----------



## Zareek (Mar 19, 2021)

Derek12 said:


> Maybe I will try that too!
> 
> I currently have 20H2 but I have a cumulative update pending, I will install it. Also I'm not on insider channels


Yeah, no insider channels. I was being sarcastic about the BETA testing for Microsoft, I do that too sometimes but not on my main rig. 

I'm not on the latest AGESA either. I'm running the last BIOS before they added Ryzen 5xxx support. I think I'm on the latest AMD chipset driver. 

I also haven't installed the Cumulative Update(KB5001649) yet. I'll run a full back-up either today or this weekend and then install the update.

I seriously wouldn't worry about the WHEA errors unless you are seeing something else too. I know it's hard to do, once you see it you want to fix it. I started ignoring them and then at some point they went away and haven't come back.


----------



## Derek12 (Mar 19, 2021)

Zareek said:


> Yeah, no insider channels. I was being sarcastic about the BETA testing for Microsoft, I do that too sometimes but not on my main rig.
> 
> I'm not on the latest AGESA either. I'm running the last BIOS before they added Ryzen 5xxx support. I think I'm on the latest AMD chipset driver.
> 
> ...


oh OK I tough you were on insider channel  I use insider builds on a VM lol
I did update to latest UEFI before the last event, so I've ruled out that (unless a new update comes out)
So maybe that's the best option, ignore it until getting stability issues, but I'will try to keep an eye and maybe try taking a memory stick out but I use it for work and 8 GB is too little lol.


----------



## Zareek (Mar 19, 2021)

Derek12 said:


> So maybe that's the best option, ignore it until getting stability issues, but I'will try to keep an eye and maybe try taking a memory stick out but I use it for work and 8 GB is too little lol.


Not to mention you'll lose even more performance because Ryzen 3xxx is memory bandwidth starved with even dual channel DDR4.


----------



## thesmokingman (Mar 19, 2021)

> 16GB 2x8GB Corsair Vengeance LPX 2400MHz


----------



## Deleted member 205776 (Mar 19, 2021)

Ryzen and LPX is a bad idea...


----------



## PooPipeBoy (Mar 19, 2021)

Alexa said:


> Ryzen and LPX is a bad idea...



Why? Source? LPX is one of the most popular RAM kits on the market and I don't hear many complaints.

That said, I do know with Corsair that they love to make sure that no two kits are identical. You can buy two kits with the same product code, one will be Samsung and the other will be Hynix modules because the version numbers are likely to be different. So it's hard to narrow down precisely which kit has incompatibility issues because they're all bloody different. That makes it great fun when you're trying to buy an additional matched kit for a future upgrade.


----------



## Deleted member 205776 (Mar 19, 2021)

Too many forum posts like this with LPX being involved, and my own personal experience with two Ryzen builds. I heard they released specific SKUs for Ryzen but I'm not sure. I actively avoid it on Ryzen builds. Works just fine with Intel builds.


----------



## PooPipeBoy (Mar 19, 2021)

Alexa said:


> Too many forum posts like this with LPX being involved, and my own personal experience with two Ryzen builds. I heard they released specific SKUs for Ryzen but I'm not sure. I actively avoid it on Ryzen builds. Works just fine with Intel builds.



It's too early to call right now if the RAM kit is at fault, and frankly not enough indications that it would be. The WHEA error says "fatal hardware error" but I would take that with a grain of salt. It's not the first time that Windows has lied about what the real problem is.


----------



## Derek12 (Mar 20, 2021)

Alexa said:


> Too many forum posts like this with LPX being involved, and my own personal experience with two Ryzen builds. I heard they released specific SKUs for Ryzen but I'm not sure. I actively avoid it on Ryzen builds. Works just fine with Intel builds.


What symptoms did you have? did you get WHEA-Logger events?



PooPipeBoy said:


> It's too early to call right now if the RAM kit is at fault, and frankly not enough indications that it would be. The WHEA error says "fatal hardware error" but I would take that with a grain of salt. It's not the first time that Windows has lied about what the real problem is.


Yeah, could bad RAM or RAM incompatibility cause ONLY those events? while the computer is stable, while gaming, Prime95, etc? (except the one time BSOD that sounded more like drivers than anything else - I updated them since- )



Zareek said:


> Not to mention you'll lose even more performance because Ryzen 3xxx is memory bandwidth starved with even dual channel DDR4.


I've used the computer for a week with only one stick (though 8 GB would be enough) and it performed relatively well (except 8 GB wasn't enough)


------------

*I forgot to say that I am using the same Windows install as with previous MB (H110M-K-D3) and CPU (Intel i3 6100), but could that cause this event? (I work on it and formatting would be difficult) I did Uninstall old drivers*


----------



## Zareek (Mar 20, 2021)

Derek12 said:


> *I forgot to say that I am using the same Windows install as with previous MB (H110M-K-D3) and CPU (Intel i3 6100), but could that cause this event? (I work on it and formatting would be difficult) I did Uninstall old drivers*


It's not supposed to with Windows 10. In the old days major hardware changes would cause massive problems.



PooPipeBoy said:


> Why? Source? LPX is one of the most popular RAM kits on the market and I don't hear many complaints.
> 
> That said, I do know with Corsair that they love to make sure that no two kits are identical. You can buy two kits with the same product code, one will be Samsung and the other will be Hynix modules because the version numbers are likely to be different. So it's hard to narrow down precisely which kit has incompatibility issues because they're all bloody different. That makes it great fun when you're trying to buy an additional matched kit for a future upgrade.


There are certain LPX kits that have major issues with Ryzen CPUs. The older the CPU the worse it is. I have a Corsair Vengeance LPX 3200 kit(CMK16GX4M2B32000C16) that will not even POST regularly with Ryzen 1700x. With every new Ryzen revision memory support has gotten better. That kit will POST fine at defaults with my 3800X it still will not run at XMP 3200 settings. If I mess with sub-timings and bump the voltages I can get it to run at 3200. My current kit(G.Skill FlareX  DDR4-3200C14 16GB(2x8)) runs out of the box with XMP on both systems.


----------



## PooPipeBoy (Mar 29, 2021)

PooPipeBoy said:


> Huh, no way! Maybe there's some rhyme or reason to these WHEA errors after all....
> 
> Nope I haven't had even one sign of stability issues for months now. Perfectly reliable. See, one would think that if there were a serious hardware flaw that a whole multitude of different processes would be spitting out errors, but instead it's just this one jerkbutt process in Windows. But it would explain why my system posts two WHEA errors every week like clockwork. Maybe it's just a software issue and it's not a "fatal" failure at all.
> 
> I figured I'd just disable the DPS process entirely and see if that fixes the problem. That would be nice if it does lol



Just a PSA that this method was a fail. Disabling DPS doesn't prevent WHEA's at all, just the reporting of them. Back to square 1 unfortunately.

I'm back to testing some voltage settings to see if that does anything to help.


----------



## Deleted member 205776 (Mar 29, 2021)

Mess around with VDDG voltages like I did. Turning down VDDG IOD a notch completely eliminated every WHEA.


----------



## Derek12 (Mar 29, 2021)

Alexa said:


> Mess around with VDDG voltages like I did. Turning down VDDG IOD a notch completely eliminated every WHEA.


I will check it and update, Last week I got another WHEA lol. This time not related to a reboot. Again no signs of instability.
Also I saw a new BIOS update so I will install it too.
Thanks!


UPDATE
I got something interesting:

I did convert that hex string to text (didn't thought about that lol)
435045521002FFFFFFFF01000100000007000000CE01000037030800180315143C60C1835215A74887D114D9467D7765000000000000000000000000000000008D7C2157665EFB4480339B74CACEDF5B03F83300702E884E992C6F26DAF3DB7ADF7A8010C41DD701080000000000000000000000000000000000000000000000C8000000060100000003020001000000000000000000000000000000000000000000000000000000000000000000000001000000000000000000000000000000000000000000000053544F52504F525401000601A4000000010003000B000000B06873F5B0486387EF62D0A1B6918DD0730074006F00720061006800630069000000000000000000000000000000000057444320202020200057443230455A525A2D30305A3548423000A4000000A40000007DE050020000000016000000000100002CFC040000000000000000000000000000000000630000000000000000000000000000000000000049000000FFFFFFFF000000000000000000000000000000200100000000000000000000000000000001000000000000003100000000000000650000006400000000000000000000000000000033000000310000003100000000000000000000003E0E0000

And Got this
CPER�����7<`��R�H���F}we�|!Wf^�D�3�t���[�3p.�N�,o&���z�z����STORPORT��hs��Hc��bС����storahciWDC     *WD20EZRZ-00Z5HB0*��}�P,�cI���� 1ed311>

The highlighted text is my data HDD. Could it be the issue??? It is present on all events The drive seems to be fine per SMART. The orange text seems to pinpoint to a SATA port (?)

EDIT 2
Saw this thread
Solved: Ryzen 9 5900x whea-logger ID:1 - AMD Community
And OP's string, when decoded, also show an WD drive!!!!


----------



## FireFox (Mar 29, 2021)

I solved the WHEA errors increasing VCCIO and VCCSA but you're using AMD so i don't know anything about it.


----------



## Derek12 (Mar 29, 2021)

FireFox said:


> I solved the WHEA errors increasing VCCIO and VCCSA but you're using AMD so i don't know anything about it.


Probably related to @Alexa post


----------



## Deleted member 205776 (Mar 29, 2021)

VCCIO (Voltage to I/O) is quite literally Intel's version of AMD's VDDG IOD (Voltage to I/O Die). No wonder it fixed things.


----------



## Derek12 (Mar 29, 2021)

In light of the decoded text pointing to my data drive and/or port, could this still be caused by RAM, CPU, MB? or it could be

Data drive (SMART data is fine, drive doesn't show signs of failing)
SATA port (I could try to change ports, again no signs of drive malfunctions)
SATA cable (I could try another cable if I have some around) but I don't notice any clear symptom (I expect they to be similar to 1, 2 and maybe 3)
SATA driver (I am using the generic standard AHCI driver as AMD recommends)
SATA controller, aka motherboard, but all events point to the same drive.
Some Windows bug or something?

I will try 2, 3, and maybe 4 and see if it repeats, and updating UEFI.


----------



## PooPipeBoy (Mar 29, 2021)

Derek12 said:


> I will check it and update, Last week I got another WHEA lol. This time not related to a reboot. Again no signs of instability.
> Also I saw a new BIOS update so I will install it too.
> Thanks!
> 
> ...



It's funny you mention a Western Digital hard drive....


----------



## Derek12 (Mar 29, 2021)

PooPipeBoy said:


> It's funny you mention a Western Digital hard drive....
> 
> View attachment 194300


WTF!!

Could it be some compatibility issue with Ryzen and some WD drives???


----------



## Deleted member 205776 (Mar 29, 2021)

Interesting...


----------



## PooPipeBoy (Mar 29, 2021)

Derek12 said:


> WTF!!
> 
> Could it be some compatibility issue with Ryzen and some WD drives???



It might be!

I looked at the other errors and I got some from the other WD drive as well. So I guess that means I'm getting two WHEA errors per week because I've got two WD drives?

Madness


----------



## Deleted member 205776 (Mar 29, 2021)

Are these uncorrectable (BSOD) or correctable (event viewer) WHEAs? If they're the latter, and the hard drive theory is to be believed, then I wouldn't worry about them. Seems like just another bug or incompatibility. Might get ironed out in a future AGESA update, who knows. Could be worth reporting this to AMD.


----------



## PooPipeBoy (Mar 29, 2021)

Alexa said:


> Are these uncorrectable (BSOD) or correctable (event viewer) WHEAs? If they're the latter, and the hard drive theory is to be believed, then I wouldn't worry about them. Seems like just another bug or incompatibility. Might get ironed out in a future AGESA update, who knows. Could be worth reporting this to AMD.



Yeah they're not BSODs, they're WHEA errors posted without any instability or hard resets. It would seem like there's some kind of scheduled maintenance process that happens every five days or so on the drives and that's what is causing the issues. Some kind of file indexing thing or something? Not sure.

Our storage be playin' mind games with us, fellas.

Edit: One of the things I've noticed with these WHEA errors is that they don't specifically mention anything about the processor cores reporting the error. Just very generic information: "A fatal hardware error has occured". That's all, nothing mentioned about processor cores reporting the error like you get with a serious WHEA that crashes the computer.


----------



## Deleted member 205776 (Mar 29, 2021)

I turned off file indexing as I just use Everything to search for files. Then again, my system is all SSDs. Didn't bother checking the logs like you guys to see what exactly caused it. Thankfully just lowering VDDG IOD solved my issues.


----------



## PooPipeBoy (Mar 29, 2021)

Alexa said:


> I turned off file indexing as I just use Everything to search for files. Then again, my system is all SSDs. Didn't bother checking the logs like you guys to see what exactly caused it. Thankfully just lowering VDDG IOD solved my issues.



Next time I get a WHEA posted I'll need to check the events that happen around that time. I haven't cross-checked with the Application events at all, which I have a suspicion might provide some more information. I've looked at other events in the System section and haven't noticed anything odd.


----------



## Derek12 (Mar 29, 2021)

PooPipeBoy said:


> Yeah they're not BSODs, they're WHEA errors posted without any instability or hard resets. It would seem like there's some kind of scheduled maintenance process that happens every five days or so on the drives and that's what is causing the issues. Some kind of file indexing thing or something? Not sure.
> 
> Our storage be playin' mind games with us, fellas.
> 
> ...


Yeah, exactly the same here too


Alexa said:


> I turned off file indexing as I just use Everything to search for files. Then again, my system is all SSDs. Didn't bother checking the logs like you guys to see what exactly caused it. Thankfully just lowering VDDG IOD solved my issues.


I will try the BIOS voltage and also updating UEFI, and wait another couple of weeks


PooPipeBoy said:


> Next time I get a WHEA posted I'll need to check the events that happen around that time. I haven't cross-checked with the Application events at all, which I have a suspicion might provide some more information. I've looked at other events in the System section and haven't noticed anything odd.


Same here, there is no pattern, some happen right after a boot/reboot, and sometimes they happen while idling or something, for example, this is the last of the events in system events and application events:


----------



## Mussels (Mar 30, 2021)

it could be the drives are asked to report something and fail to do so, they could be waking too slow from a sleep state - but this is interesting and someone needs to poke AMD and WD


----------



## PooPipeBoy (Mar 30, 2021)

Mussels said:


> it could be the drives are asked to report something and fail to do so, they could be waking too slow from a sleep state - but this is interesting and someone needs to poke AMD and WD



It's devilishly hard to report bugs to manufacturers, but I sent off information to both AMD and WD. I can't post to the AMD Community forum although I did send off a service request to customer support. Even if it takes a while to get resolved at least it's not a major issue and we don't need to worry about it.


----------



## unknown_VS (Mar 30, 2021)

I am very sure this is just  bug with the Ryzen bios and as long you don't get regular crashes or other issues not really a cause for concern.


----------



## KasaiTaka (Apr 29, 2021)

Came here to say I also have a whea event id 1 reported by dps and I did the same thing you guys did with that hex code and the problem was my WD40EZRZ-00GXCB0 hdd and I am on a z490 aorus elite ac and 10900 so this definitely is not an amd only problem.


----------



## KasaiTaka (May 4, 2021)

So yesterday I got another whea event id 1 and this time the hex code when converted to text revealed the problem might be the storport driver.


----------



## kaan0550 (May 26, 2021)

Hello, my computer just locked 10 min ago. I found the issue was WHEA error. I checked hex and my problem was WD10EZEX-22MFCA0 hdd. I use win 10 20H2 with latest updates all drivers.


----------



## Gmr_Chick (Jun 1, 2021)

Would like to report this very same issue as well. Running a Ryzen 5 3600 and B550 Asrock PG Velocita board. I've only gotten 3 of these WHEA errors, but they all mention my WD Blue HDD (WDC WD10EZEX-00RKKA0). HWInfo, however, isn't showing any actual errors with the drive, nor imminent failures. 

This is bizarre


----------



## Mussels (Jun 1, 2021)

I am really glad i dont have any WD drives, has anyone heard back from AMD or WD yet?


----------



## FireFox (Jun 1, 2021)

Gmr_Chick said:


> I've only gotten 3 of these WHEA errors, but they all mention my WD Blue HDD (WDC WD10EZEX-00RKKA0).


Where/how do you find the info that it shows it's your HDD?


----------



## Gmr_Chick (Jun 1, 2021)

FireFox said:


> Where/how do you find the info that it shows it's your HDD?



By translating the raw data numbers hex into text. 

435045521002FFFFFFFF01000100000007000000CE01000024090600190515143C60C1835215A74887D114D9467D7765000000000000000000000000000000008D7C2157665EFB4480339B74CACEDF5B03F83300702E884E992C6F26DAF3DB7A53721E69854CD701080000000000000000000000000000000000000000000000C8000000060100000003020001000000000000000000000000000000000000000000000000000000000000000000000001000000000000000000000000000000000000000000000053544F52504F525401000601A4000000010003000B00000060D20B4AFF861736C778947F0DAD08EA730074006F00720061006800630069000000000000000000000000000000000057444320202020200057443130455A45582D3030524B4B413000A4000000A40000007DE05002000000001600000000010000D74A000000000000000000000000000000000000630000000000000000000000000000000000000031000000FFFFFFFF0000000000000000000000000000000101000000000000000000000000000000010000000000000031000000000000006500000064000000000000000000000000000000330000003100000031000000000000000000000028060000 

In that mess of numbers, when converted to plain text, my WD drive is mentioned.


----------



## FireFox (Jun 1, 2021)

Gmr_Chick said:


> In that mess of numbers, when converted to plain text, my WD drive is mentioned.


And i don't know how to do that


----------



## Gmr_Chick (Jun 1, 2021)

Oh, neither do I, but I googled "hex to text converter" and found out that way


----------



## FireFox (Jun 1, 2021)

Will give it a try, got a WHEA error yesterday.


----------



## pajic (Jul 31, 2021)

Wow, I am so glad I stumbled upon this forum thread. I've been getting WHEA events since June and was infuriated because I didn't know what the problem is. Such a relief to find out it's the hard drive (by converting the raw hex data to a string) and not something more vital... but, there is nothing wrong with my hard drive whatsoever, I use it every day. I assume this is a bug of some sort? I saw that some people fixed it with a BIOS update but the latest BIOS for my mobo is still in Beta so I don't wanna bother. I replaced the SATA cable for the hard drive from a regular one to a one with a clip just to be safe.

Here's the string I got by converting the raw hex data if someone is interested, and big thanks to OP for figuring this out. 


Spoiler



CPERÿÿÿÿÎ,<`ÁƒR§H‡ÑÙF}we|!Wf^ûD€3›tÊÎß[ø3p.ˆN™,o&ÚóÛz=d<à‚×ÈSTORPORT¤ôõÈž¡(¹óßPqß˜storahciWDC     WD10EZEX-08WN4A0¤¤}àP¶žc2ÿÿÿÿ1ed311“


----------



## Boombastik (Aug 4, 2021)

As-rock z97 anniversary + a pcie sata  controller with Marvell 9120 chipset (installed at the first pcie slot).
I take whea error only if i put an old *western digital hard drive WD1600BEVT-22A23* in the *9120 sata-esata* controller.
The only difference is that 9120 has external sata capable(from hwinfo64 details)


----------



## SpazShark (Aug 6, 2021)

Hi all - first post here....I signed up just to contribute to this conversation.

I'm seeing the exact same thing (WHEA Logger Fatal Hardware error - Event ID 1).

Interestingly my raw data translates to this:

CPER�����1,<`��R�H���F}we�|!Wf^�D�3�t���[�3p.�N�,o&���z��i���STORPORT��O:�~'��v�Cstorahci*Samsung SSD 860 EVO 1TB*��}�Pr[dc����cedecc�

So in my case my Samsung 860 EVO SSD (my D: drive) is identified as the culprit. This is a new build < 2 weeks old so I've only seen this error come up once.

My C: drive is actually a Western Digital NVME drive (WDS100T1X0E) and it hasn't reported this error yet.

Is anyone seeing this error with non-WD drives? Is it always a drive plugged in via SATA? Is it always a secondary drive and not the drive Windows is installed in (C: drive)?


----------



## Hachi_Roku256563 (Aug 6, 2021)

SpazShark said:


> Hi all - first post here....I signed up just to contribute to this conversation.
> 
> I'm seeing the exact same thing (WHEA Logger Fatal Hardware error - Event ID 1).
> 
> ...


My system is almost all wd so i only have one other drive a segate
and it does not do this


----------



## Logan7 (Aug 7, 2021)

Interesting that this came up again, I just posted about this in another thread a few days ago.

I had this issue on an older system recently, the hex to text converter showed my Samsung 860 EVO SSD as the culprit. I had no real problems, except that this error would show up once per week or so and I'd BSOD maybe once a month. It ended up being an incompatibility between the ~2010 AMD chipset and the SSD. To solve it I had to turn off Native Command Queuing (NCQ). 

If this is your issue you'll see a "CRC Error Count" in SMART data above 0, even though it shows it as being Good or Healthy. If you run something like CrystalDiskMark, that error count will probably go up - mine went up by 95 in one run. I'm now sitting at an error count of 10,481 and it hasn't increased since I disabled NCQ months ago. Note that if you're disabling NCQ, the process differs depending on if you have the generic Windows 10 driver or if you have the manufacturer (AMD/Intel) driver.

Not sure if it will help anyone in this thread or if its even applicable outside of SSDs, but check your CRC error counts.


----------



## SpazShark (Aug 8, 2021)

Logan7 said:


> Interesting that this came up again, I just posted about this in another thread a few days ago.
> 
> I had this issue on an older system recently, the hex to text converter showed my Samsung 860 EVO SSD as the culprit. I had no real problems, except that this error would show up once per week or so and I'd BSOD maybe once a month. It ended up being an incompatibility between the ~2010 AMD chipset and the SSD. To solve it I had to turn off Native Command Queuing (NCQ).
> 
> ...



@Logan7 thanks for the info! Does my screenshot of my Samsung SSD 860 EVA look similar to what you saw in the "CRC Error count" section? If so do you recommend that I disable NCQ?


----------



## Logan7 (Aug 8, 2021)

SpazShark said:


> @Logan7 thanks for the info! Does my screenshot of my Samsung SSD 860 EVA look similar to what you saw in the "CRC Error count" section? If so do you recommend that I disable NCQ?
> 
> View attachment 211660


Yours looks to be fine in that regard actually, since its showing all 0s under "Raw Values" so I don't know that disabling NCQ would do anything
This is what mine looks like in Speccy





(28F1 in a hexadecimal to decimal converter shows 10,481)


----------



## SpazShark (Aug 8, 2021)

Logan7 said:


> Yours looks to be fine in that regard actually, since its showing all 0s under "Raw Values" so I don't know that disabling NCQ would do anything
> This is what mine looks like in Speccy
> 
> View attachment 211684
> ...


Ok thanks for the explanation! I guess I'll leave it for now and see....


----------



## Boombastik (Aug 27, 2021)

I have also an old p5q with q6600.
That pc has 2 DVD, one seagate hdd, two western digital hdd and one ssd.
I see that whea errors for the two wd hard drives one a week.
The wd are green hard drives from 2010 era.


----------



## Lukiez22 (Oct 29, 2021)

Hi, i got the same WHEA error but with an I7 - 4700MQ and a ssd Samsung SSD 860 EVO 1TB. if i convert the hex i got this.
For me, i don't get a blue screen, my monitor just start glitching a lot, with different colours, (like old tv when they had low signal), after this i can't move the mouse or do anything for like a minute and then crash. I must restart like 3/4 time before i can re-use the pc. (Sometime, after the restart my pc can't even boot cause it can't find the boot file)


----------



## not_a_stable_genius (Nov 7, 2021)

PooPipeBoy said:


> Are they being reported by a local service like mine are?
> 
> You should be able to look at the WHEA event details and find a Process ID. Then go into Details tab of Task Manager and find the active process that reported the error. Note that if you restart the computer, all the process IDs will be different and you won't be able to find it.
> 
> ...


Thanks for the advice. I have this problems too with a ryzen 7 5700u in a convertible device. I think this machine has some kind of power state problem since this error displays shortly upon a restart or startup of the system. It was the DPS process in my case too, i figured I just kill that permanently and fight the symptoms instead of the root problem.  This issue is very rare in my case as well and very hard to reproduce because it doesn't happen on every startup. I give it a few more days and if it continues to annoy me I'll have to make use of the warranty... I already tested the system stringently, cinebench, long multi tasking sessions, a bootable ram testing program which fills the ram and checks the integrity, windows storage diagnostics, and SMART logs of the SSD. I didn't find any hardware errors with my tests so far and when it runs and doesn't bsod after startup it runs very stable and fine. I already updated the bios the the newest version and updated all ryzen and chipset drivers to their respective newest versions.


----------



## J_45 (Dec 2, 2021)

Has anyone had a conclusive resolve to this. 

I have exactly the same issue. Hex Converter mentions a WD Blue drive each time on the WHEA error (I have 3 of the same drives in my system though, so Im just at the stage of testing one at a time).

I too am running a Ryzen 3000 X470 build, system is completely stable. Errors appear every 3 or 4 days, most of the time when the system is idle. Since monitoring it over the last month I've noticed a lot of times the error comes up around the 36 hour mark.


----------



## AlwaysHope (Dec 31, 2021)

J_45 said:


> Has anyone had a conclusive resolve to this.
> 
> I have exactly the same issue. Hex Converter mentions a WD Blue drive each time on the WHEA error (I have 3 of the same drives in my system though, so Im just at the stage of testing one at a time).
> 
> I too am running a Ryzen 3000 X470 build, system is completely stable. Errors appear every 3 or 4 days, most of the time when the system is idle. Since monitoring it over the last month I've noticed a lot of times the error comes up around the 36 hour mark.


I'd like to know this too. Have WHEA errors only when I'm gaming, not when benching at the same clocks. Very weird! Might try downgrade PCIe 4 to 3 & see if that still happens with my gpu.


----------



## AlwaysHope (Jan 8, 2022)

This is not exactly a fatal hardware error but nonetheless it's a WHEA error logged by the system. After some digging in my system, I discovered the device causing this. Funny thing is it's not consistent when producing errors whilst gaming.


----------



## buyukbang (Jan 23, 2022)

AlwaysHope said:


> This is not exactly a fatal hardware error but nonetheless it's a WHEA error logged by the system. After some digging in my system, I discovered the device causing this. Funny thing is it's not consistent when producing errors whilst gaming.
> View attachment 231639


I've exactly same error alongside the "WHEA Logger Fatal Hardware error". Can you share how did you found root cause? I tried to find a which device is connected with the mentioned (in my error log) the PCIE Root port but couldnt find a way.


----------



## GerKNG (Jan 23, 2022)

Derek12 said:


> the other happened while the computer was idle ( I think I was away RDPing it but I don't remember quite well).
> 
> As you said I didn't get any stability issues, the computer is rock solid.***


Disable Power Down mode in the Memory settings and set the PSU Idle Control from low current idle to typical current idle. (normally found within the overclocking settings either in or near of AMD CBS)

in 8/10 cases that's the problem.


----------



## AlwaysHope (Jan 24, 2022)

buyukbang said:


> I've exactly same error alongside the "WHEA Logger Fatal Hardware error". Can you share how did you found root cause? I tried to find a which device is connected with the mentioned (in my error log) the PCIE Root port but couldnt find a way.


I got the device hardware ID & systematically went through all the devices listed in device manager to marry up the names together & then WHAM! you find them.  It's tedious work but if your determined you can track it down. Funny thing is though, I'm not on that motherboard anymore & since I changed to another Z590 board, so far at least with gaming I have no WHEA errors.


----------



## buyukbang (Jan 24, 2022)

AlwaysHope said:


> I got the device hardware ID & systematically went through all the devices listed in device manager to marry up the names together & then WHAM! you find them.  It's tedious work but if your determined you can track it down. Funny thing is though, I'm not on that motherboard anymore & since I changed to another Z590 board, so far at least with gaming I have no WHEA errors.


Thank you very much for the hint. After reading your comment I noticed there is an option to see it easier  Device Manager --> View --> Resourced By Connection. I found that the PCI E Root Port that I'm getting error logs is connected to a SATA AHCI Controller. So it seems as a Sata controller or HDD issue as most of the people here already reported.


----------



## AlwaysHope (Jan 25, 2022)

Another way with device manager is too right click the device > properties > details > property > scroll down to hardware ID. Like a lot with computers there is always more than 2 ways to skin a cat.


----------



## favtony (Apr 23, 2022)

I recently got a lot WHEA ID 1 error. Every time a suddenly reboot.
I searched a lot, and find out Microsoft has an example program called "dumprec" for dumping the WHEA data from event log. But can't find anywhere to download, so I built one.
Then I found it's output can be cross validated from my whea dmp, which may indicate that it's the memory doing all this.
sources of Microsoft's sample program

just run it in a cmd, here's the Output from my computer. Be aware that the "Primary" means the section that is used for error recovery. Don's know if it's the reason, but I think that's a hint.


> ============================================================
> 2022/4/23 8:54:17 - Machine Check Exception
> Fatal (Previous Error)
> ------------------------------------------------------------
> ...



I attached my build program. Don't know what will look like when analyzing other types of error, but hope it helps.


----------



## zffx (May 31, 2022)

5/31/2022 0:29:18 - Machine Check Exception
                    Fatal (Previous Error)
------------------------------------------------------------
  0 - Memory Error Section (Primary)
  1 - Processor Generic Error Section
      Processor type:   x86/x64
      Instruction set:  x64
      Error type:       Cache
      Operation:        Data Read
      Flags:
      Level:            1
      CPU Version:      0x0000000000a20f12
      Processor ID:     0x2
  2 - XPF MCA Section
      CPU Vendor:       AMD
      Processor Number: 0x2
      MCG_STATUS:       0x0000000000000000 ()
      Instruction Ptr:  0x0000000000000000
      MCA Bank:         0x0
      MCi_STATUS:       0xbaa00000000c0135 (VAL UC EN MISCV PCC)
          Other info:   0x0200000
          Model error:  0x000c
          MCA error:    0000 0001 0011 0101 (binary)
      Misc:             0xd01a0ffe00000000
  3 - {c34832a1-02c3-4c52-a9f1-9f1d5d7723fc}



CPER�������ω�N��s,�q1�o�蜑�L��e��I�R4Pv�t��P��do�N�c>��|������v��G�K�^�����$��BWE�3V^\����'�2H��RL��]w#���hCw�t�5����
�M�';�

Can't figure out if it's a memory issue or a wd issue.

I'm running xmp at 3200mhz and curve optimizer at negative 20 all cores. happens only when im asleep and pc is idle.


----------



## eidairaman1 (May 31, 2022)

Full system specs required


----------



## zffx (May 31, 2022)

GPU is 3060 ti strix


----------



## Mussels (May 31, 2022)

raise your SoC voltage to 1.1, should help.

And since we've got no context to go on, remove any overclocks, PBO settings and undervolts you're running - that type of error can be from incorrect RAM or CPU settings (but seeing dual rank memory on ryzen, my bets on the SoC voltage)


----------



## zffx (May 31, 2022)

Mussels said:


> raise your SoC voltage to 1.1, should help.
> 
> And since we've got no context to go on, remove any overclocks, PBO settings and undervolts you're running - that type of error can be from incorrect RAM or CPU settings (but seeing dual rank memory on ryzen, my bets on the SoC voltage)


My PC would reboot on it's own no bsod or dump files just a Whea logger in windows events.
will try raising SoC and see how it goes.
Thanks.


----------



## Mussels (May 31, 2022)

zffx said:


> My PC would reboot on it's own no bsod or dump files just a Whea logger in windows events.
> will try raising SoC and see how it goes.
> Thanks.


crashing to a black screen and random reboots are very common for an overworked ryzen memory controller, raising the SoC voltage helps.

While you're on Zen 3 and not zen 1, this image explains it really well:




From your screenshots you're running two dual rank 16GB sticks (So 32GB, 4 ranks total) - the second easiest setup to run on ryzen (meaning for 3200Mhz which is technically overclocking, a small boost to the memory controllers voltage is often all anyone needs to get it stable)

People always get this wrong, but your 5950x (and all Zen 3 CPU's) are only officially rated UPTO 3200Mhz, with two single rank DIMMs. The moment you add dual rank dimms or four sticks of ram in any combination, you are not guaranteed 3200Mhz... but with minor tweaking, or lowering clockspeed it can be made to work.

(For comparison i run 2x32GB sticks that are dual rank like your setup, and need 1.10V for 3600Mhz and 1.15V for 3800Mhz. I have a higher quality motherboard designed for overclocking, so you may need slightly higher voltages for the same setup)


----------



## thesmokingman (May 31, 2022)

zffx said:


> I'm running xmp at 3200mhz and curve optimizer at negative 20 all cores. happens only when im asleep and pc is idle.


Negative 20 on all cores... yea crashy crashy makes sense.


----------



## zffx (May 31, 2022)

thesmokingman said:


> Negative 20 on all cores... yea crashy crashy makes sense.


I was going on what Ryzen Master recommended which was negative 30 all cores. Found out that it's not stable by running CoreCycler. 
I just disabled curve optimizer and increased SoC voltage to 1.1v like Mussels has recommended hopefully it works this time.


----------



## thesmokingman (May 31, 2022)

zffx said:


> I was going on what Ryzen Master recommended which was negative 30 all cores. Found out that it's not stable by running CoreCycler.
> I just disabled curve optimizer and increased SoC voltage to 1.1v like Mussels has recommended hopefully it works this time.


You don't need to touch SOC at 3200mhz DRAM.


----------



## zffx (May 31, 2022)

thesmokingman said:


> You don't need to touch SOC at 3200mhz DRAM.


Can you elaborate more?


----------



## thesmokingman (May 31, 2022)

zffx said:


> Can you elaborate more?


At your ram speed, you don't need to do anything, its base speed. If your chip cannot handle 3200mhz w/o increasing voltage you have bigger problems. It's likely the issue was a too aggressive negative curve on voltage.


----------



## zffx (May 31, 2022)

thesmokingman said:


> At your ram speed, you don't need to do anything, its base speed. If your chip cannot handle 3200mhz w/o increasing voltage you have bigger problems. It's likely the issue was a too aggressive negative curve on voltage.


I see. Well according to Mussels post " 5950x (and all Zen 3 CPU's) are only officially rated UPTO 3200Mhz, with two single rank DIMMs. The moment you add dual rank dimms or four sticks of ram in any combination, you are not guaranteed 3200Mhz" it's not guaranteed to run normally so i guess i'll just leave SoC at 1.1v for a week and set it back to default and test again if it's stable.

Thanks for your input.


----------



## thesmokingman (May 31, 2022)

zffx said:


> I see. Well according to Mussels post " 5950x (and all Zen 3 CPU's) are only officially rated UPTO 3200Mhz, with two single rank DIMMs. The moment you add dual rank dimms or four sticks of ram in any combination, you are not guaranteed 3200Mhz" it's not guaranteed to run normally so i guess i'll just leave SoC at 1.1v for a week and set it back to default and test again if it's stable.
> 
> Thanks for your input.


I'll say it again you don't need to touch SOC. Your issue was most likely your high neg offset. You can believe or not. That pic is base speeds at maxed memory support, it's a whole different thing.


----------



## zffx (May 31, 2022)

thesmokingman said:


> I'll say it again you don't need to touch SOC. Your issue was most likely your high neg offset. You can believe or not. That pic is base speeds at maxed memory support, it's a whole different thing.


Okay. I'll set SoC back to default and will report back if the pc crashes.


----------



## Mussels (Jun 1, 2022)

zffx said:


> I was going on what Ryzen Master recommended which was negative 30 all cores. Found out that it's not stable by running CoreCycler.
> I just disabled curve optimizer and increased SoC voltage to 1.1v like Mussels has recommended hopefully it works this time.


I missed the -20 

Honestly thats likely far too low, heck most of my cores dont even go below -10 with a few that crash at idle around -6


----------



## eidairaman1 (Jun 1, 2022)

zffx said:


> View attachment 249392View attachment 249393View attachment 249394View attachment 249395
> GPU is 3060 ti strix


Not detailed enough


----------

