• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.

Nvidia Driver keeps failing - W11 24h2 - bad GPU ?

izy

Joined
Jun 30, 2022
Messages
1,052 (1.13/day)
I am keep getting this error on my AMD system:
The description for Event ID 153 from source nvlddmkm cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.
1724865546615.png

I upgraded to W11 24h2 and now im getting this error a lot. I had some kind of trouble before but rarely with the game freeze for a few seconds but i never checked the log, 99% i was having the same error but it was very rare.

I had my CPU and RAM OCed and my GPU with a good curve , everything was fine in general.

I went with default settings in BIOS and for GPU , same error.

Weird things on this system are that i have 2 different RAM kits 4x4 , but they were fine till some time ago , now they are running at default + i have an old power supply that had bad fan and i changed it ( some time ago).

This happens mostly when i alt tab from Visual Studio Code to World of Warcraft and back , it can be happening in other situation but i never noticed that so far (it can happen randomly if i have both opened without alt-tabbing) , it happens more if im using the PTR / BETA version of WOW ( im coding some addons).

Some time ago i had a problem that was getting this system rebooted , i lowered the CO from the CPU ,GPU and RAM OC and it was OK but i feel that it is the same problem.

The GPU is used at like 50% max when this is happening , i have frame limiter in WOW and the CPU cant top the GPU with my current settings anyway. All stress tests are passed fine (at least 6 hours) even with the OC on everything.

What i didn try is to remove the other RAM kit (that kinda sucks) or repalce the PSU. I will try to change cables and thing.

I am not sure if its only a problem because of WOW or VSC or just something about the windows , i also did a clean install to Nvidia Studio Drivers with the same result.

If anybody have some ideas please let me know , i think something is failing but not sure what as in general it works without any problems.
 
Joined
Feb 18, 2005
Messages
5,847 (0.80/day)
Location
Ikenai borderline!
System Name Firelance.
Processor Threadripper 3960X
Motherboard ROG Strix TRX40-E Gaming
Cooling IceGem 360 + 6x Arctic Cooling P12
Memory 8x 16GB Patriot Viper DDR4-3200 CL16
Video Card(s) MSI GeForce RTX 4060 Ti Ventus 2X OC
Storage 2TB WD SN850X (boot), 4TB Crucial P3 (data)
Display(s) 3x AOC Q32E2N (32" 2560x1440 75Hz)
Case Enthoo Pro II Server Edition (Closed Panel) + 6 fans
Power Supply Fractal Design Ion+ 2 Platinum 760W
Mouse Logitech G602
Keyboard Razer Pro Type Ultra
Software Windows 10 Professional x64
Are you running the latest drivers? If not, can you try the latest? If you are running the latest, can you try rolling back to the previous version?
 

izy

Joined
Jun 30, 2022
Messages
1,052 (1.13/day)
Are you running the latest drivers? If not, can you try the latest? If you are running the latest, can you try rolling back to the previous version?
I was using latest drivers , did try normal and studio same thing ... felt fine with the other drivers before update but i think it was still happening.
I also upgraded the MB BIOS to last version.

Hardware: b450 TH Max , 5700x , RTX 2060 Super. 600W PSU.

Thinking BAD PSU (but fails not under load) , BAD GPU (but fails not under load) , this 4 sticks hate each other so much .. (they pass all tests), CPU went wrong somehow (passes all tests), maybe some driver bug with my system (i didnt try an older driver just the last 2 studio and normal), MB failing somehow ? Seems way worse with the new Windows (but maybe i was not doing the same things on the old one, still happened i think)

Any tips much appreciated, thanks.

Edit: I think its the PSU or the GPU , the GPU 8 pin goes under 11v sometimes, under a GPU stress test the computer rebooted in 2 -4 mins and this what OCCT saved before the crash. tried with 75% GPU power limit and same thing happened in OCCT 3D GPU stress test.


1724947190536.png

1724947378405.png


Second crash power limited:
1724947996767.png
 

Attachments

  • 1724947601030.png
    1724947601030.png
    12.9 KB · Views: 38
Last edited:

izy

Joined
Jun 30, 2022
Messages
1,052 (1.13/day)
Ok so i swapped the PSU from Thermaltake Smart SE 530W Bronze with the one from my other computer Thermaltake 700W Broze SPG-700DH2CC but i have a few concerns.

1. The voltages look almost the same
2. The 8 pin PCIE cable only has 6 pins on the PSU side (both cables), both PSUs have 8 pin connection , do i need to buy new cables?

I am just testing it right now(i just installed it) , so far so good over 20 min, it was rebooting after max 2 - 5 min with the other PSU, , no reboots no driver error. With the other PSU it seemed that im getting reboots only if the GPU spikes , in OCCT 3D Adaptive Switch Test and some driver errors when i was alt tabbing the game (even the frame rate was capped and the GPU was used only like 60-70%) but in CPU and RAM tests with max OC i never had a problem.

Now the questions , is it possible that the GPU VRM is going bad showing so low 12v pin voltage?
Is the other PSU broken or maybe its just too low W as it shows the same voltages as this one? Or maybe the fan that i replaced isnt cooling good?
Are both PSU bad? The 700W one was rarely used and never in gaming or intensive stuff.
Maybe the MB is going bad or is reporting lower V? I need an USB cable to connect this PSU to the PC as it has some internal sensors.
I guess the GPU temps are ok 2060 super.. that couldnt cause a reboot.

700W PSU during this test.
1725105406422.png


1725105779925.jpeg

Full Load Power Test OCCT
1725108713044.png
 
Last edited:
Joined
Jun 3, 2008
Messages
801 (0.13/day)
Location
Pacific Coast
System Name Z77 Rev. 1
Processor Intel Core i7 3770K
Motherboard ASRock Z77 Extreme4
Cooling Water Cooling
Memory 2x G.Skill F3-2400C10D-16GTX
Video Card(s) EVGA GTX 1080
Storage Samsung 850 Pro
Display(s) Samsung 28" UE590 UHD
Case Silverstone TJ07
Audio Device(s) Onboard
Power Supply Seasonic PRIME 600W Titanium
Mouse EVGA TORQ X10
Keyboard Leopold Tenkeyless
Software Windows 10 Pro 64-bit
Benchmark Scores 3DMark Time Spy: 7695
Your previous "600 watt" (530 Watt) power supply was below the minimum recommendation for that GPU.
The PSU you installed in its place is a low-end unit which should be avoided or replaced as soon as possible.
I would try a better power supply. One that is well-reviewed and not a sub $100 bronze throwaway.
 

izy

Joined
Jun 30, 2022
Messages
1,052 (1.13/day)
Your previous "600 watt" (530 Watt) power supply was below the minimum recommendation for that GPU.
The PSU you installed in its place is a low-end unit which should be avoided or replaced as soon as possible.
I would try a better power supply. One that is well-reviewed and not a sub $100 bronze throwaway.
I dont know , ive paid like 100E for it back in the day i think (for the 700W) , i was intending to use it when i build my new system with 7800x3d and 4070/super .. pretty low W and the 530W PSU was keeping alive an 1800x + radeon 280x OC in its prime.
 
Last edited:
Joined
Jun 3, 2008
Messages
801 (0.13/day)
Location
Pacific Coast
System Name Z77 Rev. 1
Processor Intel Core i7 3770K
Motherboard ASRock Z77 Extreme4
Cooling Water Cooling
Memory 2x G.Skill F3-2400C10D-16GTX
Video Card(s) EVGA GTX 1080
Storage Samsung 850 Pro
Display(s) Samsung 28" UE590 UHD
Case Silverstone TJ07
Audio Device(s) Onboard
Power Supply Seasonic PRIME 600W Titanium
Mouse EVGA TORQ X10
Keyboard Leopold Tenkeyless
Software Windows 10 Pro 64-bit
Benchmark Scores 3DMark Time Spy: 7695
I dont know , ive paid like 100E for it back in the day i think (for the 700W) , i was intending to use it when i build my new system with 7800x3d and and 4070/super ..
oof
oof
 
  • Wow
Reactions: izy

izy

Joined
Jun 30, 2022
Messages
1,052 (1.13/day)
Joined
Feb 18, 2005
Messages
5,847 (0.80/day)
Location
Ikenai borderline!
System Name Firelance.
Processor Threadripper 3960X
Motherboard ROG Strix TRX40-E Gaming
Cooling IceGem 360 + 6x Arctic Cooling P12
Memory 8x 16GB Patriot Viper DDR4-3200 CL16
Video Card(s) MSI GeForce RTX 4060 Ti Ventus 2X OC
Storage 2TB WD SN850X (boot), 4TB Crucial P3 (data)
Display(s) 3x AOC Q32E2N (32" 2560x1440 75Hz)
Case Enthoo Pro II Server Edition (Closed Panel) + 6 fans
Power Supply Fractal Design Ion+ 2 Platinum 760W
Mouse Logitech G602
Keyboard Razer Pro Type Ultra
Software Windows 10 Professional x64
I dont know , ive paid like 100E for it back in the day i think (for the 700W)
Not our fault if you got ripped off buying a piece of junk years ago. All that matters is that the PSU is old and crappy and you should replace it.
 
Joined
Jun 3, 2008
Messages
801 (0.13/day)
Location
Pacific Coast
System Name Z77 Rev. 1
Processor Intel Core i7 3770K
Motherboard ASRock Z77 Extreme4
Cooling Water Cooling
Memory 2x G.Skill F3-2400C10D-16GTX
Video Card(s) EVGA GTX 1080
Storage Samsung 850 Pro
Display(s) Samsung 28" UE590 UHD
Case Silverstone TJ07
Audio Device(s) Onboard
Power Supply Seasonic PRIME 600W Titanium
Mouse EVGA TORQ X10
Keyboard Leopold Tenkeyless
Software Windows 10 Pro 64-bit
Benchmark Scores 3DMark Time Spy: 7695
I didn't find anything on that specific model either, but everything I found makes me not trust it. It is among the company of "avoid" and "replace immediately" on the cultist PSU list, and all forum threads for similar power supplies I see are "is it really THAT bad?". Bad signs. And it is an EOL product which never got much attention. That's usually bad.
The review you linked is not comprehensive and really has no effect on my opinion.

I would not use that power supply with a new build. I think that is crazy. You can find really good power supplies for not much more money.
 
Joined
Aug 29, 2024
Messages
57 (0.41/day)
Location
Shanghai
System Name UESTC_404
Processor AMD R9 7950x3d
Motherboard Gigabyte B650E Aorus Elite X Ax Ice
Cooling Deepcool AG620 digital
Memory Asgard ValkyrieII 6800c34->6400 16g*2
Video Card(s) Yeston 7900xt SAKURA SUGAR
Storage Zhitai TiPlus7100 1T
Case RANDOMLY PICKED TRASHBIN
Power Supply Thermalright TGFX850W
Don't use a bomb psu.
 

izy

Joined
Jun 30, 2022
Messages
1,052 (1.13/day)
I understand the idea that you need to have a good PSU at the same time i have both of this PSUs from a long time and they work just fine after 6 -7 years (and they are not exactly no names or cheap at their time) and they experienced many power drops (the 530W one) because at the last location of this PC there was a problem with the power lines.. so some PSUs are decent with entry level hardware (i am not saying that you should buy cheap/bad PSUs ever, this is what i had on hand), anyway .. ontopic maybe it will help someone else:

- it seems that the 530W PSU couldnt handle the power spikes of my GPU but it wasnt the reason for the driver error , after i swapped to the 700W PSU (even tested with 1000W Corsair PSU) the driver error was still there but under OCCT 3D Adaptive Switch Test the PC no longer reboots (it was only rebooting under that test in the rest was just fine, maybe it happened because of the bug? i dont know , i havent tested it after the fix).

- what seems to solve the driver error is disabling MPO. (i am going to paste from another POST what i did so people can try if they have the same problem)

Multi-Plane Overlay (MPO - Windows feature) causes freezing/flickering/stuttering/driver crash mostly in games.
I am not sure why its affecting only some and not everybody but you guys can do your own research.

To disable download this reg from Nvidia (it works for AMD cards too, its a windows bug):

or edit with regedit yourself:
[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\Dwm]
“OverlayTestMode”=dword:00000005

You can also try this tool with multiple fixes:

What else I did (Nvidia only) was to give full permissions to the SYSTEM user to nvlddmkm.sys in system32.

---------------------
The errors i was getting before using this "fix"

From Windows Event Viewer when the video driver crashes:

The description for Event ID 153 from source nvlddmkm cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the local computer.

From WoW log when it crashed:

Exception: ACCESS_VIOLATION - The instruction at “0x00007ff6bcc3df1f” referenced memory at “0xffffffffffffffff”.
The memory could not be “read”.
ProcessID: 14700
ThreadID: 7820

From Wow gx.log:

9/5 18:16:07.606 Error WaitForSingleObjectEx Timeout: The wait operation timed out. (0x80070102).
9/5 18:16:07.622 Device context was lost. Attempting recovery. Occurrence: 4
9/5 18:16:07.622 GxRestart
9/5 18:16:07.622 D3d12 Device Destroy
9/5 18:16:07.756 NotifyOnDeviceDestroy

From DxDiag:

Windows Error Reporting:
+++ WER0 +++:
Fault bucket , type 0
Event Name: LiveKernelEvent
Response: Not available
Cab Id: 0

Problem signature:
P1: 141
P2: ffff970721dd9460
P3: fffff80141c40d80
P4: 0
P5: ffff9707254a60c0
P6: 10_0_26100
P7: 0_0
P8: 256_1
P9:
P10:
 
Last edited:
Top