Had the same issue today (again, as it was not the first time) with my Asus Rog Strix OC RTX 3090, but certainly not a random issue here. It occurs maybe 2 times a month or so, only in desktop mode, never in games. I then have this issue right after startup or reboot. So never had it when the computer is already booted for longer than a minute.
It starts first with the nvidia driver crashing and auto-reinitializing. Some times several times after another, after which some small black/grey artifacts start appearing on the screen.
First these stay tied to the foreground application (f.i. google chrome) itself, which makes them disappear upon minimizing. But when I wait longer, the stains start to occur in every application and even on the Windows desktop, after which my screen goes totally black and the pc just freezes if i did not shutdown it in time. Other times I get a BSOD instead of the freeze (with newer or latest drivers after using DDU does not matter).
After resetting the pc, I then always find a dmp file in c:\Windows\livekernelreports\watchdog which points to the nvlddmkm.sys Nvidia driver. I have no OC active on my cpu, ram or gpu, it's just a default OC'ed 3090 which is set in Gaming Mode in GPU Tweak and in Quiet Mode on the physical dip switch.
This was already done and did not work:
- Update GPU to latest driver after using DDU in safe mode
- Updated AMD Chipset to latest driver after using DDU in safe mode
- Motherboard was updated to latest bios.
- Made gpu, fans, slots and mainboard itself also completely dust-free
- Disabled Hardware-accelerated GPU scheduling in Windows and in all running applications
- Removed Asus GPU Tweak v2 and v3
- Removed Geforce Experience
- Removed Asus AI Suite + scheduled tasks for it
- Removed Armoury Crate with the official uninstall tool
- Disabled Resizeable Bar in BIOS
- Replaced PSU (now Corsair HX1200)
- Configured GPU in Nvidia Control Panel to Prefer Maximum Performance
- Ran GPU driver in Nvidia Control Panel Debug Mode to reset it to the reference clock speed
- Switched between Performance and Quiet Mode on the dip switch of GPU
- No OC on CPU, RAM, GPU, also underclocked GPU
- Memtest RAM: 0 errors
- Tried tool TDR Manipulator.exe
- Replaced GPU (because first I had the same GPU model, only in RTX 3080 where I had the same issue)
- Tried PCIe X16 slot all Mode Options Auto, Gen1, Gen2, Gen3 and Gen4
- Move GPU to another pci-e slot of the mainboard
- Set GPU in Line-Based Mode instead of default M.S.I. using MSI_util_v3.exe
- iCue Enable Plugins removed + deleted folders ASUS and NvidiaPlugin: but did not remove the application completely as I need it for G-key function
- Upgraded from Windows 10 to Windows 11 latest build and also clean install
- Disabled all startup programs like Brother software, Asus AI Suite, GPU Tweak, Razer Synapse, Actual Multiple Monitors, iCUE, ...
- Irrelevant here, but also tested with other monitors and DP cables
For me, it still looks like something in the Nvidia kernel driver that goes horribly wrong sometimes at startup, cause by some conflict, which makes freeze or even BSOD my pc, but still was not able to find what exactly.
Now I'm nearing the mood to buy an AMD 7900XTX, just to get rid of this issue.