Just had the artifacts too again, as usual after a cold startup this morning. Last time was on 6th of Dec 2022 with the previous driver version.
When I started recording them with Geforce Experience, they disappeared. Stopped the recording, they appeared seconds later again (on the recording also nothing special). Then some seconds later the screen became black and Windows 11 was started to get freezing, so I briefly pushed the power button and the pc succeeded just in time to turn off properly.
Again, in the event log as usual, first:
\Device\Video3
Graphics Exception: ILLEGAL_OPCODE
\Device\Video3
Graphics Exception: ESR 0x40a790=0x80000024
... and as a result of that, Windows TDR kicks in (you can see that this is only consequence, not cause, as it happens seconds later):
Display driver nvlddmkm stopped responding and has successfully recovered.
It started happening 5 minutes after booting up, and without opening any application/browser.
These actions were already performed one or many times, but did not fix the issue:
- Update GPU to latest driver after using DDU in safe mode
- Updated chipset to latest driver after using DDU in safe mode
- Updated mainboard to latest BIOS + loaded defaults
- Experimental Features of Geforce Experience disabled
- Hardware-accelerated GPU scheduling in Windows disabled/did not matter so afterwards re-enabled
- Disabled hardware acceleration individually when possible for all used applications
- iCue Enable Plugins disabled/re-enabled + deleted folders ASUS and NvidiaPlugin
- GPU Tweak 2/3 disabled/did not matter so afterwards re-enabled, with Gaming as default profile
- Set GPU in Line-Based Mode instead of default MSI using "E:\Bin\MSI_util_v3.exe" and rebooted/now reverted to MSI-mode
- Disabled/re-enabled Resizeable Bar in BIOS
- Moved GPU to another pci-e slot of the mainboard/Changed PCIe X16 slot Mode which should have options for Auto, Gen1, Gen2, Gen3 or Gen4. If its in Auto/Gen4, try to set it to Gen 3. If its in Gen 3, set try it to Auto
- Switch to other vBIOS when Dual vBios GPU, afterwards power off/reboot
- Disabled OC on CPU and RAM
- Memtest RAM: 0 errors
- Underclocked GPU
- Run GPU in Nvidia Control Panel Debug Mode (when supported) to reset it to reference clock speed: cannot be saved so no valid option for driver crashes at system startup
- Cleaned startup programs/DLLS with AutoRuns
- Disable Auto-grow for page file size, as this is limited to 1/8th of volume, and set it to 2 times the RAM size + 10 MB
- Disable Hibernation and thus Fast Startup, using "powercfg /hibernate off" or "powercfg -h off"
- Asus AI Suite disabled/re-enabled: disabled unwanted scheduled tasks for it, including disabling both Performance and DIP Away Mode and their "vCore Downgrade" settings in EPU
- Removed Armoury Crate with the uninstall tool: unrelated as issue also happened when it was never installed
- Reconfigured TDR in registry using "E:\Bin\TDR Manipulator.exe": values are now set to Enabled, 10, 9, 120 and 10 + apply and reboot --> did not fix it
- Disabled Brother Printer software autolaunching after startup
- For further troubleshooting only, use Process Monitor --> Options --> Enable Boot Logging to check what happens in background on startup
- Set in E:\Bin\nvidiaProfileInspector\nvidiaProfileInspector.exe - 5. Common; CUDA - Force P2 state on OFF --> now re-enabled as did not work
- Removed Shader Caches after closing all related applications. Disabled the option in the NVIDIA driver, rebooted and re-enabled it. Also cleaned %LocalAppdata%\D3DSCache and %LocalAppdata%\NVIDIA (both DXCache and GLCache folders) and Direct X-cache using cleanmgr
- MODS/MATS VRAM Test: all memory banks of current GPU are fine
- Replaced PSU: no fix
- Replaced GPU: did not fix it either
These actions are for troubleshooting and can each time be performed:
- Check folders C:\Windows\Temp\WER-XXX, C:\ProgramData\NVIDIA Corporation\CrashDumps, C:\Windows\Livekernelreports\Watchdog and C:\Programdata\Microsoft\Windows\WER for DMP-files + debug in WinDbg, to be sure the cause cannot be narrowed down to something else
These actions can still be done as a workaround:
- Set "Always 3D Clock" in GPU Tweak, however this disables 0dB Mode and consumes more power
- Set Power Management Mode to "Prefer Max Performance" in Nvidia Driver Control Panel - however, this disables 0dB Mode and consumes more power
These actions can still be performed:
- Flash this particular GPU with new vBIOS (there is a newer one for my card but can be tricky and I don't believe in hardware issue, so won't do that)
- Update mainboard BIOS (there is again a newer one for my mobo but I don't believe in hardware issue, so won't do that)
- Replace mainboard,memory,CPU,everything again! (but I don't believe in hardware issue, so won't do that)
I think about giving up to keep repeating the steps above, and rather start implementing workaround 1 now: Enabling "
Always 3D Clock" in GPU Tweak, which is enabled at startup.
View attachment 289423
This is certainly unrelated to this thread:
1. You have a RTX 40xx card
2. It's only in a game, not in desktop 2D mode
3. The particular game, FS 2020, is known for artifacting (even on Xbox there appear to be some glitches), due to temporal AA, nvidia settings/drivers and bugs in the game itself
As you can see in the video, every now and then when panning, out of nowhere, these artifacts will start appearing on the top part of my screen. I am not sure why it does it but it'll do it out of nowhere, randomly and doesn't go away until I restart the sim. All my temps are fine, GPU around 65.
www.avsim.com
Here also a factory-OC'ed ASUS card. But I don't think hardware is related, as it happens with many type of GPU's, and also so rare.
Unstable VRAM, nah don't believe so, in that case it would happen at least once in a games too, and above all, the MODS/MATS results would have shown me.
F.i. the black artifacts which I saw today were parts of something which had appeared on the screen seconds before. In fact, I could visually still recognize parts of my CMD Window which launches a custom script at startup and which was still displayed from the memory cache.
That's why I still think something goes wrong between Windows and the Nvidia driver at startup, just can't explain why this does not happen all the time.
Video Memory Management and GPU Scheduling
learn.microsoft.com
For me, the cause still has to be located in one of the following phases:
Handling Memory Segments
Handling Command and DMA Buffers
GDI Hardware Acceleration
Video memory offer and reclaim
GPU preemption
Especially the last one seems to be highly potential as related (but not necessarily as cause, might also be consequence), at least for my particular problem. Quoted from that section:
If long-running packets cannot be successfully preempted, high-priority GPU work, such as work required by the Desktop Window Manager (DWM), can be delayed, resulting in glitches during window transitions and animations. Also, long-running GPU packets that cannot be preempted can cause a TDR process to repeatedly reset the GPU, and eventually a system bugcheck can occur.
As I have mentioned here before, when starting to tinker with GPU PreEmption using the QuickBoost software, the artifacts started to appear.