Friday, December 4th 2020
PSA: AMD's Graphics Driver will Eat One CPU Core when No Radeon Installed
While I was messing around with an older SSD test system (not benchmarking anything) I wondered why the machine's performance was SO sluggish with the NVIDIA card I just installed. Windows startup, desktop, Internet, everything in Windows would just be incredibly slow. This is an old dual-core machine, but it ran perfectly fine with the AMD Radeon card I used before.At first I blamed NVIDIA, but when I opened Task Manager I noticed one of my cores sitting at 100%—that can't be right.Digging a bit further into this, it looks like RadeonSettings.exe is using one processor core at maximum 100% CPU load. Ugh, but there is no AMD graphics card installed right now.
Once that process was terminated manually (right click, select "End task"), performance was restored to expected levels and CPU load was normal again. This confirms that the AMD driver is the reason for the high CPU load. Ideally, before changing graphics card, you should uninstall the current graphics card driver, change hardware, then install the new driver, in that order. But for a quick test that's not what most people do, and others are simply not aware of the fact that a thing called "graphics card driver" exists, and what it does. Windows is smart enough to not load any drivers for devices that aren't present physically.Looks like AMD is doing things differently and just pre-loads Radeon Settings in the background every time your system is booted and a user logs in, no matter if AMD graphics hardware is installed or not. It would be trivial to add a check "If no AMD hardware found, then exit immediately", but ok. Also, do we really need six entries in Task Scheduler?
I got curious and wondered how it is possible in the first place that an utility software like the Radeon Settings control panel uses 100% CPU load constantly—something that might happen when a mining virus gets installed, to use your electricity to mine cryptocurrency, without you knowing. By the way, all this was verified to be happening on Radeon 20.11.2 WHQL driver, 20.11.3 Beta and the press driver for an upcoming Radeon review.
Unless you're a computer geek you'll probably want to skip over the following paragraphs, I still found the details interesting enough to share with you.
I attached my debugger, looked for the thread that's causing all the CPU load and found this:Hard to read, translated it into C code it might make more sense:If you're a programmer you'd have /facepalm'd by now, let me explain. In a multi-threaded program, Events are often used to synchronize concurrently running threads. Events are a core feature of the Windows operating system, once created, they can be set to "signaled", which will notify every other piece of code that is watching the status of this event—instantly and even across process boundaries. In this case the Radeon Settings program will wait for an event called "DVRReadyEvent" to get created, before it continues with initialization. This event gets created by a separate, independent, driver component, that's supposed to get loaded on startup, too, but apparently never does. The Task Scheduler entries in the screenshot above do show "StartDVR". The naming suggests it's related to the ReLive recording feature that lets you capture and stream gameplay. I guess that part of the driver does indeed check if Radeon hardware is present, and will not start otherwise. Since Windows has no WaitForEventToGetCreated() function, the usual approach is to try to open the event until it can be opened, at which point you know that it does exist.
You're probably asking now, "what if the event never gets created?" Exactly, your program will be hung, forever, caught in an infinite loop. The correct way to implement this code is to either set a time limit for how long the loop should run, or count the number of runs and give up after 100, 1000, 1 million, you pick a number—but it's important to set a reasonable limit.
A more subtle effect of this kind of busy waiting is that it will run as fast as the processor can, loading one core to 100%. While that might be desirable if you have to be able to react VERY quickly to something, there's no reason to do that here. The typical approach is to add a short bit of delay inside the loop, which tells the operating system and processor "hey, I'm waiting on something and don't need CPU time, you may run another application now or reduce power". Modern processors will adjust their frequency when lightly loaded, and even power down cores completely, to conserve energy and reduce heat output. Even a delay of one millisecond will make a huge difference here.
This is especially important during system startup, where a lot of things are happening at the same time, that need processor time to complete—it's why you feel you're waiting forever for your desktop to become usable when you start the computer. With Radeon Settings taking over one core completely, there's obviously less performance left for other startup programs to complete.
I did some quick and dirty performance testing in actual gameplay on a 8-core/16-thread CPU and found a small FPS loss, especially in CPU limited scenarios, around 1%, in the order of 150 FPS vs 151 FPS. This confirms that this can be an issue on modern systems, too, even though just 5% of CPU power is lost (one core out of 16). The differences will be minimal though, and it's unlikely you'll subjectively notice the difference.
Waiting on synchronization signals is very basic programming skills, most midterm students would be able to implement it correctly. That's why I'm so surprised to see such low quality code in a graphics driver component that get installed on hundreds of millions of computers. Modern software development techniques avoid these mistakes by code reviews—one or multiple colleagues read your source code and point out potential issues. There's also "unit testing", which requires developers to write testing code that's separate from the main code. These unit tests can then be executed automatically to measure "code coverage"—how many percent of the program code are verified to be correct through the use of unit tests. Let's just hope AMD fixes this bug, it should be trivial.
If you are affected by this issue, just uninstall the AMD driver from Windows Settings - Apps and Features. If that doesn't work, use DDU. It's not a big deal anyway, what's most important is that you are aware, in case your system feels sluggish after a graphics hardware change.
Once that process was terminated manually (right click, select "End task"), performance was restored to expected levels and CPU load was normal again. This confirms that the AMD driver is the reason for the high CPU load. Ideally, before changing graphics card, you should uninstall the current graphics card driver, change hardware, then install the new driver, in that order. But for a quick test that's not what most people do, and others are simply not aware of the fact that a thing called "graphics card driver" exists, and what it does. Windows is smart enough to not load any drivers for devices that aren't present physically.Looks like AMD is doing things differently and just pre-loads Radeon Settings in the background every time your system is booted and a user logs in, no matter if AMD graphics hardware is installed or not. It would be trivial to add a check "If no AMD hardware found, then exit immediately", but ok. Also, do we really need six entries in Task Scheduler?
I got curious and wondered how it is possible in the first place that an utility software like the Radeon Settings control panel uses 100% CPU load constantly—something that might happen when a mining virus gets installed, to use your electricity to mine cryptocurrency, without you knowing. By the way, all this was verified to be happening on Radeon 20.11.2 WHQL driver, 20.11.3 Beta and the press driver for an upcoming Radeon review.
Unless you're a computer geek you'll probably want to skip over the following paragraphs, I still found the details interesting enough to share with you.
I attached my debugger, looked for the thread that's causing all the CPU load and found this:Hard to read, translated it into C code it might make more sense:If you're a programmer you'd have /facepalm'd by now, let me explain. In a multi-threaded program, Events are often used to synchronize concurrently running threads. Events are a core feature of the Windows operating system, once created, they can be set to "signaled", which will notify every other piece of code that is watching the status of this event—instantly and even across process boundaries. In this case the Radeon Settings program will wait for an event called "DVRReadyEvent" to get created, before it continues with initialization. This event gets created by a separate, independent, driver component, that's supposed to get loaded on startup, too, but apparently never does. The Task Scheduler entries in the screenshot above do show "StartDVR". The naming suggests it's related to the ReLive recording feature that lets you capture and stream gameplay. I guess that part of the driver does indeed check if Radeon hardware is present, and will not start otherwise. Since Windows has no WaitForEventToGetCreated() function, the usual approach is to try to open the event until it can be opened, at which point you know that it does exist.
You're probably asking now, "what if the event never gets created?" Exactly, your program will be hung, forever, caught in an infinite loop. The correct way to implement this code is to either set a time limit for how long the loop should run, or count the number of runs and give up after 100, 1000, 1 million, you pick a number—but it's important to set a reasonable limit.
A more subtle effect of this kind of busy waiting is that it will run as fast as the processor can, loading one core to 100%. While that might be desirable if you have to be able to react VERY quickly to something, there's no reason to do that here. The typical approach is to add a short bit of delay inside the loop, which tells the operating system and processor "hey, I'm waiting on something and don't need CPU time, you may run another application now or reduce power". Modern processors will adjust their frequency when lightly loaded, and even power down cores completely, to conserve energy and reduce heat output. Even a delay of one millisecond will make a huge difference here.
This is especially important during system startup, where a lot of things are happening at the same time, that need processor time to complete—it's why you feel you're waiting forever for your desktop to become usable when you start the computer. With Radeon Settings taking over one core completely, there's obviously less performance left for other startup programs to complete.
I did some quick and dirty performance testing in actual gameplay on a 8-core/16-thread CPU and found a small FPS loss, especially in CPU limited scenarios, around 1%, in the order of 150 FPS vs 151 FPS. This confirms that this can be an issue on modern systems, too, even though just 5% of CPU power is lost (one core out of 16). The differences will be minimal though, and it's unlikely you'll subjectively notice the difference.
Waiting on synchronization signals is very basic programming skills, most midterm students would be able to implement it correctly. That's why I'm so surprised to see such low quality code in a graphics driver component that get installed on hundreds of millions of computers. Modern software development techniques avoid these mistakes by code reviews—one or multiple colleagues read your source code and point out potential issues. There's also "unit testing", which requires developers to write testing code that's separate from the main code. These unit tests can then be executed automatically to measure "code coverage"—how many percent of the program code are verified to be correct through the use of unit tests. Let's just hope AMD fixes this bug, it should be trivial.
If you are affected by this issue, just uninstall the AMD driver from Windows Settings - Apps and Features. If that doesn't work, use DDU. It's not a big deal anyway, what's most important is that you are aware, in case your system feels sluggish after a graphics hardware change.
277 Comments on PSA: AMD's Graphics Driver will Eat One CPU Core when No Radeon Installed
We see this with many software suppliers... and it fits well with us doing beta testing for free or even at cost, if you consider early access.
Ridiculous and Im not accepting it. Shit code? No buy
more seriously, hummm not a good thing indeed but nothing to scream at AMD who is doing quite fine this year and compared to Nv i will write it again : on AMD i had never to rollback a driver but once (the famous "ZOMGZOMG TEH NEW DIRVER WIL KEEL JOR R9 290!!!!" although not really a rollback ... as usual with drivers i let 3-5 weeks before updating, so, if any issues pop up, i stay on the previous one) on Nvidia? CTD TDR whatever you name it... always using a 6 month older driver than the current new one ... cool right?
well once the price madness will settle down ... i still think a 5600X (or above) and a RX 6800 (XT or heck even a RX 5700 XT would be an upgrade in my case ) are still in order ...
(for the fanboy calls ... please read my sys spec before blurting ineptness )
I doubt QA of any of any of the 3: Intel, AMD, Nvidia would bother checking that weird use case, mostly applicable to a small group called reviewers (and no, not to PC enthusiasts as those tend to reinstall entire windows system, and certainly the drivers)
Let's bitch about "AMD drivers" now, shall we, green FUD was not as intense recently, let's make up for it.
Something something, DLSS is better than 4k if you ignore blur and detail loss, something, something but RT in a handful of games is so very important, something. That stuff was always from "way too expensive" territory for me (on top of lack of reverse-engi-ng x86 code skills).
I know people who wrote decompilers of ECMAScript like language by just examining ARM code disassembly... to me it is some unholy magic.
sure if drivers are beta, they are not ready,they are under testing and finalysing...
also dont waiste your time to get your 3Dmarks score for HOF table or any table there. cant,must be WHQL certified.
just thinking, why release out beta drivers? why??
They have excellent hardware engineers, but very poor software developers. Easy. You had a Radeon card (i.e. an old 480) and upgraded the system with an Nvidia GPU (i.e. a GFX 1660 Super).
And so that driver change can be further tested by a larger demographic, you aren't forced to use betas and of course AMD releases whql driver's like wut the fff.
So a few external Devs loose work over this hyperbolic tension, just uninstall the driver's, it's not rocket science.
Not every user is aware of this problem. But AMD developers should be.
This is the kind of thing that average users do in large numbers every year. So yes, this information is useful, worth reading, and hopefully educating some of those many users who will do internet searches for the exact same type problem.
Hell no most, like 90% of users couldn't tell you what GPU is in the prebuilt they bought.
Out of the 10% that do how many are getting tripped over by this issue?! , Not many at all like in the few 0.0001%, probably less, Drammaaa.
AMD need to fix this shit no doubt but get a grip ,of that 10% , those that do upgrade GPU ,do typically do so correctly.
Not like this.
I took issues with your hyperbolic statement.
Why do you feel the need to throw hyperbolic statements out?.
And wtaf anyway I don't even post in every AMD thread unlike your neg AMD pro Intel self.
This is an amateurish mistake. There is no "hyperbolic statements" pointing this out.
Everyone is able to do that. Most of the new cases are tools free.
But an uninformed user could think "Windows will take care of the software, once I install the right drivers" since Windows is supposed to be plug&play since a while. And in this case Windows really is, but AMD developers messed thing up nevertheless...
Your times yours, use it how you wish.
1) Uninstall drivers/software for removed hardware
AND one that few also realized:
2) Hardware programmers (looking at you AMD), fix your shit so others don't get blamed when average Joe doesn't know WTH (1) is