Friday, July 14th 2017

Windows 10 Process-Termination Bug Slows Down Mighty 24-Core System to a Crawl

So, you work for Google. Awesome, right? Yeah. You know what else is awesome? Your 24-Core, 48-thread Intel build system with 64 GBs of ram and a nice SSD. Life is good man. So, you've done your code work for the day on Chrome, because that's what you do, remember? (Yeah, that's right, it's awesome). Before you go off to collect your google-check, you click "compile" and expect a speedy result from your wicked fast system.

Only you don't get it... Instead, your system comes grinding to a lurching halt, and mouse movement becomes difficult. Fighting against what appears to be an impending system crash, you hit your trusty "CTRL-ALT-DELETE" and bring up task manager... to find only 50% CPU/RAM utilization. Why then, was everything stopping?

If you would throw up your arms and walk out of the office, this is why you don't work for Google. For Google programmer Bruce Dawson, there was only one logical way to handle this: "So I did what I always do - I grabbed an ETW trace and analyzed it. The result was the discovery of a serious process-destruction performance bug in Windows 10."
This is an excerpt from a long, detailed blog post by Bruce titled "24-core CPU and I can't move my mouse" on his Wordpress blog randomascii. In it, he details a serious new bug that is only present in Windows 10 (not other versions). Process destruction appears to be serialized.

What does that mean, exactly? It means when a process "dies" or closes, it must go through a single thread to handle this. In this critical part of the OS which every process must eventually partake in, Windows 10 is actually single threaded.

To be fair, this is not a normal issue an end user would encounter. But developers often spawn lots of processes and close them just as often. They use high-end multi-core CPUs to speed this along. Bruce notes that in his case, his 24-core CPU only made things worse, as it actually caused the build process to spawn more build processes, and thus, even more had to close. And because they all go through the same single threaded queue, the OS grinds to a halt during this operation, and performance peak is never realized.

As for whether this is a big bug if you aren't a developer: Well that's up for debate. Certainly not directly, I'd wager, but as a former user of OS/2 and witness to Microsoft's campaign against it back in the day, I can't help but be reminded of Microsoft FUD surrounding OS/2's SIQ issue that persisted even years after it had been fixed. Does this not feel somewhat like sweet, sweet karma for MS from my perspective? Maybe, but honestly, that doesn't help anyone.

Hopefully a fix will be out soon, and unlike the OS/2 days, the memory of this bug will be short lived.
Source: randomascii Wordpress Blog
Add your own comment

107 Comments on Windows 10 Process-Termination Bug Slows Down Mighty 24-Core System to a Crawl

#26
Athlonite
trparkyPlease don't tell me you like Windows XP. :fear:
HaHa LOL he probably still uses DOS 6.22 and Windows 3.11
Posted on Reply
#27
springs113
R-T-BMaybe I just failed to see the sarcasm from a guy with an Intel CPU. :p

Or I woke up grumpy because my internet is only now back after comcast had me netless for almost a week. Take your pick. ;)



Yes. It's not even broken in 7.
Ouch i feel your pain. I have crapcast too, they have some lazy shards working for them, it's like playing Russian roulette to find a decent rep that's knowledgeable and also one that is coherent at the same time. Charge that one to the house i guess. Him having that Intel CPU is his counter for not being labeled a fanboy 'ya know ehehe.
Posted on Reply
#28
RejZoR
That's what happens when your browser shits gazillion processes for every little shit in a browser...
Posted on Reply
#29
dj-electric
Windows sucks... The only reason i use this shitty OS is to game.
Posted on Reply
#30
newtekie1
Semi-Retired Folder
The thing that bugs me most about his entire Blog post is he never mentions what version of Windows 10 he uses.

Is he using the Creators Update? Is he still on the original Windows 10 release? Has this already been fixed? We don't know, because he didn't tell us what build of Windows 10 he is on.

And I'd expect a developer to know about the different builds of Windows 10...
Posted on Reply
#31
OneMoar
There is Always Moar
anybody running a outdated version of windows: deserves to have there pc soaked in gasoline lit on fire and then thrown off a tall building

this is a edge case bug thats only a issue when you have 40 fucking cores and need to terminate 30 processes
and I bet you windows server doesn't have the issue
Posted on Reply
#32
FordGT90Concept
"I go fast!1!11!1!"
Only time there's mass process termination in a server environment is before shutting down.
Posted on Reply
#33
OneMoar
There is Always Moar
FordGT90ConceptOnly time there's mass process termination in a server environment is before shutting down.
which is why i said I bet windows server doesn't have the issue

this is just a case of somebody using a desktop-os on a server and then wondering why stuff is broke
Posted on Reply
#34
rtwjunkie
PC Gaming Enthusiast
OneMoaranybody running a outdated version of windows: deserves to have there pc soaked in gasoline lit on fire and then thrown off a tall building
I'm gonna say what everyone else who see that and is thinking but is afraid to say:
WTF is WRONG with you?
Posted on Reply
#35
OneMoar
There is Always Moar
rtwjunkieI'm gonna say what everyone else who see that and is thinking but is afraid to say:
WTF is WRONG with you?
_EVERYTHING_
but thats neither here nor there if you need a example of the damage a outdated os can do Cough windows 7 and wannacry cough cough
Posted on Reply
#36
trparky
I wrote a quick and dirty console program in .NET that spawns 1000 child processes that simply sleeps and does nothing (an infinite loop) and then proceeds to kill said child processes. Getting Windows to respond to just about anything while it was killing the child processes was difficult at best. This is a fully updated version of Windows 10, Build 15063.483. So yeah, this seems like an issue at the Windows kernel level. :ohwell:
Posted on Reply
#37
FordGT90Concept
"I go fast!1!11!1!"
Try one process with a 1000 threads. You know, like any sane program would be. Bet there isn't a problem, yet the amount of work that can be accomplished is more or less the same.
Posted on Reply
#38
OneMoar
There is Always Moar
trparkyI wrote a quick and dirty console program in .NET that spawns 1000 child processes that simply sleeps and does nothing (an infinite loop) and then proceeds to kill said child processes. Getting Windows to respond to just about anything while it was killing the child processes was difficult at best. This is a fully updated version of Windows 10, Build 15063.483. So yeah, this seems like an issue at the Windows kernel level. :ohwell:
try it on windows server
Posted on Reply
#39
trparky
I don't have a Windows Server laying around. If you have one I can give you the source code for the console app I wrote.
Posted on Reply
#40
Ferrum Master
trparkyI wrote a quick and dirty console program in .NET that spawns 1000 child processes that simply sleeps and does nothing (an infinite loop) and then proceeds to kill said child processes. Getting Windows to respond to just about anything while it was killing the child processes was difficult at best. This is a fully updated version of Windows 10, Build 15063.483. So yeah, this seems like an issue at the Windows kernel level. :ohwell:
Try to send it, I'll test on recent insiders.
Posted on Reply
#41
FordGT90Concept
"I go fast!1!11!1!"
trparkyI don't have a Windows Server laying around. If you have one I can give you the source code for the console app I wrote.
Hit me with it. I'll see about modifying it to multithreaded too.


I think what it's going to boil down to is that it's like having 1000 files on a file system versus having 1000 files archived into one library. It isn't abundantly obvious but those 1000 files on the file system have 4 MiB of overhead and are generally slow to access because the HDD/SDD has to read from the file system to find out where the data is then move to the file and read it. If they're all in one library, you're moving a pointer about one single file that the file system already looked up. Access time is much faster because the overhead is exponentially less.

Processes are the same way. The process has to keep record of all of its memory use. Threads can quickly request and release memory because there isn't much in the way of overhead (e.g. security/NXbit/etc.). Processes need to not only get permission to execute, they have to load all of their modules (10-30 is not uncommon). All of that needs to be unloaded from the memory while the process winds down.
Posted on Reply
#42
trparky
Here it is, it's quick and dirty.
Posted on Reply
#44
trparky
I updated the code to included a threaded example where it creates 1000 child threads as part of the main executable process. The system does not suffer the same performance penalty that the system endures when killing 1000 child processes. Killing threads is not nearly as tasking on the system as killing whole processes.
Posted on Reply
#45
FordGT90Concept
"I go fast!1!11!1!"
Here's my own threads program on the same Windows 10 machine above using Thread.Abort():

Here's the same program but instead using _Stop = true (peaceful termination):


using System;
using System.Threading;
namespace LotsOThreads
{
class Program
{
private static Thread[] _Threads = new Thread[1000];
private static volatile bool _Stop = false;
static void Main(string[] args)
{
Console.WriteLine("Start threads...");
for (int i = 0; i < _Threads.Length; i++)
{
_Threads = new Thread(Worker);
_Threads.Start();
}
Console.WriteLine("Waiting a second...");
Thread.Sleep(1000);
Console.WriteLine("Killing threads...");
//for (int i = 0; i < _Threads.Length; i++) // painful
// _Threads.Abort();
_Stop = true; // peaceful
Console.WriteLine("Done.");
}
private static void Worker()
{
while (!_Stop)
Thread.Sleep(1000);
}
}
}


As I said, whomever wrote gomacc.exe is an idiot.

Like one big file is better than lots of small files, one big process with lots of threads is better than lots of processes with one thread. The former in both cases creates a lot of hidden overhead.


To recap: left is bad (spam processes), right is good (spam threads)...
Posted on Reply
#46
trparky
Killing a thread is more "violent" than a graceful exit which is what we would like to simulate since killing a process is what causes this issue to begin with.
Posted on Reply
#47
FordGT90Concept
"I go fast!1!11!1!"
Actually, the best way to simulate thread killing is Process.Kill.

Process.GetCurrentProcess().Kill() = instant close, no hitch
Environment.Exit(0) = instant close, no hitch
_Stop = true = moderate close, no hitch
Thread.Abort() = slow close, no hitch
Posted on Reply
#48
OneMoar
There is Always Moar
has nothing todo with memory anyway the problem is that termination is serial not parallel
has nothing todo with memory-management or disk i/o
from the original post:

Well, what do you know. Process creation is CPU bound, as it should be. Process shutdown, however, is CPU bound at the beginning and the end, but there is a long period in the middle (about a second) where it is serialized – using just one of the eight hyperthreads on the system, as 1,000 processes fight over a single lock inside of NtGdiCloseProcess. This is a serious problem. This period represents a time when programs will hang and mouse movements will hitch – and sometimes this serialized period is several seconds longer.
FordGT90ConceptThreads can't be killed though, only aborted. Aborted threads are still gracefully unwound by the parent process.
the fuck they can't
thats exactly the issue thread killing is getting stuck in a lock as NtGdiCloseProcess can only work though one thread at a time and its hanging

the process needs to cooperate yes but you can most certainly force-term a thread
Posted on Reply
#49
FordGT90Concept
"I go fast!1!11!1!"
OneMoarhas nothing todo with memory anyway the problem is that termination is serial not parallel
has nothing todo with memory-management or disk i/o
from the original post:

Well, what do you know. Process creation is CPU bound, as it should be. Process shutdown, however, is CPU bound at the beginning and the end, but there is a long period in the middle (about a second) where it is serialized – using just one of the eight hyperthreads on the system, as 1,000 processes fight over a single lock inside of NtGdiCloseProcess. This is a serious problem. This period represents a time when programs will hang and mouse movements will hitch – and sometimes this serialized period is several seconds longer.
It's memory. CPU usage is all over the place, memory is a straight ramp. The hitch/hang occurs when freeing all the memory that was just consumed by 1000 processes.


Hmm, out of curiousity, I'm going to try killing a single processes that uses over 10 GiB of memory...

Edit: No perceived hitch:

That is one process consuming over 12 GiB of RAM (probably some page file too), switched over to the Proccesses tab, end task, then switched back to performance and watched it unwind while moving the mouse in a circle. I never saw it stop.
Posted on Reply
#50
trparky
FordGT90Conceptout of curiousity, I'm going to try killing a single processes that uses over 10 GiB of memory...
That would be interesting. I would like to see the source code used to do that.
Posted on Reply
Add your own comment
Nov 27th, 2024 01:21 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts