Friday, July 14th 2017

Windows 10 Process-Termination Bug Slows Down Mighty 24-Core System to a Crawl

So, you work for Google. Awesome, right? Yeah. You know what else is awesome? Your 24-Core, 48-thread Intel build system with 64 GBs of ram and a nice SSD. Life is good man. So, you've done your code work for the day on Chrome, because that's what you do, remember? (Yeah, that's right, it's awesome). Before you go off to collect your google-check, you click "compile" and expect a speedy result from your wicked fast system.

Only you don't get it... Instead, your system comes grinding to a lurching halt, and mouse movement becomes difficult. Fighting against what appears to be an impending system crash, you hit your trusty "CTRL-ALT-DELETE" and bring up task manager... to find only 50% CPU/RAM utilization. Why then, was everything stopping?

If you would throw up your arms and walk out of the office, this is why you don't work for Google. For Google programmer Bruce Dawson, there was only one logical way to handle this: "So I did what I always do - I grabbed an ETW trace and analyzed it. The result was the discovery of a serious process-destruction performance bug in Windows 10."
This is an excerpt from a long, detailed blog post by Bruce titled "24-core CPU and I can't move my mouse" on his Wordpress blog randomascii. In it, he details a serious new bug that is only present in Windows 10 (not other versions). Process destruction appears to be serialized.

What does that mean, exactly? It means when a process "dies" or closes, it must go through a single thread to handle this. In this critical part of the OS which every process must eventually partake in, Windows 10 is actually single threaded.

To be fair, this is not a normal issue an end user would encounter. But developers often spawn lots of processes and close them just as often. They use high-end multi-core CPUs to speed this along. Bruce notes that in his case, his 24-core CPU only made things worse, as it actually caused the build process to spawn more build processes, and thus, even more had to close. And because they all go through the same single threaded queue, the OS grinds to a halt during this operation, and performance peak is never realized.

As for whether this is a big bug if you aren't a developer: Well that's up for debate. Certainly not directly, I'd wager, but as a former user of OS/2 and witness to Microsoft's campaign against it back in the day, I can't help but be reminded of Microsoft FUD surrounding OS/2's SIQ issue that persisted even years after it had been fixed. Does this not feel somewhat like sweet, sweet karma for MS from my perspective? Maybe, but honestly, that doesn't help anyone.

Hopefully a fix will be out soon, and unlike the OS/2 days, the memory of this bug will be short lived.
Source: randomascii Wordpress Blog
Add your own comment

107 Comments on Windows 10 Process-Termination Bug Slows Down Mighty 24-Core System to a Crawl

#51
FordGT90Concept
"I go fast!1!11!1!"
I used a program I wrote a year ago to create and attempt to solve number mazes. Grid sizes larger than 9x9 are almost guaranteed to fill 16 GiB of RAM with 8 threads (what I have installed). Every cell it explores, it saves in memory along with all the previous cells it explored. On top of that, it branches everywhere it can making sure it never double backs on itself. Memory use is often exponential compared to the grid size.
Posted on Reply
#52
OneMoar
There is Always Moar
its because killing one process there is no delay the issue is when you have a 1000 waiting on the same lock
Posted on Reply
#53
trparky
FordGT90ConceptI used a program I wrote a year ago to create and attempt to solve number mazes. Grid sizes larger than 9x9 are almost guaranteed to fill 16 GiB of RAM with 8 threads (what I have installed). Every cell it explores, it saves in memory along with all the previous cells it explored. On top of that, it branches everywhere it can making sure it never double backs on itself. Memory use is often exponential compared to the grid size.
You could in theory do that by creating several IO.MemoryStreams and loading huge files into them thus keeping the data in RAM.

I have used IO.MemoryStreams to keep data in RAM until the data is ready to be written to disk.
Posted on Reply
#54
OneMoar
There is Always Moar
think of it this way you have a 1000 people waiting to jump off a bridge at the middle of the bridge there is a man with a you-may-jump stamp he can only stamp one person and a time and nobody may jump until they get there stamp

so everything grinds to a halt while one person at a time gets stamped backing up traffic and such for miles

once everybody gets there stamp they are free to leap to there impending entertainment/doom
Posted on Reply
#55
FordGT90Concept
"I go fast!1!11!1!"
OneMoarits because killing one process there is no delay the issue is when you have a 1000 waiting on the same lock
That's the point: no one should ever be trying to kill 1000 processes at once. Microsoft is likely going to tell Google they need to multithread their process instead of spawning lots of processes. It's an all-around better approach.


I think what is likely happening is that there's a process destruction queue. Everytime Process.Kill is called, a lock is put in place as the new kill order is enqueued. The queue can't carry out it's work (because it keeps getting locked) until the all of the kill orders have been enqueued. At which point, the lock clears and the queue is executed killing all of the processes in less than a second. The issue isn't all that, the issue is the mouse hangs in the process (probably user32.dll). I think only Microsoft knows why the two are related.


I kind of want to drag out my old Vista laptop and see what happens on there. Problem is, it only has 2 GiB of RAM.
Posted on Reply
#56
OneMoar
There is Always Moar
FordGT90ConceptThat's the point: no one should ever be trying to kill 1000 processes at once. Microsoft is likely going to tell Google they need to multithread their process instead of spawning lots of processes. It's an all-around better approach.
good luck with that

this is a edge case anyway something doesn't come up in normal operation unless you are A: a really terrible programmer or B: attempting to force the issue

if you need anouther example of this: GTA:V alt-tab it and watch the entire system die
Posted on Reply
#57
HopelesslyFaithful
FordGT90ConceptThat's the point: no one should ever be trying to kill 1000 processes at once. Microsoft is likely going to tell Google they need to multithread their process instead of spawning lots of processes. It's an all-around better approach.


I think what is likely happening is that there's a process destruction queue. Everytime Process.Kill is called, a lock is put in place as the new kill order is enqueued. The queue can't carry out it's work (because it keeps getting locked) until the all of the kill orders have been enqueued. At which point, the lock clears and the queue is executed killing all of the processes in less than a second. The issue isn't all that, the issue is the mouse hangs in the process (probably user32.dll). I think only Microsoft knows why the two are related.


I kind of want to drag out my old Vista laptop and see what happens on there. Problem is, it only has 2 GiB of RAM.
lol...so the answer to bad coding/design is...well you shouldnt do that anyways so no reason to fix a bad design since it should normally not be happening anyways.

genius. Since no one should have more than 10 programs open we don't need 64 bit. or multi CPUs...moron.
Posted on Reply
#58
OneMoar
There is Always Moar
HopelesslyFaithfullol...so the answer to bad coding/design is...well you shouldnt do that anyways so no reason to fix a bad design since it should normally not be happening anyways.

genius. Since no one should have more than 10 programs open we don't need 64 bit. or multi CPUs...moron.
do you have any idea what you are talking about ?
do you even understand what the problem is and why/when it can happen

OR
you just gonna keep spewing windowz is teh suckz all day ?

the only moron here is you
Posted on Reply
#59
FordGT90Concept
"I go fast!1!11!1!"
HopelesslyFaithfullol...so the answer to bad coding/design is...well you shouldnt do that anyways so no reason to fix a bad design since it should normally not be happening anyways.

genius. Since no one should have more than 10 programs open we don't need 64 bit. or multi CPUs...moron.
This 1000 process test is dangerous close to a forkbomb which is used to execute denial of service attacks and otherwise shutdown a system (any system). Processes, even doing virtually nothing, require a significant amount of memory. Threads, on the other hand, require very little.

Let's say average memory consumption for a simple process in a modern OS is 20 MiB. 1000 processes translates to 20,000 MiB or 20 GiB. Most systems don't have that much RAM installed. Spawning another thread, on the other hand, takes maybe 0.5 MiB per thread--40 times less. You can accomplish the same amount of work with 500 MiB of RAM using threads versus 20,000 MiB of RAM using processes. It really isn't a choice.

That program I demonstrated, with a little tweaking, could literally make a 1000-core system run at 100%. It won't cause the OS to hang when closing either.
Posted on Reply
#60
OneMoar
There is Always Moar
FordGT90ConceptThis 1000 process test is dangerous close to a forkbomb which is used to execute denial of service attacks and otherwise shutdown a system (any system). Processes, even doing virtually nothing, require a significant amount of memory. Threads, on the other hand, require very little.

Let's say average memory consumption for a simple process in a modern OS is 20 MiB. 1000 processes translates to 20,000 MiB or 20 GiB. Most systems don't have that much RAM installed. Spawning another thread, on the other hand, takes maybe 0.5 MiB per thread--40 times less. You can accomplish the same amount of work with 500 MiB of RAM using threads versus 20,000 MiB of RAM using processes. It really isn't a choice.

That program I demonstrated, with a little tweaking, could literally make a 1000-core system run at 100%. It won't cause the OS to hang when closing either.
you are trying to reason with somebody that doesn't have a clue :(
Posted on Reply
#61
zlobby
So much 'telemetry' in google's stuff that even Windoze can't shut it down. Keks! :D
Posted on Reply
#62
dalekdukesboy
R-T-BMaybe I just failed to see the sarcasm from a guy with an Intel CPU. :p

Or I woke up grumpy because my internet is only now back after comcast had me netless for almost a week. Take your pick. ;)



Yes. It's not even broken in 7.
Hence....why I say FU to windows 10. I'll keep my windows 7 tyvm!
Posted on Reply
#63
sweet
OneMoarif you need anouther example of this: GTA:V alt-tab it and watch the entire system die
Just because of your weak-ass 4 cores CPU mate. Buy a Ryzen 7 and you can alt-tab all days.

The problem in this thread is different from your GTA example anyway. Also, this Google guy provided too little info. If he really did try to kill 1000 processes with consumer Win 10, the stupidity is his, not the OS. No consumer PC with Win 10 will ever need to kill 1000 processes. It's the job of servers, with a proper server OS to pair with.
Posted on Reply
#64
OneMoar
There is Always Moar
sweetJust because of your weak-ass 4 cores CPU mate. Buy a Ryzen 7 and you can alt-tab all days.

The problem in this thread is different from your GTA example anyway.
exactly its the same issue relating to how locks function on desktop windows

I would't buy a AMD ryzen cpu if it was the last cpu on earth

and i got news for you it happens on 6 core cpu's as well

my 4670k may be a generation old but it will still run circles around your chip in every gaming benchmark
Posted on Reply
#65
sweet
OneMoarexactly its the same issue relating to how locks function on desktop windows

I would't buy a AMD ryzen cpu if it was the last cpu on earth

and i got news for you it happens on 6 core cpu's as well

my 4670k may be a generation old but it will still run circles around your chip in every gaming benchmark
So salty lol. Also Ryzen 7 are 8 core so your point is invalid.
Posted on Reply
#66
OneMoar
There is Always Moar
sweetSo salty lol. Also Ryzen 7 are 8 core so your point is invalid.
nope still vaild happens regardless of cpu GTA:V is one-thread bound anyway
it can use up to 6 threads(one per logical cpu) but they will only execute as fast as the master thread (cpu0)
Posted on Reply
#67
R-T-B
sweetNo consumer PC with Win 10 will ever need to kill 1000 processes.
Developers do this a lot in builds.

Developers use consumer OSes on occasion too.
Posted on Reply
#68
OneMoar
There is Always Moar
its less about the amount of processes and more about the time needed for it to release the lock and move onto the next process and release the lock again kill it and start over
its stalling out on releasing the lock and its causing something like a 900MS delay which is a eternity for a cpu
Posted on Reply
#69
sweet
OneMoarnope still vaild happens regardless of cpu GTA:V is one-thread bound anyway
it can use up to 6 threads(one per logical cpu) but they will only execute as fast as the master thread (cpu0)
Lol. We are talking about alt-tabing, why are you trying to prove GTA:V is one thread?? To be honest given that you are on a 4 core/ 4 thread CPU, it's hard to describe the feel of an 8 core/ 16 thread. Go grab yourself one and enjoy its smoothness, mate.

And where did you pull the 900ms from lol? You are really lightening my day, mate.
Posted on Reply
#70
OneMoar
There is Always Moar
sweetLol. We are talking about alt-tabing, why are you trying to prove GTA:V is one thread?? To be honest given that you are on a 4 core/ 4 thread CPU, it's hard to describe the feel of an 8 core/ 16 thread. Go grab yourself one and enjoy its smoothness, mate.

And where did you pull the 900ms from lol? You are really lightening my day, mate.
think what you want kid
Posted on Reply
#71
sweet
OneMoarthink what you want kid
Oh, so it's name calling time? :)


So much for "running circles in every gaming benchmarks", lol.

Take my advice. Next time if you alt tab and get massive slow-down, please think about changing your CPU. Even the console peasants have 8 cores, why your master race machine has only 4? Think about it and if you have money to spend and don't like AMD, no one would stop you from buying the shiny 4 cores i7, at least it is better than your current 4670k :)
Posted on Reply
#72
jaggerwild
Why is this on "HeadLines"? Look like a bunch of bozos, but hey......
Posted on Reply
#73
Boosnie
I don't get why some people have called BS on the Goma compiler.
You can't hope to run a biuld process relying on threads only.
When the linker starts reading all the stuff it has to, threads will start to compete heavly with each other, no matter how much effort you put into careful programming.
That said, Goma is a distributed compiler; the supervisor needs processes to keep track of various networked machines compiling at once, not threads.
There are no other ways really.
Posted on Reply
#74
FordGT90Concept
"I go fast!1!11!1!"
sweetThe problem in this thread is different from your GTA example anyway. Also, this Google guy provided too little info. If he really did try to kill 1000 processes with consumer Win 10, the stupidity is his, not the OS. No consumer PC with Win 10 will ever need to kill 1000 processes. It's the job of servers, with a proper server OS to pair with.
Server 2012 R2 does the same.
R-T-BDevelopers do this a lot in builds.

Developers use consumer OSes on occasion too.
My guess is he had 48 processes (one per logical core) running that were killed at once. He noticed the 125 ms hitch and investigated by trying 1000. 48 is a lot; 1000 is crazy. Visual Studio only uses 4 or 5 processes usually.
Posted on Reply
#75
FR@NK
Who the hell builds a workstation with one 24 core processor? These processors are meant to be used in multi-socket boards....you would never see just one of these in a system as it would be cheaper to just get 2P 20 core chips.
Posted on Reply
Add your own comment
Feb 17th, 2025 04:17 EST change timezone

New Forum Posts

Popular Reviews

Controversial News Posts