• Welcome to TechPowerUp Forums, Guest! Please check out our forum guidelines for info related to our community.
  • The forums have been upgraded with support for dark mode. By default it will follow the setting on your system/browser. You may override it by scrolling to the end of the page and clicking the gears icon.

Faulty Windows Update from CrowdStrike Hits Banks and Airlines Around the World

Not sure if this has been mentioned since I couldn't bring myself to read all 199 comments. At least for the first couple of pages, the uninformed seem to blame Microsoft for this. As much as Microsoft screws up, this particular issue isn't on them ...in any way, shape or form.

The reason why this happened is because the CrowdStrike agent is a boot level driver. This means that it gets loaded pretty much before most of anything else, except when you boot in Safe Mode. Then, only absolutely necessary drivers are loaded. You also need Safe Mode to be able to delete the offending file, since in a regular session (when the PC wouldn't crash) the file would be in use and thus locked.

I must admit, when I read about the fix I couldn't believe my eyes. A file with the .sys extension is usually a driver. This means actual executable code. Usually anti-malware and HIPS applications work with some form of pattern file. CrowdStrike really does distribute its "signature" updates as executable code. And therein lies the problem. I don't know how many of you know about coding and pointers in particular, but here goes: CrowdStrike tried to call some code in that update (C-00000291*.sys). The problem was, the file CrowdStrike had pushed contained zeros. Now, when you try to call or dereference a pointer of 0 (nullptr), that just won't fly. Usually, to get around potential nullptrs you make a check for it before trying to use the pointer. You can also use try/catch statements. Apparently, someone at CrowdStrike didn't think this was necessary. And... BOOOM!

At the company I work, we also got hit pretty hard by this issue. While our company is actually on the smaller size, the corporation that owns us uses CrowdStrike. A lot of us are tech-savvy, being developers. Still we weren't able to help ourselves because these days you're not allowed to have admin permissions on your workstation. Our consultants are issued laptops, which, because they're used both on- and off-site, are BitLocker-encrypted. That's not necessarily a problem, because each consultant has their key. What they don't have is the recovery key, which for some reason is needed when you actually manage to get into Repair Mode. We had to have our sys admin take a break from his vacation to help get us up and running again. Many systems are still down, because there only was time to bring the most important ones back on-line.

And yes, this could just as easily have hit *nix and macOS. But the majority of businesses out there use Windows. Like it or not.
 
Is it just me or do others think critical IT and society infrastructure services need to switch from Windows to Linux?

I don’t want this to be last thing I see before I die.
View attachment 355670

That should have happened years ago, but Microsoft people are deeply entrenched in organisations nowadays. I'm just glad I don't have to deal with this mess.
 
What they don't have is the recovery key, which for some reason is needed when you actually manage to get into Repair Mode. We had to have our sys admin take a break from his vacation to help get us up and running again. Many systems are still down, because there only was time to bring the most important ones back on-line.
This part is insane. I would be PISSED if I had to work all year and cut through the middle of it to do some emergency global thing. Why isn't there a substitution for that guy while he's on vacation? That is probably gonna get looked at.
 
Not sure if this has been mentioned since I couldn't bring myself to read all 199 comments
Next time just read the comments.
 
As a result of all of this, I reckon that Microsoft should make it a requirement to more prominently display the faulting file and owner of it on a blue screen. I saw someone throw a pic up on Twitter where the blue screen had the CrowdStrike logo on it. It's such a simple thing that I can't believe something like that hasn't been done already. Would have also avoided this getting blamed on Microsoft by all of the uneducated people out there and the media incorrectly reporting this as a "Faulty Windows Update" or "Microsoft Issue".
 
Last edited:
This part is insane. I would be PISSED if I had to work all year and cut through the middle of it to do some emergency global thing. Why isn't there a substitution for that guy while he's on vacation? That is probably gonna get looked at.
He was at home, only a couple of kilometers away. Like I said, our's is a small company (30 people or so), albeit owned by a large corporation. In the olden days, some of us had administrative access to our systems (me included), but this is no longer policy. Still, I did what I could to help on Friday.
Next time just read the comments.
Thanks for that insightful comment. Well, I did go back and read (most of) the comments. I did see lots of Microsoft bashing. And the familiar "this wouldn't have happened on Linux" trope. What I didn't see was an actual explanation of what went wrong, so I believe my comment wasn't completely unwarranted.
 
Low quality post by Caring1
Thanks for that insightful comment. Well, I did go back and read (most of) the comments. I did see lots of Microsoft bashing. And the familiar "this wouldn't have happened on Linux" trope. What I didn't see was an actual explanation of what went wrong, so I believe my comment wasn't completely unwarranted.
I love it when a lazy person comes in after multiple comments, admits they couldn't be bothered reading before commenting, then assumes they know more than everyone else and is going to "inform" us all of the cause. ‍♀️ :slap:
Like I said, next time just read the comments.
 

What I get from the video is that CrowdStrike created a device driver that can dynamically load updated modules from a specified directory. This effectively creates an engine that runs untrusted and unapproved code in kernel land. If that doesn't scare the shit out of you, I really don't know what will.

What makes it even more scary is that CrowdStrike did not include any input validation into their code thus why this whole fiasco happened. They failed to check for the most basic of issues, a file full of null data. OOPS!

Honestly, I'd be very damned surprised that CrowdStrike survives this whole mess.
 
Last edited:
imagine being a global company and not using staged rollouts of updates
then imagine NOT testing those updates before they went live
Then imagine why your company no longer exists
 
imagine being a global company and not using staged rollouts of updates
then imagine NOT testing those updates before they went live
Then imagine why your company no longer exists
Well, the CEO *did* say he was sorry. That has to count for something...
(this post may contain traces of sarcasm)
 
What I get from the video is that CrowdStrike created a device driver that can dynamically load updated modules from a specified directory. This effectively creates an engine that runs untrusted and unapproved code in kernel land. If that doesn't scare the shit out of you, I really don't know what will.
Yup. This is something way more familiar in USER space. Updating dictionary tools, updating software packages, Microsoft/Steam/Epic stores deciding something is out of date and time to overwrite with something with a newer time stamp...All of those are perfectly normal. If something glitches or bugs out, we find ways around it while waiting for a fix or end up fixing things ourselves.

That doesn't fly at kernel level, especially when something is installed as a BOOT level driver. Those could be anything for interfacing with hardware like CPUs, GPUs and even some accelerators like PCI-E/SAS storage. There tends to be a lot of these but I usually reconfigure them to behave a bit differently on my systems before and after "first" boot.

1721626773340.png


Usually you'll see something flagged differently in ErrorControl than these two examples. Something like Critical - Log error & fail boot. In that situation when it fails, you'd get CrowdStrike'd hard.

......

I was here on day 0 and this thread was lit up with 4 pages by the time I got in. One whole ass page per hour and this isn't even a security forum.
You can bet every single one of those sites went wildly spinning themselves into orbit over this one.
This had so much reach that even the solar observer YouTuber guys had to chime in about it:
"If it was the sun, trust me, I would tell you."


Identify faulty driver(s) located in the one suspicious subdir where all boot critical drivers are located on the system, delete and reboot.
Simple as.

I didn't want to hammer that message home because one, I'm not a CrowdStrike customer and like most people here have identified nobody that is a customer for this LITERAL WHO. This isn't a security risk to anyone here and judging from half the threads it looks like I'm one of maybe three people reading fully equipped to deal with such a crisis in the first place. That part on its own is WILD. If deleting the null drivers wasn't enough, you'd have to go thumbing for some CrowdStrike service and hard delete that too.

Do you....You guys like fishing around in remote mounted hives for boot level drivers under CurrentControlSet and taking those risks?
Again, I'm equipped to deal with it and even I don't like doing it. This is exactly how we end up with the kinds of trust issues that lead to developing these emergency skills in the first place.

Anyway you're not going to like this but CrowdStrike will survive this flub and that's pretty much the basis for why the software even exists. What does that mean? I'll get to it. They have enough customers, obviously. Would I want CrowdStrike software running on any of my systems? Maybe if I had some highly targeted (lol no) mission critical (double LOL) VM or baremetal that's susceptible to Day 0 AI driven attacks or some absolutely insane pre-historic malware like Blaster that gets into every networked Windows box faster than a ninja sex party. Unlike those customers with ~8 million bricked machines, I don't subscribe to the kind of philosophy that permits these types of problems to reconstitute. I avoid updates on personal snowflake servers. I don't even like rebooting the server.

The software exists on the idea of rapid response to emerging threats, which is kind of along the lines of antivirus.
The problem started with one squirrely update that didn't ship correctly and people quickly applied it because they trust the vendor like that.
The fix was shipped out just over half an hour later but 8 million boxes rebooted before they could receive it.
Those 8 million boxes went offline and didn't need the protection anymore, which is a fail for production but NOT a fail for security.
It inconvenienced a bunch of IT pros and devops with a surprise recovery key audit to perform a fix because a lot of those systems had BitLocker/encryption and other complications involved.
So what I want to know is how many of those CrowdStrike systems that didn't go down, are still out in the wild and how often do they reboot after updates?
That might be something to check out.
imagine being a global company and not using staged rollouts of updates
then imagine NOT testing those updates before they went live
Then imagine why your company no longer exists
Honestly this right here should be the majority response. It won't happen because those subscribers have a completely separate philosophy and an entire other universe of problems to go with it. It might shake a few of them out of it though. Enough of these guys need to start asking some deep questions like "is this worth it?"
 
According to one YouTube commenter that says that he works at a company that runs CrowdStrike, he has staged updates to the systems that run the software. He said that he has three stages. The first stage gets the updates immediately but as he said that stage is reserved for devops or test systems to make sure that the CrowdStrike update doesn't mess anything up. He then went onto say that he has two additional stages, stage two where updates are more readily pushed out but not to every system, that's where stage three comes in; that stage is reserved for mission critical systems where CrowdStrike updates are only pushed out to those after SERIOUS amounts of testing.

That sounds like a good policy. Great. So, if this guy has that kind of staged updates how did his company get hit by this whole damn mess? Oh yeah... CrowdStrike delivered the faulty update as an update that would be pushed out regardless of what update stage you have a particular system in. It didn't matter if you had a system in the stage three update ring, it too got the update. YIKES!!!
 
Last edited:
Were they still able to pull up OTIS?
 
so now w this whole crowdstrike thing we now have a few things that are like a juicy steak for hackers

1) A list of clients that use CS
2) The methods that the software uses
3) A way to push an infected content file that gets run thru the kernel driver

Someone allready is selling a COMPLETE list of CS users as well.
 
so now w this whole crowdstrike thing we now have a few things that are like a juicy steak for hackers

1) A list of clients that use CS
2) The methods that the software uses
3) A way to push an infected content file that gets run thru the kernel driver

Someone allready is selling a COMPLETE list of CS users as well.
What makes you think #2 and #3 were not already known?
 
What makes you think #2 and #3 were not already known?
Number 3 really scares me. I hope to God that CrowdStrike included in some kind of digital signature verification to make sure that whatever modules the kernel driver loads are valid modules to load. Then again, I'm not willing to put any money on it.
 
Who in their right mind actually releases a update on a Friday?

I thought that really was a no go unless the systems didn't work in the first place.
 
Why does this remind me of squirrels? :laugh:
 
Was this posted already?


Also, fun fact:
An interesting sidenote pointed out by The Register is that CrowdStrike's current CEO, George Kurtz, was also CEO of McAFee during an infamous 2010 update that caused several PCs to be stuck in an endless boot loop. This likely makes George Kurtz the first CEO in history to preside over two major global PC outages caused by bad security software updates.
 
I just watched this YT video of a guy with some more knowledge of MS Windows that the average Joe.
Part of his story might as well have been in Chinese as I had no idea what he was talking about :eek:, but for some of us here it will make sense.

 
Back
Top