CrowdStrike code update bricking Windows machines around the world

[11]

sparkle

July 19, 2024 (edited July 19, 2024)

Link

Supposedly pre-market trading already has Crowdstrike down 21% I've been up for the past 6 hours now after getting a beautiful 45 minutes of sleep. Been pulled into four major incident calls,...

Supposedly pre-market trading already has Crowdstrike down 21%

I've been up for the past 6 hours now after getting a beautiful 45 minutes of sleep. Been pulled into four major incident calls, written up some quick documents on fixes for various environments, and trying to walk people through how to fix our thousands of servers because I don't want to stay up any longer lol

Absolutely gobsmacked this made it through any level of QA since it is affecting all flavours of Windows, not just specific combinations. And didn't we learn from the big Meta outage (and literally every other Friday outage) that you don't push shit on a Friday??

At least I can probably take the rest of the day off...

Edit: Still coaching support teams on what to do. I did find that somebody has already started a wikipedia page for this :D - https://en.wikipedia.org/w/index.php?title=2024_CrowdStrike_incident&useskin=vector

72 votes

[5]
AugustusFerdinand
July 19, 2024
Link Parent
[insert that's a big list meme here] 911 services of eleven (so far) different states, American, United and Delta flights grounded along with dozens of others across the world plus several major...

Edit: Still coaching support teams on what to do. I did find that somebody has already started a wikipedia page for this :D - https://en.wikipedia.org/w/index.php?title=2024_CrowdStrike_incident&useskin=vector

[insert that's a big list meme here]

911 services of eleven (so far) different states, American, United and Delta flights grounded along with dozens of others across the world plus several major airports grounding all flights, US DOJ, DC and NYC mass transit systems, Oracle, Nokia, broadcasting stations, banks, railways, Singapore's stock exchange, Paris Olympics, entire supermarket chains, hospitals cancelling procedures, Mercedes, McLaren, Aston Martin, and Williams F1 teams, universities, law firms, pharmacies, casinos, train networks, petrol stations, stadiums, fire alarm systems...

emphasis mine on some of the more interesting/frightening examples

27 votes
1. [2]
  TheFireTheft
  July 19, 2024
  Link Parent
  Here's a comment from Hacker News that made me realize how serious this is (via user jmcgough in this post)
  
  Here's a comment from Hacker News that made me realize how serious this is (via user jmcgough in this post)
  
  Took down our entire emergency department as we were treating a heart attack. 911 down for our state too. Nowhere for people to be diverted to because the other nearby hospitals are down. Hard to imagine how many millions of not billions of dollars this one bad update caused.
  
  30 votes
  1. redwall_hp
    July 19, 2024
    Link Parent
    The Associated Press coverage mentioned that surgeries are being postponed because anesthesia is off the table without equipment to manage and monitor it. I'm sure radiology equipment is also...
    
    The Associated Press coverage mentioned that surgeries are being postponed because anesthesia is off the table without equipment to manage and monitor it. I'm sure radiology equipment is also affected. Never mind charting...
    
    12 votes
2. [2]
  sparkle
  July 19, 2024
  Link Parent
  Some lighthearted humour to counter all the countless hours being put in to fix this fuckduggery and the gravity of some of the outages:...
  
  Some lighthearted humour to counter all the countless hours being put in to fix this fuckduggery and the gravity of some of the outages:
  
  https://www.skysports.com/f1/news/12040/13180880/global-it-outage-impacts-mercedes-f1-teams-pit-wall-screens-at-hungarian-gp
  
  A certain irony that Crowdstrike is a sponsor for Mercedes F1...
  
  I may just be a tad tired - going on 12 hours now, but the end miiiiight be in sight
  
  14 votes
  1. AugustusFerdinand
    July 19, 2024
    Link Parent
    Saw that and found it humorous. Gotta have something to balance out some of the horrors that are happening. Hearing from some of my old healthcare worker friends that things like their medication...
    
    Saw that and found it humorous.
    Gotta have something to balance out some of the horrors that are happening. Hearing from some of my old healthcare worker friends that things like their medication systems are locked out, can't get patient charts to know what anyone should be getting anyway ...oh... and the in-room panic buttons and heart monitors that alert staff when a patient codes (flatlines) are all linked to the same system and don't work. So unless someone is literally within earshot of a patient's room and hears the monitor start to scream...
    
    11 votes
[5]
Nivlak
July 19, 2024
Link Parent
Was this something that was easy to catch? Or was this an advanced error that no one could have seen? I am a complete laymen on this topic. I’m just trying to gauge how preventable this was or if...

An update to a configuration file (here called a channel file) issued at 04:09 UTC on 19 July 2024 conflicted with the Windows sensor client, causing affected machines to enter the blue screen of death with the stop code PAGE_FAULT_IN_NONPAGED_AREA.[10][11][12]

Was this something that was easy to catch? Or was this an advanced error that no one could have seen? I am a complete laymen on this topic. I’m just trying to gauge how preventable this was or if there was no chance of seeing it coming.

5 votes
1. [4]
  Eji1700
  July 20, 2024
  Link Parent
  It’s so absurdly easy it makes it extra weird it got out. The bare minimum catch here is “huh we deployed this on a windows machine and it nuked itself”. The best explanation I can think of is one...
  
  It’s so absurdly easy it makes it extra weird it got out.
  
  The bare minimum catch here is “huh we deployed this on a windows machine and it nuked itself”.
  
  The best explanation I can think of is one patch got tested and a different one deployed but that should also be trivial to detect and stop.
  
  It’s why this whole thing is so mind blowing because it MUST involve serious policy, procedure, and culture issues for this to even occur
  
  25 votes
  1. [3]
    tonyswu
    July 20, 2024
    Link Parent
    Yeah, a simple release verification would’ve caught this. There is someone alleging on Reddit that they have proof that crowdstrike developers has the ability to build and deploy from their...
    
    Yeah, a simple release verification would’ve caught this. There is someone alleging on Reddit that they have proof that crowdstrike developers has the ability to build and deploy from their laptops, if that’s true then it’s concerning for sure.
    
    14 votes
    
    [2]
    Eji1700
    July 20, 2024
    Link Parent
    I hope it’s not because this just shows that if a single one of those laptops was compromised you could deploy kernel level malware to every crowdstrike machine
    
    I hope it’s not because this just shows that if a single one of those laptops was compromised you could deploy kernel level malware to every crowdstrike machine
    
    15 votes
    
    Minori
    July 20, 2024
    Link Parent
    I don't think that's necessarily hard to believe? It seems possible a developer could override safeties to create and push a release candidate without oversight. Ideally there would be some alarms...
    
    I don't think that's necessarily hard to believe? It seems possible a developer could override safeties to create and push a release candidate without oversight. Ideally there would be some alarms going off to get someone to review the rush-release. Getting access to the right engineer's laptop has always been a nice attack vector.
    
    10 votes

[5]

Eji1700

July 19, 2024

Link

Got a call from the very top at 12:30 AM and have been in triage mode ever since. I'm less useful in this regard and mostly here as extra hands and testing, but jesus what a massive fuckup this is...

Got a call from the very top at 12:30 AM and have been in triage mode ever since. I'm less useful in this regard and mostly here as extra hands and testing, but jesus what a massive fuckup this is going to be.

Even with the "fix" being quick this is jacked on so many levels:

Kernel drivers strike again.
Memory UNSAFE language drivers strike again.
"Oops all crashes" pushed to PRODUCTION, possibly ignoring any configurations admins had up to prevent that.
Pushed ON A FUCKING FRIDAY.

They're HYPER lucky this is an "easy" fix, but even still that was hours upon hours of outage for half the fucking globe. The cost from this is incalculable, and just shoved down every C level's throat what a fucking mess all this single point of failure cloud stuff can be.

65 votes

[4]
papasquat
July 19, 2024
Link Parent
This doesn't have anything to do with cloud stuff. The outage was caused by locally running code, if cloudstrike was entirely on prem you'd have the same issue. This is an auto updating software...

This doesn't have anything to do with cloud stuff. The outage was caused by locally running code, if cloudstrike was entirely on prem you'd have the same issue.

This is an auto updating software with no possible way to test or have oversight of the update process issue.

No matter what, your EDR platform is going to be running the same software on all of your agents across the enterprise. Crowdstrike isn't alone in this, but not even having the option of controlling how they're updated is crazy.

14 votes
1. [3]
  whs
  July 19, 2024
  Link Parent
  If this software was designed in the pre-cloud era where the internet can be costly and not always on, then the silent update feature will not be a thing and there would not be an issue today.
  
  If this software was designed in the pre-cloud era where the internet can be costly and not always on, then the silent update feature will not be a thing and there would not be an issue today.
  
  8 votes
  1. stu2b50
    July 19, 2024
    Link Parent
    Cloud was about externalizing server based infrastructure to dedicated companies. If anything, if you were entirely cloud based you would not have this issue. The internet existed before “the...
    
    Cloud was about externalizing server based infrastructure to dedicated companies. If anything, if you were entirely cloud based you would not have this issue.
    
    The internet existed before “the cloud” was popular. If you look at the major impacts of these outages, they must be connected to the internet to work by definition.
    
    This really has nothing to do with “the cloud” and if anything mostly impacts on prem setups and edge devices - so, the opposite of “the cloud”.
    
    9 votes
  2. tonyswu
    July 20, 2024
    Link Parent
    I will say if they had any sort of staggering deployment process (which any sizable companies big enough to shard their infrastructure should have) they could have drastically reduced the blast...
    
    I will say if they had any sort of staggering deployment process (which any sizable companies big enough to shard their infrastructure should have) they could have drastically reduced the blast radius of this.
    
    7 votes

[9]

CptBluebear

July 19, 2024

Link

I'm the major incident manager for my corp and have been dealing with this since early this morning. It's pretty bad. The fix is easy but it's a local fix. Meaning that our stores need to have a...

I'm the major incident manager for my corp and have been dealing with this since early this morning.

It's pretty bad.

The fix is easy but it's a local fix. Meaning that our stores need to have a manual fix applied done by the staff themselves. They're being guided by our techies but good lord..

46 votes

[7]
l_one
July 19, 2024 (edited July 19, 2024)
Link Parent
Oh yeah. When you have that kind of problem spread out across... everyone (edit, everyone on Windows) running this software (which to be fair I had not heard of until today but apparently it is...

The fix is easy but it's a local fix.

Oh yeah. When you have that kind of problem spread out across... everyone (edit, everyone on Windows) running this software (which to be fair I had not heard of until today but apparently it is widely used) and every machine needs a human physically present to be able to fix? No remote fix?

Ooooohhhh.... ouch.

another edit: Well, I'll call this a win for anyone running their machines on a Hypervisor platform. At least they can run a fix remotely. Um... unless the Hypervisor they are using on bare metal is... Windows based??? OOOOHHHH, ouch again.

17 votes
1. [6]
  overbyte
  July 19, 2024
  Link Parent
  For servers you can reattach the disk to a new VM, take out the offending file, stick it back into the original VM and boot. The fun part comes when everyone else is doing this and storage latency...
  
  For servers you can reattach the disk to a new VM, take out the offending file, stick it back into the original VM and boot.
  
  The fun part comes when everyone else is doing this and storage latency across the cloud providers have shot through the roof.
  
  15 votes
  1. [5]
    Luca
    July 19, 2024
    Link Parent
    Unless your disk is bitlocker encrypted, and your keys are backued up to another machine that also had crowdstrike installed...
    
    Unless your disk is bitlocker encrypted, and your keys are backued up to another machine that also had crowdstrike installed...
    
    16 votes
    
    [4]
    Minori
    July 19, 2024
    Link Parent
    Manually entering a 48 digit recovery key is fun.
    
    Manually entering a 48 digit recovery key is fun.
    
    8 votes
    
    [3]
    CptBluebear
    July 19, 2024
    Link Parent
    I didn't have to do that but my direct colleagues did. We gave them and a couple of onsite techs emergency access to the vault. That sucked but it was better than being helpless.
    
    I didn't have to do that but my direct colleagues did. We gave them and a couple of onsite techs emergency access to the vault. That sucked but it was better than being helpless.
    
    9 votes
    
    [2]
    Omnicrola
    July 19, 2024
    Link Parent
    I hope you also gave them access to the emergency tequila cabinet as well. My sympathies to all such on-the-ground IT folks today.
    
    emergency access to the vault
    
    I hope you also gave them access to the emergency tequila cabinet as well. My sympathies to all such on-the-ground IT folks today.
    
    7 votes
    
    CptBluebear
    July 19, 2024
    Link Parent
    The US folks got hit the hardest because they actually had to wake at 4. We were relatively fine waking up at normal hours and riding the coattails of Australia having done most of the...
    
    The US folks got hit the hardest because they actually had to wake at 4. We were relatively fine waking up at normal hours and riding the coattails of Australia having done most of the investigative stuff.
    
    Still not an easy day and I'm sure they've had (or are having) a stiff drink to start the weekend.
    
    7 votes
jredd23
July 19, 2024
Link Parent
I am just commiserating the pain with you. On a Friday!

I am just commiserating the pain with you. On a Friday!

3 votes

patience_limited

July 19, 2024 (edited July 19, 2024)

Link

I'd run out of fingers and toes if I had to count the number of hospital systems I work with that use Crowdstrike. It's lightweight, easy to manage, and a reasonably cost-effective part of...

I'd run out of fingers and toes if I had to count the number of hospital systems I work with that use Crowdstrike. It's lightweight, easy to manage, and a reasonably cost-effective part of mitigation strategies for the ransomware plague.

A non-trivial number of surgeries, treatments, and patient visits are going to get postponed because of this. As /u/Eji1700 said, I'm baffled at how this update made it from DEV to TST, let alone PRD. The ugliest part is that since Crowdstrike Falcon is a vendor-managed client, the customers don't get to put its updates on a managed patching schedule with a test environment or spaced rollout. Let this be a lesson to us all.

Update via The Guardian:

CrowdStrike president George Kurtz said the problem was caused by a “defect found in a single content update for Windows hosts”.

He wrote on X:

CrowdStrike is actively working with customers impacted by a defect found in a single content update for Windows hosts. Mac and Linux hosts are not impacted. This is not a security incident or cyberattack. The issue has been identified, isolated and a fix has been deployed.

We refer customers to the support portal for the latest updates and will continue to provide complete and continuous updates on our website. We further recommend organizations ensure they’re communicating with CrowdStrike representatives through official channels. Our team is fully mobilized to ensure the security and stability of CrowdStrike customers.

That may be one of the most laughable uses of the passive voice I've ever seen.

38 votes

[2]

paris

July 19, 2024

Link

Minor unimportant question: I was under the impression that “brick” meant something more permanent than what this seems to be.

35 votes

Soggy
July 19, 2024
Link Parent
"Soft" bricking didn't use to be the common usage, and I too think it's a bit hyperbolic.

"Soft" bricking didn't use to be the common usage, and I too think it's a bit hyperbolic.

26 votes

[4]

winther

July 19, 2024

Link

Apparently the driver files were not correctly formatted If that is true, then it is wild it could make it past any sort of QA.

Apparently the driver files were not correctly formatted

If that is true, then it is wild it could make it past any sort of QA.

29 votes

[3]
WiseassWolfOfYoitsu
July 19, 2024
Link Parent
Some random dev: "I'm gonna get chewed out if completing my PR is late again, so let's just do a little push to prod to get it checked off. What's the worst that could happen?" More seriously - I...

Some random dev: "I'm gonna get chewed out if completing my PR is late again, so let's just do a little push to prod to get it checked off. What's the worst that could happen?"

More seriously - I have to wonder, given the description of the incident, if the faulty file is even human generated. It sounds like it's a content update bundle of some sort. Perhaps it's being generated by the code deployment system and the bug is actually in the file generation/validation/deployment? It's possible the hole is in the test process itself - that the components of the update were being properly individually tested, but that something with the bundling and deployment was missed in the procedures.

26 votes
1. Omnicrola
  July 19, 2024
  Link Parent
  Whatever the actual details turn out to be, this is one of those case studies that will be used as an example in every IT and software dev class from now till the end of time.
  
  Whatever the actual details turn out to be, this is one of those case studies that will be used as an example in every IT and software dev class from now till the end of time.
  
  29 votes
2. AugustusFerdinand
  July 19, 2024
  Link Parent
  There's a reply in that thread that the files being sent to the poster are all different, so that's a good theory.
  
  There's a reply in that thread that the files being sent to the poster are all different, so that's a good theory.
  
  7 votes

[13]

sparksbet

July 19, 2024 (edited July 19, 2024)

Link

Luckily since my company only runs Linux and Mac, we've dodged this bullet ourselves. That said, I feel really bad for everyone who's directly affected by this. German news is reporting that some...

Luckily since my company only runs Linux and Mac, we've dodged this bullet ourselves. That said, I feel really bad for everyone who's directly affected by this. German news is reporting that some hospitals are cancelling all elective surgeries today because of this issue.

EDIT TO ADD:
My wife works as a data center technician at Microsoft, and weirdly enough she didn't even know this was happening until I told her. I guess she's lower-level than the stuff this affects.

18 votes

[2]
overbyte
July 19, 2024
Link Parent
Microsoft would likely dogfood their own software and run Defender on their infrastructure instead of Crowdstrike.

Microsoft would likely dogfood their own software and run Defender on their infrastructure instead of Crowdstrike.

20 votes
1. AugustusFerdinand
  July 19, 2024
  Link Parent
  At least they'd still be up by doing so.
  
  At least they'd still be up by doing so.
  
  2 votes
[7]
Akir
July 19, 2024
Link Parent
Microsoft has made big strides in making Windows a lot more stable than it was when I was young, but it's this kind of thing that still gives me the impression that people who use it for...

Microsoft has made big strides in making Windows a lot more stable than it was when I was young, but it's this kind of thing that still gives me the impression that people who use it for mission-critical applications are completely nuts.

The main reason I don't use it at home these days is because I feel that Windows Update is currently so intrusive that it has essentially turned it into a managed system, and it requires a degree of trust in Microsoft that I am very far from giving them.

7 votes
1. [2]
  redwall_hp
  July 19, 2024
  Link Parent
  I very much appreciate that Apple forbids kernel extensions these days, and locks system folders down with SIP. Security happens at a design level, not by bolting on third-party malware. Third...
  
  I very much appreciate that Apple forbids kernel extensions these days, and locks system folders down with SIP. Security happens at a design level, not by bolting on third-party malware. Third party software should never, ever be allowed to render a machine unbootable or have OS-level privileges.
  
  I have enough daily rage about corporate "security" software on my development machine, so I'm glad Apple keeps it locked inside userland, at least.
  
  16 votes
  1. dblohm7
    July 19, 2024
    Link Parent
    Apple can afford to forbid third-party kernel extensions because they control their hardware.
    
    Apple can afford to forbid third-party kernel extensions because they control their hardware.
    
    8 votes
2. [4]
  dblohm7
  July 19, 2024
  Link Parent
  It's not a Windows problem, it's a CrowdStrike problem.
  
  It's not a Windows problem, it's a CrowdStrike problem.
  
  4 votes
  1. [3]
    Eji1700
    July 20, 2024
    Link Parent
    Arguably both. As others have pointed out Apple and Linux do not allow the kind of access that made this mess possible.
    
    Arguably both. As others have pointed out Apple and Linux do not allow the kind of access that made this mess possible.
    
    4 votes
    
    [2]
    whbboyd
    July 20, 2024
    Link Parent
    Linux absolutely does allow this kind of access. Writing a broken kernel module that consistently panics the kernel when loaded is a rite of passage for budding kernel devs. In this case, however,...
    
    Linux absolutely does allow this kind of access. Writing a broken kernel module that consistently panics the kernel when loaded is a rite of passage for budding kernel devs. In this case, however, Linux provides a separate, safer interface (eBPF) which Crowdstrike uses, rather than a full-privilege kernel module.
    
    (More generally, most Linux systems in the enterprise are not being operated by non-technical end users, so the need for heuristic security software like antivirus is a lot lower; John from sales isn't going to blindly open email attachments on the server.)
    
    20 votes
    
    tonyswu
    July 20, 2024
    Link Parent
    This is true. In fact similar thing happened to RedHat / Rocky 9.4 not too long ago, difference being it was RedHat's fault and was fixed with a kernel patch, and it was not as widespread because...
    
    This is true. In fact similar thing happened to RedHat / Rocky 9.4 not too long ago, difference being it was RedHat's fault and was fixed with a kernel patch, and it was not as widespread because you had to upgrade to 9.4 pretty early to have run into it.
    
    9 votes
[3]
stu2b50
July 19, 2024
Link Parent
It doesn’t really have anything to do with Microsoft so I’m not surprised she wouldn’t know about it.

It doesn’t really have anything to do with Microsoft so I’m not surprised she wouldn’t know about it.

5 votes
1. [2]
  sparksbet
  July 19, 2024
  Link Parent
  When it first broke here people were calling it a Microsoft bug because it only affected Windows machines. It took a hot sec for everyone to realize it was actually Crowdstrike at fault.
  
  When it first broke here people were calling it a Microsoft bug because it only affected Windows machines. It took a hot sec for everyone to realize it was actually Crowdstrike at fault.
  
  3 votes
  1. Minori
    July 20, 2024
    Link Parent
    Also, Azure Central VMs went down for a few hours right before the Crowdstrike issue showed up. This is where some of the confusion stems from.
    
    Also, Azure Central VMs went down for a few hours right before the Crowdstrike issue showed up. This is where some of the confusion stems from.
    
    8 votes

crazydave333

July 19, 2024

Link

I work at a hotel and our systems have all been nuked since a half hour after I came in. This was especially difficult because we had to take in a bunch of guests who were stuck in my city because...

I work at a hotel and our systems have all been nuked since a half hour after I came in. This was especially difficult because we had to take in a bunch of guests who were stuck in my city because of the airline outage, and then they all get fucked by our outage as well.

18 votes

[12]

gil

July 19, 2024

Link

This is insane! I've heard 8 airports are closed because of this so far. Some airlines already stopping operations. I feel sorry for the poor dev that introduced this bug.

17 votes

[11]
Eji1700
July 19, 2024
Link Parent
Honestly the dev is going to eat shit but they're the last person who should. The fact this EVER got into production is insane, let alone this massively.

Honestly the dev is going to eat shit but they're the last person who should.

The fact this EVER got into production is insane, let alone this massively.

33 votes
1. [11]
  
  Comment removed by site admin
  Link Parent
  1. [10]
    Eji1700
    July 19, 2024
    Link Parent
    This is enough shit i'm not sure the company makes it through as is. Granted the CEO probably still gets a golden parachute or some nonsense, but this is really quite special in it's cascading and...
    
    This is enough shit i'm not sure the company makes it through as is. Granted the CEO probably still gets a golden parachute or some nonsense, but this is really quite special in it's cascading and immediate effects.
    
    13 votes
    
    [5]
    papasquat
    July 19, 2024
    Link Parent
    The company will 100% make it through this. Crowdstrike is probably the market leader in EDR and very well respected in that arena. If this was a consistent pattern I might agree with you, but...
    
    The company will 100% make it through this. Crowdstrike is probably the market leader in EDR and very well respected in that arena. If this was a consistent pattern I might agree with you, but there are absolutely zero widely deployed software companies that have been around for any period of time that haven't had some sort of major outage or incident. Solar winds is still around, Palo Alto is still around, and crowdstrike will still be around.
    
    12 votes
    
    [2]
    Eji1700
    July 19, 2024
    Link Parent
    Palo Alto and solarwinds didn’t take down half the globe in a way that will still be affecting companies for the next week, if not month. There are a fuck ton of systems that now need to be...
    
    Palo Alto and solarwinds didn’t take down half the globe in a way that will still be affecting companies for the next week, if not month.
    
    There are a fuck ton of systems that now need to be physically rebooted and oops that outsourced IT team won’t be doing that.
    
    They might be fine but this isn’t in the same category in the slightest
    
    8 votes
    
    papasquat
    July 20, 2024
    Link Parent
    Palo Alto and solar winds were arguably worse, because they were cybersecurity incidents, not operational incidents, and both of them could and did result in attackers launching attacks inside...
    
    Palo Alto and solar winds were arguably worse, because they were cybersecurity incidents, not operational incidents, and both of them could and did result in attackers launching attacks inside private networks to get persistence, literally move through the network, and exfiltrate confidential data, the effects of which are still being felt. Having to reboot computers is more painful in the short term, but definitely less damaging.
    
    10 votes
    
    Omnicrola
    July 19, 2024
    Link Parent
    Absolutely. It's likely a handful of people probably get fired over this, but the company itself will 100% be just fine.
    
    Absolutely. It's likely a handful of people probably get fired over this, but the company itself will 100% be just fine.
    
    5 votes
    
    stu2b50
    July 19, 2024
    Link Parent
    Yep, and you can see this in the stock price. Did it go down? Yes. Is it at 0? Not even close. They’re actually still above where they were 6 months ago. They’ll lose customers for sure, but...
    
    Yep, and you can see this in the stock price. Did it go down? Yes. Is it at 0? Not even close. They’re actually still above where they were 6 months ago.
    
    They’ll lose customers for sure, but they’ll be around.
    
    3 votes
    
    [4]
    ThrowdoBaggins
    July 19, 2024
    Link Parent
    Looking at how drastically the share price dropped when markets opened but then climbed every single minute so far since then I reckon it will probably be at most a few days until the share price...
    
    Looking at how drastically the share price dropped when markets opened but then climbed every single minute so far since then I reckon it will probably be at most a few days until the share price is back where it was, or even higher (because now they’ve had a bunch of free publicity worldwide)
    
    4 votes
    
    [3]
    Weldawadyathink
    July 19, 2024
    Link Parent
    What stock market data are you looking at? On Yahoo finance, it appears that it has stabilized today on the exact same price it was going for during after hours trading. It definitely hasn’t been...
    
    What stock market data are you looking at? On Yahoo finance, it appears that it has stabilized today on the exact same price it was going for during after hours trading. It definitely hasn’t been climbing constantly all morning.
    
    3 votes
    
    [2]
    ThrowdoBaggins
    July 20, 2024
    Link Parent
    Yeah you’re absolutely right. I commented about half an hour after the market first opened, but the trend did not continue. It’s actually hilarious in retrospect just how perfectly poorly timed my...
    
    Yeah you’re absolutely right. I commented about half an hour after the market first opened, but the trend did not continue. It’s actually hilarious in retrospect just how perfectly poorly timed my comment was. Here are some screenshots showing the point at which I made the comment, compared to how it went for the rest of the day.
    
    4 votes
    
    boxer_dogs_dance
    July 21, 2024
    Link Parent
    Lawyer here. Damages from this incident are a big unknown and haven't finished playing out. Just casually reading reddit I have seen claims that hospitals and 911 systems were down and patients...
    
    Lawyer here. Damages from this incident are a big unknown and haven't finished playing out. Just casually reading reddit I have seen claims that hospitals and 911 systems were down and patients died. Someone said that their city water purification system was down. Flights were cancelled and airlines are going to have to provide rescheduled flights at no additional cost. Global shipping and freight were impacted and resetting those schedules is not easy and not free.
    
    Was Crowd strike negligent? What do the contracts say about risk? Crowd strike and investors don't yet know the company's costs for legal liability. They also don't know if customers will leave and what sales they will lose going forward.
    
    4 votes

[11]

Luca

July 19, 2024

Link

The accelerationist deep inside me is lowkey smug about this. Maybe, just maybe, this will push for the adoption of more open source software in business critical applications.

14 votes

[10]
papasquat
July 19, 2024
Link Parent
Open source wouldn't really have fixed this. There are plenty of examples of massive issues caused by open source packages as well (heartbleed, for example). The main cause of this sort of thing...

Open source wouldn't really have fixed this. There are plenty of examples of massive issues caused by open source packages as well (heartbleed, for example).

The main cause of this sort of thing having such a wide impact is massive numbers of machines all running the same software. Unfortunately there's really no way around that. It's not like my company is going to write their own EDR platform.

It's just one of those things that have to be factored in when deploying IT systems. All critical functions need a continuity plan in case absolutely nothing works, because sometimes, nothing works.

26 votes
1. [9]
  krellor
  July 19, 2024
  Link Parent
  I haven't used crowdstrike, but don't enterprises have the ability to create patch windows? I used a competitors product, and we could schedule patches to hit dev before prod, which would have...
  
  I haven't used crowdstrike, but don't enterprises have the ability to create patch windows? I used a competitors product, and we could schedule patches to hit dev before prod, which would have caught this.
  
  3 votes
  1. [4]
    TheJorro
    July 19, 2024
    Link Parent
    Yeah, every company I've been with built and deployed its own Windows patches. I believed this was the normal until this morning... Insane that CrowdStrike works by automatically and silently...
    
    Yeah, every company I've been with built and deployed its own Windows patches. I believed this was the normal until this morning...
    
    Insane that CrowdStrike works by automatically and silently updating all computers in an enterprise immediately. That's a huge red flag. There should always be a test batch.
    
    15 votes
    
    [2]
    papasquat
    July 19, 2024
    Link Parent
    It's kind of the norm for this type of security software. Most software patches in an enterprise are handled by a centralized patch management system that are tested on the hardware that...
    
    It's kind of the norm for this type of security software. Most software patches in an enterprise are handled by a centralized patch management system that are tested on the hardware that enterprise uses before someone manually kicks off a deploy. EDR software usually automatically updates itself with updated data because cyber threats emerge very quickly, and a zero day can be used to exploit a lot of machines very very quickly. There have been countless attacks that could have been prevented by up to date EDR software, and up until now, there hasn't really been a widespread issue with that kind of software causing massive outages.
    
    I imagine that risk analysis may yield a different outcome after taking this incident into account though.
    
    10 votes
    
    TheJorro
    July 19, 2024
    Link Parent
    Right, I supposed that there would have been a practice of testing it quickly on a batch of machines with a 1-3 hour window for approval before deploying it out wide. I think everyone has a story...
    
    Right, I supposed that there would have been a practice of testing it quickly on a batch of machines with a 1-3 hour window for approval before deploying it out wide. I think everyone has a story where a single typo caused an issue with a release at some point, this kind of thing can happen so it's good to have a just-in-case test step. It seems like dangerous practice to me to simply launch to all machines at once without a precursory test, which we're seeing the result of here.
    
    3 votes
    
    krellor
    July 19, 2024
    Link Parent
    Yeah. I understand the need to quickly get updates out to security software. But... some environments require every change to be pre-authorized and tested. So it seems like a feature to group your...
    
    Yeah. I understand the need to quickly get updates out to security software. But... some environments require every change to be pre-authorized and tested. So it seems like a feature to group your devices and force them to update in sequence with a delay between and the ability to halt updates would be a must have in an enterprise offering.
    
    Move fast and break things is great when the things aren't people, airplanes, or stock markets.
    
    5 votes
  2. [2]
    WiseassWolfOfYoitsu
    July 19, 2024
    Link Parent
    Falcon is a threat sensor - it uses data bundles similar to an antivirus. The impression I've gotten from descriptions of the problem is that this is less update-update with new code and is more...
    
    Falcon is a threat sensor - it uses data bundles similar to an antivirus. The impression I've gotten from descriptions of the problem is that this is less update-update with new code and is more equivalent to a bad antivirus threat definition package. These generally aren't something you want to delay. It does call in to question Falcon's parsing of the files if it can't handle an error without BSOD, though.
    
    4 votes
    
    krellor
    July 19, 2024
    Link Parent
    I get that it is similar to updating a fingerprint dictionary. But I've worked in environments where those had to be authorized as well. Anything involving a high CVE remote exploit was instantly...
    
    I get that it is similar to updating a fingerprint dictionary. But I've worked in environments where those had to be authorized as well. Anything involving a high CVE remote exploit was instantly tested in dev and rolled to prod shortly. If necessary and not already in place, we'd put at F5 in front of customer facing services or other mitigations while we rolled the updates through the process.
    
    Anything less critical would make it through in 1-3 days.
    
    4 votes
  3. [2]
    MimicSquid
    July 19, 2024
    Link Parent
    Crowdstrike doesn't allow for that, no.
    
    Crowdstrike doesn't allow for that, no.
    
    1 vote
    
    krellor
    July 19, 2024
    Link Parent
    That seems like a feature with a story to market. 🙂
    
    That seems like a feature with a story to market. 🙂
    
    3 votes
    
    Comment removed by site admin
    Link Parent

arqalite

July 19, 2024

Link

Yeah, the company I work at was crippled by this, we were instructed to turn off all machines until further notice. Thankfully an update seems to have resolved the issue.

9 votes

[12]

Jambo

July 19, 2024

Link

My laptop was affected but I have the added bonus of somehow having bitlocker enabled on my machine.... None of my accounts show this machine has bitlocker and I've never set it up, I don't really...

My laptop was affected but I have the added bonus of somehow having bitlocker enabled on my machine.... None of my accounts show this machine has bitlocker and I've never set it up, I don't really know what I'll do, I guess just wipe the drive and cry.

7 votes

[3]
xk3
July 19, 2024
Link Parent
oof... wtf it's on by default in Windows 11 and they don't tell you?!

None of my accounts show this machine has bitlocker and I've never set it up

oof... wtf it's on by default in Windows 11 and they don't tell you?!

9 votes
1. Weldawadyathink
  July 19, 2024
  Link Parent
  I think it was on by default on most W10 machines too. Maybe not on a retail install, but basically all OEMs have it enabled. Microsoft stores the recovery key in your Microsoft, which is one...
  
  I think it was on by default on most W10 machines too. Maybe not on a retail install, but basically all OEMs have it enabled. Microsoft stores the recovery key in your Microsoft, which is one reason they push you away from a local account. As long as you used the Microsoft happy path you should be able to recover everything.
  
  And if you aren’t using a Microsoft account: you have backups right?
  
  4 votes
2. Jambo
  July 19, 2024
  Link Parent
  Apparently so... I'm not happy - I'm going to try to get around it with the link from Lapbunny but I think I'm screwed
  
  Apparently so... I'm not happy - I'm going to try to get around it with the link from Lapbunny but I think I'm screwed
  
  1 vote
[6]
sparksbet
July 19, 2024
Link Parent
Wait, are there people who have CrowdStrike on their laptops? I didn't know they did anything at the individual user level like that.

Wait, are there people who have CrowdStrike on their laptops? I didn't know they did anything at the individual user level like that.

5 votes
1. [2]
  TurtleCracker
  July 19, 2024
  Link Parent
  I believe my entire company has Crowdstrike on every single Windows laptop.
  
  I believe my entire company has Crowdstrike on every single Windows laptop.
  
  11 votes
  1. sparksbet
    July 19, 2024
    Link Parent
    Ohhh for like work laptop. I completely forgot about that usecase and assumed they meant their own personal computer.
    
    Ohhh for like work laptop. I completely forgot about that usecase and assumed they meant their own personal computer.
    
    8 votes
2. [2]
  papasquat
  July 19, 2024
  Link Parent
  Crowdstrike is an EDR platform. They deploy agents on all endpoints, so servers, workstations, kiosks, signage IoT devices that support it, which is why this thing is such a cluster. I drove past...
  
  Crowdstrike is an EDR platform. They deploy agents on all endpoints, so servers, workstations, kiosks, signage IoT devices that support it, which is why this thing is such a cluster.
  
  I drove past a billboard this morning that was BSODed
  
  8 votes
  1. sparksbet
    July 19, 2024
    Link Parent
    oh yeah I know that Crowdstrike is on tons of machines, but I forgot that people have Windows laptops for work that could be affected. All the other systems that failed are the sorts of things...
    
    oh yeah I know that Crowdstrike is on tons of machines, but I forgot that people have Windows laptops for work that could be affected. All the other systems that failed are the sorts of things that I expected, I was just confused why someone would have it on their personal laptop.
    
    Incidentally, my work uses Crowdstrike, though only on employee laptops, not on any servers. We just lucked out this time because we only use Macbooks and Linux.
    
    1 vote
3. Jambo
  July 19, 2024
  Link Parent
  Our entire company has crowdstrike (falcon sensor) on our workstations
  
  Our entire company has crowdstrike (falcon sensor) on our workstations
  
  2 votes
[2]
Lapbunny
July 19, 2024
Link Parent
You can't get around it via Windows RE?

You can't get around it via Windows RE?

3 votes
1. Jambo
  July 19, 2024
  Link Parent
  I'll definitely try it! thanks
  
  I'll definitely try it! thanks
  
  1 vote

[4]

ogre

July 20, 2024 (edited July 20, 2024)

Link

My company has several thousand employees working remote and maybe a thousand within driving distance of the corporate headquarters. I was on the phone with many different IT agents yesterday...

My company has several thousand employees working remote and maybe a thousand within driving distance of the corporate headquarters. I was on the phone with many different IT agents yesterday trying to get my bitlocker key to no avail. I’m sure this is a common story for many companies right now.

The real kicker for me and my entire team is we’re contractors paid hourly. ~~I don’t think any of us will be paid for the days we can’t work due to this outage~~ (edit: we are being paid). I’m sure many hourly workers across the world are suffering the same fate. This won’t ruin me, but anyone living paycheck to paycheck that misses several days of pay because of this deserves some form of restitution.

7 votes

[3]
chocobean
July 20, 2024
Link Parent
I was under the impression that hourly and daily contractors remaining idle due to factors beyond their control are paid for days sitting idle??? How does that sound fair to me to be on stand by...

I was under the impression that hourly and daily contractors remaining idle due to factors beyond their control are paid for days sitting idle??? How does that sound fair to me to be on stand by not making money? Yikes 😬

7 votes
1. papasquat
  July 20, 2024
  Link Parent
  If you can't work they just won't schedule you. It obviously depends on your contract, but that's kind of the nature of the beast with contracting work. You don't work, you don't get paid.
  
  If you can't work they just won't schedule you. It obviously depends on your contract, but that's kind of the nature of the beast with contracting work. You don't work, you don't get paid.
  
  4 votes
2. ogre
  July 20, 2024
  Link Parent
  You’re correct, I let my anxiety get the best of me without being certain. We have been compensated for yesterday and fortunately IT has been working around the clock to get laptops fixed via...
  
  You’re correct, I let my anxiety get the best of me without being certain. We have been compensated for yesterday and fortunately IT has been working around the clock to get laptops fixed via phone calls.
  
  4 votes

JCPhoenix

July 19, 2024

Link

Luckily, my org doesn't use Crowdstrike. Because I'm on vacation and outta town. Small org, so I'm the only guy who does any IT. Feel bad for the IT folks who are affected, though. Good luck everyone.

Luckily, my org doesn't use Crowdstrike. Because I'm on vacation and outta town. Small org, so I'm the only guy who does any IT.

Feel bad for the IT folks who are affected, though. Good luck everyone.

6 votes

[2]

pete_the_paper_boat

July 19, 2024

Link

The Crowdstroke bug lmao

The Crowdstroke bug lmao

8 votes

patience_limited
July 19, 2024
Link Parent
I've seen it referred to as Generalstrike.

I've seen it referred to as Generalstrike.

8 votes

streblo

July 19, 2024

Link

So glad my company doesn't have any Windows boxes...

4 votes

knocklessmonster

July 19, 2024 (edited July 19, 2024)

Link

It's a doozy. I'm currently repairing Azure VMs for a client that fell over, and most of my company is bluescreening with 40 minute waits for the service desk.

3 votes

[4]

RustyRedRobot

July 20, 2024

Link

What I find interesting is how this could have been done maliciously, IE not a hack as such, but a bad actor gaining access to a release and deliberately hacking it to bring systems down. Rather...

What I find interesting is how this could have been done maliciously, IE not a hack as such, but a bad actor gaining access to a release and deliberately hacking it to bring systems down. Rather than an accident.

Quick question to Linux experts, could this happen there or is this intrinsically linked to the lack of true separation on Windows?

Also could it happen to serverless applications on the cloud?

3 votes

[3]
Minori
July 20, 2024
Link Parent
See the discussion here: https://tildes.net/~tech/1hp7/crowdstrike_code_update_bricking_windows_machines_around_the_world#comment-d8ed

See the discussion here: https://tildes.net/~tech/1hp7/crowdstrike_code_update_bricking_windows_machines_around_the_world#comment-d8ed
1. [2]
  RustyRedRobot
  July 20, 2024
  Link Parent
  Thank you. Seems more like a kernel issue could break Linux like this, other than 3rd parties?
  
  Thank you. Seems more like a kernel issue could break Linux like this, other than 3rd parties?
  
  1 vote
  1. winther
    July 20, 2024
    Link Parent
    Certainly possible for a broken update to cause a similar issue on Linux servers, but I have a hard time seeing it could ever reach the scale of this. Linux servers are more diversified with many...
    
    Certainly possible for a broken update to cause a similar issue on Linux servers, but I have a hard time seeing it could ever reach the scale of this. Linux servers are more diversified with many different distributions and versions. Updates are usually run with some level of oversight by administrators. The reason Cloudstrike could break so many systems so fast is because it can update itself automatically, so a single global update broke everything all at once. I am not aware of any Linux systems that would run kernel level updates on their own without any oversight. So even if a major Linux distribution released a broken update, it would likely be discovered pretty quickly before it was rolled out to thousands of servers worldwide at the same time.
    
    4 votes

fraughtGYRE

July 19, 2024

Link

I'm just going to drop an "I was here" comment for watching the dress rehearsal to a complete collapse of the Internet, wherever that may yet come from... (let me know if you want your name added XD)

3 votes

Comment removed by site admin

Link

[3]

Comment removed by site admin

Link

Grumble4681
July 20, 2024 (edited July 20, 2024)
Link Parent
It's used by IT professionals to monitor/manage security of computers under their control, whether it be IT employees of companies/corporations or MSPs (Managed Service Providers) that manage the...

It's used by IT professionals to monitor/manage security of computers under their control, whether it be IT employees of companies/corporations or MSPs (Managed Service Providers) that manage the IT for other companies.

It's software that is meant for deployment at scale and managing security of devices at scale, so it's not like an anti-virus software you would download for your PC as an individual user like many people did back in the day with McAfee or Norton etc. This is why it has impacted so many computers, but not your individual computer because it's not something that a regular PC user would install.

21 votes
gravitas
July 20, 2024
Link Parent
CrowdStrike is software that businesses install on their computers (Windows, Mac, and Linux) to monitor and prevent malware (in short). If you don’t have it installed, you’re in the clear. It’s...

CrowdStrike is software that businesses install on their computers (Windows, Mac, and Linux) to monitor and prevent malware (in short). If you don’t have it installed, you’re in the clear. It’s not a Windows component—although only Windows computers are affected by this bad update.

12 votes

Link information

98 comments