this post was submitted on 19 Jul 2024
1202 points (99.5% liked)

Technology

58137 readers
4359 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
 

All our servers and company laptops went down at pretty much the same time. Laptops have been bootlooping to blue screen of death. It's all very exciting, personally, as someone not responsible for fixing it.

Apparently caused by a bad CrowdStrike update.

Edit: now being told we (who almost all generally work from home) need to come into the office Monday as they can only apply the fix in-person. We'll see if that changes over the weekend...

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 14 points 2 months ago (2 children)

More than that: it's an IT security and infrastructure admin issue. How was this 3rd party software update allowed to go out to so many systems to break them all at once with no one testing it?

[–] [email protected] 3 points 2 months ago* (last edited 2 months ago)

From what I understand, Crowdstrike doesn't have built in functionality for that.

One admin was saying that they had to figure out which IPs were the update server vs the rest of the functionality servers, block the update server at the company firewall, and then set up special rules to let the traffic through to batches of their machines.

So... yeah. Lot of work, especially if you're somewhere where the sysadmin and firewall duties are split across teams. Or if you're somewhere that is understaffed and overworked. Spend time putting out fires, or jerry-rigging a custom way to do staggered updates on a piece of software that runs largely as a black box?

Edit: re-read your comment. My bad, I think you meant it was a failure of that on CrowdStrike's end. Yeah, absolutely.

[–] [email protected] 1 points 2 months ago (1 children)

Bingo. I work for a small software company, so I expect shit like this to go out to production every so often and cause trouble for our couple tens of thousands of clients... But I can't fathom how any company with worldwide reach can let it happen...

[–] [email protected] 3 points 2 months ago

That's because cloudstrike likely has significantly worse leadership compared to your company.

They have a massive business development budget though.