What enterprises say the CrowdStrike outage really teaches (2024)

What enterprises say the CrowdStrike outage really teaches (1)

byTom Nolle

Opinion

Aug 13, 20247 mins

Cloud ComputingNetwork Management SoftwareNetwork Security

After the CrowdStrike outage, some enterprises IT teams are rethinking their assumptions about how use of the cloud impacts application reliability.

What enterprises say the CrowdStrike outage really teaches (2)

Credit: Shutterstock

Early on July 19, just minutes after data security giant CrowdStrike released what was supposed to be a security update, enterprises started losing Windows endpoints, and we ended up with one of the worst and most widespread IT outages of all time. There’s been a lot said about the why and the how. But how much of that reflects what enterprises think about the outage and what they believe they need to do? We’ve been told that enterprises are rethinking their cloud strategy. Is that true, and what are they planning to do?

One thing is clear: Enterprises believe this was CrowdStrike’s problem. Only 21 of the enterprises I contacted thought Microsoft was even a contributor, and none thought Microsoft was primarily to blame.

CrowdStrike made two errors, enterprises say. First, CrowdStrike didn’t account for the sensitivity of its Falcon client software for endpoints to the tabular data that described how to look for security issues. As a result, an update to that data crashed the client by introducing a condition that had existed before but hadn’t been properly tested. Second, rather than doing a limited release of the new data file that would almost certainly have caught the problem and limited its impact, CrowdStrike pushed it out to its entire user base.

All program logic is data-dependent in that the paths through the software are determined by the data it’s processing. You can’t say you’ve tested unless you’ve exercised all these paths. Of 89 enterprise development managers who shared comments with me, all said they had to deal with this in their own testing, and they’d expect a software supplier to be even more careful than an end user. Still, they understand how it could happen. One said they’d heard the software bug had been in the Falcon client for over a year and just hadn’t been hit yet.

Where things get a bit murky is whether the CrowdStrike failure should have caused Windows systems (over eight million of them) to crash and resist remote recovery. All of the 21 enterprises who said they believed Microsoft had contributed to the problem thought Microsoft’s Windows should not have responded to the CrowdStrike error in the way it did. The 37 who didn’t hold Microsoft accountable pointed out that security software necessarily has a unique ability to interact with the Windows kernel software, and this means it can create a major problem if there’s an error.

But while enterprises aren’t convinced that Microsoft contributed to the problem, over three-quarters think Microsoft could contribute to reducing the risk of a recurrence. Nearly as many said that they believed Windows was more prone to the kind of problem CrowdStrike’s bug created, and that view was held by 80 of the 89 development managers, many of whom said that Apple’s MacOS or Linux didn’t pose the same risk and that neither was impacted by the problem.

Misjudging cloud’s impact on application reliability

But what does this all mean with regard to things like cloud usage?

Enterprises said they are looking at their use of the cloud as a means of improving application reliability. In fact, the number who said they believed they’d misjudged the cloud’s value in that area increased from less than 15% before the CrowdStrike event to 35% immediately after it, and to 55% by early August. The biggest factor in that growth was the realization that massive endpoint faults could take down their operation, and no cloud backup would be effective. Enterprises were forced by the fault to examine just how the cloud impacts application reliability.

Let’s say you have a data center application linked to a Windows PC device. Let’s say that each is likely to be down one percent of the time. You want to improve reliability with the addition of a cloud front-end, and let’s say that it’s also down one percent of the time. What’s your reliability? It depends on whether the cloud and data center are able to back each other up. If they can’t, the chances all three will be up is 0.99 cubed, or 97%, which is less than it would have been without the cloud. But, if the cloud and data center can back each other up, then both would have to fail to take your application down. The chances of both cloud and data center failing is 1% times 1% or 0.0001, which is one in ten thousand, and application reliability is improved.

The same thing has to be considered in multi-cloud. Of 110 enterprises who commented on the reliability impact of multi-cloud, 108 said it made applications more reliable. Does it? It depends. If two clouds back each other up, the risk of failure is indeed lower, just like in my cloud/data-center example above. But many enterprises admitted that at least some of their applications needed both clouds because components relied on features specific to each cloud. Now they both need to be up, and so multi-cloud actually reduced reliability!

What this proves is that enterprises may be deluding themselves about the cloud and reliability, overall. The cloud isn’t always going to improve reliability any more than it always lowers costs. There’s no substitute for knowing what you’re doing, especially in the area of managing reliability. Instincts are a poor substitute for a tutorial in probability and statistics.

But let’s go back to my cloud reliability calculation. Yes, the chances of both cloud and data enter failing is one in ten thousand, but the chance of the endpoint failing in that example is one in a hundred. Endpoint risk is clearly more of a problem, so what can enterprises do about it?

Of the 138 enterprises who commented on the problem, the suggestion made most often was to teach key people at each location how to do a “safe boot” of their systems, because this was all that was really needed to quickly resolve the CrowdStrike problem. The second-place recommendation was “use a browser interface” on the endpoint device rather than an application. In fact, 44 enterprises said they used browser application access and were able to operate normally if they had something other than Windows endpoints to fall back on. Most often the other endpoint choice was a phone or tablet, but some (13) had Mac or Linux desktop systems they could use during the outage. In addition, you can use any number of simple devices to run a browser, like a Chromebook, and simple devices are less likely to fall prey to the sort of problem CrowdStrike had, or even to need specialized endpoint security tools.

So, should you be “rethinking your cloud strategy”? In fact, maybe what’s needed is rethinking the endpoint strategy. The second-place recommendation above could mean that doing more in the cloud would reduce risk, because the real problem here is that sophisticated devices as user on-ramps to applications are harder to fix remotely, and local people lack the skills to do the job themselves. Simplification of endpoints can lead to a multiplicity of available endpoint options as it did for many enterprises, and that would make the kind of failure that CrowdStrike created little more than an inconvenience. Don’t panic; properly used, the cloud is still your friend.

Read more from Tom Nolle

  • And the AI winner is…IBM?
  • Who will enterprises trust to guide network transformation?
  • Open RAN and HashiCorp are making us rethink openness
  • Why edge computing is both hyped and ignored
  • AI success: Real or hallucination?

Related content

  • news2024 global network outage report and internet health check ThousandEyes tracks internet and cloud traffic and provides Network World with weekly updates on the performance of ISPs, cloud service providers, and UCaaS providers.By Ann BednarzAug 20, 202479 minsInternet Service ProvidersNetwork Management SoftwareCloud Computing
  • featureVMware Explore 2024: Latest news and insights Concerns about the impact of Broadcom's acquisition of VMware are still prevalent as VMware Explore 2024 kicks off. Here's the latest from the event.By Network World staffAug 20, 20243 minsEdge ComputingNetwork SecurityVirtualization
  • featureCustomer concerns loom as VMware Explore event approaches With VMware Explore 2024 on the horizon, it’s a good time to look at what enterprise customers want as they plan and expand their IT and cloud strategies. By Bob ViolinoAug 19, 20249 minsVirtualizationCloud ComputingData Center
  • analysisHPE buys Morpheus Data for multicloud management Hewlett Packard Enterprise is snapping up Morpheus Data for its cloud-agnostic management software, which tackles multicloud automation and orchestration as well as cloud cost optimization.By Michael CooneyAug 15, 20243 minsHybrid CloudNetwork Management SoftwareCloud Computing
  • PODCASTS
  • VIDEOS
  • RESOURCES
  • EVENTS

NEWSLETTERS

Newsletter Promo Module Test

Description for newsletter promo module.

What enterprises say the CrowdStrike outage really teaches (2024)
Top Articles
Recovery Coach Salary
Rune Factory 5 Dual Blade Recipes
Tyler Sis 360 Louisiana Mo
Craigslist Monterrey Ca
Noaa Charleston Wv
The UPS Store | Ship & Print Here > 400 West Broadway
Belle Meade Barbershop | Uncle Classic Barbershop | Nashville Barbers
Get train & bus departures - Android
5 Bijwerkingen van zwemmen in een zwembad met te veel chloor - Bereik uw gezondheidsdoelen met praktische hulpmiddelen voor eten en fitness, deskundige bronnen en een betrokken gemeenschap.
Bellinghamcraigslist
Whiskeytown Camera
Bbc 5Live Schedule
Jscc Jweb
Osrs Blessed Axe
Craigslist Heavy Equipment Knoxville Tennessee
Pwc Transparency Report
Jack Daniels Pop Tarts
Flower Mound Clavicle Trauma
Ou Class Nav
R Cwbt
Grandview Outlet Westwood Ky
Vandymania Com Forums
Water Trends Inferno Pool Cleaner
Metro Pcs.near Me
Isaidup
Yosemite Sam Hood Ornament
BJ 이름 찾는다 꼭 도와줘라 | 짤방 | 일베저장소
Amerisourcebergen Thoughtspot 2023
Costco Jobs San Diego
Cornedbeefapproved
10 Best Places to Go and Things to Know for a Trip to the Hickory M...
Temu Seat Covers
Busch Gardens Wait Times
Productos para el Cuidado del Cabello Después de un Alisado: Tips y Consejos
Rogold Extension
How to Draw a Bubble Letter M in 5 Easy Steps
Craigslist West Seneca
Afspraak inzien
Heavenly Delusion Gif
Ticket To Paradise Showtimes Near Marshall 6 Theatre
Joey Gentile Lpsg
Wayne State Academica Login
Infinite Campus Farmingdale
O'reilly's Palmyra Missouri
Login
Candise Yang Acupuncture
Mega Millions Lottery - Winning Numbers & Results
Ciara Rose Scalia-Hirschman
Billings City Landfill Hours
Chitterlings (Chitlins)
Loss Payee And Lienholder Addresses And Contact Information Updated Daily Free List Bank Of America
Primary Care in Nashville & Southern KY | Tristar Medical Group
Latest Posts
Article information

Author: Barbera Armstrong

Last Updated:

Views: 5794

Rating: 4.9 / 5 (59 voted)

Reviews: 82% of readers found this page helpful

Author information

Name: Barbera Armstrong

Birthday: 1992-09-12

Address: Suite 993 99852 Daugherty Causeway, Ritchiehaven, VT 49630

Phone: +5026838435397

Job: National Engineer

Hobby: Listening to music, Board games, Photography, Ice skating, LARPing, Kite flying, Rugby

Introduction: My name is Barbera Armstrong, I am a lovely, delightful, cooperative, funny, enchanting, vivacious, tender person who loves writing and wants to share my knowledge and understanding with you.