As the dust settles from the CrowdStrike outages we reflect on how adopting a heterogeneous IT strategy when adopting technologies can increase your organization’s resilience against outages and mitigate risks.
None of eG Innovations’ product lines were impacted by the CrowdStrike outages (a reasonable overview of cause and impact in: CrowdStrike outage: We finally know what caused it – and how much it cost | CNN Business) because we simply don’t rebrand other vendors’ driver technology within our products. We operate with a large government and enterprise customer base globally and such a model isn’t appropriate within the regulated environments eG Enterprise is often deployed in.
Some of our customers were unfortunately affected by what some are labelling the “worst IT outage” the world has ever seen” as they leverage CrowdStrike purchased and supported directly. And we sympathize and support them as they return to normality. Small comfort to the millions stranded in airports but in practice it is currently estimated that less than 1% of Windows endpoints worldwide were affected. Endpoints running Linux, Mac and other OSs were unaffected.
Some organizations 100% reliant on Windows were completely wiped out whilst others could muddle through with their non-Windows systems. Many are now considering whether, as in human resources recruitment, a bit of “diversity” might not be a bad thing!
The Dutch Elm Disease Analogy – with Respect to Heterogeneous IT
Dutch Elm Disease serves as an effective analogy for why it may be prudent to diversify your IT operating system landscape beyond Microsoft Windows. Just as Dutch Elm Disease decimated elm tree populations due to their genetic uniformity, a homogeneous IT environment is similarly vulnerable to widespread disruption. When an entire network relies on a single operating system, a targeted malware or ransomware attack (or in CrowdStrike’s case a likely innocent but flaky QA process) can compromise every connected device, leading to catastrophic failures.
What is Dutch Elm Disease?
Dutch Elm Disease, was first identified in the UK in the 1920s, and became devastating in the 1970s, killing over 20 million elm trees. Spread by bark beetles and caused by the fungus Ophiostoma novo-ulmi, it led to significant losses in the elm population, transforming the British landscape and prompting extensive tree management efforts.
Diversifying your OS landscape introduces a variety of systems, akin to planting different tree species. This strategy enhances resilience, as not all systems will be susceptible to the same threats. For instance, while a Windows-specific virus may incapacitate systems running Windows, devices operating on Linux or macOS remain unaffected, ensuring continuity of critical operations. Bunging a few oaks amongst those elms may even slow the spread of the disease – making is a little harder for the hackers (or bark beetles) to hop between systems.
Moreover, a diverse IT environment may encourage innovation and offer advantages of “best-in-class” adoption. Different operating systems often excel in different areas — Linux for its security, open-source ecosystem and stability, macOS for its user-friendly interface (oh so beloved of earnest designers and architects), and Windows for its widespread application support. Leveraging these strengths allows organizations to tailor their IT infrastructure to meet varied needs and application workload demands more effectively.
In essence, just as biodiversity safeguards ecosystems against disease and environmental changes, OS diversity fortifies IT infrastructure against cyber threats and operational challenges, ensuring greater security, robustness, and adaptability.
Is Heterogenous IT the Right Strategy for You?
Whilst a heterogeneous IT strategy offers resilience and flexibility by using diverse operating systems, it inevitably comes with downsides. It requires a wider range of staff skills, increasing training costs and complexity. Maintenance overheads are higher due to varied systems needing unique support. License management can also be a pain and collaboration can be challenging across different platforms.
Personally, I think it depends on the size of an organization and what the critical functions within that organization are and how they are entwined and interdependent. For an organization standardized on Windows based Outlook for email, a back-up system based on Linux may be a wise move. If email is a critical function or all your communication mechanisms are based on one OS this may be one to risk assess.
6 Ways eG Enterprise Helps Empower Resilience via a Heterogeneous IT Strategy
1. Breadth of technologies covered
eG Enterprise was designed to be a single pane of glass observability solution designed to monitor each technology via domain aware and specific intelligence but via a single simple to use GUI interface that requires minimal training to use.
This means that if you want to add a bit of Linux VDI, as part of a heterogeneous IT strategy, alongside your Windows DaaS or VDI you need minimal effort to monitor it and your help desk operators on L1/L2 support will need little or perhaps no extra training to use eG Enterprise on a different genre of OS.
More than 500+ different technologies are supported by eG Enterprise, for details of specific technologies – see: End-to-End Monitoring: Applications, Cloud, Containers (eginnovations.com).
2. AIOps automation
The automation provided by the powerful AIOps engine within eG Enterprise provides the intelligence to understand the end-to-end dependencies with IT systems allowing automated deployment, discovery, root-cause analytics, monitoring, alerting, auto-remediation and more. This makes adopting new technologies to diversify an environment a lot easier by minimizing manual effort.
Learn more about why AIOps observability is now the de facto standard in enterprise IT monitoring, see: AIOps Powered Monitoring | eG Innovations.
3. Transferable licensing
We understand that many of our customers (especially those in regulated government or enterprises sectors) leverage a heterogeneous IT strategy for resilience. One that they tweak and evolve as their businesses change and as the wider risks and threats to their IT systems change.
This is why we offer transferable licensing – allowing customers to move their monitoring capabilities to the technologies they want to use. This significantly lowers the cost of a heterogeneous IT strategy vs using native tools or point solutions. See: eG Enterprise IT Monitoring Licensing – Cost-Effective & Flexible (eginnovations.com) for more information. Transferable licensing can also benefit Circular IT strategies, please see: Circular IT and Zero Waste Strategies with Flexible Licensing (eginnovations.com).
4. Designed for Multi Cloud, Hybrid Cloud, and Digital Transformation
Cloud is more than platform technology – it is a marketplace and economics often dictate decisions as much as the core technological details. eG Enterprise is fundamentally built for Multi Cloud and Hybrid Cloud scenarios. These are fundamentally heterogeneous IT strategies in themselves. We assume you’ll probably be mixing technologies and that at some point you will need to replace certain technologies with newer better options. Read more on our take on some of this:
- The Importance of a Cloud Exit Strategy: What It Is, Who Needs It, and How to Plan It | eG Innovations
- On-premises, Cloud First or Cloud Repatriation – What’s the Trend? Which is Best? | eG Innovations
- What is Supercloud? What to consider when monitoring and observing a Supercloud? | eG Innovations
5. Truly a Global Company – Asia, USA, ANZ, EMEA and More
A few years ago, I had a very interesting conversation with a Cambridge University Material Scientist who was researching novel ceramic coatings for use in the aerospace industry. His research was focused on techniques from the former DDR (“East Germany”). It turns out that the low-quality “Brown Coal” in the area led to the evolution of very different techniques to the rest of the world and the resulting ceramics have somewhat unusual yet potentially desirable characteristics – outside-of-the-box to what had become mainstream conventional wisdom.
Personally, I believe this is true of IT too – many regions operate in their own silo to a large extent. eG Innovations sales are worldwide and we have a much larger presence in Asia, the Middle East, even Africa than many EMEA & USA only companies. This means the range of technologies we support draws from a wider range of demands and more diverse customer base.
Some interesting reading around this:
- Why Asia dodged the worst of the CrowdStrike meltdown – ABC News – basically, far fewer businesses in the region are CrowdStrike customers.
I certainly find that I always learn something new from our customers in KSA, South Korea, Singapore, Indonesia and so on vs. the EMEA or USA Citrix sales I’ve had more experience with. A fresh perspective and a different range of technologies in use can be healthy for a product manager or solutions architect.
6. We Assume “everything will fail at some point”
Not very long ago, I wrote an article about how to evaluate if a software vendor was flaky or trustworthy, Should I Trust a SaaS Vendor or Product? | eG Innovations, in it I used the phrase “everything will fail at some point”. It’s one as a monitoring vendor; we must keep at the forefront of our minds. How well CrowdStrike supports their customers recover from the incident will be very important to their future.
Conclusions on What We Should Learn from CrowdStrike (and Trees) and the Value of Heterogeneous IT Systems
The CrowdStrike outage wasn’t malicious, is reversible (if a bit of pain to do so), only hit <1% of Windows end-points, time zones mitigated for many…. but it makes you think, if some of those factors were different…. it could have been a lot worse!
The bark beetles are a malicious force to the English Elm, but they are pretty rubbish at targeting anything else…. when the Elms all died, their population plunged not too long after. If I was managing street trees or woodlands, I’d be looking to have more than homogenous avenues of Elms.
Tree Diversity Strategies
The 10-20-30 rule (Santamour,f, 2002) is a method of ensuring your tree population remains sufficiently diverse:
- No more than 10% should be the same species (prunus avium, wild cherry)
- No more than 20% should be the same genus (prunus, cherry)
- No more than 30% should be the same family (Rosacea Family)
Perhaps the CrowdStrike incident suggests that in some use cases, a similar rule for hypervisors or OSs should be considered, limiting the % of servers or devices exposed to a specific risk or threat?
And so too, in IT, it is probably wise to adopt a heterogeneous IT strategy. It’s one I’ve long recommended in my articles advising on Cloud Outage resilience. It’s simply a case of identifying where your organization is wholly reliant on a single piece of networking, software or hardware and thinking let’s assume “everything will fail at some point”.
eG Enterprise is an Observability solution for Modern IT. Monitor digital workspaces,
web applications, SaaS services, cloud and containers from a single pane of glass.