More than a couple of weeks since the CrowdStrike global outage, the cybersecurity company has published an in-depth report around what exactly went wrong. The analysis finds that a combination of factors caused the Falcon sensor to crash.
The Falcon sensor, the firm stated in the report is meant to deliver AI/ML to “protect customer systems by identifying and remediating the advanced threats.” But a new feature that intended to increase visibility of “possible novel attack techniques that could abuse certain Windows mechanisms,” was added to the sensor earlier in February.
This feature was tested by the firm’s “standard software development processes,” they noted using a pre-defined set of fields for Rapid Response Content to gather data. But this wasn’t thorough enough to catch the issue.
“On March 5, 2024, following a successful stress test, the first Rapid Response Content for Channel File 291 was released to production as part of a content configuration update, with three additional Rapid Response updates deployed between April 8, 2024 and April 24, 2024,” CrowdStrike said. These “performed as expected” in production.
(Unravel the complexities of our digital world on The Interface podcast, where business leaders and scientists share insights that shape tomorrow’s innovation. The Interface is also available on YouTube, Apple Podcasts and Spotify.)
Then, an update on July 19 was delivered to certain Windows hosts that would “evolve the new capability first released in February 2024,” where the sensor expected 20 input fields but the update provided 21 input fields. This mismatch due to an out-of-bounds memory read caused the system crash.
The firm has said that based on this investigation, they will update their content configuration system test procedures and upgrade tests for template type development with “automated tests for all existing template types.”
It will also be adding deployment layers and acceptance checks for the content configuration system.
Internally, there have been earlier complaints around the automated updates. To resolve this, CrowdStrike will now offer customers more control over the deployment of Rapid Response Content updates.
The company will also be working with “two independent third-party software security vendors” to further review the Falcon sensor code and their quality control and update releases.
Cybersecurity experts have praised the company’s transparency but underlined the need for them to have robust processes given the critical industries they cater to.
But the company could still get sued by Delta Airlines that is looking to recover their losses due to the outage. Last week, CrowdStrike shareholders filed a lawsuit against it for making “false and misleading” statements about its software testing.