Awareness - Virtual Machines
Resolved
Minor
July 18, 2024 - Started 5 months ago
- Lasted 2 days
Need to monitor Azure outages?
Stay on top of outages with IsDown. Monitor the official status pages of all your vendors, SaaS, and tools, including Azure, and never miss an outage again.
Start Free Trial →
Outage Details
We are aware of an issue that started on 19 July 2024 at 04:09UTC, which resulted in customers experiencing unresponsiveness and startup failures on Windows machines using the CrowdStrike Falcon agent, affecting both on-premises and various cloud platforms (Azure, AWS, and Google Cloud).CrowdStrike has released a public statement on Windows Sensor Update - crowdstrike.com addressing the matter, and it includes recommended steps for a workaround. For environments specific to Azure, further instructions are provided below:Updated: We approximate impact started as early as 19 July 2024 at 04:09UTC, when this update started rolling out.Update as of 07:30 UTC on 20 July 2024:We have received reports of successful recovery from some customers attempting multiple Virtual Machine restart operations on affected Virtual Machines. Customers can attempt to do so as follows:Using the Azure Portal - attempting 'Restart' on affected VMsUsing the Azure CLI or Azure Shell (https://shell.azure.com)https://learn.microsoft.com/en-us/cli/azure/vm?view=azure-cli-latest#az-vm-restartWe have received feedback from customers that several reboots may be required, but overall feedback is that reboots are an effective troubleshooting step at this stage.Additional options for recovery:Option 1We recommend customers that can, to restore from a backup, preferably from before 19 July 2024 at 04:09UTC, when this faulty update started rolling out.Customers leveraging Azure Backup can follow the following instructions:How to restore Azure VM data in Azure portalOption 2Customers can attempt to remove the C-00000291*.sys file on the disk directly and potentially not need to perform detach and reattach disc.Open Azure AZ CLI and run the following steps 1.Create rescue VM with// Creates a rescue VM, same size as the original VM in the same region. Asks for Username and password. // Makes a copy of the OS Disk of the problem VM// Attaches the OS Disk as Data disk to the Rescue VM//az vm repair create -g {your-resource-group} -n {vm-name} --verbose"az vm repair create -g RGNAME -n VMNAME -- verbose" **NOTE: For encrypted VM run the following command:**"az vm repair create -g RGNAME -n BROKENVMNAME --unlock-encrypted-vm --verbose"2.Then run:// Runs the mitigation script on the Rescue VM which fixes the problem (on the os-disk copy attached as a data disk)//az vm repair run -g {your-resource-group} -n {vm-name} --run-id win-crowdstrike-fix-bootloop -verbose"az vm repair run -g RGNAME -n BROKENVMNAME -- run-id win-crowdstrike-fix-bootloop -- run-on-repair -- verbose"3.Final step is to run:// Removes the Fixed OS-Disk Copy from the rescue VM// Stops the problem VM but it is not deallocated// Attaches the fixed OS-Disk to the original VM// Starts the original VM// Gives prompts to delete the repair vm//az vm repair restore -g {your-resource-group} -n {vmname} --verbose"az vm repair restore -g RGNAME -n BROKENVMNAME" --verboseNote: These steps would work for both managed and unmanaged disks. In case, if you run into capacity issues, please retry after some time.Option 3Customers can attempt repairs on the OS disk by following these instructions:Troubleshoot a Windows VM by attaching the OS disk to a repair VM through the Azure portalOnce the disk is attached, customers can attempt to delete the following file:Windows/System32/Drivers/CrowdStrike/C-00000291*.sysThe disk can then be attached and re-attached to the original VM.We can confirm the affected update has been pulled by CrowdStrike. Customers that are continuing to experience issues should reach out to CrowdStrike for additional assistance.Additionally, we're continuing to investigate additional mitigation options for customers and will share more information as it becomes known.