Outage in Azure

Multiple services recovering after power/cooling issue - Australia East

Resolved Minor
August 30, 2023 - Started about 1 year ago - Lasted 1 day

Need to monitor Azure outages?
Stay on top of outages with IsDown. Monitor the official status pages of all your vendors, SaaS, and tools, including Azure, and never miss an outage again.
Start Free Trial

Outage Details

Impact Statement: Starting at approximately 08:30 UTC on 30 August 2023, a utility power surge in the Australia East region tripped a subset of the cooling units offline in one datacenter, within one of the Availability Zones. While working to restore cooling, temperatures in the datacenter increased so we proactively powered down a small subset of selected compute and storage scale units, to avoid damage to hardware. Multiple downstream services were impacted, with targeted communications being distributed via Azure Service Health.Current Status: Storage infrastructure has recovered. A subset of services still experiencing residual impact are on the path to mitigation.Mitigation: We worked on recovering the failed cooling units and reducing the overall temperature within the impacted area. Once temperature levels were within operational thresholds, we began to restore power to the affected infrastructure and started a phased process to bring this infrastructure back online. Once storage infrastructure was fully restored, dependent compute scale units were then also restored to operation. As the underlying compute and storage scale units became healthy, compute and other dependent Azure services recovered. While we have broadly recovered, a small subset of services are still working on post recovery checks, and we are closely monitoring the datacenter metrics for storage and compute resources to ensure they continue to show as healthy. For any residual customers with services still in the recovery process, we will communicate directly to them through Service Health in the Azure portal, which also triggers Service Health alerts.

Cut Vendor Outage Costs with an Internal Status Page

With IsDown, you can monitor all your critical services' official status pages from one centralized dashboard and receive instant alerts the moment an outage is detected. Say goodbye to constantly checking multiple sites for updates and stay ahead of outages with IsDown.

Start free trial

No credit card required · Cancel anytime · 3260 services available

Integrations with Slack Microsoft Teams Google Chat Datadog PagerDuty Zapier Discord Webhook

Setup in 5 minutes or less

How much time you'll save your team, by having the outages information close to them?

14-day free trial · No credit card required · Cancel anytime