Outage in LMS365

LMS365 is unavailable in the Australia East region

Resolved Major
August 30, 2023 - Started about 1 year ago - Lasted about 19 hours
Official incident page

Need to monitor LMS365 outages?
Stay on top of outages with IsDown. Monitor the official status pages of all your vendors, SaaS, and tools, including LMS365, and never miss an outage again.
Start Free Trial

Outage Details

We are currently aware of an issue with accessing the LMS365 in Australia East Region. The root cause is a problem with the cooling units in the datacenter. As per Microsoft: Storage and Compute - Australia East - Applying Mitigation Impact Statement: Starting at approximately 08:30 UTC on 30 August 2023, a utility power surge in the Australia East region tripped a subset of the cooling units offline in one of the Availability Zones. While working to restore the cooling units, temperatures in the datacenter increased so we have proactively powered down a small subset of selected compute and storage scale units to avoid damage to hardware and reduce cooling system load. All impacted storage and compute scale units are in the same datacenter, within one of the region’s three Availability Zones (AZs). Multiple downstream services have been identified as impacted. Current Status: We do not have an exact ETA at this time, but temperature in the impacted datacenter have been stabilized. The Azure service recovery process has commenced and is expected to progressively return over a number of hours. Due to the nature of this issue our storage scale units are expected to require additional recovery efforts to ensure all resources return in a consistent state. Note that any new allocations for resources will automatically avoid the impacted scale units. If your workloads are protected by Azure Site Recovery or Azure Backup, we recommend to either initiate a failover to the recovery region or recover using Cross Region Restore. Further updates will be provided in an hour or as events warrant.
Components affected
LMS365 Australia
Latest Updates ( sorted recent to last )
RESOLVED about 1 year ago - at 08/31/2023 07:52AM

We can confirm that the issue is now resolved. If you notice any problems going forward, please reach out to us

MONITORING about 1 year ago - at 08/30/2023 11:11PM

We’re happy to report the issues in the Australia East region have now almost been resolved.

As per Microsoft:

Current Status: With 99% of storage services and 99% of impacted Virtual Machines back online and healthy, we are actively investigating remaining issues with individual downstream services to confirm their recovery status. Our Storage team are making progress on one specific storage scale unit that is still experiencing isolated issues. Our SQL team are investigating a potential issue with an underlying Service Fabric dependency. Our Cosmos DB team are investigating why some services have not fully recovered. Despite these remaining investigations, the majority of customers and services should already be recovered. Further updates will be provided in 60 minutes, or as events warrant.

MONITORING about 1 year ago - at 08/30/2023 07:12PM

As per Microsoft:

Current Status: Mitigation efforts are continuing, we have made significant progress in restoring core services, and we expect that the vast majority of remaining services should be back online in the next 2-3 hours. After restoring power and stabilizing temperatures, all network infrastructure and 95% of storage services are back online. All premium disk storage has fully recovered, we continue to work towards mitigating the final remaining storage devices. The majority of underlying compute services are back online, with more than 85% of Virtual Machines that were impacted now back online and healthy. As a result, many customers of these services have already recovered - but we continue to work with downstream impacted services to ensure that they are coming back online in the next 2-3 hours as expected. Further updates will be provided in 60 minutes, or as events warrant.

MONITORING about 1 year ago - at 08/30/2023 05:40PM

Based on our logs, tenants in Australia East region can be accessed now, however, Microsoft is still stating that service recovery is progressing.

We will continue monitoring, and we will keep providing updates as soon as we have further information.

Do not hesitate to contact us if you have any issues with loading the system.

IDENTIFIED about 1 year ago - at 08/30/2023 02:32PM

We are currently aware of an issue with accessing the LMS365 in Australia East Region. The root cause is a problem with the cooling units in the datacenter.

As per Microsoft:
Storage and Compute - Australia East - Applying Mitigation

Impact Statement: Starting at approximately 08:30 UTC on 30 August 2023, a utility power surge in the Australia East region tripped a subset of the cooling units offline in one of the Availability Zones. While working to restore the cooling units, temperatures in the datacenter increased so we have proactively powered down a small subset of selected compute and storage scale units to avoid damage to hardware and reduce cooling system load. All impacted storage and compute scale units are in the same datacenter, within one of the region’s three Availability Zones (AZs). Multiple downstream services have been identified as impacted.

Current Status: We do not have an exact ETA at this time, but temperature in the impacted datacenter have been stabilized. The Azure service recovery process has commenced and is expected to progressively return over a number of hours. Due to the nature of this issue our storage scale units are expected to require additional recovery efforts to ensure all resources return in a consistent state. Note that any new allocations for resources will automatically avoid the impacted scale units. If your workloads are protected by Azure Site Recovery or Azure Backup, we recommend to either initiate a failover to the recovery region or recover using Cross Region Restore. Further updates will be provided in an hour or as events warrant.

Keeping track of cloud vendor outages shouldn't be hard

With IsDown, you can monitor all your critical services' official status pages from one centralized dashboard and receive instant alerts the moment an outage is detected. Say goodbye to constantly checking multiple sites for updates and stay ahead of outages with IsDown.

Start free trial

No credit card required · Cancel anytime · 3243 services available

Integrations with Slack Microsoft Teams Google Chat Datadog PagerDuty Zapier Discord Webhook

Setup in 5 minutes or less

How much time you'll save your team, by having the outages information close to them?

14-day free trial · No credit card required · Cancel anytime