Outage in Azure Databricks

ES-1540594

Resolved Major
August 01, 2025 - Started 3 days ago - Lasted about 14 hours

Need to monitor Azure Databricks outages?
Stay on top of outages with IsDown. Monitor the official status pages of all your vendors, SaaS, and tools, including Azure Databricks, and never miss an outage again.
Start Free Trial

Outage Details

We are actively investigating an issue with the Databricks service.

Further updates will be provided within the next hour, or as events warrant.
Latest Updates ( sorted recent to last )
3 days ago - at 08/01/2025 03:42PM

We are actively investigating an issue with the Databricks service.

Further updates will be provided within the next hour, or as events warrant.

3 days ago - at 08/01/2025 04:31PM

Starting at approximately 03:00 UTC August 01, 2025, customers may experience failures when attempting to launch all-purpose and jobs compute resources which connect to the Metastores.

The issue has been identified. We are continuing to work with the cloud provider on potential mitigation paths. Further updates will be provided in 1 hour, or as events warrant.

2 days ago - at 08/01/2025 06:02PM

Starting at approximately 03:00 UTC August 01, 2025, Customers may be experiencing ‘METASTORE_DOWN’ errors, resulting in service unavailability for Databricks resources.

The issue has been identified. We are actively investigating a potential networking issue and exploring possible mitigation options. An update will be provided within 60 minutes, or as events warrant

2 days ago - at 08/01/2025 07:16PM

Starting at approximately 03:00 UTC on August 1, 2025, customers may be experiencing ‘METASTORE_DOWN’ errors, resulting in service unavailability for Databricks resources.

The issue has been identified. We are actively investigating a potential networking issue and exploring possible mitigation options. Preliminary findings indicate this may be related to a recent infrastructure change affecting a subset of workspaces.
An update will be provided within 60 minutes, or as events warrant.

2 days ago - at 08/01/2025 08:36PM

Starting at 03:00 UTC on 1 August 2025, a service issue began affecting the Azure Databricks service in the US Gov Virginia region. Customers may encounter failures when attempting to launch all-purpose and jobs compute resources that connect to the Metastores. These failures may manifest as ‘METASTORE_DOWN’ errors, resulting in service unavailability for Databricks resources in this region.

We have determined that the current issue stems from a recent platform update, which created a networking configuration gap and is currently preventing Databricks services from reaching essential backend components. Although all underlying systems remain operational, this connectivity problem is affecting service availability in certain customer environments.
We have validated configuration settings updates across a limited subset of affected environments has worked. Now, we are testing the fix across a larger set of workspaces. Following review of the test, we will review the safest method of pushing the fix to all necessary environments.
We will provide another update within 2 hours, or as events warrant.

2 days ago - at 08/01/2025 10:58PM

Starting at 03:00 UTC on 1 August 2025, a service issue began affecting the Azure Databricks service in the US Gov Virginia region. Customers may encounter failures when attempting to launch all-purpose and jobs compute resources that connect to the Metastores. These failures may manifest as 'METASTORE_DOWN' errors, resulting in service unavailability for Databricks resources in this region.

We validated that retrying the configuration settings updates across a limited subset of affected environments has restored service availability. Now, we are retrying the update across all necessary environments in the region. There is no ETA for this workstream at this time, but we are closely monitoring the retry actions and will provide another update within 2 hours, or sooner.

We’ve determined that the issue is related to a recent platform update that introduced a networking configuration gap, preventing Databricks services from reaching essential backend components. While the underlying systems remain healthy, this connectivity issue is impacting service availability in some customer environments.

2 days ago - at 08/02/2025 12:16AM

Starting at 03:00 UTC on 1 August 2025, a service issue began affecting the Azure Databricks service in the US Gov Virginia region. Customers may encounter failures when attempting to launch all-purpose and jobs compute resources that connect to the Metastores. These failures may manifest as 'METASTORE_DOWN' errors, resulting in service unavailability for Databricks resources in this region. Customers may be seeing signs of recovery at this time.

We completed retrying the configuration updates across a broader set of customer environments. We are validating the success of retrying the update. While system logs indicate that some updates may have been skipped, spot checking has confirmed that the required network settings appear to be in place for many workspaces. We are continuing to verify connectivity and functionality across previously impacted environments. This includes assessing current workspace connectivity and comparing connectivity before and after the platform change to confirm progress. We are continuing to validate and monitor the situation, and will provide another update within 2 hours, or sooner if new information becomes available.

We’ve determined that the issue is related to a recent platform update that introduced a networking configuration gap, preventing Databricks services from reaching essential backend components. While the underlying systems remain healthy, this connectivity issue is impacting service availability in some customer environments.

2 days ago - at 08/02/2025 02:24AM

Starting at 03:00 UTC on 1 August 2025, a service issue began affecting the Azure Databricks service in the US Gov Virginia region. Customers may encounter failures when attempting to launch all-purpose and jobs compute resources that connect to the Metastores. These failures may manifest as 'METASTORE_DOWN' errors, resulting in service unavailability for Databricks resources in this region. Customers may be seeing signs of recovery at this time.

We completed retrying the configuration updates across a broader set of customer environments.

We are validating the success of retrying the update. While system logs indicate that some updates may have been skipped, spot checking has confirmed that the required network settings appear to be in place for many workspaces.

We are continuing to verify connectivity and functionality across previously impacted environments. This includes assessing current workspace connectivity and comparing connectivity before and after the platform change to confirm progress.

We are actively validating and monitoring the situation, and will provide another update within 4 hours, or sooner if new information becomes available.

We’ve determined that the issue is related to a recent platform update that introduced a networking configuration gap, preventing Databricks services from reaching essential backend components. While the underlying systems remain healthy, this connectivity issue is impacting service availability in some customer environments.

2 days ago - at 08/02/2025 04:35AM

Starting at 03:00 UTC on 1 August 2025, a service issue began affecting the Azure Databricks service in the US Gov regions. Customers may encounter failures when attempting to launch all-purpose and jobs compute resources that connect to the Metastores. These failures may manifest as 'METASTORE_DOWN' errors, resulting in service unavailability for Databricks resources in this region. Customers may be seeing signs of recovery at this time.

We completed mitigation actions across the impacted resources.
We are continuing to monitor connectivity and functionality across previously impacted resources. This includes assessing current workspace connectivity and comparing connectivity before and after the platform change to confirm progress.

We’ve determined that the issue is related to a recent platform update that introduced a networking configuration gap, preventing Databricks services from reaching essential backend components. While the underlying systems remain healthy, this connectivity issue is impacting service availability in some customer environments.

Latest Azure Databricks outages

ES-1539356 - 4 days ago
ES-1516779 - about 1 month ago
ES-1509183 - about 1 month ago
ES-1507247 - about 2 months ago
ES-1504656 - about 2 months ago

Be the First to Know When Vendors Go Down

With IsDown, you can monitor all your critical services' official status pages from one centralized dashboard and receive instant alerts the moment an outage is detected. Say goodbye to constantly checking multiple sites for updates and stay ahead of outages with IsDown.

Start free trial

No credit card required · Cancel anytime · 4400 services available

Integrations with Slack Microsoft Teams Google Chat Datadog PagerDuty Zapier Discord Webhook