Use cases
Software Products E-commerce MSPs Schools Development & Marketing DevOps Agencies Help Desk
Company
Internet Status Blog Pricing Log in Get started free

AWS Databricks Outage History

Every past AWS Databricks outage tracked by IsDown, with detection times, duration, and resolution details.

There were 96 AWS Databricks outages since January 2023. The 39 outages from the last 12 months are summarized below, with incident details, duration, and resolution information.

Major April 22, 2026

April 2026: ES-1873238

Detected Apr 22, 2026 10:17 PM WEST · Resolved Apr 22, 2026 11:25 PM WEST · Duration about 1 hour

AWS Databricks experienced a major service incident affecting Unity Catalog in multiple regions (us-west-2, us-west-1, and us-east-1) starting at 20:47 UTC on April 22, 2026. Customers encountered failures when accessing Unity Catalog resources and errors when querying or managing data governed by Unity Catalog, with workload failures for services dependent on Unity Catalog. The incident lasted 1.1 hours with Databricks actively investigating the degradation, though no resolution details were provided in the available updates.

Major April 21, 2026

April 2026: ES-1868544

Detected Apr 21, 2026 9:50 PM WEST · Resolved Apr 21, 2026 10:18 PM WEST · Duration 28 minutes

AWS Databricks experienced a major service incident affecting Unity Catalog in the CA-Central-1 region, where requests failed to process starting at 20:10 UTC on April 21, 2026. Users encountered errors and timeouts when accessing Unity Catalog and failures when managing or querying catalog objects, tables, schemas, or permissions. The engineering team identified the root cause and was actively working on restoration, with the incident lasting 28 minutes.

Minor April 20, 2026

April 2026: ES-1864534

Detected Apr 20, 2026 4:02 PM WEST · Resolved Apr 20, 2026 6:38 PM WEST · Duration about 3 hours

AWS Databricks experienced an issue affecting Lakeflow Spark Declarative Pipelines on Serverless Compute in the EU-West-1 region, starting at 12:13 UTC on April 20, 2026, with error rates significantly increasing at 14:40 UTC. Customers experienced pipeline launch failures, timeouts, and increased launch times for new pipeline runs. The engineering team was actively investigating and implementing remediation steps, with the incident lasting approximately 2.6 hours.

Minor April 19, 2026

April 2026: ES-1862841

Detected Apr 19, 2026 11:19 AM WEST · Resolved Apr 19, 2026 4:12 PM WEST · Duration about 5 hours

AWS Databricks experienced a 4.9-hour incident affecting Declarative Automation Bundles for CI/CD deployments, caused by an expired HashiCorp GPG key used for Terraform binary verification. Customers experienced failures in CI/CD pipelines and were unable to perform automated deployments due to GPG checksum verification errors. The issue was resolved by releasing a fix in Databricks CLI version 0.297.2 that incorporates the new public key, with patch versions for older CLI versions being actively developed.

Minor April 17, 2026

April 2026: ES-1856982

Detected Apr 17, 2026 7:56 AM WEST · Resolved Apr 17, 2026 8:48 AM WEST · Duration about 1 hour

AWS Databricks experienced a 52-minute service incident where customers across multiple regions encountered delays or failures when launching clusters and running jobs that used init-scripts dependent on Ubuntu package repositories (archive.ubuntu.com and security.ubuntu.com). The issue affected the Compute Service, causing cluster launch failures and job execution problems for workloads relying on these package dependencies. The incident appeared to stabilize as the team monitored for full recovery, with customers advised to temporarily use alternative mirror repositories as a workaround.

Minor April 16, 2026

April 2026: ES-1856982

Detected Apr 16, 2026 8:49 AM WEST · Resolved Apr 16, 2026 10:17 AM WEST · Duration about 1 hour

AWS Databricks experienced a compute service issue starting at 05:54 UTC on April 16, 2026, causing cluster launch delays and failures across multiple regions for 1.5 hours. Jobs dependent on affected clusters also failed or experienced delays during this period. The issue appeared to be stabilizing by the end of the incident with teams monitoring for full recovery.

Minor April 14, 2026

April 2026: ES-1852558

Detected Apr 14, 2026 7:52 PM WEST · Resolved Apr 14, 2026 9:32 PM WEST · Duration about 2 hours

AWS Databricks experienced compute service failures in the us-gov-west-1 region, affecting Classic Compute and Jobs Compute from 17:50-19:21 UTC and Serverless Compute starting at 19:06 UTC. Customers encountered cluster startup failures and job execution issues due to clusters being unable to start. Classic Compute was fully recovered by 19:21 UTC, while Serverless Compute issues were still being mitigated with an identified root cause.

Major March 30, 2026

March 2026: ES-1814874

Detected Mar 30, 2026 10:16 PM WEST · Resolved Mar 31, 2026 12:12 AM WEST · Duration about 2 hours

AWS Databricks experienced a major service incident affecting the Compute Service and Jobs Service, preventing customers from starting clusters or running workloads. Users encountered failures with classic clusters, jobs, pipelines, and notebook clusters not becoming available. The engineering team identified the root cause and deployed fixes globally, with service restoration occurring in regions where fixes were implemented.

Minor March 24, 2026

March 2026: ES-1798419

Detected Mar 24, 2026 11:30 AM WET · Resolved Mar 24, 2026 2:39 PM WET · Duration about 3 hours

AWS Databricks experienced degraded availability with Serverless Compute starting at 11:00 UTC on March 24, 2026, lasting 3.2 hours. Customers encountered cluster launch failures with driver unreachable errors, job runs terminating unexpectedly with DRIVER_UNRESPONSIVE errors, clusters stuck in restart loops, and failures during new cluster creation. The incident was classified as minor and was under active investigation with no customer action required.

Minor March 20, 2026

March 2026: ES-1792497

Detected Mar 20, 2026 6:48 PM WET · Resolved Mar 20, 2026 8:45 PM WET · Duration about 2 hours

AWS Databricks experienced an issue with Serverless compute in the us-east-2 region starting at 18:24 UTC on March 20, 2026, lasting 1.9 hours. The incident affected Community Edition and Free Edition users who encountered failures when launching serverless compute resources, starting clusters, or running jobs, with some requests timing out. The service team actively investigated the root cause and worked to restore normal operations.