Outage in AWS Databricks

ES-692009

Resolved Major
May 09, 2023 - Started over 2 years ago - Lasted about 1 hour

Incident Report

We are investigating an issue with one of the Databricks services.

Incident Details:
- Workspace authentication requests may fail or timeout.
- Cluster start/resize/termination requests may time out.
- Jobs relying on cluster start/resize/termination may not execute.
- Jobs submitted through APIs/Schedulers may not execute.
- Running jobs may not complete on time.

Incident Start Time: 23:20 UTC May 08 2023

We will provide an update in the next hour, or as soon as the issue has been identified.

One place to monitor all your cloud vendors. Get instant alerts when an outage is detected.

Try IsDown risk-free 14-day free trial · No credit card required
Latest Updates ( sorted recent to last )
over 2 years ago - at 05/08/2023 11:45PM

We are investigating an issue with one of the Databricks services.

Incident Details:
- Workspace authentication requests may fail or timeout.
- Cluster start/resize/termination requests may time out.
- Jobs relying on cluster start/resize/termination may not execute.
- Jobs submitted through APIs/Schedulers may not execute.
- Running jobs may not complete on time.

Incident Start Time: 23:20 UTC May 08 2023

We will provide an update in the next hour, or as soon as the issue has been identified.

over 2 years ago - at 05/08/2023 11:53PM

We are investigating an issue with one of the Databricks services.

Incident Details:
- Workspace authentication requests may fail or timeout.
- Cluster start/resize/termination requests may time out.
- Jobs relying on cluster start/resize/termination may not execute.
- Jobs submitted through APIs/Schedulers may not execute.
- Running jobs may not complete on time.

Incident Start Time: 23:20 UTC May 08 2023

We will provide an update in the next hour, or as soon as the issue has been identified.

over 2 years ago - at 05/09/2023 12:09AM

We have identified the problem with the Databricks service. Our team is continuing to work on a mitigation.

Incident Details:
- Workspace authentication requests may fail or timeout.
- Cluster start/resize/termination requests may time out.
- Jobs relying on cluster start/resize/termination may not execute.
- Jobs submitted through APIs/Schedulers may not execute.
- Running jobs may not complete on time.

Incident Start Time:
Incident Start Time: 23:20 UTC May 08 2023

We will provide an update in the next hour, or as soon as the issue has been mitigated.

over 2 years ago - at 05/09/2023 12:23AM

A mitigation has been applied and the services are operational.

Incident Details:
- Workspace authentication requests may fail or timeout.
- Cluster start/resize/termination requests may time out.
- Jobs relying on cluster start/resize/termination may not execute.
- Jobs submitted through APIs/Schedulers may not execute.
- Running jobs may not complete on time.
- Serverless SQL Warehouses requests may time out .

Incident Start Time: 23:20 UTC May 08 2023
Incident End Time: 00:10 UTC May 09 2023

We will continue to monitor for continued stability and provide a final update in the next hour.

Latest AWS Databricks outages

ES-1613032 - 5 days ago
ES-1596201 - 23 days ago
ES-1596190 - 23 days ago
ES-1586721 - about 1 month ago
ES-1570676 - about 1 month ago

The Status Page Aggregator Built for IT Teams

With IsDown, you can monitor all your critical services' official status pages from one centralized dashboard and receive instant alerts the moment an outage is detected. Say goodbye to constantly checking multiple sites for updates and stay ahead of outages with IsDown.

Start free trial

No credit card required · Cancel anytime · 4522 services available

Integrations with Slack Microsoft Teams Google Chat Datadog PagerDuty Zapier Discord Webhook