Outage in AWS Databricks

ES-1059099

Resolved Minor
February 28, 2024 - Started 4 months ago - Lasted about 4 hours

Need to monitor AWS Databricks outages?
Stay on top of outages with IsDown. Monitor the official status pages of all your vendors, SaaS, and tools, including AWS Databricks, and never miss an outage again.
Start Free Trial

Outage Details

We are investigating an issue with one of the Databricks services.

Incident Details:
- Workspace authentication requests may fail or timeout.
- Cluster start/resize/termination requests may fail or time out.
- Jobs relying on cluster start/resize/termination may not execute.
- Jobs submitted through APIs/Schedulers may not execute.
- UI and Databricks SQL queries may time out.
- Users may experience failures launching Databricks Serverless SQL Warehouses.
- Users may not be able to access UC APIs.

Incident Start Time: 18:36 UTC February 28 2024

We will provide an update in the next hour, or as soon as the issue has been identified.
Latest Updates ( sorted recent to last )
4 months ago - at 02/28/2024 06:54PM

We are investigating an issue with one of the Databricks services.

Incident Details:
- Workspace authentication requests may fail or timeout.
- Cluster start/resize/termination requests may fail or time out.
- Jobs relying on cluster start/resize/termination may not execute.
- Jobs submitted through APIs/Schedulers may not execute.
- UI and Databricks SQL queries may time out.
- Users may experience failures launching Databricks Serverless SQL Warehouses.
- Users may not be able to access UC APIs.

Incident Start Time: 18:36 UTC February 28 2024

We will provide an update in the next hour, or as soon as the issue has been identified.

4 months ago - at 02/28/2024 07:07PM

We have identified the problem with the Databricks service. Our team is working on a mitigation.

Incident Details:
- Workspace authentication requests may fail or timeout.
- Cluster start/resize/termination requests may fail or time out.
- Jobs relying on cluster start/resize/termination may not execute.
- Jobs submitted through APIs/Schedulers may not execute.
- UI and Databricks SQL queries may time out.
- Users may experience failures launching Databricks Serverless SQL Warehouses.
- Users may not be able to access UC APIs.

Incident Start Time: 18:36 UTC February 28 2024

We will provide an update in the next hour, or as soon as the issue has been mitigated.

4 months ago - at 02/28/2024 07:42PM

We are seeing a sign of recovery. Our team is actively monitoring the system to ensure the full mitigation. Rest assured, we are diligently working to maintain this positive trajectory. Thank you for your patience.

Incident Details:
- Workspace authentication requests may fail or timeout.
- Cluster start/resize/termination requests may fail or time out.
- Jobs relying on cluster start/resize/termination may not execute.
- Jobs submitted through APIs/Schedulers may not execute.
- UI and Databricks SQL queries may time out.
- Users may experience failures launching Databricks Serverless SQL Warehouses.
- Users may not be able to access UC APIs.

Incident Start Time: 18:36 UTC February 28 2024

We will provide an update in the next hour, or as soon as the issue has been mitigated.

4 months ago - at 02/28/2024 09:25PM

Mitigation has been applied, and The issue has been successfully mitigated, although you may notice some latency. It’s important to note that this latency does not impact production services. Our team continues to monitor the situation closely to ensure optimal performance.

Incident Details:
- Workspace authentication requests may fail or timeout.
- Cluster start/resize/termination requests may fail or time out.
- Jobs relying on cluster start/resize/termination may not execute.
- Jobs submitted through APIs/Schedulers may not execute.
- UI and Databricks SQL queries may time out.
- Users may experience failures launching Databricks Serverless SQL Warehouses.
- Users may not be able to access UC APIs.

Incident Start Time: 18:36 UTC February 28 2024
Incident End Time: 19:08 UTC February 28 2024

We will continue to monitor for continued stability and provide a final update in the next two hours.

Latest AWS Databricks outages

ES-1167111 - 10 days ago
ES-1137910 - about 1 month ago
ES-1119898 - 2 months ago
ES-1109110 - 2 months ago
ES-1108448 - 2 months ago

Never miss a vendor outage again

With IsDown, you can monitor all your critical services' official status pages from one centralized dashboard and receive instant alerts the moment an outage is detected. Say goodbye to constantly checking multiple sites for updates and stay ahead of outages with IsDown.

Start free trial

No credit card required · Cancel anytime · 3197 services available

Integrations with Slack Microsoft Teams Google Chat Datadog PagerDuty Zapier Discord Webhook

Setup in 5 minutes or less

How much time you'll save your team, by having the outages information close to them?

14-day free trial · No credit card required · Cancel anytime