Use Cases
Software Products MSPs Schools Development & Marketing DevOps Agencies Help Desk
 
Internet Status Blog Pricing Log In Try IsDown for free now

Outage in Coalesce

Intermittent Job Timeouts

Major
February 25, 2026 - Started 8 days ago
Official incident page

Incident Report

We are investigating reports of jobs intermittently timing out. Our team is actively working to identify the root cause and will provide updates as soon as more information becomes available.
Components affected
Coalesce Scheduler

Need to monitor Coalesce outages?

  • Monitor all your external dependencies in one place
  • Get instant alerts when outages are detected
  • Be the first to know if service is down
  • Show real-time status on private or public status page
  • Keep your team informed
Latest Updates ( sorted recent to last )
MONITORING about 18 hours ago - at 03/04/2026 09:47PM

Starting at approximately 10PM Pacific time Tuesday March 3rd a single resource in our Scheduler API resource pool degraded in performance and did not recover. This caused a performance impact across the product due to the reliance on the Scheduler as our core API service. At 9AM Pacific time Wednesday March 4th, this resource was restarted and operations returned to normal.

We are continuing to root cause the reason we experienced a similar scenario on the AM of March 3rd. We have added additional monitoring and alerting for this specific behavior to reduce any impact if it re-occurs until we deliver a full resolution.

Updates to follow with a full timeline and root cause of this entire incident.

MONITORING 2 days ago - at 03/03/2026 03:07PM

We identified a resource that needed to be restarted and the DEADLINE_EXCEEDED errors clients were seeing should cease.

This was unrelated to the overall incident we are tracking here and work will continue to resolving fully.

MONITORING 2 days ago - at 03/03/2026 11:54AM

Clients are reporting errors running refresh and deploy operations. We are investigating and will provide updates.

MONITORING 6 days ago - at 02/28/2026 12:09AM

We are no longer seeing general or widespread issues with our operations and will continue to monitor this incident over the weekend as we work towards a final solution and root cause analysis.

MONITORING 6 days ago - at 02/27/2026 05:48PM

The update we released yesterday has resolved the ongoing Deploy and Refresh failures from our clients. We are continuing to monitor and will provide details on the root cause of the issue as soon as available.

A small number of our clients are seeing delays running Deploy operations that are unrelated to the ongoing issue this week. We have identified a likely root cause and if confirmed and resolved will transition this issue to Operational.

MONITORING 7 days ago - at 02/27/2026 01:06AM

Clients may still see issues with Deploy and Refresh operations which we are continuing to monitor. We have released an additional update, version 7.29.5, that includes follow-on improvements to the reliability of our Refresh and Deploy operations. This update is included in version 7.29.5 of our coa CLI and is a recommended upgrade to all customers.

We are continuing to monitor and provide updates as this incident progresses.

MONITORING 7 days ago - at 02/26/2026 11:39PM

Clients can expect to continue to experience issues with delays and timeouts. We have an additional reliability optimization we have developed that is targeted for release within the next 2 hours or less. Expect our next update here when available.

MONITORING 7 days ago - at 02/26/2026 08:53PM

We have released a new version 7.29.4 that mitigates the performance issues our customers are experiencing. We expect it to reduce customer impact from this issue. Any customers using the coa CLI are recommended to upgrade.

We will continue to update this issue as we make progress towards the root cause.

MONITORING 7 days ago - at 02/26/2026 04:43PM

Between 2-5 AM Pacific Time US we discovered jobs were again backing up and not processing as expected.

Engineering has developed a patch to resolve this issue we are targeting to be released by 12 PM Pacific. We will continue to monitor and resolve any issues with job processing and update this incident when the patch has been shipped.

MONITORING 8 days ago - at 02/26/2026 12:06AM

Jobs are no longer timing out and the Scheduler is back to operational. We are going to continue to monitor over the next 12-24 hours and provide an update tomorrow with more details on the cause of the service interruption.

MONITORING 8 days ago - at 02/25/2026 05:37PM

We are no longer observing job timeouts at this time. Our team continues to monitor the system closely to ensure stability. We will provide further updates if anything changes.

IDENTIFIED 8 days ago - at 02/25/2026 05:15PM

We are continuing to investigate. We will post updates at the top of each hour, or sooner if new information becomes available.

IDENTIFIED 8 days ago - at 02/25/2026 04:10PM

Job scheduling issues in the US region have recurred. We have identified that certain jobs are becoming unresponsive and blocking the rest of the queue. We are intervening to clear the backlog and are investigating the underlying cause of these stalled processes. Customers should expect intermittent delays or timeouts in the interim.

MONITORING 8 days ago - at 02/25/2026 02:07PM

A fix has been implemented and we are monitoring the results.

IDENTIFIED 8 days ago - at 02/25/2026 01:13PM

We have identified a bottleneck in our job scheduling queue in the US region. We are currently recovering our infrastructure and monitoring the backlog of stale jobs. Customers may continue to see some older jobs fail, but new jobs are starting to process. We will provide further updates as the queue returns to normal levels.

INVESTIGATING 8 days ago - at 02/25/2026 11:21AM

We are investigating reports of jobs intermittently timing out. Our team is actively working to identify the root cause and will provide updates as soon as more information becomes available.

The Status Page Aggregator with Early Outage Detection

With IsDown, you can monitor all your critical services' official status pages from one centralized dashboard and receive instant alerts the moment an outage is detected. Say goodbye to constantly checking multiple sites for updates and stay ahead of outages with IsDown.

Start free trial

No credit card required · Cancel anytime · 6020 services available

Integrations with Slack Microsoft Teams Google Chat Datadog PagerDuty Zapier Discord Webhook