Outage in Datadog US1

Increased delay processing events

Resolved Minor
January 17, 2025 - Started 10 months ago - Lasted about 3 hours
Official incident page

Incident Report

We are investigating increased latency processing Events. As a result of this issue, some users may see delays in the event stream or for event queries on dashboards, and event alert evaluation is delayed. This issue also caused a delay in the processing of alerts across other products. We've implemented a fix for this, and are monitoring the recovery of the alert evaluation pipeline. As a result, a subset alerts may be delayed while the system recovers.

Need to monitor Datadog US1 outages?

One place to monitor all your cloud vendors. Get instant alerts when an outage is detected.

Latest Updates ( sorted recent to last )
RESOLVED 10 months ago - at 01/17/2025 04:50PM

This incident has been resolved.

MONITORING 10 months ago - at 01/17/2025 03:34PM

We are continue to monitor the progress of processing the backlog in Events. The majority of the backlog has been processed. Event Monitor evaluation remains delayed while we finish processing the backlog.

IDENTIFIED 10 months ago - at 01/17/2025 02:31PM

We've implemented a fix, and are currently working through the backlog of delayed Events. Event Monitor evaluation remains delayed while we work through the backlog. All other monitor types have recovered and are currently evaluating.

IDENTIFIED 10 months ago - at 01/17/2025 01:55PM

We have identified the issue causing delayed ingestion of Events. Alerting evaluation continues to be delayed for Event Monitors, Process Monitors, and Cloud Network monitors. All other monitor types have recovered and are currently evaluating.

INVESTIGATING 10 months ago - at 01/17/2025 01:46PM

We are continuing to investigate this issue.

INVESTIGATING 10 months ago - at 01/17/2025 01:42PM

We are investigating increased latency processing Events.

As a result of this issue, some users may see delays in the event stream or for event queries on dashboards, and event alert evaluation is delayed.

This issue also caused a delay in the processing of alerts across other products. We've implemented a fix for this, and are monitoring the recovery of the alert evaluation pipeline. As a result, a subset alerts may be delayed while the system recovers.

The Status Page Aggregator Built for IT Teams

With IsDown, you can monitor all your critical services' official status pages from one centralized dashboard and receive instant alerts the moment an outage is detected. Say goodbye to constantly checking multiple sites for updates and stay ahead of outages with IsDown.

Start free trial

No credit card required · Cancel anytime · 4522 services available

Integrations with Slack Microsoft Teams Google Chat Datadog PagerDuty Zapier Discord Webhook