Outage in Daily

Issues connecting to calls

Resolved Minor
February 07, 2023 - Started about 1 year ago - Lasted 2 days
Official incident page

Need to monitor Daily outages?
Stay on top of outages with IsDown. Monitor the official status pages of all your vendors, SaaS, and tools, including Daily, and never miss an outage again.
Start Free Trial

Outage Details

We are investigating elevated platform error rates. Users may get websocket connection errors when trying to join calls.
Latest Updates ( sorted recent to last )
RESOLVED about 1 year ago - at 02/09/2023 11:37PM

We've identified the issue that caused the incident on Tuesday morning. While we've already deployed fixes that helped prevent the problem from reoccurring, we still need to perform one more database update that will require a short scheduled maintenance. That will likely happen this weekend.

We will post a full retro after completing the final database maintenance operation.

MONITORING about 1 year ago - at 02/09/2023 04:13AM

We've deployed a platform update with a few improvements designed to mitigate the impact of the current database performance issue. The only thing you may notice is that you'll no longer see 429 rate limit responses in your Dashboard API logs.

Our database metrics have remained normal today, but we'll continue to monitor the platform to verify these fixes and watch for further issues.

MONITORING about 1 year ago - at 02/08/2023 06:30AM

While we were able to restore platform functionality earlier today, we've continued to troubleshoot the underlying issue that caused the problem.

As a precautionary measure, we've temporarily enabled rate limiting on the REST API endpoint used to create rooms. The limit for POST /rooms is now the same as the DELETE /rooms/:name endpoint. You can expect about 2 requests per second, or 50 over a 30-second window.

MONITORING about 1 year ago - at 02/07/2023 04:07PM

We’ve addressed the issue with the database, and platform operations have returned to normal. We are monitoring alerts and metrics for any further issues.

IDENTIFIED about 1 year ago - at 02/07/2023 03:39PM

We've identified an issue with one of our databases that coordinates activity between call servers. This is causing elevated rates of "meeting moves", which is when an ongoing call session has to move from one call server to a different one. If you're in a call when this happens, you'll notice everyone's video and audio drop out and come back within a few seconds. You may also need to restart recording or live streaming when this happens.

You may also experience timeouts when making REST API requests.

We'll post more information as soon as it's available.

INVESTIGATING about 1 year ago - at 02/07/2023 02:54PM

We are investigating elevated platform error rates. Users may get websocket connection errors when trying to join calls.

Start monitoring Daily and all your cloud vendors in minutes

With IsDown, you can monitor all your critical services' official status pages from one centralized dashboard and receive instant alerts the moment an outage is detected. Say goodbye to constantly checking multiple sites for updates and stay ahead of outages with IsDown.

Start free trial

No credit card required · Cancel anytime · 3153 services available

Integrations with Slack Microsoft Teams Google Chat Datadog PagerDuty Zapier Discord Webhook

Setup in 5 minutes or less

How much time you'll save your team, by having the outages information close to them?

14-day free trial · No credit card required · Cancel anytime