Outage in Umbraco Cloud

Investigating issue with slow responses from Content Delivery APIs

Resolved Minor
September 13, 2022 - Started about 3 years ago - Lasted 1 day
Official incident page

Need to monitor Umbraco Cloud outages?
Stay on top of outages with IsDown. Monitor the official status pages of all your vendors, SaaS, and tools, including Umbraco Cloud, and never miss an outage again.
Start Free Trial

Outage Details

We are currently seeing a surge in 5xx errors returned for requests to the Content Delivery Platform (REST API and GraphQL API). The issue is being investigated - we will report back when we have an overview of the root cause.
Latest Updates ( sorted recent to last )
MONITORING about 3 years ago - at 09/14/2022 06:52PM

Content ingestion is still running and tackling the queue from todays pause - we expect the queue to be emptied in the coming hours. New content changes takes priority now.
As such APIs and content ingestion are again both functional.
We will keep monitoring until the queue is emptied, and get back with an update tomorrow (thursday) morning CEST.

MONITORING about 3 years ago - at 09/14/2022 04:10PM

Content ingestion is running and content updates are rolling out as the queue empties out. We are continuing to monitor the situation (performance and database-server), and will be back with an update at 21.00 CEST.

MONITORING about 3 years ago - at 09/14/2022 01:32PM

Our tests have shown improvements, and we are starting a controlled deployment of the fixes to the production environment. We don’t expect any downtime to the public APIs due to this deploy.
Once the fix is out, we will slowly restart content-ingestion and slowly speed up to process the queue of updates created during this incident.
We will be back with an update no later than 18.00 CEST with status on the content-ingestion.

MONITORING about 3 years ago - at 09/14/2022 12:12PM

We are currently testing our updates in our development environment and getting ready to deploy latest changes. As we are still in a testing phase we will post back no later than 15.30 CEST with an update on deployments and the restart of content update ingestion to the platform.

MONITORING about 3 years ago - at 09/14/2022 09:20AM

We are currently working on 3 different initiatives to further improve the situation. The first initiative will ensure that we can resume ingestion of content updates into the platform in a stable manner. This implies that content updates will continue to be paused for at least the next 3 hours until we have had time to update, test and deploy the changes.

We will post back with an update on our progress at 14:00 CEST.

We apologize for the inconvenience that this is causing - we know that many are eagerly awaiting the ability to push content updates. This is also our number one goal currently and thus working towards enabling this once again while maintaining stability, so the APIs continue to run and serve requests.

MONITORING about 3 years ago - at 09/13/2022 10:49PM

REST API is now also online again. We continue to monitor both APIs as well as the database powering both of these.
The underlying issue is still not fixed, so we will continue to keep this incident in a "Monitoring"-state.

MONITORING about 3 years ago - at 09/13/2022 10:32PM

The GraphQL API is back online and we will monitor the health before bringing the REST API back online. We want to avoid that either of the APIs puts too much strain on the database server until we have resolved the underlying issue.

MONITORING about 3 years ago - at 09/13/2022 09:41PM

We have isolated the issue to be database related and the effect of an excessive amount of connections made at once. We are still monitoring and asses why its happening in order to find the underlying root cause. We are continueing our investigation, as the underlying issue is not yet solved.

MONITORING about 3 years ago - at 09/13/2022 08:24PM

We are continuing to monitor for any further issues.

MONITORING about 3 years ago - at 09/13/2022 07:11PM

We are seeing connection issues returning and as a result lots of requests to the GraphQL and REST APIs are failing.
We are engaged with Microsoft to help diagnose the connection issue.

MONITORING about 3 years ago - at 09/13/2022 04:44PM

All systems are operational again, so we are moving back to monitoring. Ingestion of content updates will be started and will continue to run through the evening.

We will continue with root cause analysis with Microsoft tomorrow. So we expect to be monitoring until at least tomorrow morning (central european time).

INVESTIGATING about 3 years ago - at 09/13/2022 03:05PM

We are moving this issue back to investigating, as we are now in a situation where the APIs have issues connecting to their database meaning that the API servers are unable to serve requests. This effects both GraphQL and REST API in the Content Delivery Platform at this point in time.
We are also involving Microsoft to help diagnose the issue from their end.

MONITORING about 3 years ago - at 09/13/2022 02:09PM

We have currently paused content updates to the Content Delivery Platform as we investigate the issue. This implies a delay in updating content served by the GraphQL and REST APIs.

MONITORING about 3 years ago - at 09/13/2022 12:58PM

Issue has been resolved and APIs are back to normal with normal response times.
We will continue to monitor the Platform as we investigate further.

INVESTIGATING about 3 years ago - at 09/13/2022 12:37PM

We are currently seeing a surge in 5xx errors returned for requests to the Content Delivery Platform (REST API and GraphQL API).
The issue is being investigated - we will report back when we have an overview of the root cause.

The Status Page Aggregator Built for IT Managers

With IsDown, you can monitor all your critical services' official status pages from one centralized dashboard and receive instant alerts the moment an outage is detected. Say goodbye to constantly checking multiple sites for updates and stay ahead of outages with IsDown.

Start free trial

No credit card required · Cancel anytime · 4522 services available

Integrations with Slack Microsoft Teams Google Chat Datadog PagerDuty Zapier Discord Webhook