Outage in amazee.io

Scaling activities

Resolved Major
January 11, 2024 - Started 12 months ago - Lasted 1 day
Official incident page

Need to monitor amazee.io outages?
Stay on top of outages with IsDown. Monitor the official status pages of all your vendors, SaaS, and tools, including amazee.io, and never miss an outage again.
Start Free Trial

Outage Details

We observed an increase in resource usage on the shared MySQL cluster on UK3. To account for the increase we are scaling the cluster which will lead to one failover.
Components affected
amazee.io uk3.lagoon
Latest Updates ( sorted recent to last )
RESOLVED 11 months ago - at 01/12/2024 04:10PM

The situation is stable. We will resolve this incident here and follow up with a post incident review in the coming days.

MONITORING 12 months ago - at 01/11/2024 10:27PM

The original database cluster can only be started in read mode.

In accordance with our backup and recovery processes, we promoted the new database cluster with the state of 2024-01-11 03:05 UTC as the new production cluster. We updated all workloads to use this new database cluster. Please note that this does not contain data between 2024-01-11 03:05 UTC and the moment the database cluster went offline (~ 2024-01-11 07:22 UTC).

Dumps of the original database with the latest data can be exported and shared on request.

A summary of the incident will be shared in the upcoming days.

We are sorry for the inconvenience this caused you and your clients. If you have any questions regarding this, please reach out to us.

IDENTIFIED 12 months ago - at 01/11/2024 06:27PM

Recovering the database was interrupted due to an unforeseen issue. We are working with the AWS RDS team to bring the database back online.

As an alternative option for recovery we can point single environments to a new database cluster, containing data up until 2024-01-11 03:05 UTC. Please be aware that this option would lead to data loss. If you would like to pursue this route, please contact us through our support channels.

IDENTIFIED 12 months ago - at 01/11/2024 04:35PM

We're making good progress on recovering the database cluster. We're expecting the database cluster to be back online within the next 2 hours.

IDENTIFIED 12 months ago - at 01/11/2024 02:31PM

Recovery is still underway. We're evaluating additional ways to recover from the current situation quicker and restore services.

IDENTIFIED 12 months ago - at 01/11/2024 12:17PM

We're making progress in recovery - We can't give a firm ETA as the recovery speed hasn't settled fully yet. Still in discussions with the AWS RDS team on timings and additional recovery options.

IDENTIFIED 12 months ago - at 01/11/2024 10:42AM

We've identified the issue in the meantime and working on recovering from the outage. We can't give an ETA for now and evaluating several options.

INVESTIGATING 12 months ago - at 01/11/2024 09:39AM

We're still working with AWS RDS team to investigate the issue and what causes the connectivity issues.

INVESTIGATING 12 months ago - at 01/11/2024 09:04AM

We're seeing connectivity issues to the database cluster after the scaling operation.

We'll involve our upstream provider to look into this issue aswell.

INVESTIGATING 12 months ago - at 01/11/2024 08:41AM

We're seeing issues with the Database Cluster and investigating

MONITORING 12 months ago - at 01/11/2024 07:18AM

We observed an increase in resource usage on the shared MySQL cluster on UK3. To account for the increase we are scaling the cluster which will lead to one failover.

Latest amazee.io outages

Failing image builds - 24 days ago
JSM Assist sync issues - 26 days ago
Delayed logs - about 2 months ago
MySQL 8 upgrade - 2 months ago

Start monitoring all your vendors in just 5 minutes

With IsDown, you can monitor all your critical services' official status pages from one centralized dashboard and receive instant alerts the moment an outage is detected. Say goodbye to constantly checking multiple sites for updates and stay ahead of outages with IsDown.

Start free trial

No credit card required · Cancel anytime · 3278 services available

Integrations with Slack Microsoft Teams Google Chat Datadog PagerDuty Zapier Discord Webhook

Setup in 5 minutes or less

How much time you'll save your team, by having the outages information close to them?

14-day free trial · No credit card required · Cancel anytime