Outage in Aiven

Services with new nodes stuck in rebuilding phase

Resolved Minor
December 10, 2024 - Started 12 days ago - Lasted about 17 hours

Need to monitor Aiven outages?
Stay on top of outages with IsDown. Monitor the official status pages of all your vendors, SaaS, and tools, including Aiven, and never miss an outage again.
Start Free Trial

Outage Details

We are investigating delayed service start for new services. We are currently investigating impact and cause. We apologise for the inconvenience caused by this issue. We will provide a further update in approximately 30 minutes.
Latest Updates ( sorted recent to last )
IDENTIFIED 11 days ago - at 12/11/2024 08:52AM

We have identified the root cause of the issue and are currently working on a permanent solution. In the meantime, we have implemented a workaround to mitigate the problem, and new nodes should now be able to launch successfully.

INVESTIGATING 11 days ago - at 12/11/2024 05:14AM

Unfortunately, we are still seeing DNS update failures and are investigating further.

Our engineers are currently working on a fix.

We apologise for the inconvenience caused by this issue.

MONITORING 11 days ago - at 12/11/2024 04:24AM

We have identified and fixed the DNS resolution issues that was effecting new nodes.

We are continuing to monitor this closely.

We apologise for the inconvenience caused by this issue. 

IDENTIFIED 11 days ago - at 12/11/2024 03:08AM

We are still mitigating the DNS resolution issues so nodes are stuck in rebuilding longer than usual. You should notice to see some improvements soon.

Therefore, we still recommend against performing any unnecessary actions that could trigger a node replacement until this incident is fully resolved.

IDENTIFIED 11 days ago - at 12/10/2024 11:41PM

We have been able to mitigate the DNS resolution issues so nodes stuck in syncing should begin to come online now.

We still recommend against performing any unnecessary actions that could trigger a node replacement until this incident is fully resolved.

INVESTIGATING 11 days ago - at 12/10/2024 10:15PM

Our incident team has initiated mitigation steps. DNS resolution is beginning to restore progressively across services. You may start seeing improvements, though full restoration will occur gradually across all affected services.

The guidance remains unchanged - please refrain from any actions that could trigger node replacement. The next update will be provided in 30 minutes.

INVESTIGATING 11 days ago - at 12/10/2024 09:21PM

We continue to make progress in identifying the root cause. Our investigation remains focused on DNS-related issues. The guidance remains unchanged - please refrain from any actions that could trigger node replacement. Next update will be provided in 30 minutes.

Thank you for your continued patience.

INVESTIGATING 11 days ago - at 12/10/2024 08:46PM

Our investigation continues to point to DNS and we are continuing to make progress on determining the cause. Our ask remains the same, do not take any actions which may cause a node to be replaced. Please expect an update in 30 minutes.

INVESTIGATING 11 days ago - at 12/10/2024 08:15PM

We confirmed that this relates to DNS and are diving deeper into the identifying the cause. Our ask remains the same, do not take any actions which may cause a node to be replaced. Please expect another update from us in 30 minutes.

INVESTIGATING 11 days ago - at 12/10/2024 07:38PM

Our investigation so far points that this is related to DNS and that there is intermittency on the impact. We continue to ask that no action is taken which may cause a node to be replaced. Please expect another update in 30 minutes time.

INVESTIGATING 12 days ago - at 12/10/2024 07:10PM

As our investigation continues we are finding that any new nodes are failing to start. Please do not issue plan upgrades at this time. Any currently running service will continue work as expected so long as nodes are not replaced.

INVESTIGATING 12 days ago - at 12/10/2024 06:57PM

We are investigating delayed service start for new services. We are currently investigating impact and cause.

We apologise for the inconvenience caused by this issue. We will provide a further update in approximately 30 minutes.

Need to know when vendors go down? You’re in the right place

With IsDown, you can monitor all your critical services' official status pages from one centralized dashboard and receive instant alerts the moment an outage is detected. Say goodbye to constantly checking multiple sites for updates and stay ahead of outages with IsDown.

Start free trial

No credit card required · Cancel anytime · 3278 services available

Integrations with Slack Microsoft Teams Google Chat Datadog PagerDuty Zapier Discord Webhook

Setup in 5 minutes or less

How much time you'll save your team, by having the outages information close to them?

14-day free trial · No credit card required · Cancel anytime