Outage in Aiven

Aiven incident: Some Aiven Kafka clusters stopped reporting Prometheus metrics

Resolved Minor
December 04, 2024 - Started 18 days ago - Lasted 1 day

Need to monitor Aiven outages?
Stay on top of outages with IsDown. Monitor the official status pages of all your vendors, SaaS, and tools, including Aiven, and never miss an outage again.
Start Free Trial

Outage Details

We are currently investigating a partial outage affecting Prometheus metric reporting of some Aiven Kafka clusters. As far as we can tell, Datadog users are not impacted. We apologise for the inconvenience caused by this issue. We will provide regular update about our progress in resolving this issue.
Latest Updates ( sorted recent to last )
MONITORING 17 days ago - at 12/05/2024 05:44AM

Fix has been deployed and preventative maintenance will be scheduled for potentially impacted and impacted services.

MONITORING 18 days ago - at 12/04/2024 03:41PM

We are still closely monitoring Aiven fleet of Kafka clusters. So far no new services were affected. We will continue to monitor and to work on a permanent fix.

MONITORING 18 days ago - at 12/04/2024 02:47PM

We have applied a patch to all Aiven Kafka services which previously stopped emitting metrics to their Prometheus endpoints.

We are continuing to closely monitor affected services, but we are confident that the situation should now gradually get back to normal. In case of doubt, feel free to reach out to Aiven Support.

IDENTIFIED 18 days ago - at 12/04/2024 01:53PM

We have identified the root cause of the partial outage affecting Prometheus metric reporting. We also identified the 7 impacted services.

We prepared an Emergency procedure to patch those services. As we are applying this patch, we will also pro-actively reach out to impacted customers.

We will provide a further regular updates about this issue.

INVESTIGATING 18 days ago - at 12/04/2024 01:27PM

We are currently investigating a partial outage affecting Prometheus metric reporting of some Aiven Kafka clusters. As far as we can tell, Datadog users are not impacted.

We apologise for the inconvenience caused by this issue. We will provide regular update about our progress in resolving this issue.

Be the first to know when Aiven and other third-party services go down

With IsDown, you can monitor all your critical services' official status pages from one centralized dashboard and receive instant alerts the moment an outage is detected. Say goodbye to constantly checking multiple sites for updates and stay ahead of outages with IsDown.

Start free trial

No credit card required · Cancel anytime · 3278 services available

Integrations with Slack Microsoft Teams Google Chat Datadog PagerDuty Zapier Discord Webhook

Setup in 5 minutes or less

How much time you'll save your team, by having the outages information close to them?

14-day free trial · No credit card required · Cancel anytime