Outage in Grafana

Rare intermittent write failures in Tempo

Resolved Minor
January 17, 2025 - Started 19 days ago - Lasted 6 days
Official incident page

Need to monitor Grafana outages?
Stay on top of outages with IsDown. Monitor the official status pages of all your vendors, SaaS, and tools, including Grafana, and never miss an outage again.
Start Free Trial

Outage Details

Users may receive internet write failures on collectors stating: "Exporting failed. Will retry the request after interval. {"kind": "exporter", "data_type": "traces", "name": "otlp/cloud", "error": "rpc error: code = Unavailable desc = unexpected HTTP status code received from server: 502 (Bad Gateway)" The agent/collector will automatically retry the write attempt and succeed, so there is no loss in data collection.
Latest Updates ( sorted recent to last )
RESOLVED 13 days ago - at 01/23/2025 11:04PM

This incident has been resolved.

MONITORING 14 days ago - at 01/22/2025 10:11PM

Multiple issues were identified with our rollout process that caused instability on the write path.
We have pushed out three fixes that drastically reduce the number of dropped writes and make all failed writes retryable. We will continue to closely monitor this issue and make improvements where necessary.

IDENTIFIED 14 days ago - at 01/22/2025 07:52PM

We are continuing to diagnose and improve this situation. Fixes have been rolled out on Monday January 20th, and today, Wednesday January 22nd, to reduce drops during rollouts.

IDENTIFIED 18 days ago - at 01/18/2025 12:16AM

The issue has been identified and a fix is being implemented.

INVESTIGATING 19 days ago - at 01/17/2025 07:56PM

We are continuing to investigate this issue.

INVESTIGATING 19 days ago - at 01/17/2025 07:56PM

Users may receive internet write failures on collectors stating:
"Exporting failed. Will retry the request after interval. {"kind": "exporter", "data_type": "traces", "name": "otlp/cloud", "error": "rpc error: code = Unavailable desc = unexpected HTTP status code received from server: 502 (Bad Gateway)"

The agent/collector will automatically retry the write attempt and succeed, so there is no loss in data collection.

Be the first to know when Grafana and other third-party services go down

With IsDown, you can monitor all your critical services' official status pages from one centralized dashboard and receive instant alerts the moment an outage is detected. Say goodbye to constantly checking multiple sites for updates and stay ahead of outages with IsDown.

Start free trial

No credit card required · Cancel anytime · 3722 services available

Integrations with Slack Microsoft Teams Google Chat Datadog PagerDuty Zapier Discord Webhook