Outage in Grafana

Rare intermittent write failures in Tempo

Resolved Minor
January 17, 2025 - Started 4 months ago - Lasted 6 days
Official incident page

Need to monitor Grafana outages?
Stay on top of outages with IsDown. Monitor the official status pages of all your vendors, SaaS, and tools, including Grafana, and never miss an outage again.
Start Free Trial

Outage Details

Users may receive internet write failures on collectors stating: "Exporting failed. Will retry the request after interval. {"kind": "exporter", "data_type": "traces", "name": "otlp/cloud", "error": "rpc error: code = Unavailable desc = unexpected HTTP status code received from server: 502 (Bad Gateway)" The agent/collector will automatically retry the write attempt and succeed, so there is no loss in data collection.
Latest Updates ( sorted recent to last )
RESOLVED 4 months ago - at 01/23/2025 11:04PM

This incident has been resolved.

MONITORING 4 months ago - at 01/22/2025 10:11PM

Multiple issues were identified with our rollout process that caused instability on the write path.
We have pushed out three fixes that drastically reduce the number of dropped writes and make all failed writes retryable. We will continue to closely monitor this issue and make improvements where necessary.

IDENTIFIED 4 months ago - at 01/22/2025 07:52PM

We are continuing to diagnose and improve this situation. Fixes have been rolled out on Monday January 20th, and today, Wednesday January 22nd, to reduce drops during rollouts.

IDENTIFIED 4 months ago - at 01/18/2025 12:16AM

The issue has been identified and a fix is being implemented.

INVESTIGATING 4 months ago - at 01/17/2025 07:56PM

We are continuing to investigate this issue.

INVESTIGATING 4 months ago - at 01/17/2025 07:56PM

Users may receive internet write failures on collectors stating:
"Exporting failed. Will retry the request after interval. {"kind": "exporter", "data_type": "traces", "name": "otlp/cloud", "error": "rpc error: code = Unavailable desc = unexpected HTTP status code received from server: 502 (Bad Gateway)"

The agent/collector will automatically retry the write attempt and succeed, so there is no loss in data collection.

All Your Service Status Pages in One Dashboard

With IsDown, you can monitor all your critical services' official status pages from one centralized dashboard and receive instant alerts the moment an outage is detected. Say goodbye to constantly checking multiple sites for updates and stay ahead of outages with IsDown.

Start free trial

No credit card required · Cancel anytime · 4000 services available

Integrations with Slack Microsoft Teams Google Chat Datadog PagerDuty Zapier Discord Webhook