Outage in Netdata

Nightly static builds overwrite node and metric data upon install

Resolved Major
April 01, 2025 - Started about 1 month ago - Lasted 1 day
Official incident page

Need to monitor Netdata outages?
Stay on top of outages with IsDown. Monitor the official status pages of all your vendors, SaaS, and tools, including Netdata, and never miss an outage again.
Start Free Trial

Outage Details

We have found that a recent change in the nightly static builds of Netdata Agent causes meta data on the Agent to be overwritten. Specifically the sqlite3 database that keeps meta data on which timeseries stored in dbengine correspond with which metrics, and the information on the Agent's "machine GUID" are overwritten with the same copy in the build package. Not affected are: - All stable releases - Native packages (.deb and .rpm) Affected are all nightly static builds with the following version numbers: - 2.3.0-50-nightly - 2.3.0-60-nightly - 2.3.0-72-nightly - 2.3.0-78-nightly - 2.3.0-87-nightly The initial impact is that all affected Agent installs, even though they still have the timeseries data stored on disk, have lost all meta data associated with it, so these timeseries become inaccessable. This is unrecoverable. Additionally, the main form of identification is overwritten, too. We are assessing what the impact is for users of Netdata Cloud, and will update this incident with more information when the investigation is completed. The bug itself has been fixed and merged. We will issue a new nightly build shortly.
Components affected
Netdata Agent Services
Latest Updates ( sorted recent to last )
RESOLVED 30 days ago - at 04/02/2025 09:25AM

Affected Agents can cause the creation of multiple duplicate nodes in Netdata Cloud. All but the last one will appear as offline, and the last one will be as if it was created from scratch with no data. Unfortunately, the previously stored metrics for the affected nodes can not be recovered.

The duplicate offline nodes can be safely deleted from Space Settings -> Nodes. Note that you may have to add the newest copy of these nodes to the appropriate rooms.

The fixed nightly static build is v2.3.0-102.

INVESTIGATING about 1 month ago - at 04/01/2025 09:00AM

We have found that a recent change in the nightly static builds of Netdata Agent causes meta data on the Agent to be overwritten. Specifically the sqlite3 database that keeps meta data on which timeseries stored in dbengine correspond with which metrics, and the information on the Agent's "machine GUID" are overwritten with the same copy in the build package.

Not affected are:

- All stable releases
- Native packages (.deb and .rpm)

Affected are all nightly static builds with the following version numbers:

- 2.3.0-50-nightly
- 2.3.0-60-nightly
- 2.3.0-72-nightly
- 2.3.0-78-nightly
- 2.3.0-87-nightly

The initial impact is that all affected Agent installs, even though they still have the timeseries data stored on disk, have lost all meta data associated with it, so these timeseries become inaccessable. This is unrecoverable. Additionally, the main form of identification is overwritten, too.

We are assessing what the impact is for users of Netdata Cloud, and will update this incident with more information when the investigation is completed.

The bug itself has been fixed and merged. We will issue a new nightly build shortly.

Real-time vendor status monitoring for IT and Ops teams

With IsDown, you can monitor all your critical services' official status pages from one centralized dashboard and receive instant alerts the moment an outage is detected. Say goodbye to constantly checking multiple sites for updates and stay ahead of outages with IsDown.

Start free trial

No credit card required · Cancel anytime · 3970 services available

Integrations with Slack Microsoft Teams Google Chat Datadog PagerDuty Zapier Discord Webhook