Minor · 6 days ago · lasted about 5 hours
We are currently experiencing a higher-than-usual load. As a consequence, we have accumulated some backlog on our event storage cluster. Alerts are still raised in a timely manner, but events may take a few minutes to appear on the events page. Our team is currently scaling up our infrastructure in order to process the backlog and return to normal.
Major · 9 days ago · lasted about 8 hours
A restart of a misbehaving server caused a cascading failure of our caching mecanism. These servers takes very long time to start and synchronise together. The ingestion is currently stopped since 22h50 UTC
Minor · 12 days ago · lasted about 13 hours
Due to an ongoing operation on an internal database, some counters are currently not working. These include : - Event counters on the intake page - Observable heatmaps in the Intelligence Center - Assets statistics - ... Counters are still being updated in the background, but they are not requestable right now
Minor · 14 days ago · lasted about 11 hours
One of our internal database cluster is down due to too much load. This database is responsible of some counters. Due to this problem, the heatmaps should not be updated and the events counters (intakes, entities and assets) should also have stopped working properly. The alert counter is not impacted by this incident.
Minor · 25 days ago · lasted about 10 hours
An unusual load on the process responsible for raising alerts resulted in some delay accumulating over the course of a few hours. The main issue has been identified, and our team is working on a temporary fix.
Minor · about 1 month ago · lasted about 22 hours
We would like to inform you of an ongoing incident that occurred between 14:52 and 14:58 CEST today. During this timeframe, a significant number of events were not parsed as expected due to an essential deployment in progress. To ensure that all events are properly parsed and recorded, we have initiated a replay process. While this action will rectify the parsing issues, it may result in some users seeing certain events in duplicate. Our team is actively monitoring the situation and the replay process to guarantee that all data is accurately processed. We will keep you updated on the progress and notify you upon the successful conclusion of the replay process..
SaaS rules the world, and all teams depend on them to do their most productive work. IsDown helps you monitor all your cloud services, so you can focus on what matters.
Try it out! How much time you'll save your team, by having the outages information close to them?