Trusted by 1,000+ teams
Stop finding out about outages from your users. Monitor 6,320+ cloud services and get alerted the second something breaks.
This incident has been resolved.
The root cause was a dataset desynchronising between the two backend environments after the hadoop master failed. The hadoop master contains a single point of failure (the job scheduler). The on-premise environment was still processing data, while the other is 24h-36h behind.
We have rolled back to only using the on-premise environment and cleared the cached data. This should have mitigated the issue. We are monitoring the results.
This quarter, we are migrating the RIS/RIPEstat data to rented bare metal. Because we are in the final steps of this migration, part of the requests are routed to the new backend environment to evaluate the performance impact.
A key component in the hadoop environment of the new environment failed today (the active hadoop master node). This caused part of our backend cluster to become unresponsive and return empty data when returning responses from the new cluster.
Unfortunately these results poisoned a cache that is shared by the whole application, causing the system to be fully unavailable for these datasets. We are working on a mitigation.
We are investigating an issue with some RIPEstat datasets. As a user you may see "There was a problem handling this request. [...]" error messages.
With IsDown, you can monitor all your critical services' official status pages from one centralized dashboard and receive instant alerts the moment an outage is detected. Say goodbye to constantly checking multiple sites for updates and stay ahead of outages with IsDown.
Start free trialNo credit card required · Cancel anytime · 6320 services available
Integrations with