Trusted by 1,000+ teams
Stop finding out about outages from your users. Monitor 6,320+ cloud services and get alerted the second something breaks.
We increased the replica count in production PODs, but the issue persists despite our mitigation efforts. The team is attempting to replicate the issue on a test POD and will look to restart the impacted PODs.
We’ll provide an update in 30 minutes or sooner if additional information becomes available.
Restarting the PODs did not improve the impact on customers. However, the team was successful in replicating the issue in test PODs. The team is now extracting debug logs to understand the cause of these intermittent timeouts.
We’ll provide an update in 60 minutes or sooner if additional information becomes available.
The team successfully replicated the issue after restarting the test POD, and the issue could no longer be replicated. The team is actively analyzing captured logs to understand any anomalies or patterns to determine the intermittent timeouts.
We’ll provide an update in 60 minutes or sooner if additional information becomes available.
The team is working to analyze captured logs to understand the intermittent timeouts. Restarts of the test PODs has been attempted, however this is only providing temporary relief. The team is continuing to investigate.
We’ll provide an update in 60 minutes or sooner if additional information becomes available.
The team continues to investigate and is restarting all production PODs, while also conducting an in-depth debugging analysis on one of the production PODs, as well as developing a hotfix. The hotfix will take a time to develop and test.
These activities are expected to take a few hours, and the next update will be shared once meaningful progress is made.
We have finished developing the hotfix and are proceeding to validate its suitability. Additionally, we have identified a recent change as a potential trigger and are preparing to roll this back on a subset of instances.
Once both options are validated, we will implement the most suitable option to all instances intermittently experiencing CDP query timeouts. We’ll provide an update when either validation has completed.
Rolling back the recent change is no longer being considered a viable remediation strategy as it introduced an additional issue. Alternatively, the fix-forward solution has been validated and deployed to all impacted PODs. We will monitor the environments for a period to ensure its effectiveness.
We’ll provide another update as additional information becomes available.
With IsDown, you can monitor all your critical services' official status pages from one centralized dashboard and receive instant alerts the moment an outage is detected. Say goodbye to constantly checking multiple sites for updates and stay ahead of outages with IsDown.
Start free trialNo credit card required · Cancel anytime · 6320 services available
Integrations with