PostHog experienced analytics query timeouts and failures that affected dashboards, insights, and other query-driven views, making them slow or unavailable. Data ingestion was also impacted, causing delays in query results. The incident lasted 2.2 hours before being resolved.
We're continuing to monitor the occasional spikes in query timeouts and errors. Recovery from the spikes has been happening more quickly.
We're still seeing occasional spikes in query timeouts and errors, but recovering much more quickly. We're continuing to monitor and investigate.
We've reopened this incident after seeing the spikes in query timeouts/failures return. Apologies for the ongoing troubles. We're are investigating again.
We've resolved the query timeouts/failures incident. Cluster stability has improved and recent configuration changes to table profiles have been applied. Queries and the app are operating normally, with no backlog observed in distribution queues after the updates.
We've seen some smaller spikes in failed queries, so we're still monitoring this closely for a while, just in case, before we call this one resolved.
Load has been looking much better, but we are still monitoring for a while before we mark this one as resolved. Thanks much for your patience.
We are still monitoring the cluster to ensure stability. The failure rate for insight queries has dropped, and customer-facing query errors have decreased, but we are continuing to watch for further issues.
We have resumed event ingestion and are monitoring load on the cluster
We fixed the root cause and are monitoring the load on our ClickHouse cluster.
We have identified the issue and we are working on the fix
Unfortunately, the issue has reoccured and we have begun a second investigation.
-
We’ve identified an issue affecting analytics queries (timeouts and/or failures). Dashboards, insights, and other query-driven views may be slow or unavailable.
Ingestion is also impacted, so data delays are expected when issuing queries.
The load spike has been resolved and systems are operating normally.
We’ve identified an issue affecting analytics queries (timeouts and/or failures). Dashboards, insights, and other query-driven views may be slow or unavailable.
Ingestion is also impacted, so data delays are expected when issuing queries.
With IsDown, you can monitor all your critical services' official status pages from one centralized dashboard and receive instant alerts the moment an outage is detected. Say goodbye to constantly checking multiple sites for updates and stay ahead of outages with IsDown.
Start free trialNo credit card required · Cancel anytime · 6020 services available
Integrations with