We are investigating internal error failure
Incident Report — Infrastructure Upgrade
We began upgrading our infrastructure to introduce a new Kubernetes-based worker architecture.
During deployment, the system entered an unexpected intermediate state where parts of the platform were running under the new architecture while others were still relying on the previous worker model. This caused instability in flow execution.
Because the system was partially migrated, there was no safe recovery path from this state. To restore stability, we changed the architecture again and reverted to the previous worker setup, redeploying workers using our earlier deployment system (Kamal).
After the rollback and redeployment, workers resumed normal processing and the system returned to stable operation.
Impact: During this period, some flows experienced failures, delays, or retries, and workflow execution was temporarily unstable.
Our workers are catching up with the queued jobs.
We are monitoring errors showing up again
We are up and running again.
We are currently experiencing issues with our workers after a recent refactor to the worker architecture.
With IsDown, you can monitor all your critical services' official status pages from one centralized dashboard and receive instant alerts the moment an outage is detected. Say goodbye to constantly checking multiple sites for updates and stay ahead of outages with IsDown.
Start free trialNo credit card required · Cancel anytime · 6020 services available
Integrations with