This incident has been resolved.
Azure confirmed that the fix has been successfully deployed to East US 2 region. We are monitoring the results.
We have identified the root cause of the autoscaling issue affecting certain deployments on the Shared Azure Cluster in the US-East2 region. The issue is related to underlying infrastructure behaviour within the cloud provider environment.
Azure has acknowledged the issue and is currently rolling out a hotfix across regions. We are actively monitoring the rollout and will provide further updates as they become available.
We are currently investigating an issue affecting Airflow worker autoscaling for some deployments hosted on the Shared Azure Cluster in the US-East2 region.
As a result of this issue, worker pods may not scale up as expected in response to workload demand. This can lead to tasks remaining in a queued state longer than usual and, in some cases, failing due to queued timeouts.
Our engineering team is actively working to identify the root cause and restore normal autoscaling behaviour. We will provide further updates as more information becomes available.
Impact:
• Affected deployments may experience delayed task execution.
• Worker pods may not scale up from zero or may not scale as expected under load.
We will continue to share updates on this page as we make progress toward resolution.
With IsDown, you can monitor all your critical services' official status pages from one centralized dashboard and receive instant alerts the moment an outage is detected. Say goodbye to constantly checking multiple sites for updates and stay ahead of outages with IsDown.
Start free trialNo credit card required · Cancel anytime · 6020 services available
Integrations with