Outage in AWS

Increased Error Rates and Latencies - N. Virginia

Resolved Minor

October 28, 2025 - Started 9 months ago - Lasted about 7 hours

Incident Report

Earlier today some EC2 launches within the use1-az2 Availability Zone (AZ) experienced increased latencies for EC2 instance launches. We communicated with affected customers via the AWS Personal Health Dashboard shortly after the issue began. This issue has been resolved and EC2 instance launches are operating normally, however some request throttles are currently in place for the use1-az2 Availability Zone (AZ), which are gradually being removed. Customers may experience “request limit exceeded” in this AZ while these throttles are in place; retries should resolve the issue. Currently we are investigating task launch failure rates for ECS tasks for both EC2 and Fargate for a subset of customers in the US-EAST-1 Region. Customers may also see their container instances disconnect from ECS which can cause tasks to stop in some circumstances. ECS operates cells in the Region and a small number of these cells are currently experiencing elevated error rates launching new tasks and existing tasks may stop unexpectedly. When creating an ECS cluster to run tasks, the cluster is assigned to a specific cell. Customers with a cluster in impacted cells are seeing impact across all Availability Zones in the Region. At this time, we recommend customers who can, create new clusters to ensure that the cluster is assigned to a healthy cell. Existing clusters in the remaining healthy cells are not affected. We have identified actions to restore the impacted cells to full service but do not have an estimated time of recovery. Customers who use EMR Serverless are also affected by this issue. We will provide an update by 4:15 PM PDT or as soon as more information becomes available.

Components affected

AWS App Runner AWS App Runner (us-east-1) Amazon Managed Workflows for Apache Airflow Amazon Managed Workflows for Apache Airflow (us-east-1) AWS Glue AWS Glue (us-east-1) Amazon EC2 Amazon EC2 (us-east-1) Amazon ECS Amazon ECS (us-east-1) Amazon EMR Serverless Amazon EMR Serverless (us-east-1) AWS Batch AWS Batch (us-east-1) AWS CodeBuild AWS CodeBuild (us-east-1) AWS DataSync AWS DataSync (us-east-1) AWS Fargate AWS Fargate (us-east-1) Multiple services Multiple services (us-east-1) Amazon EKS Amazon EKS (us-east-1)

Trusted by 1,000+ teams

The Status Page Aggregator with Early Outage Detection

Stop finding out about outages from your users. Monitor 6,320+ cloud services and get alerted the second something breaks.

Start Free Trial Learn More

Latest Updates ( sorted recent to last )

UPDATE 9 months ago - at 10/29/2025 04:52AM

We are observing significant recovery for services impacted by ECS. For ECS itself, we have recovered two of the three impacted cells and continue to work towards recovering the remaining cell. We have lifted throttles for the two recovered cells but throttling remains in effect for the third cell. The vast majority of customer applications should be recovered. We will continue to provide updates as we have additional information available, or by 11:00 PM.

UPDATE 9 months ago - at 10/29/2025 03:54AM

We are seeing significant signs of recovery and continue to work toward full resolution.

UPDATE 9 months ago - at 10/29/2025 03:08AM

We have made additional progress. For EMR Serverless, we have completed refreshing the warm pool with healthy clusters. We recommend customers restart their existing applications. We are also observing ECS task launches are beginning to succeed. While we work toward full resolution of the underlying issue, some requests will be throttled. Our current best estimate of an ETA to full recovery is an additional 1-2 hours away. As we make additional progress, success rates for affected operations are continuing to improve. We will continue to provide updates as we have additional information available, or by 9:05 PM.

UPDATE 9 months ago - at 10/29/2025 01:50AM

We continue to make additional progress towards ongoing mitigation efforts. While we have not fully recovered, we can confirm we are seeing positive signs of improvement for ECS clusters task launches on the impacted cells in the US-EAST-1 Region. For customers who need immediate recovery, we recommend recreating impacted ECS Clusters using a different identifier for clusterName. We continue to work toward full recovery. For EKS, impact is limited to Fargate launches only.

For MWAA environments that are stuck in an unhealthy state, or that are impaired, we recommend customers perform an update to the environment without changing the current configuration.

For EMR serverless, we have made substantial progress to refresh the warm pool with healthy clusters and continue to work toward full recovery. Once we have fully refreshed these warm pools we will provide additional guidance for required action to mitigate impact.

Our current best estimate of an ETA to full recovery is an additional 2-4 hours away. As we make additional progress, success rates for affected operations will improve. We will continue to provide updates as we have additional information available, or by 7:45 PM.

UPDATE 9 months ago - at 10/29/2025 12:31AM

We want to provide an update on EMR Serverless. EMR Serverless maintains a warm pool of ECS clusters to support customer requests, and some of these clusters are operating in the impacted ECS cells. In order to reduce EMR Serverless error rates, we are actively working on refreshing these warm pools with healthy clusters. For ECS, we continue to make progress on recovering impacted ECS cells, but progress is not visible externally. ECS has stopped new launches and tasks on the affected clusters. Some services (such as Glue) are observing recovery for error rates, but may still be experiencing increased latency. Our current best estimate of an ETA is 2-3 hours away. As we make additional progress, success rates for affected operations will improve. We will continue to provide updates as we have additional information available, or by 6:30 PM.

UPDATE 9 months ago - at 10/28/2025 11:31PM

For EMR Serverless, some jobs continue to experience increased execution delays or failures. For EC2, we continue to throttle some requests (new instance launches and other networking related mutating API calls) in a single Availability Zone (use1-az2) in the US-EAST-1 Region. These throttles will remain until we have fully mitigated all issues, and have a high degree of confidence that the issue will not reoccur. Existing instances are unaffected by this issue. For ECS, we are continuing to make progress toward recovering the impacted ECS cells, but this has not yet resulted in customer visible improvements. Customers that are experiencing task launch errors and latencies in the impacted ECS cells are not yet observing improvement. While we do not have a firm ETA, we expect full recovery is 2-3 hours away. As we make additional progress, success rates for affected operations will improve. We will continue to provide updates as we have additional information available, or by 5:30 PM.

UPDATE 9 months ago - at 10/28/2025 10:36PM

Earlier today some EC2 launches within the use1-az2 Availability Zone (AZ) experienced increased latencies for EC2 instance launches. We communicated with affected customers via the AWS Personal Health Dashboard shortly after the issue began. This issue has been resolved and EC2 instance launches are operating normally, however some request throttles are currently in place for the use1-az2 Availability Zone (AZ), which are gradually being removed. Customers may experience “request limit exceeded” in this AZ while these throttles are in place; retries should resolve the issue.

Currently we are investigating task launch failure rates for ECS tasks for both EC2 and Fargate for a subset of customers in the US-EAST-1 Region. Customers may also see their container instances disconnect from ECS which can cause tasks to stop in some circumstances. ECS operates cells in the Region and a small number of these cells are currently experiencing elevated error rates launching new tasks and existing tasks may stop unexpectedly. When creating an ECS cluster to run tasks, the cluster is assigned to a specific cell. Customers with a cluster in impacted cells are seeing impact across all Availability Zones in the Region. At this time, we recommend customers who can, create new clusters to ensure that the cluster is assigned to a healthy cell. Existing clusters in the remaining healthy cells are not affected. We have identified actions to restore the impacted cells to full service but do not have an estimated time of recovery. Customers who use EMR Serverless are also affected by this issue. We will provide an update by 4:15 PM PDT or as soon as more information becomes available.

Latest AWS outages

Inaccurate Estimated Billing Data - about 7 hours ago

Increased 5xx Errors - 1 day ago

Elevated connectivity issues with a single Avalability zone - Frankfurt - 1 day ago

Increased Launch Template API Error Rates - N. Virginia - 11 days ago

Increased Error Rates and Latencies - Stockholm - 17 days ago

The Status Page Aggregator with Early Outage Detection

With IsDown, you can monitor all your critical services' official status pages from one centralized dashboard and receive instant alerts the moment an outage is detected. Say goodbye to constantly checking multiple sites for updates and stay ahead of outages with IsDown.

Start free trial

No credit card required · Cancel anytime · 6320 services available

Integrations with Slack Microsoft Teams Google Chat Datadog PagerDuty Zapier Discord Webhook