Outage in AWS
Increased error rates
Resolved
Minor
August 07, 2021 - Started over 3 years ago
- Lasted 8 months
Need to monitor AWS outages?
Stay on top of outages with IsDown. Monitor the official status pages of all your vendors, SaaS, and tools, including AWS, and never miss an outage again.
Start Free Trial →
Outage Details
6:08 PM PDT We are investigating increased error rate and latencies for DynamoDB in US-EAST-1 region.
6:32 PM PDT We can confirm increased error rates for DynamoDB in the US-EAST-1 Region. We have identified the root cause of the issue and are working towards resolution.
7:03 PM PDT We have seen some improvement to the error rates for DynamoDB in the US-EAST-1 Region and continue to work towards full resolution. For customers that are experiencing 503 errors, retries may resolve the issue in some cases. In other cases, recreating the connection to DynamoDB may address the error rates. We continue to take steps towards full resolution for all affected tables.
8:11 PM PDT We continue to make progress in addressing the increased error rates for DynamoDB in the US-EAST-1 Region. The root cause of the issue is a problem with the metadata subsystem used by DynamoDB, where several nodes are in an unhealthy state. We continue to work towards restoring the health of these nodes. The issue affected a subset of DynamoDB tables that are associated with the unhealthy nodes in the metadata subsystem. For these tables, customers will experience increased error rates until we have resolved the issue. DynamoDB tables that are not associated with the affected metadata nodes, are not affected by this issue. We continue to work towards full resolution.
9:00 PM PDT We continue to see an improvement in the error rates for affected DynamoDB tables in the US-EAST-1 Region. Since the start of the event, we have seen a 75% reduction in error rates and are now working on resolving the errors for the remaining DynamoDB tables.
10:20 PM PDT We have resolved the error rates for the majority of the affected DynamoDB tables and now have a small number of DynamoDB tables that are still experiencing error rates and a small number of global secondary indexes that are experiencing propagation delays. While all the nodes in the metadata store are now healthy, some are not yet able to process incoming requests, which we are working to resolve.
11:11 PM PDT We have now resolved the error rates affecting DynamoDB tables in the US-EAST-1 Region. A small number of DynamoDB tables continue to experience delayed propagation for global secondary indexes, but these are moving towards full recovery as well. We’re continuing to monitor the service, but customers should be seeing recovery for their DynamoDB tables at this stage.
Aug 7, 1:50 AM PDT We have now resolved the Global Secondary index propagation delays for most customers in the US-EAST-1 region. Our mitigation efforts are working as expected and we continue to work towards full recovery.
Aug 7, 2:53 AM PDT Between August 6 5:23 PM and August 7 2:48 AM PDT, DynamoDB customers experienced API errors and delayed propagation for global secondary indexes in the US-EAST-1 Region. The issue has been resolved and the service is operating normally.