Outage in Aqua Cloud

CyberCenter Service Disruption

Resolved Major
August 21, 2024 - Started 8 months ago
Official incident page

Need to monitor Aqua Cloud outages?
Stay on top of outages with IsDown. Monitor the official status pages of all your vendors, SaaS, and tools, including Aqua Cloud, and never miss an outage again.
Start Free Trial

Outage Details

On August 21st, 2023, at approximately 8:30 PM UTC, our Container Image Scanning service experienced a major disruption due to a lambda function exceeding its ephemeral storage limit. The lambda, responsible for downloading and extracting a critical database for image scanning, was configured with 3GB of ephemeral storage. However, the extracted database size of 2.5GB, combined with the 500MB zip archive, exhausted the available storage, causing the lambda to enter a panic state. This resulted in a service outage, impacting container image scanning capabilities. Although monitoring was in place for various components, an alert specifically based on lambda panics was missing, delaying proactive identification and remediation. The Aqua Fields team promptly identified the issue and engaged the on-call channel. However, due to the unavailability of the US team and the late hour in India, response time was impacted. The India team resolved the incident at 10:52 PM UTC on August 21st by increasing the lambda's ephemeral storage. We apologize for any inconvenience caused by this disruption. We are taking steps to improve our monitoring and alerting capabilities, including implementing automated remediation where possible, to prevent similar incidents in the future.
Latest Updates ( sorted recent to last )
RESOLVED 8 months ago - at 08/27/2024 06:28PM

On August 21st, 2023, at approximately 8:30 PM UTC, our Container Image Scanning service experienced a major disruption due to a lambda function exceeding its ephemeral storage limit. The lambda, responsible for downloading and extracting a critical database for image scanning, was configured with 3GB of ephemeral storage. However, the extracted database size of 2.5GB, combined with the 500MB zip archive, exhausted the available storage, causing the lambda to enter a panic state.

This resulted in a service outage, impacting container image scanning capabilities. Although monitoring was in place for various components, an alert specifically based on lambda panics was missing, delaying proactive identification and remediation.

The Aqua Fields team promptly identified the issue and engaged the on-call channel. However, due to the unavailability of the US team and the late hour in India, response time was impacted. The India team resolved the incident at 10:52 PM UTC on August 21st by increasing the lambda's ephemeral storage.

We apologize for any inconvenience caused by this disruption. We are taking steps to improve our monitoring and alerting capabilities, including implementing automated remediation where possible, to prevent similar incidents in the future.

Real-time vendor status monitoring for IT and Ops teams

With IsDown, you can monitor all your critical services' official status pages from one centralized dashboard and receive instant alerts the moment an outage is detected. Say goodbye to constantly checking multiple sites for updates and stay ahead of outages with IsDown.

Start free trial

No credit card required · Cancel anytime · 3970 services available

Integrations with Slack Microsoft Teams Google Chat Datadog PagerDuty Zapier Discord Webhook