Outage in Google Cloud

Vertex AI custom training jobs failing if using more than 2GB ephemeral storage

Resolved Minor
August 16, 2024 - Started 4 months ago - Lasted about 4 hours

Need to monitor Google Cloud outages?
Stay on top of outages with IsDown. Monitor the official status pages of all your vendors, SaaS, and tools, including Google Cloud, and never miss an outage again.
Start Free Trial

Outage Details

Summary: Vertex AI custom training jobs failing if using more than 2GB ephemeral storage Description: Mitigation work is currently underway by our engineering team. We do not have an ETA for mitigation at this point. We will provide more information by Friday, 2024-08-16 17:00 US/Pacific. Diagnosis: Custom Vertex AI training jobs running on GKE and using more than 2GB of ephemeral storage may fail with the error ""Pod ephemeral local storage usage exceeds the total limit of containers 2Gi." Workaround: None at this time.

Be the first to know when Google Cloud and other third-party services go down

With IsDown, you can monitor all your critical services' official status pages from one centralized dashboard and receive instant alerts the moment an outage is detected. Say goodbye to constantly checking multiple sites for updates and stay ahead of outages with IsDown.

Start free trial

No credit card required · Cancel anytime · 3278 services available

Integrations with Slack Microsoft Teams Google Chat Datadog PagerDuty Zapier Discord Webhook

Setup in 5 minutes or less

How much time you'll save your team, by having the outages information close to them?

14-day free trial · No credit card required · Cancel anytime