Use cases
Software Products E-commerce MSPs Schools Development & Marketing DevOps Agencies Help Desk
Company
Internet Status Blog Pricing Log in Get started free

Outage in LiftIgniter

Issues with services in US East due to capacity issues with cloud provider

Resolved Minor
April 28, 2022 - Started almost 4 years ago
Official incident page

Incident Report

Due to some capacity issues being experienced by our cloud provider (Google Cloud) in US East, we are or were experiencing issues with some of our services. Our query endpoint (query.petametrics.com), that is used to serve recommendations, saw (503 status) error rates rise to about 1%. Error rates were nonzero between 18:00 and 18:04 UTC. We had already started provisioning alternate capacity prior to the increase in error rates, but still got some errors as the provisioning of capacity took a few minutes. We also saw increased latency in the period from 17:51 to 18:11 UTC for the successful requests. We also provisioned alternate capacity for a few other affected services; these services had a few minutes of downtime while the alternate capacity was coming online. We significantly benefited from preparation we did after the previous incident http://status.liftigniter.com/incidents/1522vrjxbmcp.

Trusted by 1,000+ teams

The Status Page Aggregator with Early Outage Detection

Stop finding out about outages from your users. Monitor 6,320+ cloud services and get alerted the second something breaks.

Start Free Trial
  • No credit card
  • 14-day trial
  • 2-minute setup
IsDown status aggregator dashboard
Latest Updates ( sorted recent to last )
RESOLVED almost 4 years ago - at 04/28/2022 09:33PM

Capacity is back to normal and all configurations have been returned to their defaults.

MONITORING almost 4 years ago - at 04/28/2022 06:21PM

Due to some capacity issues being experienced by our cloud provider (Google Cloud) in US East, we are or were experiencing issues with some of our services.

Our query endpoint (query.petametrics.com), that is used to serve recommendations, saw (503 status) error rates rise to about 1%. Error rates were nonzero between 18:00 and 18:04 UTC. We had already started provisioning alternate capacity prior to the increase in error rates, but still got some errors as the provisioning of capacity took a few minutes. We also saw increased latency in the period from 17:51 to 18:11 UTC for the successful requests.

We also provisioned alternate capacity for a few other affected services; these services had a few minutes of downtime while the alternate capacity was coming online.

We significantly benefited from preparation we did after the previous incident http://status.liftigniter.com/incidents/1522vrjxbmcp.

The Status Page Aggregator with Early Outage Detection

With IsDown, you can monitor all your critical services' official status pages from one centralized dashboard and receive instant alerts the moment an outage is detected. Say goodbye to constantly checking multiple sites for updates and stay ahead of outages with IsDown.

Start free trial

No credit card required · Cancel anytime · 6320 services available

Integrations with Slack Microsoft Teams Google Chat Datadog PagerDuty Zapier Discord Webhook