Outage in Ravelin

Elevated BigTable Error Rate

Resolved Minor
April 11, 2021 - Started about 3 years ago - Lasted 12 months
Official incident page

Need to monitor Ravelin outages?
Stay on top of outages with IsDown. Monitor the official status pages of all your vendors, SaaS, and tools, including Ravelin, and never miss an outage again.
Start Free Trial

Outage Details

There was an elevated error rate on the API causing a spike in 500s between 1934 and 1938 BST and another beginning now. This appears to correlate with a spike in BigTable CPU usage which we are investigating.
Latest Updates ( sorted recent to last )
MONITORING about 3 years ago - at 04/11/2021 07:12PM

Normal operation has resumed.

Additional BigTable load was generated due to the queue retry behaviour of a data replay that has been running today. Ravelin uses data replays to patch and upload data, and they are a part of normal operation. The data requests are batched per customer for ordering purposes. This evening we have encountered a large enough batch that the queue handler timed out because we are not pinging it mid-batch. This automatically puts the entire batch back on the queue to be processed again. We believe the BigTable CPU spike and subsequent API response errors to be a result of this loop.

We will resume the replay once there is enough capacity for it to succeed. Tomorrow we will investigate pinging the message queue to avoid timing out mid-batch, and only re-queuing the parts of a batch yet to be processed.

INVESTIGATING about 3 years ago - at 04/11/2021 06:47PM

With another small spike between 1941 and 1943 BST we are investigating the cause of BigTable CPU spikes correlating with the time these errors occurred. Scoring times remain elevated. We have paused background data cleaning operations to reduce load.

MONITORING about 3 years ago - at 04/11/2021 06:41PM

There was an elevated error rate on the API causing a spike in 500s between 1934 and 1938 BST and another beginning now. This appears to correlate with a spike in BigTable CPU usage which we are investigating.

Latest Ravelin outages

Elevated API Errors - about 1 year ago
Dashboard Connection Issues - over 1 year ago
Elevated Error Rate - almost 2 years ago

The easiest way to monitor Ravelin and all cloud vendors

With IsDown, you can monitor all your critical services' official status pages from one centralized dashboard and receive instant alerts the moment an outage is detected. Say goodbye to constantly checking multiple sites for updates and stay ahead of outages with IsDown.

Start free trial

No credit card required · Cancel anytime · 3153 services available

Integrations with Slack Microsoft Teams Google Chat Datadog PagerDuty Zapier Discord Webhook

Setup in 5 minutes or less

How much time you'll save your team, by having the outages information close to them?

14-day free trial · No credit card required · Cancel anytime