Use cases
Software Products E-commerce MSPs Schools Development & Marketing DevOps Agencies Help Desk
Company
Internet Status Blog Pricing Log in Get started free

Outage in Crusoe Energy

VM Creation Failure for A100 Infiniband Type VMs in us-east1-a Region

Resolved Minor
October 02, 2025 - Started 7 months ago - Lasted about 17 hours
Official incident page

Incident Report

We have identified an issue that is preventing new or restarted Virtual Machines from booting successfully on our A100 Infiniband hardware fleet. Any new VM provisioning request for this hardware type will also fail. Additionally, any existing VM on an A100 Infiniband node that is stopped and started (or rebooted) will also fail to come back online. Existing, currently running VMs are not affected and will continue to operate normally. We advise customers to avoid rebooting critical workloads on this hardware until a resolution is in place. Our engineering teams are actively investigating the root cause and are working to restore normal provisioning operations as quickly as possible.

Trusted by 1,000+ teams

The Status Page Aggregator with Early Outage Detection

Stop finding out about outages from your users. Monitor 6,320+ cloud services and get alerted the second something breaks.

Start Free Trial
  • No credit card
  • 14-day trial
  • 2-minute setup
IsDown status aggregator dashboard
Latest Updates ( sorted recent to last )
RESOLVED 7 months ago - at 10/02/2025 08:50PM

This incident is now resolved.

MONITORING 7 months ago - at 10/02/2025 08:20PM

A fix has been implemented, and we are monitoring the environment.

IDENTIFIED 7 months ago - at 10/02/2025 04:00AM

We have identified an issue that is preventing new or restarted Virtual Machines from booting successfully on our A100 Infiniband hardware fleet.

Any new VM provisioning request for this hardware type will also fail. Additionally, any existing VM on an A100 Infiniband node that is stopped and started (or rebooted) will also fail to come back online.

Existing, currently running VMs are not affected and will continue to operate normally. We advise customers to avoid rebooting critical workloads on this hardware until a resolution is in place.

Our engineering teams are actively investigating the root cause and are working to restore normal provisioning operations as quickly as possible.

The Status Page Aggregator with Early Outage Detection

With IsDown, you can monitor all your critical services' official status pages from one centralized dashboard and receive instant alerts the moment an outage is detected. Say goodbye to constantly checking multiple sites for updates and stay ahead of outages with IsDown.

Start free trial

No credit card required · Cancel anytime · 6320 services available

Integrations with Slack Microsoft Teams Google Chat Datadog PagerDuty Zapier Discord Webhook