Use cases
Software Products E-commerce MSPs Schools Development & Marketing DevOps Agencies Help Desk
Company
Internet Status Blog Pricing Log in Get started free

Outage in Crusoe Energy

VM creation and networking failure for A100 Infiniband type VMs in us-east region

Resolved Major
July 31, 2025 - Started 9 months ago - Lasted about 3 hours
Official incident page

Incident Report

We have identified an issue that is preventing new or restarted Virtual Machines from booting successfully on our A100 Infiniband hardware fleet. Any new VM provisioning request for this hardware type will also fail. Additionally, any existing VM on an A100 Infiniband node that is stopped and started (or rebooted) will also fail to come back online. Existing, currently running VMs are not affected and will continue to operate normally. We advise customers to avoid rebooting critical workloads on this hardware until a resolution is in place. Our engineering teams are actively investigating the root cause and are working to restore normal provisioning operations as quickly as possible.

Trusted by 1,000+ teams

The Status Page Aggregator with Early Outage Detection

Stop finding out about outages from your users. Monitor 6,320+ cloud services and get alerted the second something breaks.

Start Free Trial
  • No credit card
  • 14-day trial
  • 2-minute setup
IsDown status aggregator dashboard
Latest Updates ( sorted recent to last )
RESOLVED 9 months ago - at 08/01/2025 06:21AM

This incident is now resolved

MONITORING 9 months ago - at 08/01/2025 05:06AM

A fix has been implemented, and we are monitoring the environment for now.

IDENTIFIED 9 months ago - at 08/01/2025 03:44AM

The issue has been identified, and we have tested a fix internally. We are working on rolling out the fix to our A100 Infiniband type servers now.

INVESTIGATING 9 months ago - at 08/01/2025 03:18AM

We have identified an issue that is preventing new or restarted Virtual Machines from booting successfully on our A100 Infiniband hardware fleet.

Any new VM provisioning request for this hardware type will also fail. Additionally, any existing VM on an A100 Infiniband node that is stopped and started (or rebooted) will also fail to come back online.

Existing, currently running VMs are not affected and will continue to operate normally. We advise customers to avoid rebooting critical workloads on this hardware until a resolution is in place.

Our engineering teams are actively investigating the root cause and are working to restore normal provisioning operations as quickly as possible.

The Status Page Aggregator with Early Outage Detection

With IsDown, you can monitor all your critical services' official status pages from one centralized dashboard and receive instant alerts the moment an outage is detected. Say goodbye to constantly checking multiple sites for updates and stay ahead of outages with IsDown.

Start free trial

No credit card required · Cancel anytime · 6320 services available

Integrations with Slack Microsoft Teams Google Chat Datadog PagerDuty Zapier Discord Webhook