Use cases
Software Products E-commerce MSPs Schools Development & Marketing DevOps Agencies Help Desk
Company
Internet Status Blog Pricing Log in Get started free

Outage in AgResearch eRI

gpu-0 compute node drained

Resolved Major
September 14, 2025 - Started 8 months ago - Lasted 4 days
Official incident page

Incident Report

During last week's network issues we discovered gpu-0 had its own set of unrelated problems and hence it has been drained. The node is suffering from frequent mlag failover on its bonded interface. We will be reseating and testing cables early this week.

Trusted by 1,000+ teams

Never miss outages in third-party dependencies

Stop finding out about outages from your users. Monitor 6,320+ cloud services and get alerted the second something breaks.

IsDown status aggregator dashboard
Latest Updates ( sorted recent to last )
RESOLVED 7 months ago - at 09/19/2025 01:25AM

This incident has been resolved.

MONITORING 7 months ago - at 09/17/2025 04:21AM

We have made some network configuration changes on gpu-0 which have reduced the mlag failover frequency. We will monitor for any regression. gpu-0 is now availabel again in SLurm.

INVESTIGATING 8 months ago - at 09/14/2025 10:10PM

During last week's network issues we discovered gpu-0 had its own set of unrelated problems and hence it has been drained. The node is suffering from frequent mlag failover on its bonded interface. We will be reseating and testing cables early this week.

Never miss outages in third-party dependencies

With IsDown, you can monitor all your critical services' official status pages from one centralized dashboard and receive instant alerts the moment an outage is detected. Say goodbye to constantly checking multiple sites for updates and stay ahead of outages with IsDown.

Start free trial

No credit card required · Cancel anytime · 6320 services available

Integrations with Slack Microsoft Teams Google Chat Datadog PagerDuty Zapier Discord Webhook