Use cases
Software Products E-commerce MSPs Schools Development & Marketing DevOps Agencies Help Desk
Company
Internet Status Blog Pricing Log in Get started free

Outage in AgResearch eRI

Slow response on the login nodes and GPFS compute clients

Resolved Minor
October 19, 2025 - Started 6 months ago - Lasted 9 days
Official incident page

Incident Report

We are currently investigating this issue.

Trusted by 1,000+ teams

Never miss outages in third-party dependencies

Stop finding out about outages from your users. Monitor 6,320+ cloud services and get alerted the second something breaks.

IsDown status aggregator dashboard
Latest Updates ( sorted recent to last )
RESOLVED 6 months ago - at 10/29/2025 02:32AM

All bare metal compute nodes have now had the network config change applied. Packet loss is no longer occurring. In addition we have since identified a workload that was causing slow i/o response from the filesystem. This has been removed whilst we work to improve it.

MONITORING 6 months ago - at 10/22/2025 09:21PM

The network config change has now been applied to compute-2 and -4 successfully. Nix is now running better on these two nodes. We will continue to apply the same change to all the compute nodes as they become available.

MONITORING 6 months ago - at 10/22/2025 09:19PM

We are continuing to monitor for any further issues.

MONITORING 6 months ago - at 10/21/2025 11:30PM

The cluster and login nodes appear to be stable and performant now, although Nix may be slow on some compute nodes (2 and 4). We will be implementing a network config change on each compute node, in a rolling fashion to minimise the impact. This requires draining each node in the Slurm cluster, one at a time.

INVESTIGATING 6 months ago - at 10/20/2025 09:07PM

We are continuing to see slow response issues with Slurm and Nix but it seems to be intermittent. Investigation continues.

MONITORING 6 months ago - at 10/20/2025 07:45PM

We made a network configuration change to a single node last night. The cluster has been stable overnight, with some load on it. We'll continue monitoring today as the load increases. We will make the same change to the other bare metal compute nodes, in a rolling fashion, as they become available.

INVESTIGATING 6 months ago - at 10/20/2025 03:23AM

We have found evidence of network packet loss again and are continuing to investigate

INVESTIGATING 6 months ago - at 10/20/2025 02:22AM

We are currently investigating this issue.

Never miss outages in third-party dependencies

With IsDown, you can monitor all your critical services' official status pages from one centralized dashboard and receive instant alerts the moment an outage is detected. Say goodbye to constantly checking multiple sites for updates and stay ahead of outages with IsDown.

Start free trial

No credit card required · Cancel anytime · 6320 services available

Integrations with Slack Microsoft Teams Google Chat Datadog PagerDuty Zapier Discord Webhook