Outage in Georgia Tech IT

Degraded Performance on Phoenix Project storage

Resolved Minor
March 31, 2025 - Started 17 days ago - Lasted about 7 hours

Need to monitor Georgia Tech IT outages?
Stay on top of outages with IsDown. Monitor the official status pages of all your vendors, SaaS, and tools, including Georgia Tech IT, and never miss an outage again.
Start Free Trial

Outage Details

Summary: Performance of Phoenix project storage is currently degraded.

Details: Two of our MDS (MetaData Servers) rebooted early
Monday morning, March 31, and load averages are unusually high on one of them.

Impact: Researchers may experience significant slowness in read & write performance on Phoenix project storage until we are able to mitigate the issue. Conda environments located in project storage may be very slow to load (even if the python script to run is located elsewhere) or fail to activate, while attempts to view project storage files via the OnDemand web portal may time out.
Components affected
Georgia Tech IT Academic Services
Latest Updates ( sorted recent to last )
17 days ago - at 03/31/2025 01:13PM

Summary: Performance of Phoenix project storage is currently degraded.

Details: Two of our MDS (MetaData Servers) rebooted early
Monday morning, March 31, and load averages are unusually high on one of them.

Impact: Researchers may experience significant slowness in read & write performance on Phoenix project storage until we are able to mitigate the issue. Conda environments located in project storage may be very slow to load (even if the python script to run is located elsewhere) or fail to activate, while attempts to view project storage files via the OnDemand web portal may time out.

17 days ago - at 03/31/2025 04:38PM

To limit the impact of the current Phoenix project filesystem issues, we have implemented the following changes to expedite troubleshooting and limit impact to currently running jobs:

New Logins to Phoenix Login Nodes are Paused
We have prevented new login attempts to the Phoenix login nodes. Users that are currently logged in will be able to stay logged onto the system.

Phoenix Jobs Prevented from Starting
Jobs that are in the queue but that have not yet started have been paused to prevent them from starting. These submitted jobs will remain in the queue.
Jobs that are currently running may experience decreased performance if using project storage. We are doing our best to prioritize the successful completion of these jobs.

Open OnDemand (OOD)
Users of Phoenix OOD can log in and interact with only their home directory. Project and scratch space are not available.
Some users of Open OnDemand may be unable to reach this service and are experiencing "Proxy Error" messages. We are investigating the root cause of this issue.

Globus File Transfer Paused for Project Space
File transfers to/from project storage on Globus have been paused. Other Globus transfers (Box, DropBox, and OneDrive cloud connectors; scratch; home; and CEDAR) will continue.

The PACE team is working as quickly as we are able to diagnose the current issues with support from our filesystem vendor. We aim to resume normal operation of the Phoenix cluster as quickly as we are able. We will continue to share updates as we have them and apologize for this unexpected service outage.

Real-time vendor status monitoring for IT and Ops teams

With IsDown, you can monitor all your critical services' official status pages from one centralized dashboard and receive instant alerts the moment an outage is detected. Say goodbye to constantly checking multiple sites for updates and stay ahead of outages with IsDown.

Start free trial

No credit card required · Cancel anytime · 3949 services available

Integrations with Slack Microsoft Teams Google Chat Datadog PagerDuty Zapier Discord Webhook