Outage in Adeptcore

Storage Performance for PHX Cluster 100

Resolved Minor

October 01, 2024 - Started over 1 year ago - Lasted 1 day
Official incident page

Incident Report

We have received reports of some virtual machines experiencing issues with degraded performance this morning. During our investigation we noticed that one of our Nimble SAN clusters was experiencing higher than expected latency. We have contacted datacenter storage engineers to investigate the issue. Currently, we are seeing higher than expected latency but it is currently trending downwards. Adeptcloud support is monitoring the status of virtual machines and migrating clusters when necessary. Latency has been cut down by about 50% in the last hour and we are seeing it trend down back to normal with occasional spikes. We are continuing to monitor the situation. We will post an update once the issue is fully resolved or once we learn of any new information in the meantime.

Components affected

Adeptcore ACP - Storage

Need to monitor Adeptcore outages?

Monitor all your external dependencies in one place
Get instant alerts when outages are detected
Be the first to know if service is down
Show real-time status on private or public status page
Keep your team informed

Start monitoring for free

Latest Updates ( sorted recent to last )

RESOLVED over 1 year ago - at 10/02/2024 09:30PM

We have continued to monitor the performance and latency of all storage arrays throughout the day today and have not noticed any other issues.

We have also received confirmation from the datacenter storage team that a component failure is behind the increased latency and degraded performance of the storage array yesterday. The storage array re-balanced the cache and some of the workload on the array was also offloaded to ensure proper operations until the component could be replaced.

We consider this issue to be resolved and do not anticipate any other performance related issues occurring, as such, we will be closing this incident and encourage anyone who may be experiencing any issues to contact us directly.

Thank you for your patience and understanding.

IDENTIFIED over 1 year ago - at 10/02/2024 01:56PM

We continued monitoring the latency and performance of the affected storage array throughout the evening yesterday. The latency levels returned to normal at 6:48 PM yesterday and have remained normal throughout the night and into the morning.

We are awaiting additional information from the datacenter storage team in regards to what caused the latency issues yesterday and will update this incident once we receive this information.

Currently, the affected storage array is not experiencing any issues with latency or performance but we will continue to monitor these statistics and the overall performance of the storage array until we are confident that the issue has been completely resolved and we receive such a confirmation from the datacenter storage team.

Additional updates to this incident will be provided as soon as we have any additional information.

IDENTIFIED over 1 year ago - at 10/01/2024 09:23PM

The cache is continuing to rebuild on the affected storage array at this time. We are also continuing to work with the datacenter storage engineers and HP's enterprise team during this time to ensure the rebuild is completed successfully.

We will be continuing to monitor the affected storage array throughout the evening to ensure smooth operations and a decreased latency.

At this time, we are also in the process of offloading IOPs off of the affected storage array to minimize the disruption some virtual machines have been experiencing.

We apologize for any inconvenience caused by this disruption.

This incident will be updated as soon as we have any additional information to share.

IDENTIFIED over 1 year ago - at 10/01/2024 06:50PM

The datacenter storage team has confirmed that the SSD caching on this storage array experienced an unexpected failure and the cache is actively being rebuilt on the array.

We are monitoring and are continuing to see a downward trajectory in overall latency on this storage array.

We are awaiting a confirmation from the datacenter team upon the cache being successfully rebuilt.

As it currently stands, the latency and overall performance on the cluster has improved, but has not yet returned to the expected levels.

Given the ongoing cache rebuilding process, it is likely that we may continue to see occasional spikes in latency which may affect performance on a small percentage of virtual machines but the overall average latency is slowly returning to normal.

We will be continuing to work with the datacenter storage team in regards to this ongoing issue and will update this incident once we receive additional information.

IDENTIFIED over 1 year ago - at 10/01/2024 02:43PM

We have received reports of some virtual machines experiencing issues with degraded performance this morning. During our investigation we noticed that one of our Nimble SAN clusters was experiencing higher than expected latency. We have contacted datacenter storage engineers to investigate the issue.

Currently, we are seeing higher than expected latency but it is currently trending downwards. Adeptcloud support is monitoring the status of virtual machines and migrating clusters when necessary.

Latency has been cut down by about 50% in the last hour and we are seeing it trend down back to normal with occasional spikes. We are continuing to monitor the situation. We will post an update once the issue is fully resolved or once we learn of any new information in the meantime.

Latest Adeptcore outages

Baremetal Alert - about 14 hours ago

Baremetal Connectivity Issues - 14 days ago

Elk Datacenter Outage - 22 days ago

PHX Partial Outage - about 1 month ago

ELK datacenter outage - about 1 month ago

The Status Page Aggregator with Early Outage Detection

With IsDown, you can monitor all your critical services' official status pages from one centralized dashboard and receive instant alerts the moment an outage is detected. Say goodbye to constantly checking multiple sites for updates and stay ahead of outages with IsDown.

Start free trial

No credit card required · Cancel anytime · 6020 services available

Integrations with Slack Microsoft Teams Google Chat Datadog PagerDuty Zapier Discord Webhook