Outage in VMware Workspace ONE

Horizon Cloud Service First-Gen: Increased Request Error Rate (East US 2)

Resolved Minor
February 28, 2024 - Started 3 months ago - Lasted about 17 hours
Official incident page

Need to monitor VMware Workspace ONE outages?
Stay on top of outages with IsDown. Monitor the official status pages of all your vendors, SaaS, and tools, including VMware Workspace ONE, and never miss an outage again.
Start Free Trial

Outage Details

Horizon Cloud Service First-Gen is investigating increased error rates for requests made to the service. There may be limited or no service impact at this time. Service reliability is a top priority at Horizon Cloud Service, and we are making continuous improvements to better our systems. Additional Information and Support: Please visit Service Support Website at Customer Connect (https://customerconnect.vmware.com/login) or call Service Support Line at 1-877-4VMWARE. For information on contacting VMware support, see How to file a Support Request in Customer Connect (https://kb.vmware.com/s/article/2006985). Thanks for your patience. Additional Information: VMware will provide additional updates on our Unified Status page (http://status.workspaceone.com) until the maintenance activity is complete. Thank You, VMware Horizon Communications Team
Latest Updates ( sorted recent to last )
RESOLVED 3 months ago - at 02/29/2024 07:59AM

The issue is resolved. We apologize for the inconvenience and thank you for your patience and continued support.

The root cause was due to an unexpected downtime of the Microsoft Azure control plane component, in a single zone, followed by issues with interactions between various Azure control plane components responsible for Service Management.

This impacted GET and PUT call operations within the zone. After the component was self-recovered, there was a retry storm pattern between various Azure control plane services. This caused a huge volume of calls, which further increased the load and operation latencies. This in turn added to lower success rates and recovery durations.

There were allocation failures in the other two Azure zones due to heavier capacity demands as traffic and deployments were migrated to the other zones. Microsoft Azure worked on resolving the service management issues through various steps to reduce the call volumes and recover the QoS on Service Management and subsequently ramped traffic back up in the impacted zone.

Microsoft Azure gradually tested the system by removing the throttles imposed while working on the root cause, which allowed them to open up the allocation zone. Once the system turned healthy, customers would be able to retry their existing allocations and create new VMs.

Service reliability is a top priority at Horizon Cloud Service, and we are making continuous improvements to better our systems.

Additional Information and Support:
Please visit Service Support Website at Customer Connect (https://customerconnect.vmware.com/login) or call Service Support Line at 1-877-4VMWARE.
For information on contacting VMware support, see How to file a Support Request in Customer Connect (https://kb.vmware.com/s/article/2006985).

Thank You,
VMware Horizon Communications Team

MONITORING 3 months ago - at 02/29/2024 06:35AM

Microsoft Engineering has re-opened the zone and its ready to take operations.

Horizon Cloud Service Operations teams are continuing to monitor the availability of the service to ensure the recovery is complete.

The next update will be provided in 3 hours or as events warrant.

Service reliability is a top priority at Horizon Cloud, and we are making continuous improvements to better our systems.

Thanks for your patience.

Additional Information: VMware will provide additional updates on our Unified Status page (http://status.workspaceone.com) until this issue has been resolved.

Thank You,
VMware Horizon Communications Team

MONITORING 3 months ago - at 02/29/2024 05:25AM

Microsoft Engineering has fully recovered the allocation issues. The team is currently monitoring the health of the zone. The next steps are to ensure the affected zone is healthy and then reopen it, restoring full capacity to the region. The zone is expected to be fully restored within the next 1 hour. The next update will be provided in 60 minutes or as events warrant.

Horizon Cloud Service Operations teams are continuing to monitor the availability of the service to ensure the recovery is complete.

Service reliability is a top priority at Horizon Cloud, and we are making continuous improvements to better our systems.

Thanks for your patience.

Additional Information: VMware will provide additional updates on our Unified Status page (http://status.workspaceone.com) until this issue has been resolved.

Thank You,
VMware Horizon Communications Team

IDENTIFIED 3 months ago - at 02/29/2024 03:07AM

Horizon Cloud Service First-Gen is experiencing intermittent issues with VM provisioning and Power operations in the “East US 2” region. Existing VMs that are currently running are not impacted.

Microsoft engineering team is continuing to work towards 100% recovery of the allocation issues. In parallel, the team has been working to free up capacity in the other zones and to restore all throttles imposed while addressing the root issue.

The next steps are to ensure a healthy state for the zone and then re-open the zone, thus restoring all capacity to the region. Total restoration for the zone is expected to happen in the next 2-3 hours. The next update will be provided in 2 hours, or as events warrant.

Service reliability is a top priority at Horizon Cloud, and we are making continuous improvements to better our systems.

Thanks for your patience.

Additional Information: VMware will provide additional updates on our Unified Status page (http://status.workspaceone.com) until this issue has been resolved.

Thank You,
VMware Horizon Communications Team

IDENTIFIED 3 months ago - at 02/29/2024 01:01AM

Horizon Cloud Service First-Gen continues to experience issues intermittently for VM provisioning and Power operations in the "East US 2" region. Existing VMs that are currently running are not impacted.

Microsoft have mitigated the issue related to Object Store failure but is continuing to work to clear the operations backlog (queue). This is progressing as expected from Microsoft, however it is expected to take about 4-6 hours. The next update will be provided in 2 hours, or as events warrant.

Service reliability is a top priority at Horizon Cloud, and we are making continuous improvements to better our systems.

Thanks for your patience.

Additional Information: VMware will provide additional updates on our Unified Status page (http://status.workspaceone.com) until this issue has been resolved.

Thank You,
VMware Horizon Communications Team

IDENTIFIED 3 months ago - at 02/28/2024 11:58PM

Horizon Cloud Service First-Gen continues to experience issues intermittently for VM provisioning and Power operations in the "East US 2" region. Existing VMs that are currently running are not impacted.

Microsoft have identified a low-level component called Object Store presented a failure which caused a surge in API GET call operations and subsequent queuing of GET API calls. The component is responsible for processing calls from other components to create linked resources associated to several compute services. Impacted components were restarted to mitigate the issue, but the accumulated backlog did not drain as expected. Microsoft have since stopped new allocations in the
affected zone and are continuing to work to clear the operations backlog (queue), which is progressing as expected. The next update will be provided in 60 minutes, or as events warrant.

Service reliability is a top priority at Horizon Cloud, and we are making continuous improvements to better our systems.

Thanks for your patience.

Additional Information: VMware will provide additional updates on our Unified Status page (http://status.workspaceone.com) until this issue has been resolved.

Thank You,
VMware Horizon Communications Team

IDENTIFIED 3 months ago - at 02/28/2024 10:15PM

Horizon Cloud Service First-Gen continues to experience issues intermittently for VM provisioning and Power operations in the "East US 2" region. Existing VMs that are currently running are not impacted.

Microsoft have identified a low-level control plane component responsible for persistence that impacted GET and PUT call operations. The component is responsible for processing calls from other components to create linked resources associated to several compute services. Impacted components were restarted to mitigate the issue, but the accumulated backlog did not drain as expected. To help alleviate this condition, Microsoft engineers have stopped new allocations in the affected zone and are working to clear the operations backlog (queue). The next update will be provided in 60 minutes or as events warrant.

Service reliability is a top priority at Horizon Cloud, and we are making continuous improvements to better our systems.

Thanks for your patience.

Additional Information: VMware will provide additional updates on our Unified Status page (http://status.workspaceone.com) until this issue has been resolved.

Thank You,
VMware Horizon Communications Team

IDENTIFIED 3 months ago - at 02/28/2024 08:58PM

Horizon Cloud Service First-Gen continues to experience issues intermittently for VM provisioning and Power operations in the "East US 2" region. Existing VMs that are currently running are not impacted.

Microsoft is attempting to mitigate a potential capacity load issue in the "East US 2" region resulting in performance failures by reducing the call volume coming into the overloaded system. This strategy will be split into 3 phases with the first phase of throttling traffic by 75% has been completed. Then the throttling will be gradually increased to continue to reduce the volume. The next update will be provided in 60 minutes, or as events warrant.

Service reliability is a top priority at Horizon Cloud, and we are making continuous improvements to better our systems.

Thanks for your patience.

Additional Information: VMware will provide additional updates on our Unified Status page (http://status.workspaceone.com) until this issue has been resolved.

Thank You,
VMware Horizon Communications Team

IDENTIFIED 3 months ago - at 02/28/2024 07:57PM

Horizon Cloud Service First-Gen continues to experience issues intermittently for VM provisioning and Power operations.

Microsoft is investigating a potential capacity load issue in the "East US 2" region resulting in performance failures. Existing VMs that are currently running are not impacted.

Microsoft is restarting the impacted instances in the region to help alleviate impact and continuing to investigate other options to resolve the issue. The next update will be provided in 60 minutes, or as events warrant.

Service reliability is a top priority at Horizon Cloud, and we are making continuous improvements to better our systems.

Thanks for your patience.

Additional Information: VMware will provide additional updates on our Unified Status page (http://status.workspaceone.com) until this issue has been resolved.

Thank You,
VMware Horizon Communications Team

IDENTIFIED 3 months ago - at 02/28/2024 06:22PM

Horizon Cloud Service First-Gen continues to experience issues.

Microsoft Azure is working on resolving the VM provisioning and Power operation issues in the “East US 2” region.

We are closely monitoring this issue and will provide an update when more information is available.

Service reliability is a top priority at Horizon Cloud, and we are making continuous improvements to better our systems.

Thanks for your patience.

Additional Information: VMware will provide additional updates on our Unified Status page (http://status.workspaceone.com) until this issue has been resolved.

Thank You,
VMware Horizon Communications Team

IDENTIFIED 3 months ago - at 02/28/2024 05:36PM

Horizon Cloud Service First-Gen continues to experience issues.

Microsoft Azure is currently experiencing an issue in the “East US 2” region, and Azure team is working on resolving the issues. As a result, VM provisioning and Power operations are being intermittently impacted.

We are closely monitoring this issue and will provide an update when more information is available.

Service reliability is a top priority at Horizon Cloud Service, and we are continuously improving our systems.

Thanks for your patience.

Additional Information: VMware will provide additional updates on our Unified Status page (http://status.workspaceone.com) until this issue has been resolved.

Thank You,
VMware Horizon Communications Team

IDENTIFIED 3 months ago - at 02/28/2024 05:04PM

Microsoft Azure is working on mitigating the issues. Further updates will be provided in 60 minutes, or as soon as we have more information.

Service reliability is a top priority at Horizon Cloud Service, and we are continuously improving our systems.

Thanks for your patience.

Additional Information: VMware will provide additional updates on our Unified Status page (http://status.workspaceone.com) until this issue has been resolved.

Thank You,
VMware Horizon Communications Team

IDENTIFIED 3 months ago - at 02/28/2024 04:00PM

Horizon Cloud Service First-Gen continues to experience issues.

Microsoft Azure and Horizon Cloud Service Operations teams have identified an issue with virtual machines in East US2. These machines may experience errors during Create, Read, Update, or Delete (CRUD) operations. The team is actively working to resolve the issue.

Service reliability is a top priority at Horizon Cloud Service, and we are continuously improving our systems.

Thanks for your patience.

Additional Information: VMware will provide additional updates on our Unified Status page (http://status.workspaceone.com) until this issue has been resolved.

Thank You,
VMware Horizon Communications Team

INVESTIGATING 3 months ago - at 02/28/2024 03:12PM

Horizon Cloud Service First-Gen is investigating increased error rates for requests made to the service. There may be limited or no service impact at this time.

Service reliability is a top priority at Horizon Cloud Service, and we are making continuous improvements to better our systems.

Additional Information and Support:
Please visit Service Support Website at Customer Connect (https://customerconnect.vmware.com/login) or call Service Support Line at 1-877-4VMWARE.
For information on contacting VMware support, see How to file a Support Request in Customer Connect (https://kb.vmware.com/s/article/2006985).

Thanks for your patience.

Additional Information: VMware will provide additional updates on our Unified Status page (http://status.workspaceone.com) until the maintenance activity is complete.

Thank You,
VMware Horizon Communications Team

Easily monitor VMware Workspace ONE and all your third-party status pages

With IsDown, you can monitor all your critical services' official status pages from one centralized dashboard and receive instant alerts the moment an outage is detected. Say goodbye to constantly checking multiple sites for updates and stay ahead of outages with IsDown.

Start free trial

No credit card required · Cancel anytime · 3170 services available

Integrations with Slack Microsoft Teams Google Chat Datadog PagerDuty Zapier Discord Webhook

Setup in 5 minutes or less

How much time you'll save your team, by having the outages information close to them?

14-day free trial · No credit card required · Cancel anytime