Incident in Azure
3 months ago

Service Management Operation Errors Across Azure Services in East US 2

Resolved Minor
Impact Statement: Starting at approximately 12:25 UTC on 08 Apr 2022, customers running services in the East US 2 region may be experiencing service management errors, delays, and/or timeouts. We are investigating an underlying issue causing GET and PUT errors impacting the Azure portal itself, as well as services including Azure Virtual Machines (VMs), Virtual Machine Scale Sets (VMSS), and additional downstream services. Customers may see errors including “The network connectivity issue encountered for Microsoft.Compute cannot fulfill the request.” Finally, for some downstream services that have auto-scale enabled, this service management issue may cause data plane impact.Current Status: The series of mitigation efforts described in earlier incident updates is still making progress in improving error rates. Internal services continue to report significant improvements in the proportion of requests that are succeeding. While mitigation is still being applied, the investigation into what is causing this incident has determined that the Compute Resource Provider (CRP) gateways in East US 2 are being overwhelmed with requests for compute resources. Mitigation workstreams continue to focus on how to prevent CRP gateways from becoming unhealthy. While the combination of restarts, scaling out, and traffic reduction initially helped some gateway nodes to return to a healthy state, and stay healthy, other gateway nodes are routinely getting into a condition of being overloaded by request volume. To resolve this, there are two mitigation workstreams being run in parallel – in the short term, we are investigating automation to restart gateway nodes on a regular basis to avoid getting into an unhealthy state. In the long term, we are investigating a CRP gateway hotfix that will obviate the need for restarts and prevent each node from becoming unhealthy. Both these work streams are making good progress. At this stage, we believe that we have eliminated impact to most of the downstream services and are working with each team to confirm mitigation. We are also working with the last couple of services to mitigate them.Although we believe that external customers and partners are continuing to see improvements, as mentioned we are not declaring mitigation until error rates return to pre-incident levels. While mitigation efforts continue, we will continue to provide hourly updates to ensure that all impacted customers and partners are informed of progress. The next update will be provided by 03:00 UTC, April 9th, or as soon as we have an update to share.
Don't waste time! Know when there's an external outage happening.

Monitor Azure and all your business critical cloud services

Never miss when your external dependencies are down. Instant notifications when there are outages.

Get Started Free 197 companies signed up in the last month!
No credit card required · Cancel anytime · 1622 services available to monitor

"I spend 2 hours trying to solve an issue and then realize it's due to an [EXTERNAL SERVICE] outage"
Every engineer at some point in time

How can we help?

No more frantic searching for the source of the problem.

All of your service statuses in one place

Check the status of all your services in one place. No more going to each of the status pages and managing them individually.

Notifications of incidents in real time

We monitor 24 hours a day, 7 days a week and will notify you if there is an incident. No more wasting time trying to figure out why something isn't working.

Notifications in your favorite channel

Get instant notifications in your email, Slack, or Discord.

Keep track of scheduled maintenance

Never again be caught off guard by unexpected maintenance from your services. A feed of the next scheduled maintenances is available.

Set the notification level for each service

Configure which notifications you want to receive from each service. You can choose to receive notifications for all incidents, only critical incidents, or just display them on the dashboard.

Integrate with your current workflows

Using Zapier or Webhooks, you can easily integrate notifications into your processes.

Receive a Weekly Digest

Every Monday, you'll receive a weekly summary of what happened the previous week as well as the maintenance schedule for the following week.

Multiple Profiles

Create one profile for each of your teams. One Dashboard and specific Notifications settings for each team.

Only receive notifications for specific components

Filter notifications by service components. You can opt to receive notifications only when a specific component is affected.

For every team in your company

Engineering

Add another dimension (external systems) to your monitoring data and complement it with the external factors. Monitor the services your company relies on for the development experience. Don't waste time looking elsewhere, when it's an external outage.

Customer Support

Know before your clients tell you. Anticipate possible issues and make the necessary arrangements. Understand if your business is being impacted by outages in external services.

Marketing

One of your competitors is down? Maybe a good time to spread the word about your service.

Start monitoring
all your essential services

  1. Step 1 Create an account

    Start with a trial account that will allow you to try and monitor up to 30 services for 14 days.

  2. Step 2 Select your services

    There are 1622 services to choose from, and we're adding more every week.

  3. Step 3 Set up notifications

    You can get notifications by email, Slack, and Discord. You can also use Zapier or Webhooks to build your workflows.

Frequently Asked Questions

Is Azure down today?
Azure seems to be up and running. We've updated the status 1 minute ago.
I'm having issues with Azure, but the status is OK. What's going on?
There are a few things you can try:
How can I be notified when Azure is having issues?
You can subscribe for updates on the official status page or create an account in IsDown. We will send you a notification in real-time when Azure has issues.
Why use IsDown?
We want to keep you updated on the status of the services that impact your company. You can setup a notification via email, Slack, or Discord when a service you monitor has issues or when maintenances are scheduled.
What happens when I create an account?
You'll have access to a 14-day trial in our Pro plan. You can cancel or delete your account anytime. After 14 days, you'll need to subscribe to continue to use the service and get notifications.
How can I pay for a subscription?
You can go to the Billing section in your account and choose one of the plans. We've monthly and yearly options. We accept all major credit cards, Apple Pay, and Google Play. We use Stripe for payments.
Can I get a refund?
We'll refund your subscription if you cancel it until ten days after the subscription has started. No questions asked.
Are we missing an integration?
Contact us, and we'll add it ASAP.

Ready to start monitoring your cloud services?

Increase the productivity and efficiency of your team. Enable monitoring for your services, and start receiving real-time alerts when your services have outages.

Start today for FREE