Cloud Provider Status Report - January 2026

Published at Feb 6, 2026.

Executive Summary

This report analyzes cloud provider status data for January 2026, covering 12 major cloud platforms: AWS, Azure DevOps, DigitalOcean, Fly.io, Heroku, Linode, Microsoft Azure, Netlify, Railway, Render, and Vercel. The data includes official incident reports from each provider's status page and early detection capabilities from IsDown's monitoring system.

Important: Each provider has their own framework for reporting incidents. A provider showing more incidents does not mean they are less reliable - it may indicate more transparent reporting. These providers also vary significantly in size and market share, making direct comparisons inappropriate.

Summary Table

Provider	Official Incidents	Total Incident Time	Peak Incident Day	Typical Resolution	Early Detections by IsDown	Unconfirmed Incidents
AWS	1 (0 major, 1 minor)	~2.2 hours	Wednesday	2-4 hours	1 (20 min avg)	1
Azure DevOps	5 (0 major, 5 minor)	~10.5 hours	Tuesday	1-2 hours	-	-
DigitalOcean	5 (1 major, 4 minor)	~14.0 hours	Wednesday	2-4 hours	-	-
Fly.io	14 (2 major, 12 minor)	~175.7 hours	Thursday	4+ hours	-	-
Heroku	1 (0 major, 1 minor)	~1.2 hours	Thursday	1-2 hours	-	-
Linode	2 (0 major, 2 minor)	~103.7 hours	Tuesday & Friday	4+ hours	-	-
Microsoft Azure	4 (2 major, 2 minor)	~12.6 hours	Saturday & Tuesday	15-30 min / 30-60 min / 2-4 hours / 4+ hours	-	-
Netlify	10 (0 major, 10 minor)	~17.4 hours	Wednesday	2-4 hours	-	-
Railway	11 (1 major, 10 minor)	~18.0 hours	Wednesday	30-60 min	-	-
Render	4 (1 major, 3 minor)	~10.2 hours	Friday	1-2 hours	-	-
Vercel	5 (1 major, 4 minor)	~38.0 hours	Tuesday & Friday & Monday & Wednesday & Thursday	2-4 hours / 4+ hours	-	-

Provider Details

AWS

Official Incidents: 1 (0 major, 1 minor)
Total Incident Time: ~2.2 hours
Average Resolution Time: 131 minutes per incident

AWS experienced a single minor incident in January 2026 involving elevated latencies for network change propagation in the Ireland (EU-WEST-1) region. The issue affected network operations and caused timeouts when pulling images, lasting approximately 2 hours and 11 minutes on January 28th.

Early Detection by IsDown: IsDown detected the network latency issue 20 minutes before it was officially acknowledged on AWS's status page at 16:53 UTC, with the official acknowledgment coming at 17:14 UTC. This early detection provided IsDown users with valuable advance notice.

Unconfirmed Incidents: IsDown detected one potential issue on January 22nd based on user reports that was never officially acknowledged.

Note: Unconfirmed incidents are potential issues detected by IsDown but never officially acknowledged by the vendor. These serve as indicators only and do not confirm an actual outage occurred.

Azure DevOps

Official Incidents: 5 (0 major, 5 minor)
Total Incident Time: ~10.5 hours
Average Resolution Time: 126 minutes per incident

Azure DevOps reported five minor incidents throughout January 2026, all characterized as availability degradations. The longest incident occurred on January 7th in Central US, lasting over 6 hours. Other incidents were shorter in duration, with the briefest lasting only 16 minutes on January 23rd. One incident specifically affected test plans in the West US region for internal customers.

DigitalOcean

Official Incidents: 5 (1 major, 4 minor)
Total Incident Time: ~14.0 hours
Average Resolution Time: 167 minutes per incident

DigitalOcean experienced five incidents in January 2026, including one major outage affecting the Cloud Control Panel and API on January 26th. The FRA1 region faced particular challenges on January 28th with two separate incidents affecting Droplet-based events and Kubernetes clusters, lasting several hours each. Additional issues included account access and payment problems, as well as App Platform deployment difficulties.

Fly.io

Official Incidents: 14 (2 major, 12 minor)
Total Incident Time: ~175.7 hours
Average Resolution Time: 752 minutes per incident

Fly.io reported the highest number of incidents in January 2026, with 14 total incidents. Issues ranged from network connectivity problems between US and EU regions, to a major management plane outage lasting over 15 hours on January 6th. Multiple regions experienced challenges including metrics collection issues in Mumbai that persisted for over 63 hours, MPG instability in LAX and SIN regions, and delayed metric reporting in NRT and SIN that lasted nearly 49 hours. The platform also faced elevated API latency, authentication token issues, and certificate issuance delays.

Heroku

Official Incidents: 1 (0 major, 1 minor)
Total Incident Time: ~1.2 hours
Average Resolution Time: 70 minutes per incident

Heroku experienced a single minor incident on January 15th involving an expired certificate that caused API access issues. The incident resulted in 502 errors for subsequent API requests and lasted approximately 70 minutes.

Linode

Official Incidents: 2 (0 major, 2 minor)
Total Incident Time: ~103.7 hours
Average Resolution Time: 3112 minutes per incident

Linode reported two incidents in January 2026. The first was a connectivity issue in the London (EU-West) data center on January 20th that lasted over 8 hours, causing intermittent connection timeouts and errors. The second was a proactive notification about an upcoming winter storm affecting the continental United States, posted on January 23rd and lasting for several days as monitoring continued.

Microsoft Azure

Official Incidents: 4 (2 major, 2 minor)
Total Incident Time: ~12.6 hours
Average Resolution Time: 189 minutes per incident

Microsoft Azure reported four incidents in January 2026, including two major incidents. The first major incident on January 10th was a datacenter power event in West US 2 that caused service disruptions and connectivity issues. The second major incident on January 27th affected Azure OpenAI Service in Sweden Central, lasting over 8 hours with intermittent availability issues. Two additional minor incidents were related to the same events.

Netlify

Official Incidents: 10 (0 major, 10 minor)
Total Incident Time: ~17.4 hours
Average Resolution Time: 104 minutes per incident

Netlify experienced 10 minor incidents throughout January 2026. Issues included CDN errors, observability data ingestion problems, Standard Edge Network errors and latencies, Log Drains export failures to New Relic, Image CDN errors, AI Gateway errors from upstream dependencies, domain renewal issues, build failures related to GitHub issues, UI errors and latencies, and rendering problems with app.netlify.com.

Railway

Official Incidents: 11 (1 major, 10 minor)
Total Incident Time: ~18.0 hours
Average Resolution Time: 98 minutes per incident

Railway reported 11 incidents in January 2026. The single major incident involved temporary deployment slowdowns on January 2nd. Other issues included storage bucket creation/deletion problems, elevated build times for Metal Build Environment services, log delivery delays, Node Yarn PKG mirror unavailability, staged changes application delays, and multiple instances of GitHub rate limiting affecting logins and deployments. NPM performance degradation also impacted package installations.

Render

Official Incidents: 4 (1 major, 3 minor)
Total Incident Time: ~10.2 hours
Average Resolution Time: 152 minutes per incident

Render experienced four incidents in January 2026. Three minor incidents involved deploy delays in Oregon, missing application and build logs on the dashboard, and metrics display issues for Oregon services. The major incident on January 30th affected external connectivity for Postgres databases hosted in Singapore.

Vercel

Official Incidents: 5 (1 major, 4 minor)
Total Incident Time: ~38.0 hours
Average Resolution Time: 456 minutes per incident

Vercel reported five incidents in January 2026. The major incident on January 13th involved elevated build failure rates lasting over 12 hours. Other issues included domain purchase failures, delayed dashboard data across multiple services (Observability, Speed Insights, Web Analytics, Usage, and Audit Logs), elevated connection latency in the Dublin Edge region, and significantly elevated git clone durations affecting build times that persisted for over 19 hours on January 29th.

IsDown Early Detection Highlights

In January 2026, IsDown's monitoring system successfully detected 1 incident before it was officially acknowledged on the vendor's status page. This early detection capability provided IsDown users with an average of 20 minutes of advance warning, allowing them to proactively respond to the issue before official confirmation. The total time saved through early detection was 0.3 hours, demonstrating the value of independent monitoring alongside official status pages.

Conclusion

January 2026 saw varying incident patterns across cloud providers. Fly.io reported the highest number of incidents with 14 total, while AWS and Heroku each reported only 1 incident. Total incident durations ranged from 1.2 hours (Heroku) to 175.7 hours (Fly.io). Resolution times varied significantly, with Railway showing the fastest typical resolution at 30-60 minutes, while Fly.io and Linode showed typical resolutions exceeding 4 hours. Wednesday was the most common peak incident day across providers. IsDown's early detection capability was demonstrated with AWS, providing users with 20 minutes of advance warning for the network latency issue.

Nuno Tomas Founder of IsDown

For IT Managers

Monitor all your dependencies in one place

One dashboard with all vendors statuses

A bird's-eye view of all your services in one place.

Get alerts when your vendors are down

Notifications in Slack, Datadog, PagerDuty, etc.

Start Free Trial

Oct 1, 2025

Top 10 Reasons Why You Need a Status Page Aggregator

Discover why a status page aggregator is essential for monitoring multiple vendors. Learn how to centralize alerts and improve incident response.

Jun 16, 2026

IsDown is joining UptimeRobot

IsDown has been acquired by UptimeRobot. Your plan, login, and data stay the same. Here's what's changing, what isn't, and the legal details.

May 20, 2026

Error Budget in SRE: The Complete Guide (2026)

Error budgets translate your SLO into a measurable allowance for failure. Learn how to calculate, defend, and spend your error budget - and why vendor outages silently drain it.

May 13, 2026

Cloud Outage History: Six Years of Recurring Failures

Six years of major cloud outages dissected - AWS, Cloudflare, CrowdStrike and more. Root causes, failure patterns, and what SRE teams keep getting wrong.

May 3, 2026

April 2026: IsDown Users Saved 16.5 Hours with Early Outage Detection

IsDown detected 45 outages up to 3.6 hours before vendors acknowledged them in April 2026, plus 104 incidents vendors never reported.

Apr 22, 2026

AWS Outage History: What Engineering Teams Should Learn

AWS outage history follows a predictable pattern: us-east-1, cascade failures, status pages that lag 30-90+ minutes. Here's what engineering teams should learn.