Outage in The Graph

Error "api.thegraph.com | 502: Bad gateway" while deploying subgraphs

Resolved Minor
May 06, 2025 - Started about 1 month ago - Lasted 16 days
Official incident page

Need to monitor The Graph outages?
Stay on top of outages with IsDown. Monitor the official status pages of all your vendors, SaaS, and tools, including The Graph, and never miss an outage again.
Start Free Trial

Outage Details

We are currently investigating the issue. It should work on multiple retries.
Components affected
The Graph IPFS
Latest Updates ( sorted recent to last )
RESOLVED 28 days ago - at 05/22/2025 07:19AM

This incident has been resolved.

MONITORING about 1 month ago - at 05/14/2025 06:23PM

We are continuing to monitor the changes made for the recent IPFS deployment issues, and the system is now much more reliable. Users have reported successful deployments with fewer retries (2-3 compared to 10-20 previously), and there have been no widespread complaints in the last few hours.

Our SRE team has implemented the following recent enhancements:

- Restarted IPFS with optimized connection settings.
- Modified IPFS endpoints to better manage traffic.
- Created new dashboards to monitor errors and connection timeouts in real-time.
- Reviewed and tweaked rules to ensure community node traffic is handled efficiently.

These changes have led to a noticeable improvement in deployment success rates. However, some users may still experience occasional connection timeouts, which we are actively addressing. We’re continuing to monitor the system closely and will make additional adjustments as needed. If you encounter any issues, please let us know.

Thank you for your patience and support!

MONITORING about 1 month ago - at 05/13/2025 10:42PM

We’ve made substantial progress with the recent IPFS deployment issues, and the system is now demonstrating significantly improved reliability.

Our Site Reliability Engineering team has implemented several key enhancements, including:
- Applied targeted rules to block suspicious traffic and reduce system load.
- Upgraded IPFS Kubo on both testnet and mainnet to include critical stability improvements.
- Adjusted nginx connection limits to eliminate "Cannot assign requested address" errors, improving proxy stability.
- Resolved a misconfigured nginx caching rule that was returning incorrect IPFS hashes for different files.

These improvements have resulted in more consistent and successful IPFS deployments. We continue to actively monitor system performance and are working on further optimizations to maintain long-term stability.

Thank you for your continued patience and support.

IDENTIFIED about 1 month ago - at 05/12/2025 07:26PM

We've upgraded our IPFS to address deployment issues caused by memory limits being exceeded. This update includes fixes for resource leaks that were contributing to the problem. We've also blocked several suspicious IP addresses that may have been overloading the system. While IPFS stability has improved, the root issue is not yet fully resolved. We appreciate your continued patience as we work toward a complete fix.

IDENTIFIED about 1 month ago - at 05/07/2025 02:29PM

We're identified an issue in which our internal IPFS proxy's in-memory cache and aggressive retry logic are causing elevated load and intermittent timeouts. Our engineering team is working to implement improved exponential back-off in our fetch workflows and is evaluating more durable / decoupled caching solutions to ensure continued stability. We appreciate your continued patience as we work to resolve this.

INVESTIGATING about 1 month ago - at 05/06/2025 01:20PM

We are currently investigating the issue. It should work on multiple retries.

Be the First to Know When Vendors Go Down

With IsDown, you can monitor all your critical services' official status pages from one centralized dashboard and receive instant alerts the moment an outage is detected. Say goodbye to constantly checking multiple sites for updates and stay ahead of outages with IsDown.

Start free trial

No credit card required · Cancel anytime · 4200 services available

Integrations with Slack Microsoft Teams Google Chat Datadog PagerDuty Zapier Discord Webhook