Use Cases
Software Products MSPs Schools Development & Marketing DevOps Agencies Help Desk
 
Internet Status Blog Pricing Log In Try IsDown for free now

Outage in Zero Hash

Elevated 503 Errors on Multiple Endpoints

Resolved Major
January 21, 2026 - Started 21 days ago - Lasted about 11 hours
Official incident page

Incident Report

We are actively investigating an issue causing elevated error rates and degraded service availability across multiple endpoints. Our team is aware of the impact and are currently all-hands on this incident. We will provide updates as soon as we can.

Need to monitor Zero Hash outages?

  • Monitor all your external dependencies in one place
  • Get instant alerts when outages are detected
  • Be the first to know if service is down
  • Show real-time status on private or public status page
  • Keep your team informed
Latest Updates ( sorted recent to last )
RESOLVED 21 days ago - at 01/22/2026 10:12AM

The incident has been resolved. All systems are healthy and back online again following the reboot and rollout of final fixes in all environments.

MONITORING 21 days ago - at 01/22/2026 09:30AM

Status Update: The reboot was successful and we are seeing positive results as our applications come back online. We are still actively monitoring the situation as we work to ensure everything is back to a stable and healthy state. We have also started to process a large backlog of transactions.

As of now, we are estimating a resolution of ~1 hour but we will continue to provide updates should anything change.

IDENTIFIED 21 days ago - at 01/22/2026 09:06AM

Current status: In-Progress

The backup has been complete and we are now working on rebooting the messaging system and restoring all of the data. Following that, we will work on getting all of the impacted services back online as quickly as possible. In tandem, a fix for the underlying root cause of the outage has been prepped and will be rolled out accordingly once the reboot of the messaging system has finished.

Thank you for your patience, Our team is all hands on deck here and we're optimistic that the incident will be resolved soon. We will provide another update in 30 minutes.

IDENTIFIED 21 days ago - at 01/22/2026 08:29AM

We are still actively in the progress of backing up all the data in cluster. Once the backup is done we will reset the internal messaging system and bring all the services back online.

For full transparency and to keep everyone in loop, here's a recap of the incident:

Multiple services degraded due to our internal messaging system outage. Trading, Participants, Transact, Deposits and Withdrawals are impacted.

Root cause of messaging system outage:
The root cause was traced to an inefficiency in how our internal messaging system queries data. A specific function was creating and destroying connections much more rapidly than intended, which placed excessive stress on the infrastructure and ultimately led to the overload.

Root cause of delay in messaging system recovery:
As this is the core communication system to our services, we were all hands on backing up the cluster. But due to the amount of data we need to recover it’s taking a long time and we are assessing all of our options to accelerate the recovery process.

Current status:
we’re almost at the point where we have all the data backed up, then we can reset the messaging system and restore the data to bring all services back online.

IDENTIFIED 21 days ago - at 01/22/2026 07:27AM

We are continuing to work on the fix. We sincerely apologise for the delay in resoving the incident. Keeping you updated on where we stand, we’re almost at the point where we have all the data backed up, then we can reset the messaging system and restore the data. We will make sure to provide a detailed RCA after this is resolved.

IDENTIFIED 21 days ago - at 01/22/2026 06:00AM

We are continuing to work on the fix. Due to the huge amount of data to recover in our internal messaging system, it is taking longer than expected. We are now switching strategy to backup the data and restart the system. ETA still not available but our entire Engineering team and the leaders are continuing to work with external vendor to address this as soon as possible

IDENTIFIED 21 days ago - at 01/22/2026 03:25AM

We are continuing to work on the fix. Incident was caused by our internal messaging system outage and now we are in the final stage of restoring it. We will keep you posted with the progress.

IDENTIFIED 21 days ago - at 01/22/2026 12:11AM

We have identified the root cause of the issue and are working on a fix. We are still all-hands on the issue, but do not have an ETA at this stage yet.

INVESTIGATING 21 days ago - at 01/21/2026 10:54PM

We are actively investigating an issue causing elevated error rates and degraded service availability across multiple endpoints. Our team is aware of the impact and are currently all-hands on this incident. We will provide updates as soon as we can.

The Status Page Aggregator with Early Outage Detection

With IsDown, you can monitor all your critical services' official status pages from one centralized dashboard and receive instant alerts the moment an outage is detected. Say goodbye to constantly checking multiple sites for updates and stay ahead of outages with IsDown.

Start free trial

No credit card required · Cancel anytime · 5850 services available

Integrations with Slack Microsoft Teams Google Chat Datadog PagerDuty Zapier Discord Webhook