Outage in SafetyCulture

Some sensors are not operational due to global cellular data network outage

Resolved Minor
August 28, 2023 - Started about 2 years ago - Lasted 3 days
Official incident page

Need to monitor SafetyCulture outages?
Stay on top of outages with IsDown. Monitor the official status pages of all your vendors, SaaS, and tools, including SafetyCulture, and never miss an outage again.
Start Free Trial

Outage Details

We are currently investigating this issue.
Components affected
SafetyCulture Sensors
Latest Updates ( sorted recent to last )
RESOLVED about 2 years ago - at 09/01/2023 07:36AM

Our upstream partner has addressed the outage issue, we will continue to monitor the situation.

MONITORING about 2 years ago - at 08/31/2023 06:52AM

The majority of sensors are now back online and operating normally. We are continuing to monitor and are working alongside our partners to ensure ongoing stability.

MONITORING about 2 years ago - at 08/30/2023 06:04PM

Most sensors are now back online and operating normally. We are working to restore connectivity to the remaining sensors that are offline.

We will continue to monitor the situation and work with our upstream partners to ensure services remain stable.

IDENTIFIED about 2 years ago - at 08/30/2023 07:53AM

Our upstream partner is continuing to implement changes to bring devices back online, and is currently working to mitigate congestion. We are continuing to monitor the situation.

IDENTIFIED about 2 years ago - at 08/30/2023 05:45AM

Our upstream partner has shifted traffic to a new node and is beginning to see improvements in IoT device connectivity.

We are continuing to monitor the situation. We will continue to post updates here as we learn more.

IDENTIFIED about 2 years ago - at 08/30/2023 02:26AM

The previously identified fix for the failures in our upstream partner's network interfaces was not ultimately viable and was not implemented.

Our partner believes they have now identified a common problem with the interface failures and are working on a fix.

IDENTIFIED about 2 years ago - at 08/30/2023 01:24AM

Our upstream parter has began implementing a fix for the regression. Once the fix is in place the traffic will be increase slowly to ensure a stable recovery.

The expected recovery time remains estimated at 0700 to 0800 UTC. We will provide further updates if this estimate is revised.

IDENTIFIED about 2 years ago - at 08/30/2023 01:08AM

The likely root cause of the regression has been identified on a node in our upstream partner's network interface. They engaged with their vendors and a solution has been identified, which will be executed in the next 30 minutes. The updated node will come back online and traffic will be gradually increased to it to ensure a stable recovery.

The expected recovery time depends on the depth of backlog of connection requests, it will be a slow release to avoid overloading the signal, they're estimating recovery by 0700 to 0800 UTC.

IDENTIFIED about 2 years ago - at 08/29/2023 11:25PM

Our upstream partner has identified the root cause of the regression. One of the provider's network interface was unable to support the amount of traffic being released and was compromised.

The solution is currently being assessed, in which they'll migrate the signaling services to a different node that is operating normally and has bandwidth to support the incremental traffic.

Latest estimate for full resolution is 0700 to 0800 UTC. This estimate is based on the assumption that the applied solutions work as intended.

IDENTIFIED about 2 years ago - at 08/29/2023 09:24PM

Additional sensors in all regions are now offline. Our upstream partners have confirmed there has been a major regression.

We're monitoring the impact and will provide further updates when available.

IDENTIFIED about 2 years ago - at 08/29/2023 01:01PM

There are still issues with the connectivity on our upstream partners network.

The cause of this incident has now been rectified and stability has been confirmed.

They're now facing a signaling storm due to congestion, they're restricting the traffic and are slowly increasing throughput to resolve this.

We're still unable to provide an ETA but we started seeing improvements.

IDENTIFIED about 2 years ago - at 08/29/2023 08:14AM

Our upstream partner continues to work on the remaining network instability issues adversely affecting subscriber attachments. There is no ETA yet on resolution. We will continue to post updates here as we learn more.

IDENTIFIED about 2 years ago - at 08/29/2023 07:31AM

Our upstream partners have stabilized the replacement hardware and continue to work on the remaining network instability issues that are adversely affecting subscribe attachments. There is no ETA yet on resolution. We will continue to post updates here as we learn more.

IDENTIFIED about 2 years ago - at 08/29/2023 06:15AM

Our upstream partners have replaced faulty hardware, have begun bringing interconnect links back online, and are continuing to work to resolve the issues that remain. Unfortunately, bringing back the interconnect links have not yet had the expected effect on connectivity. Subscriber attachments are still adversely affected. There is no ETA yet on resolution. We will continue to post updates here as we learn more.

IDENTIFIED about 2 years ago - at 08/29/2023 04:32AM

Our upstream partners have replaced faulty hardware, have begun bringing interconnect links back online, and are continuing to work to resolve the issues that remain. Subscriber attachments are still adversely affected.

We will continue to post updates here as we learn more.

IDENTIFIED about 2 years ago - at 08/29/2023 03:33AM

Our upstream partners have replaced faulty hardware and are continuing to work to resolve the issues that remain.

We will continue to post updates here as we learn more.

IDENTIFIED about 2 years ago - at 08/29/2023 02:49AM

Unfortunately the fix attempted by our upstream partners did not resolve the issue. They are continuing to investigate.

We will post updates here as we learn more.

IDENTIFIED about 2 years ago - at 08/29/2023 02:47AM

Upstream partners have identified the issue and are implementing a fix.

IDENTIFIED about 2 years ago - at 08/29/2023 12:10AM

The issue has been identified. We will post updates here as we learn more.

INVESTIGATING about 2 years ago - at 08/28/2023 08:21PM

Our third-party data provider has notified us of a data outage. This is currently affecting all customers using a cellular gateway on mobile data.

We are monitoring the situation closely and will provide updates as soon as possible.

INVESTIGATING about 2 years ago - at 08/28/2023 07:56PM

We are currently investigating this issue.

Latest SafetyCulture outages

Unable to login - about 23 hours ago
Service outage for some sensors - about 1 month ago
Service outage for some sensors - about 1 month ago

Be the First to Know When Vendors Go Down

With IsDown, you can monitor all your critical services' official status pages from one centralized dashboard and receive instant alerts the moment an outage is detected. Say goodbye to constantly checking multiple sites for updates and stay ahead of outages with IsDown.

Start free trial

No credit card required · Cancel anytime · 4484 services available

Integrations with Slack Microsoft Teams Google Chat Datadog PagerDuty Zapier Discord Webhook