Trusted by hundreds of companies · 14-day free trial · No credit card required
We continuously monitor the official Skylight status page for updates on any ongoing outages. Check the stats for the latest 30 days and a list of the last Skylight outages.
Minor Resolved · 5 months ago · lasted about 4 hours
One of our processors is encountering errors while processing traces, we have taken the worker offline to work on a fix. This is affecting a portion of our customers – if your app is affected you may see missing data on the Skylight dashboard starting from around 20:50 UTC. We expect the missing data to be filled in once we resume processing.
Minor Resolved · 7 months ago · lasted 6 minutes
Problem with Upstream Provider (Heroku)
Problems with an upstream provider (Heroku) have taken down our website as well as agent authentication. It does not appear to affect all customers uniformly. We are still able to resolve and connect to the site from some locations as of now. The Skylight collector is still operational and processing data from agents that were already authenticated. We are monitoring the situation. Upstream incident: https://status.heroku.com/incidents/2454
14 day free trial · No credit card required
Major Resolved · 7 months ago · lasted about 3 hours
Problem with Upstream Provider (Heroku)
Problems with an upstream provider (Heroku) have taken down our website as well as agent authentication. The Skylight collector is still operational and processing data from agents that were already authenticated. We are monitoring the situation. Upstream incident: https://status.heroku.com/incidents/2453
Minor Resolved · 8 months ago · lasted about 1 hour
We are investigating a "clog" in the data processing pipeline that is causing one of the worker servers to be "stuck". This only affect a portion of the customers. If you are affected, you will see missing data from the Skylight dashboard for the last few hours (which is slowly populating). At this stage, we believe the data is safely received in the queue for processing, and once we resolve the issue that causes the "clog" we will be able to backfill the missing data.
Major Resolved · 9 months ago · lasted 38 minutes
Our automation have failed to renew/replace the SSL certificates before they expired. We are currently replacing the certificates manually. In the meantime, both the Skylight dashboard and the data collecting servers (used by the agents to submit traces) are inaccessible. We are sorry for the inconvenience.
Minor Resolved · 9 months ago · lasted 35 minutes
Data Processing Partial Backlog
A subset of our customers are experiencing a delay in data processing. We've identified the cause and are working on a fix.
Minor Resolved · about 1 year ago · lasted 3 months
Heroku SSL Service Degradation
The Skylight dashboard is inaccessible currently due to a configuration issue. This outage also impacted agent authentication – new authentications from agents will not succeed at the moment. Agents that are already authenticated can continue to report data until the authentication session expires. The data processing pipeline is technically unaffected by this outage as it is hosted on a different provider. However, given that agents are failing to authenticate (and therefore failing to submit traces), we expect this to cause lapses in Skylight data during the outage period.
User-reported problems for Skylight in the last 12 hours. It's a collection of user reports from different sources.
Check the current status of the components
With IsDown, you can monitor all your critical services' official status pages from one centralized dashboard and receive instant alerts the moment an outage is detected. Say goodbye to constantly checking multiple sites for updates and stay ahead of outages with IsDown.Start free trial
No credit card required · Cancel anytime · 2512 services available
Quickly identify external outages that impact your business. We are monitoring more than 2500 services in real time.
Your team on top of problems
IsDown aggregates the information from the status pages of all your services, making it easy to monitor the health of all your services in one place. Say goodbye to managing each status page individually - our service simplifies the process.
No more wasting time. Uptime monitoring in real time
Say goodbye to wasting time trying to diagnose issues with your services - our 24/7 monitoring service does the work for you. We'll notify you if there is an incident, so you can focus on other tasks.
Receive alerts in your preferred channels
Our outage monitoring keeps you informed, no matter where you are. Get instant notifications in your email, Slack, Teams, or Discord when an outage is detected, so you can take action quickly.
Easily integrate with your current tools and workflows
Enhance your processes with more information using our integration of Zapier, Webhooks, PagerDuty, and Datadog. Stay notified and in control. Upgrade your operations today.
Avoid notifications clutter
Maximize your control with customizable notifications from each service. Filter by components and severity to only receive the most important updates. Streamline your processes and stay informed with our advanced notification features.
Multiple dashboards, shareable with the world
Create one dashboard for each of your teams/clients/projects and monitor only the services that each uses. Have a dedicated dashboard with custom notification settings. Easily make your dashboard public and share it with the world.
Prepare for scheduled maintenances
Never again be caught off guard by unexpected maintenance from your services. A feed of the next scheduled maintenances is available.
Weekly Digest of the services' outages
Every Monday, you'll receive a weekly summary of what happened the previous week as well as the maintenance schedule for the following week.
The data and notifications you need, in the tools you already use.
DevOps & On-Call Teams
You already monitor your internal systems. What about the external services? Monitor the services your business depends on. Don't waste time looking elsewhere when external outages are the cause of issues.
IT Support Teams
Detect external outages before your clients tell you. Anticipate possible issues and make the necessary arrangements. Having proactive communication, builds trust over clients and prevents flow of support tickets.
5 minute setup,
instant value for your team
Start with a trial account that will allow you to try and monitor up to 40 services for 14 days.
There are 2512 services to choose from and you can start monitoring, and we're adding more every week.
You can get notifications by email, Slack, and Discord. You can also use Zapier or Webhooks to build your workflows.
You'll start getting alerts when we detect outages in your external dependencies! No more wasting time looking in the wrong place!
Best practices when managing an outage
There’s never a good time for a service outage. And, from the moment it hits, it starts affecting your stakeholders. Suddenly, essential daily tasks are curtailed while your team enters emergency response mode. However, the surest way to mitigate damages and recover quickly is to follow a set of best practices.
Try it out! How much time you'll save your team, by having the outages information close to them?