14 day free trial · No credit card required
Check the stats and details of the latest Skylight outages and issues
Minor Resolved · about 1 month ago · lasted about 4 hours
One of our processors is encountering errors while processing traces, we have taken the worker offline to work on a fix. This is affecting a portion of our customers – if your app is affected you may see missing data on the Skylight dashboard starting from around 20:50 UTC. We expect the missing data to be filled in once we resume processing.
Minor Resolved · 3 months ago · lasted 6 minutes
Problems with an upstream provider (Heroku) have taken down our website as well as agent authentication. It does not appear to affect all customers uniformly. We are still able to resolve and connect to the site from some locations as of now. The Skylight collector is still operational and processing data from agents that were already authenticated. We are monitoring the situation. Upstream incident: https://status.heroku.com/incidents/2454
Major Resolved · 4 months ago · lasted about 3 hours
Problems with an upstream provider (Heroku) have taken down our website as well as agent authentication. The Skylight collector is still operational and processing data from agents that were already authenticated. We are monitoring the situation. Upstream incident: https://status.heroku.com/incidents/2453
Minor Resolved · 5 months ago · lasted about 1 hour
We are investigating a "clog" in the data processing pipeline that is causing one of the worker servers to be "stuck". This only affect a portion of the customers. If you are affected, you will see missing data from the Skylight dashboard for the last few hours (which is slowly populating). At this stage, we believe the data is safely received in the queue for processing, and once we resolve the issue that causes the "clog" we will be able to backfill the missing data.
Major Resolved · 5 months ago · lasted 38 minutes
Our automation have failed to renew/replace the SSL certificates before they expired. We are currently replacing the certificates manually. In the meantime, both the Skylight dashboard and the data collecting servers (used by the agents to submit traces) are inaccessible. We are sorry for the inconvenience.
Minor Resolved · 6 months ago · lasted 35 minutes
Data Processing Partial Backlog
A subset of our customers are experiencing a delay in data processing. We've identified the cause and are working on a fix.
Minor Resolved · 11 months ago · lasted 3 months
Heroku SSL Service Degradation
The Skylight dashboard is inaccessible currently due to a configuration issue. This outage also impacted agent authentication – new authentications from agents will not succeed at the moment. Agents that are already authenticated can continue to report data until the authentication session expires. The data processing pipeline is technically unaffected by this outage as it is hosted on a different provider. However, given that agents are failing to authenticate (and therefore failing to submit traces), we expect this to cause lapses in Skylight data during the outage period.
Check the current status of the components
Have you ever missed an important outage from a third-party service? We've built IsDown, so you never miss another outage again. It's the easiest way to monitor all your SaaS and cloud providers and get alerted when an outage impacts your business.Start free trial
No credit card required · Cancel anytime · 2024 services available
Quickly identify external outages that impact your business. We are monitoring more than 2000 services in real time.
Birds-eye view over all your services statuses
Check the status page aggregated of all your services in one place. No more going to each of the status pages and managing them individually.
Outage monitoring in real time
We monitor 24 hours a day, 7 days a week and will notify you if there is an incident. No more wasting time trying to figure out why something isn't working.
Alerts in your favorite channels
Get instant notifications in your email, Slack, Teams, or Discord when we detect a service outage. Outage monitoring where you are already doing your work.
Easily integrate with your current tools and workflows
Using Zapier or Webhooks, you can easily integrate notifications into your processes. PagerDuty integration is also available.
Avoid notifications clutter
Configure which notifications you want to receive from each service. Filter notifications by service components. You can opt to receive notifications only when a specific component is affected. You can also choose to receive notifications with a certain severity.
Have multiple dashboards. Easily shareable with the world.
Create one dashboard for each of your teams/clients/projects. Monitor only the services that each uses. Dedicated dashboard with custom notification settings. Easily make your dashboard public and share it with the world.
Prepare for scheduled maintenances
Never again be caught off guard by unexpected maintenance from your services. A feed of the next scheduled maintenances is available.
Weekly Digest of the services' outages
Every Monday, you'll receive a weekly summary of what happened the previous week as well as the maintenance schedule for the following week.
DevOps & On-Call Teams
You already monitor your internal systems. What about the external services? Monitor the services your business depends on. Don't waste time looking elsewhere when external outages are the cause of issues.
IT Support Teams
Detect external outages before your clients tell you. Anticipate possible issues and make the necessary arrangements. Having proactive communication, builds trust over clients and prevents flow of support tickets.
5 minute setup,
instant value for your team
Start with a trial account that will allow you to try and monitor up to 40 services for 14 days.
There are 2024 services to choose from and you can start monitoring, and we're adding more every week.
You can get notifications by email, Slack, and Discord. You can also use Zapier or Webhooks to build your workflows.
You'll start getting alerts when we detect outages in your external dependencies! No more wasting time looking in the wrong place!
Try it out! How much time you'll save your team, by having the outages information close to them?