Know when your cloud services are down

IsDown monitors Buttondown's and more than 1000 services statuses.
Monitor the services that impact your business. Get an alert when they are down.

Buttondown

Buttondown Status

Buttondown not working for you? Report it!

Stats

0 incidents in the last 7 days

0 incidents in the last 30 days

Automatic Checks

Last check: 2 minutes ago

Last known issue: 5 months ago

Latest Incidents

Last 30 days

26/06
27/06
28/06
29/06
30/06
01/07
02/07
03/07
04/07
05/07
06/07
07/07
08/07
09/07
10/07
11/07
12/07
13/07
14/07
15/07
16/07
17/07
18/07
19/07
20/07
21/07
22/07
23/07
24/07
25/07
26/07

Resolved DNS is not resolving in US-EAST

This is a DNSimple issue, unfortunately: https://dnsimple.statuspage.io/incidents/bcnzxs0mgz21=

Resolved General availability outage

For around an hour this morning, Buttondown had significantly degraded availability. ## What happened? New hosts refused to spin up and were "correctly" throwing 500s for around 30% of requests (this was only impacting hosts that were automatically cycling in and out, which is why it wasn't all requests.) ## Why did this happen? I'm using an undocumented Notion API to power documentation search, and the token that I was using to power that API expired in a way that I was not defensively programming against. This meant that each time the server tried to restart it would hit the API, fall over, and then pass that failure onto the client. As soon as this happened widespread enough, I got an alert for it... but I was out on a run. As soon as I got back, I hit the circuit breaker for that codepath and things got back to normal. ## Why won't this happen again? That circuit breaker is gonna stay off for a little, but I plan on moving all of that compilation to a build-time step anyway, removing the Notion codepath from the critical path of the application! ## Any questions? Email me: [email protected]

Resolved General availability outage

For around five hours (meaning the early morning of 11/13, Pacific time) Buttondown's availability was heavily degraded. ## What happened? Around 50-70% of requests timed out. It wasn't _quite_ a complete DDOS, but essentially so. ## Why did this happen? This is... actually fairly silly, as far as these things go. An old third-party log handler that Buttondown was using shut off access at the logdrain I was using. (This is a totally reasonable thing to do!) _Unfortunately_, that clobbered a huge amount of the requests being served, to the point where all the active dynos on my infrastructure were busy complaining and throwing errors because they couldn't emit logs. The irony of this does not escape me. ## Well, why did it take so long to fix? I was asleep. No, really! That's the reason. I've got two thresholds for Buttondown outages: 1. The server is down for a little, which texts me. 2. The server is hard down for all requests, which calls and pages me. This was an exceptionally long bout of the former, which meant I woke up to like seventy outage texts but no outright pages. ## Why won't this happen again? First: I've upped (or lowered, depending on how you look at it) the threshold for what constitutes an outage. I made a lot of these alerts two years ago when Buttondown was a fraction of a fraction of its current size; thankfully, things are generally stable, but its still time to be more alert. Any non-trivial breakage of traffic pages little old me. Second: to fix the _actual_ issue, I'm spending some time this weekend messing around with the logging & error infrastructure Buttondown uses to more gracefully degrade. Have any questions? Email me: [email protected]

Resolved Broken tracking links

I tried to switch over to SSL for our tracking links and it looks like something got awry. I need to follow up with CloudFlare as to what the issue was (it was likely a misconfiguration on my end, in all honesty, but hard to say) — but for around two hours all tracked links were broken because they pointed to https and not http. I've reached out to the folks who have sent outbound emails during that time, and am in the process of backfilling the transactional emails (such as subscriber confirmations) that were sent during that window!

about 1 year ago Official incident report

Resolved 500s on Frontend Application

GitHub is running into [issues](https://www.githubstatus.com/incidents/80d0cs6kpsps) and Buttondown's CI pipeline wasn't running properly, leading to a feature branch getting deployed to production which caused breakages for a few minutes.

over 1 year ago Official incident report

How it works

  1. Step 1 Create an account

    Start with a free account that will allow you to monitor up to five services. Sign up via Google, Slack, or email.

  2. Step 2 Select your tools

    There are 1100 services to choose from, and we're adding more every week.

  3. Step 3 Set up notifications

    You can get notifications through the mail, Slack or use Zapier or Webhooks to build your workflows.

  4. Step 4 You're ready!

    You'll never be caught by surprise again. Every time there's a problem with one tool, you'll get a notification ASAP.

Unified tools status

Check the status of all your tools in one place

IsDown integrates with hundreds of services. Handles the hassle of going to each one of the status pages and manage it one by one. We also help control how you receive the notifications.
We monitor all problems and outages and keep you posted on their current status in almost real-time.

Notifications in realtime

Get notifications in your favorite communication channel

You can easily get notifications in your email, slack, or use Webhooks and Zapier to introduce the service status in your workflows.

Email
SlackSlack
ZapierZapier
Webhooks

React faster

Help your teams with more data

Engineering

You already monitor internal systems. Add another dimension (external systems) to your monitoring data and complement it with the external factors.

Customer Support

Know before your clients tell you. Anticipate possible issues and make the necessary arrangements.

Marketing

One of your competitors is down? Maybe a good time to spread the word about your service.

Trusted by teams from all over the world

Services Available
1100
Incidents
63630
Ongoing Incidents
106

Start monitoring your tools today

Start today for FREE