Outage in Buttondown

General availability outage

Resolved Minor
November 13, 2020 - about 2 years ago - Lasted over 1 year
Official incident page
Latest Buttondown outages
DNS is not resolving in US-EAST - almost 2 years ago
General availability outage - about 2 years ago

Details

For around five hours (meaning the early morning of 11/13, Pacific time) Buttondown's availability was heavily degraded. ## What happened? Around 50-70% of requests timed out. It wasn't _quite_ a complete DDOS, but essentially so. ## Why did this happen? This is... actually fairly silly, as far as these things go. An old third-party log handler that Buttondown was using shut off access at the logdrain I was using. (This is a totally reasonable thing to do!) _Unfortunately_, that clobbered a huge amount of the requests being served, to the point where all the active dynos on my infrastructure were busy complaining and throwing errors because they couldn't emit logs. The irony of this does not escape me. ## Well, why did it take so long to fix? I was asleep. No, really! That's the reason. I've got two thresholds for Buttondown outages: 1. The server is down for a little, which texts me. 2. The server is hard down for all requests, which calls and pages me. This was an exceptionally long bout of the former, which meant I woke up to like seventy outage texts but no outright pages. ## Why won't this happen again? First: I've upped (or lowered, depending on how you look at it) the threshold for what constitutes an outage. I made a lot of these alerts two years ago when Buttondown was a fraction of a fraction of its current size; thankfully, things are generally stable, but its still time to be more alert. Any non-trivial breakage of traffic pages little old me. Second: to fix the _actual_ issue, I'm spending some time this weekend messing around with the logging & error infrastructure Buttondown uses to more gracefully degrade. Have any questions? Email me: [email protected]
Updates ( sorted recent to last )
RESOLVED at 11/13/2020 05:02PM

For around five hours (meaning the early morning of 11/13, Pacific time) Buttondown's availability was heavily degraded. ## What happened? Around 50-70% of requests timed out. It wasn't _quite_ a complete DDOS, but essentially so. ## Why did this happen? This is... actually fairly silly, as far as these things go. An old third-party log handler that Buttondown was using shut off access at the logdrain I was using. (This is a totally reasonable thing to do!) _Unfortunately_, that clobbered a huge amount of the requests being served, to the point where all the active dynos on my infrastructure were busy complaining and throwing errors because they couldn't emit logs. The irony of this does not escape me. ## Well, why did it take so long to fix? I was asleep. No, really! That's the reason. I've got two thresholds for Buttondown outages: 1. The server is down for a little, which texts me. 2. The server is hard down for all requests, which calls and pages me. This was an exceptionally long bout of the former, which meant I woke up to like seventy outage texts but no outright pages. ## Why won't this happen again? First: I've upped (or lowered, depending on how you look at it) the threshold for what constitutes an outage. I made a lot of these alerts two years ago when Buttondown was a fraction of a fraction of its current size; thankfully, things are generally stable, but its still time to be more alert. Any non-trivial breakage of traffic pages little old me. Second: to fix the _actual_ issue, I'm spending some time this weekend messing around with the logging & error infrastructure Buttondown uses to more gracefully degrade. Have any questions? Email me: [email protected]

Monitor Buttondown and all your third-party dependencies in one place

IsDown is an uptime monitoring solution for your critical business dependencies. Keep tabs on your SaaS and cloud providers in real-time and never miss another outage again. Get instant alerts and stay informed when an incident impacts your operations.

Start free trial

No credit card required · Cancel anytime · 2359 services available

Integrations with Slack Microsoft Teams Google Chat Datadog PagerDuty Zapier Discord Webhook

Monitor all your dependencies in a consistent way.

The Old Way
  • Subscribing to status pages one-by-one
  • Limited notification options
  • Can't monitor only the parts that matter
  • No bird's eye view over all your services
  • Losing time looking for problems elsewhere
  • No access to historical issues and stats
With IsDown
  • Easily subscribe to all status pages
  • Notifications in the tools you already use
  • Monitor only what directly impacts your business
  • Easy access to the status of all your services
  • Outages information where it's needed
  • Historical data of outages for all your providers

IsDown is the missing layer in your monitoring stack

Quickly identify external outages that impact your business. We are monitoring more than 2300 services in real time.

Your team on top of problems

IsDown aggregates the information from the status pages of all your services, making it easy to monitor the health of all your services in one place. Say goodbye to managing each status page individually - our service simplifies the process.

IsDown Dashboard

No more wasting time. Uptime monitoring in real time

Say goodbye to wasting time trying to diagnose issues with your services - our 24/7 monitoring service does the work for you. We'll notify you if there is an incident, so you can focus on other tasks.

Receive alerts in your preferred channels

Our outage monitoring keeps you informed, no matter where you are. Get instant notifications in your email, Slack, Teams, or Discord when an outage is detected, so you can take action quickly.

IsDown Integrations

Easily integrate with your current tools and workflows

Enhance your processes with more information using our integration of Zapier, Webhooks, PagerDuty, and Datadog. Stay notified and in control. Upgrade your operations today.

Avoid notifications clutter

Maximize your control with customizable notifications from each service. Filter by components and severity to only receive the most important updates. Streamline your processes and stay informed with our advanced notification features.

Notify By Components

Multiple dashboards, shareable with the world

Create one dashboard for each of your teams/clients/projects and monitor only the services that each uses. Have a dedicated dashboard with custom notification settings. Easily make your dashboard public and share it with the world.

Multiple Dashboards

Prepare for scheduled maintenances

Never again be caught off guard by unexpected maintenance from your services. A feed of the next scheduled maintenances is available.

Weekly Digest of the services' outages

Every Monday, you'll receive a weekly summary of what happened the previous week as well as the maintenance schedule for the following week.

Integrate with tools you already use and love

The data and notifications you need, in the tools you already use.

Your teams will love it

DevOps & On-Call Teams

You already monitor your internal systems. What about the external services? Monitor the services your business depends on. Don't waste time looking elsewhere when external outages are the cause of issues.

IT Support Teams

Detect external outages before your clients tell you. Anticipate possible issues and make the necessary arrangements. Having proactive communication, builds trust over clients and prevents flow of support tickets.

5 minute setup,
instant value for your team

  1. Step 1 Create an account

    Start with a trial account that will allow you to try and monitor up to 40 services for 14 days.

  2. Step 2 Select your cloud services

    There are 2359 services to choose from and you can start monitoring, and we're adding more every week.

  3. Step 3 Set up notifications

    You can get notifications by email, Slack, and Discord. You can also use Zapier or Webhooks to build your workflows.

  4. Step 4 Done!

    You'll start getting alerts when we detect outages in your external dependencies! No more wasting time looking in the wrong place!

Frequently Asked Questions

Is Buttondown down right now? What is Buttondown current status?
Buttondown seems to be up and running. We've updated the status 3 minutes ago.
Was Buttondown down today?
Buttondown is up and running now. In the last 24 hours there was 0 outages.
I'm having issues with Buttondown, but the status is OK. What's going on?
There are a few things you can try:
  • Check the official status page for more information.
  • Check on the top of the page if there are any reported problems by other users.
Having problems with Buttondown and need support?
Buttondown outage? How can I monitor Buttondown?
Why use IsDown instead of Buttondown status page?
IsDown is a status page aggregator, which means that we aggregate the status of multiple cloud services. Monitor all the services that impact your business. Get a dashboard with the health of all services and status updates. Set up notifications via email, Slack, or Discord when a service you monitor has issues or when maintenances are scheduled.
What happens when I create an IsDown account?
You'll have access to a 14-day trial in our Pro plan. You can cancel or delete your account anytime. After 14 days, you'll need to subscribe to continue to use the service and get notifications.
How can I pay for a subscription?
You can go to the Billing section in your account and choose one of the plans. We have monthly and yearly options. We accept all major credit cards, Apple Pay, and Google Play. We use Stripe for payments.
Can I get a refund?
We'll refund your subscription if you cancel it until ten days after the subscription has started. No questions asked.
Can't find a service/integration?
Just contact us, and we'll add it ASAP.

Setup in 5 minutes or less

Try it out! How much time you'll save your team, by having the outages information close to them?

  • 14-day free trial
  • No credit card required to start
  • Cancel anytime
  • +2000 services available