The “beeminder.com” domain is registered with Namecheap, with DNS pointing to Linode (ns*.linode.com as nameservers) where we have our zonefiles.
We’ve been having rare but ongoing DNS failures, which we now think is Linode’s fault. Last night, it because hair-on-fire acute for a while, with emails to @beeminder.com addresses mostly bouncing:
It seems that that was due to the following Linode outage:
We’re pretty angry about how consistently flaky Linode has been with DNS so probably @adamwolf is going to move us to something else. (Opinions solicited on what!)
In parallel, I’m super freaked out about the email problem. Our ESP is Mailgun and I’ve not been exactly thrilled with them for a while. What do you think of the following idea for another layer of protection:
Over the last 4 months, our DNS doesn’t resolve about 0.05% of the time. This is from some monitoring that checks from a single server randomly chosen from a wide variety of data centers (which are both geographically diverse and diverse from an AS), configured for DNS testing.
I do not have the stats handy for response time. Typically, it’s pretty quick, but it is not unheard of for us to have more than 2 seconds response time! I don’t like this.
I have a good amount of experience with Route53, and a fair amount of experience with Cloudflare. Last night while investigating this, I ran into Namecheap’s PremiumDNS, which seems pretty great.
We don’t do anything fancy with Beeminder’s DNS, even with our load balancing.
Being that we don’t do anything fancy or dynamic with AWS, I’m not super inclined to switch to Route53 over Cloudflare, and since we don’t do anything fancy, I’m not against trying Namecheap’s PremiumDNS or something similar.
I think everything I know of for email deliverability is for outgoing deliverability. I asked around to see if folks know things for incoming deliverability… you’d think there’d be something!
Mhmm, same. I’ve had it in my backlog to switch away from them for awhile. Was considering rolling something with AWS, but not sure that’s the right choice, either.
Very eager to see what Mailgun alternatives you find, @narthur! Now I’m wondering what @malcolm uses for Complice, too.
And anyone who wanted to be Beeminder’s upbuddy, including implementing the daemon, that would be amazing and we could pay generously in premium credit for that…
Essentially it sends an email to all of the receivers at a set interval, containing a JSON object. Each receiver listens for incoming emails and replies in response with a similar JSON object. Upbuddy server keeps track of everything and does, uh, something, when a reply takes too long (or at least that’s the idea, it doesn’t work yet)
edit: somehow spammers have already found my testing mail server and started spamming it (I don’t have any anti-spam features enabled), wow
You can see that there was a brief outage at 2020-10-20, 11:41:18 a.m, and a longer outage from 1:21 to 2:01. Still has a lot of important stuff missing, but it’s getting there!
FWIW, my company’s “stack” (not sure if the term applies here) is
Servers: Linode
ESP (for transactional messages): Mailgun
DNS: Hexonet
Third-party uptime checks: UptimeRobot
I don’t sign onto the beeminder forums often apparently lol but I use Mailgun for the fairly small amount of mail that Complice sends & receives, and I have no particular issues with it. They overhauled their API a couple years ago which was nice as then I could remove an old messy library from my code.