Beeminder Forum

Bug Buckets aka DEFCON for bugs, also fairness

I improved our bug classification system and this kind of thing is so fun for me that I’m sharing with you all:

  1. BITTY BUGS are barely bothersome.
  2. BANEFUL BUGS make Beeminder blatantly wrong, but not in any breach-of-contract way, unlike…
  3. BUM-STEER BUGS which may make you derail by leading you astray about the state of your graph, or, worse:
  4. BAMBOOZLE BUGS making our marketing mendacious or fallacious. And finally, the truly unconscionable,
  5. BURGLE BUGS which would charge you money you didn’t implicitly agree to pay!

(Yes, I’ve basically read every b-word in the dictionary to come up with that. We all need hobbies, ok?)

Recent examples of bitty bugs include #3539 and #3480; recent examples of baneful bugs include #3552 and #3541. I’d say most of our bug fixes – grep “#bugfix” at beeminder.com/changelog – are somewhere in between bitty and baneful. Just today we added one (#3567) that was in between baneful and bum-steer.

We designated #3547 & #3548 bamboozle bugs because there were paid premium features that couldn’t be used simultaneously (huge thanks to @mary on that!).

Burgle bugs are rare enough that I’ve failed to find one in the changelog, though I suspect there are a few in there. Relatedly…

Burgle Bug Fairness Principle: If we have a bug that incorrectly charges someone and we catch it, we can just refund it. If the user catches it – before we say anything to them – then we have to refund twice the amount we wrongly charged. “Wrongly” means a bug that amounted to stealing from them, i.e., a burgle bug. This doesn’t apply to things like non-legit derailments, even those caused by bum-steer bugs. Because even for an egregious bum-steer bug, the derailment did happen and we do explicitly ask “was this legit?” so there’s not much burgliness to it or possibility for it to go unnoticed. (And to be clear, we do refund those!)

(It actually really irks me that the double-refund thing is not standard for all businesses. If someone makes an error in their own favor it’s not remotely enough to be like “Oh, haha, sorry, here’s the money back that we would’ve happily stolen if you hadn’t noticed!” We need to teach game theory in high school, people!)

3 Likes

Wow, never thought about that, but I love it. This definitely needs to be how I handle things for TaskRatchet, too. Thankfully, I’m not aware of any TaskRatchet burglaries yet! :sweat_smile:

1 Like

Burgle bugs are rare enough that I’ve failed to find one in the changelog, though I suspect there are a few in there.

UVI #1001, #1084, #2027, #2035, #2466, #2595 (kinda, the price is displayed wrong so I’m counting it as one). And if you count charging the wrong credit card (you agreed to pay, but you not via that credit card): #700, #1290. ~0.2% of UVIs fix burgle bugs.

3 Likes

Ha, nice! Should’ve occurred to me to grep for “refunds” (or “oy”)! :grinning:

I don’t know about #2035 since that sounds like net zero overcharging in expectation… (I guess the next one, #2466, sounds similar, and #2595 is definitely not burgle. Could even be as low as bitty, being a subtle-ish typographical thing, though probably I’d say baneful for that).

The first 3 definitely though! Thanks for finding those examples!

PS: Those last 2 – #700 and #1290 – are the exact opposite: what we call moneyburning bugs (oh crap, what’s a synonym for “moneyburning” that starts with b?) where we fail to charge you when we should.

2 Likes

(oh crap, what’s a synonym for “moneyburning” that starts with b?)

Bust bugs?

1 Like

Ha, not bad. More brainstorming: bungle bug, bezzle bug, bloodbath bug, beneficent bug?, bankbreaker, …

I wasn’t too serious about adding a bug level for those though. The bitty→baneful→bum-steer→bamboozle→burgle hierarchy is all focused on severity from a user perspective. If we have a :money_with_wings: bug, that’s probably just level 2 – baneful – if we’re failing to charge for premium and level 4 – bamboozle – if we’re failing to charge for derailments (“taking your money” being the fundamental service Beeminder offers!) and we can further prioritize based on business reasons as needed.

2 Likes

Fun game theory discussion with @adba, reproduced here with permission, and maybe mostly as notes for myself for a footnote in the eventual blog post about the Burgle Bug Fairness Principle:

ADBA: Game theory doesn’t imply strictly 2x, as you know. Good email, good classification system.

DREEV: ah, thank you! important thing to point out! the refund should be multiplied by the reciprocal of the probability of the customer noticing – maybe calculated as the fraction of customers that notice, if we want to be frequentist about it?

but of course we’re bayesians and by the principle of indifference we could take 1/2 as the prior and thus double-refunds as a starting point.

or just say double refunds because of focality aka schelling-pointiness and simplicity and not instantly glazing over the eyes of everyone we’re trying to convince about the burgle bug fairness principle :slight_smile:

not to mention pareto-dominating the status quo.

but all this will make a great footnote at the very least!

(did i pack in enough game theory concepts there to restore my credibility?)

ADBA: You didn’t lose any credibility in the first place, and I agree with your reasoning for a half being a perfectly sensible default. :slight_smile:

I don’t think it’s fundamentally a bayesian vs frequentist question either, and I don’t think “reciprocal of the probability” is the right algorithm. I’d be tempted to cast it as “a refund, plus a reasonable-seeming penalty we impose on ourselves for having let it get that far and as part of restoring warm fuzzies all around”; I’m yet to see an analysis that seems like more than an attempt to use formalisms to slightly sidestep the fuzziness without actually dispelling it. :slight_smile:

DREEV: :slight_smile: i think you’re right that bayesianism vs frequentism isn’t actually at issue.

quick rationale for “reciprocal of the probability”: suppose we want to fairly redress the burglary in the sense of giving back all the money we burgled, in expectation. if 1 out of n users notice and we refund each of those users n times what we stole then overall, on average, we’ll have refunded 100% of the ill-gotten booty.

of course our intention would be to more-than-refund users who notice as well as exactly-refund everyone who didn’t notice. so that’s supererogatory in one sense but feels important, incentives-wise, i guess because those who notice shouldn’t just believe us that we’re also refunding everyone who didn’t notice.

(this is turning out to be quite valuable to think through with you! thank you again!)

ADBA: I’ll buy that “reciprocal of the probability” has roughly the right shape in some ways (ie, there’s a reasonable case to make that the right ratio isn’t entirely inelastic to the percentage of people who notice), but it falls down badly in several ways:

a) The limits. If people had a 100% chance of noticing, you’d probably still want to give them more than what they lost back. If people had a 0% chance of noticing, you’re not going to give them infinite money (though if you had a way of getting infinite money, crafting a userbase with a 0% chance of noticing to introduce infinite money into the world would be an interesting problem!)

b) Near the limits. Giving 102% or 100.01% back could easily be taken as annoying / insulting / worse than 100% by a fairly high fraction of people, and wouldn’t seem “fair”/reasonable. There’s a huge body of literature about how people will do things like decline to get $1 if the person offering gets $9 of $10 and chose the split and the alternative is that they both get nothing, etc. It’s not an identical scenario, but the notion of “too small being unfair and literally worse than nothing” seems to show up pretty regularly.

I agree that a > 100% incentive on yourselves to get this right is a useful thing to have – both for the same reasons that beeminder is useful to people in the first place, and because it’s the kind of nice-and-rational thing that’s part of why you guys are so awesome.

(Gladly!)

DREEV: ha, fair, though we don’t have to worry about the 0%/infinity case. if a user does complain then that rules out 0% as the probability! (recall that the burgle bug fairness principle only requires double-refunds if the user notices before we’ve said anything to them. if we catch it before they do, a normal 100% refund is ok.)

i’m on board with the other points though, like not seeming stingy (sting-y yes, stingy no). perception-of-fairness is a big part of all this.

Have you read David Friedman’s excellent book, Legal Systems Very Different from Ours? It’s an amazing book, I can’t recommend it highly enough. See also slatestarcodex’s book review, and this video of the SSC meetup a few weeks ago where the author, David Friedman, gave an excellent talk summarizing a few of the book’s highlights.

(I’ll also recommend the online SSC meetups we’re having nowadays. They’re predictably great. The upcoming one is on Sunday, November 8th, with Sam Altman.)

Anyway: the book has some interesting things to say on the topic of probability multipliers for damages. Most of the book’s chapters describe specific interesting legal systems (be it Imperial Chinese law, Jewish law, pirate law, or law in saga-period Iceland). Interspersed with these, there are chapters describing interesting commonalities; concepts that reoccur in different contexts. These include everything from “When God is the Legislator”, to “The Problem of Error”, to (relevant here), “Enforcing Rules”.

There are, of course, many different ways to enforce rules. One of them is a system of torts: if you harm me (say, if you steal from me), I can bring a claim against you for damages, and then (if I am judged to be in the right), you have to pay me compensation for the damage you did to me. This is a good system! But the incentives are less than perfect. If the worst thing that can happen to a thief is that he has to give back what he stole, there is no reason not to steal all you like: if you get away with it, great, and if not, no harm done, you just need to return the stolen goods!

One solution is indeed a probability multiplier, to fix exactly this problem. But that’s got its downsides to. For one, the tortfeasor may be limited in what he or she can pay: the probablity multiplier can’t effectively be high enough to make the damages more than everything the tortfeasor owns.

For another, it can lead to synthetic torts. To quote from the book:

Late at night, as your car comes around the bend, I shove mine into the road and hastily stand back. When the dust has cleared and you have gotten free of the wreckage—I considerately staged my fake accident on a slow road so that you would survive to be sued—I commence a legal action, claiming five times the value of my car on the grounds that it was only by great good luck that you did not succeed in escaping after smashing my car and that four friends of mine just happened to be lurking in the underbrush to witness the accident and testify against you. In order to frame you I had to create a real accident that really destroyed my car, so it was made profitable only by the existence of a probability multiplier.

That’s quite a serious issue. Nevertheless, there are ways to try to get around these issues. Legal Systems Very Different from Ours has quite a few things to say about this and other such mechanisms, especially in the context of the way many historical (and present) legal systems have tried to resolve them. If this at all interests you, it’s a book I strongly recommend. Or start with the slatestarcodex book review, which his as excelent as slatestarcodex always is.

1 Like