Derailing Is Not Failing; or, Beeminder Revenue Proportional To User Awesomeness

matti · March 2, 2019, 1:44pm

Maybe the key to seeing Beeminder’s incentives as wholly non-perverse is to remember that derailing does not equal failing. Derailing is a lot of things — a kick in the pants, paying for an immediate break because the goal has been really hard, valuable information about how realistic the goal is — none of which have anything to do with failing. The only ways to fail are to quit the goal, to set the yellow brick road to something stupidly easy, or to cheat.

That’s also the key of the article: This presupposes a certain definition of “failing”. And then goes on to say that derailing is no such thing. In my opinion what failing is, is a very subjective question that can’t really be dictated by Beeminder, because it’s an opinion (and emotions-) based matter and might change not only from individual to individual but also from goal to goal, day to day, etc.

I know Beeminder is a company and needs or at least wants to have a set of applicable principles (a (predictive) theory of beeminder) to gauge stuff like revenue and user engagement, but I think this is also a prime case of removing unknowables by presupposing certain definitions and then working under the assumption that the unknowables now have become knowable. I would even suggest that many users couldn’t even articulate why something feels like a failure, which makes following @dreev’s definition even harder.

To say it in an image: The article claims to be a map of a territory but it’s actually a blueprint for a terraforming project, since it doesn’t describe the landscape which is unexplored. Instead it draws up a class of use cases and tries to rework the discursive landscape in which it acts (ergo: convince other people that what they see is indeed what @dreev describes if we only put a fresh coat of paint on it. And remove this detail. And add this extra thing over here. And so on.).

But also that’s basically how you get to a predictive theory of anything: You control the discourse by standardization (of definitions) and so on… So I don’t even know if it’s a bad thing in that sense, but we also loose complexity (and in this case a lot of it) by clearing things up in that way…

zedmango · March 2, 2019, 6:37pm

This is confusing short-term and long-term. If they’re derailing a lot, that’s high revenue short term, which might or might not mean high awesomeness. They might be derailing a lot but not deriving much value, in which case, yes, they would leave after a while, but in the meantime you have (short-term) high revenue and low awesomeness.

There are different kinds of awesomeness we can talk about.
First is “gross awesomeness.” Using Bee’s ukulele example, gross awesomeness is how much you practice ukulele.

But that doesn’t take into account the downsides, of which there are two:

First, the opportunity cost of all the other things you might prefer doing. As Bee points out, doing it too much will actually decrease awesomeness because it adds stress and takes away time from other things in your life.

And second, the pledge money you have to pay. If you end up practicing a lot, but paying so much in pledges that it’s not worth it to you, beeminder may get a lot of revenue, but the net awesomeness to the user is little or negative.

The user wants to choose a rate and pledge that maximizes net awesomeness, given their level of akrasia.

dreev · March 3, 2019, 5:46am

@matti: Dang, there’s a lot of insight packed in to that. Thank you! I’m chewing particularly hard on this “not a map of the territory but a terraforming project” concept handle you just gave me.

I think I’ve added a bunch of confusion talking about long-term vs short-term revenue, the joke about running away to Mexico, etc, but in the blog post and all my arguments here when I talk about “Beeminder revenue” I always mean overall, long-term revenue.

The case of high short-term revenue and low awesomeness is simply a case of low revenue in the important sense.

Now we’re just quibbling about how to divvy the consumer surplus. That question is always secondary to the question of social/economic efficiency.

zedmango · March 3, 2019, 6:27am

Well if we’re only talking about long-term:

In my view the optimal amount of derailing for the user is none or low, whereas the amount of derailing that maximizes revenue for beeminder is somewhat higher.

Too much derailing means your pledge is too low or your rate is too high. But more derailing means more revenue for beeminder as long as it’s sustainable.

There’s an exception: If the pledge is very low, more frequent derailing might actually mean less revenue than less frequent derailing at a higher pledge, but this isn’t guaranteed - it depends on how the user’s motivation increases with money. If there’s a sharp cutoff at the motivation point (say $100), user awesomeness is maximized when the pledge is just above the motivation point ($101), whereas revenue for beeminder is maximized when the pledge is either insanely high ($10,000) or sustainably below the motivation point ($45). $75 might not be sustainable, and $25 might be too low revenue.)

In my view, as derailing increases, once you’re past the point when you’re choosing to derail once in a while (beenice), awesomeness for the user starts decreasing, yet (long-term) beeminder revenue continues to increase - up until we get to derailing levels so high that they are not sustainable because the user quits out of frustration.

No/little derailing - could mean either high awesomeness because you’re at your motivation point, or medium awesomeness because your rate is below optimal.

— Corresponds to medium income for beeminder long-term.
Medium derailing - could mean derailing is by choice, but then your rate is probably too high, so low awesomeness. could mean your pledge is below optimal, so low awesomeness. Users will probably not get too frustrated, though, so this is probably sustainable. Could mean it’s not the right goal for you.

— Corresponds to high income for beeminder long-term, if sustainable.
Lots of derailing - probably means beeminder is not working for you. Pledge may be below optimal, rate may be too high, or not the right goal for you. Likely to not be sustainable and lead to frustration.

— Corresponds to low income for beeminder long-term because it isn’t sustainable.

Each user has a level and type of akrasia corresponding to a function of two variables - frequency of keeping your pledge is a function of the pledge and rate.

f = f(p,r)

Beeminder revenue is pledge times frequency of not keeping your pledge.

BR = p(1-f) = p-pf

For each user and goal there is an ideal rate i. The rate the user actually performs will be the rate they set times the proportion of times they don’t derail, or rf. The user wants this to be as close to i as possible, that is, to minimize |rf - i|. The user also wants to minimize payment to beeminder, that is, minimize p-pf.

So we can model awesomeness to the user as a(p-pf) + b(|rf - i|). The maximum awesomeness is 0. a is a negative constant corresponding to how much it sucks to pay each dollar to beeminder, and b is a negative constant corresponding to how much it sucks to do the task more or less than the ideal rate.

@dreev’s theory that revenue is proportional to awesomeness can be expressed as c(p-pf) = a(p-pf) + b(|rf-i|).

This can be rewritten as:

(1)         n(p-pf) = |rf-i|

where n is the new constant (c-a)/b.

(1) essentially is a claim about akrasia - that akrasia works in a way such that users’ akrasia functions generally satisfy, or come close to satisfying, (1).

zedmango · March 3, 2019, 8:13am

To make the math a little easier we can instead assume the user wants to minimize (rf - i)^2. We then have:

(2)     n(p-pf) = (rf-i)^2

We can rewrite this as np-npf = r^2 f^2 - 2rfi + i^2

We can write that as (r^2) f^2 + (np-2ri) f + (i^2 - np) = 0

Which by the quadratic formula means f(p,r) satisfies

(3) f(p,r) = (2ri-np) / (2 r^2)  + or - sqrt( (np-2ri)^2 - 4 r^2 (i^2 - np) ) / (2 r^2)

What would be really nice here is a Mathematica 3D graph of f(p,r) with sliders to change n and i so we can see what this surface looks like!

narthur · March 3, 2019, 5:03pm

Really enjoying reading this conversation.

Just jumped in to say that I just hope, @dreev, that you don’t underestimate the flexibility of the product you’ve created.

I use Beeminder to solve all sorts of problems; and the value I derive from Beeminder for each is often slightly (or extremely) different. I feel like you’re stuck on the hard-core “do-as-much-as-you-can” type-goals, when there are many, many other ways to derive value from Beeminder, and to maximize real-life awesomeness.

Here are the use cases I’ve had for it that I was able to remember off the top of my head (examples aren’t necessarily my own):

I want something that will poke me on a daily basis, a reminder I can’t (safely) ignore (hygiene).
I want to be motivated to do more on a goal than I normally would (exercise).
I want to track data (health metrics, meta goals).
I never want to have even the option of derailing (addictions, hard-core commitments).
I want to have the option of derailing in unusual circumstances using a pledge bright line (work or study hours).
I want to maintain a low commitment on something that I just don’t want to die (side projects).
I want to make sure I don’t forget to do something that I only need to do every so often (check oil, check fire alarm batteries, etc).
I want to have the freedom of doing an activity without risking a self-destructive binge; just need a guardrail (gaming, YouTube, social media).
I want to actively eliminate an activity from my life (addictions, negative health behaviors).
I just want something that will encourage me to put in the activation energy of starting, but will leave me free to do as much or as little as I want beyond that (side projects, exercise).
I want to start my goal out very conservatively and not feel forced into ratcheting it up until I feel ready (scary goals, exercise, side projects, anything).

I think the thing to remember here is that your definition of awesomeness (the user pushes themselves to the point where it’s almost not sustainable), while perhaps sometimes correct, is a very isolated definition. Imposing it disregards your users’ real-life context.

Your users have specific problems, specific real-life goals, specific needs. Beeminder is a great tool because I as a user can configure it to meet many of those needs.

I’m afraid if you took your philosophy to is logical end, you might be tempted to remove a lot of that flexibility.

zedmango · March 3, 2019, 5:28pm

What a great post! I agree completely, and that’s a great list of use cases. I want to go through that and make goals for each of those.

This in particular is a great one. I have a few goals in this category, and they’re super awesome for me because they’re so easy to meet. All I have to do is get started, and so I virtually never derail.

One example is my goal to do stretches for physical therapy. Yesterday I had to hurry to a presentation in the morning and I was exhausted in the evening, so I probably stretched for about 10 seconds each, which was sufficient to keep me from derailing.

If I had to do more than that, I probably wouldn’t have done anything and derailed. More revenue for beeminder, but less awesomeness.

And this brings up an important point - it’s the user who chooses the pledge and the rate (and the goal requirement), and the user should choose those to maximize awesomeness. Beeminder should help the user choose those settings to maximize awesomeness, rather than to maximize revenue.

(Example: this suggestion to automatically lower pledges after not derailing for a while - forcing this or making it a default would in my view lower pledges below motivation point, thus increasing revenue and decreasing awesomeness.)

So thinking more about this, my algebra was kind of misleading, because the only values that matter are the user’s choices for pledge and rate. There will always be values of pledge and rate for which awesomeness is low. If the user chooses poorly, then beeminder might end up getting more revenue even with low awesomeness.

This exactly. My suggestion for defining awesomeness is how close are you to doing the task at the rate that’s best for your life, minus the amount you pay beeminder.

I see derailing more than once in a while as a sign that the user’s settings are not optimized for this awesomeness.

mary · March 3, 2019, 7:29pm

My point was more that maybe peak awesomeness includes, perhaps by definition, the meta-skill of having the rate for that goal set just right, so that it takes into account the unknowables and so has just enough buffer that those unusual, infrequent things don’t derail you. So I was suggesting that there might be some perfect pledge amount for each goal that would be just scary enough that I wouldn’t let the buffer get too small to be derailed by something unexpected, but also some perfect rate for that goal such that I’d be doing as much as I could on it given the better forward planning that includes the unknowns. Seems like that would be peak awesomeness, but would also decrease revenue.

Though, like @bee pointed out, this presumes that every goal wants to be maximized to the highest (or lowest, depending on the direction) rate that is possible, which I agree just isn’t the case.

Right. And all of this is the logic behind the pledge schedule: That there’s some point at which you won’t be derailing anymore because of the size of the pledge and so, in the end, derailments (and revenue) will drop and yet awesomeness will increase. The principles behind the pledge schedule and those behind “increased revenue means increased user awesomeness” might be in tension with one another.

I agree. Part of the beauty of Beeminder is in it’s flexibility in the ways it’s inflexible. I get to decide exactly what my goals are and enforce that, and for some goals that’s, “Let’s see how much I can push this”, and for others it’s, “I just want to never have the option to not make contact with this activity over a given interval”.

This.

And everything @bee said

There’s so much worth talking about in this post that I kind of want to have a whole separate DM conversation about it!

Weight loss. Dialing a road above 2-3 lbs per week can have potentially serious side effects. So getting to a healthy rate and doing well at it will not maximize revenue, but will maximize awesomeness. Also, having just a weight loss goal (an output goal) and no pledges connected to actions mean that you might still not be getting motivation about what clearly needs to be done at a time of action (assuming you’re not in the group of users intending to just not eat until on the road). And that can perhaps be generalized to say that it might be an indication, not that you’re on the verge of awesomeness, but that you haven’t realized that the goal isn’t set up in a way that you’re actually going to get what you want out of it.

adamwolf · March 3, 2019, 9:43pm

I find it helpful to realize that a Beeminder goal is not my goal. (I wonder if calling it a goal is a bad idea!) I use Beeminder goals to achieve my goals. Usually, derailing on Beeminder goals is not failing at my goals–it’s a safety net that makes me look at the gap between my expectations and reality.

zedmango · March 3, 2019, 10:44pm

So I was thinking peak awesomeness probably includes some derailment. Say you have a daily goal where every 4 weeks on average, something comes up that is genuinely worth forgoing the goal for.

The ideal performance rate would then be 6.75/week. But the things that come up don’t come up like clockwork. So you’re not always going to have exactly enough buffer at exactly the right times so that you don’t derail, and if you lower the rate too much, you might end up with an extra day buffer when you don’t need it. Then, assuming you’re an edge-skater, you wouldn’t do the task on those days, which would decrease your awesomeness.

So you should set the goal rate to 7/week and set the pledge to $10 or so. This way you choose to derail and pay when it’s worth it to you, and the rest of the time the pressure stays on.

So while peak awesomeness doesn’t equal peak revenue, I think it still includes some derailments.

I’m not sure I understand - why does it presume that? Seems like the same logic applies to ukelele playing where the ideal performance rate is 6.75 minutes a week or something.

kenoubi · March 11, 2019, 2:13am

It sounds ridiculously bad because it is ridiculously bad. It’s an approximation to the truth, but only if Beeminder itself politely pretends that it isn’t. If a user doesn’t realize on their own that the derailments are worth the price, then they aren’t.

dreev · March 11, 2019, 4:43am

I’m not totally sure but I think what you’re dubbing polite pretense in fact goes completely without saying. The user realizing on their own that the derailments are worth it is in fact the key premise for the thesis in my blog post.

So I’m tentatively latching on to the “it’s an approximation to the truth” and saying this is an agreement on the fundamentals! I mean, the only thing that matters for Beeminder’s long-term survival is actually making people awesomer. The exact relationship of that with revenue is ultimately a quibble. Unless revenue and user awesomeness are actually at odds, and I think we agree that they’re not, especially not in the long term.

kenoubi · March 11, 2019, 9:43am

It actually is. I’m just really concerned about how new users who don’t have the context that I and other long-time Beeminder users do will take statements from the founders about their goal explicitly being to maximize revenue. Which isn’t even really true, I think – if you realized in a particular case that they were at odds, 99 times out of 100 I think you’d choose to maximize user awesomeness over maximizing revenue (and the 100th time was likely a mistake). But once again, imagining myself not having the context, I imagine that sounding so ridiculous that I’d believe something shady was going on instead.

zedmango · March 11, 2019, 10:47am

Further musings:

Derailments only contribute to awesomeness when they end up helping the user derail less. That decreases beeminder revenue in the long term for that goal - but helps the user with that goal and encourages the user to set up other goals.

The “per goal” vs “per user” distinction is important, I think.

If derailments don’t help the user derail less, they don’t correlate with awesomeness.

Derailments only correlate with awesomeness under very limited circumstances - rather than think of “exceptions,” it might be better to try to pin down those circumstances, like:

the goal and rate are appropriate. (otherwise derailments probably indicate problems with the goal or rate)
the goal is not a “beenice” type goal - that is, it’s one where you actually never want to derail, not one where you sometimes want to choose to derail and pay the penalty (otherwise more derailing just means choosing to derail more often, which doesn’t mean more awesomeness)
and
user hasn’t reached motivation point, or motivation hasn’t fully “sunk in” yet (otherwise derailing isn’t helping you derail less)

kenoubi · March 11, 2019, 11:53am

That’s a little too strong. You can have a goal on which you sometimes are okay with derailing and still have a particular derailment be due to akrasia. “Correlated” is a very weak requirement.

zedmango · March 11, 2019, 1:01pm

yes, you can, but how does such a derailment increase awesomeness? I’m not seeing the correlation here.

kenoubi · March 11, 2019, 3:32pm

In exactly the same way as derailments on goals where you’ve never okay with derailing do – by stinging so that you try to avoid them. If you didn’t care about doing the thing at all, why didn’t you have a flat road, or just not have a goal in the first place?

zedmango · March 11, 2019, 7:37pm

It could help you derail less, but it seems to me it’s more likely to be lumped into the general category of “times I paid to not do that thing.” But you’re right, I should amend that.

Should read:

If the goal is a “beenice” type goal - that is, it’s one where you sometimes want to choose to derail and pay the penalty - the derailment was:

a) in retrospect, the derailment was akratic, not by conscious choice, (otherwise more derailing just means choosing to derail more often, which doesn’t relate has to awesomeness) and
b) the derailment actually changed your beehavior and made you less likely to derail akratically (otherwise it doesn’t help with awesomeness).

zedmango · March 11, 2019, 8:45pm

So say that before you set up beeminder for a goal you akratically derail with probability A. (the “beenice” derails where you choose to derail non-akratically don’t count.)

After you set up beeminder, you derail akratically at probability B. (see what I did there?)

Presumably, due to the threat of losing money, B <= A.

Now each time you derail, the probability of akratic derailing may change, due to two factors:

If you set up increasing pledges, you now have more at stake.
It may have “sunk in” that derailing costs you and you don’t want to do that.

Call the probability of derailing B1 after the first derail, B2 after the second, etc.

Then most likely:

A >= B >= B1 >= B2 >= B3 >= ...

I think this is how revenue is proportional to awesomeness. Each derailment hopefully makes you better by making you akratically derail less often, as Beeminder points out in the FAQ.

https://www.beeminder.com/faq#qcoi

But once you’re at your motivation point, and once it’s “sunk in” as much as it’s gonna sink in, future akratic derailments don’t help you - they just mean you’re losing money for no purpose and you’re still derailing when you wish you hadn’t.

So at that point, revenue is no longer proportional to awesomeness.

dreev · March 11, 2019, 9:04pm

As I said in an update to the blog post, I’m finding this whole discussion pretty eye-opening. A lot of it seems to contradict my thesis here. I’m not sure it does as much as it seems to but it has convinced me that this is just a part of the picture and Beeminder’s power and flexibility continue to amaze me. People improve their lives in a lot of different ways with Beeminder!

I also added this (now edited a bit) in a beemail as a characterization of the debate so far:

My original thesis was that revenue was a good metric for user success. People have pointed out various ways in which it won’t be and I concede those. But I’m as convinced as ever that obvious metrics like “reached supposed goal date without derailing” are worse. Except when they’re not. So really what Beeminder needs is to learn to distinguish. Like by making open-ended goals the default so that hitting a goal end date means that the user actually meant for the goal to end. And providing guidance on dialing in the rate of the Yellow Brick Road to eliminate the common case of setting a laughably easy slope initially – often for lack of data, which Beeminder is about to provide in spades – and never getting around to changing it.

Topic		Replies	Views
perverse incentives and the paradox of beeminder's sting	0	450	August 3, 2012
Laws of Beeminder Life Akrasia	13	1085	January 1, 2019
Intensity of punishment effects? Akrasia	10	647	April 11, 2020
Thoughts on Payment is Not Punishment Akrasia	2	59	September 27, 2024
Feature Idea: +Reinforcement via escrow Akrasia	5	1475	February 12, 2016

Derailing Is Not Failing; or, Beeminder Revenue Proportional To User Awesomeness

Related topics