New blog post responding to Scott Alexander’s Bayesian theory of willpower:
I’ve just yolo-ordered 30 Mealsquares. Haha. You should get affiliated with them dreev
I’ve gotten some highly inciteful [UPDATE: wow, I only noticed days later that I Freudian-typo’d “insightful” as “inciteful” somehow; I didn’t mean it that way, I swear!] replies to this by email that I’m hoping to twist arms to repeat here or in the blog comments (or I may quote them anonymously if not). In the meantime, I turned the whole post into a Twitter thread:
I don’t see how the Bayesian theory predicts anything as weird as hyperbolic discounting, with its preference reversals.
But calling it “preference reversal” makes it sound weirder than it is! Preference reversal is a logical consequence of “ignore your full utility function and do what you most want to do in the moment”. Ie, apply an enormous discount factor now and then switch to a normal one for anything outside your akrasia horizon.
So I guess in the Bayesian theory, if you over-weight the “do what’s most immediately rewarding” system then that’s hyperbolic discounting, preference reversals and all.
Repeating from daily beemail:
- Adam Wolf is not impressed.
- Alice Harris (aka alys) has an interesting anecdote possibly in support (now in the blog comments).
- Jacob Falkovich is… I think mostly on my side, and maybe also likes Kaj Sotala’s multiple sub-agents model.
- I still can’t decide how much if any merit I think the Bayesian theory has but am feeling less optimistic about it after all these conversations.
Thanks everyone!
Also I hope everyone appreciated the title image on the blog post. By “everyone” I mean the decision theorists and statisticians and data scientists and rationalists. That pretty much covers everyone, right? Our usual blog illustrator (our daughter) was slaving away on it when I realized that LaTeX could probably just do that, and, sure enough. In retrospect, Mathematica could’ve also rendered it just fine.
I think Beeminder is of great help in getting more brain parts suggesting the same course of action. For example:
Situation A (akrasia):
Intellectual brain: you should exercise, because of your long-term goals.
Short-term reinforcement learner: you should play video games.
Situation B (beeminder):
Intellectual brain: you should exercise, because of your long-term goals.
Short-term reinforcement learner: you should exercise because otherwise, you have to pay.
Situation C (long-term exerciser):
Intellectual brain: you should exercise, because of your long-term goals.
Short-term reinforcement learner: you should exercise because it is rewarding.
Beeminder helps to transition from A to B to C, by tricking your reinforcement learner with a short-term consequence of not exercising. In the process, as real rewards for exercising start coming in, you train your brain to trust the suggestions from your intellectual brain more. In addition, I suspect that next to a prior on motionlessness there also is a strong prior on habit. So I think what beeminder does is not so much providing stronger evidence from your intellectual brain (this would be reading another book on the importance of exercise), but instead incentivizing other brain parts (the money-saving brain, the habit-brain, etc.) to suggest a similar course of action as the intellectual brain would.
I agree that this Bayesian framework does not immediately suggest new practical insights, but it’s a nice over-arching theoretical framework. I like your comparison to relativity and Newtonian physics.
Bee-autifully said! Although some people might wonder if it really helps you transition or if it’s just a crutch and hinders the transition. I like your subsequent argument that that’s not the case.
I do not think that preference reversal is a logical consequence of discounting over time. Exponential discounting doesn’t result in preference reversal, right?
Hyperbolic discounting is more specific than just preferring payoffs happen sooner. The time inconsistency, that hyperbolic, the getting less steep as time passes, is intrinsic to it.
That’s right, exponential discounting (like how banks charge interest), no matter how extreme, never yields preference reversals. That’s also correct that hyperbolic discounting is more specific than just putting high weight on immediate consequences. It’s the combination of that plus switching to less extreme discounting after the window on “immediate” is past.
So you really don’t want to wait an hour to eat this cake now, but how much do you care about having the cake at 3pm vs 4pm next week? If you were consistent, and used exponential discounting, then 3pm vs 4pm next week would be just as big a deal as now vs an hour from now.
Right, preference reversal is a predictable consequence of hyperbolic discounting. What I don’t get is how you infer hyperbolic discounting from Scott’s Bayesian model.
Ah, yeah, that’s a good question! (Sorry to focus on the wrong part of that question, though I think that turned out to be valuable as well to clarify hyperbolic vs exponential discounting.)
I continue to feel less and less excited about the Bayesian theory of willpower (though not writing it off entirely).
But to defend it I would say that it gives a more fundamental explanation of hyperbolic discounting: When you’re considering what to do next week, the “what’s most immediately rewarding” system is silent. When considering what to do now, that system drowns out your rational brain.