Self-Control is now an Engineering Problem (We will have Personal AI Overlords)

Disclaimer: I’m the founder of Forfeit (very similar to Beeminder) and am now building a company called Overlord. This is an AI agent that is designed to monitor you 24/7, and encourage you to stick to good habits, and not do bad habits. This is an essay about how AI can pretty much fix self-control very soon (through everyone having AI accountability partners), but of course I’m biased, as I’m building a company in this space.

For the majority of people, self-control is their #1 issue. Most people are drastically messing up in one element of their life due to not being able to control themselves. Thought experiment: If you had a friend following you around 24/7, would you be able to kick all your bad habits? Probably.

Let’s take the issue of obesity (40% of Americans right now). If an obese person had a friend following them around each day, they’d likely be able to eat 230 calories less each day (enough to lose 2lbs/month). This would be almost comically easy - your friend will just say “do you need large fries, or are medium OK?”, and everyone would be thin.

This can, of course, apply to everything. The following issues (think about these yourself), would be immediately fixed if you had a friend following you around encouraging you: porn addiction, exercising, doomscrolling, drinking too much…

Now, imagine instead of a friend, it’s an AI. Of course, you feel no shame towards an AI, as you would a friend, yet. This means you’d need to jerryrig pain into it: It can charge you money (LessWrong - Losing money or completing habits), call you to persuade you, text your friends, call your mum. But if done intelligently - striking a balance between being too nice and too mean - it should be pretty close. And pretty close to a fix for self-control is the Holy Grail for most people.

How far off are these AIs? These will come when the AI personal assistants come, and, in my opinion, would be more valuable to a lot of people. 24/7 accountability partners aren’t a job that we can replicate (as paying a human 24/7 is too expensive), so they’re not discussed as much as personal assistants, but they will be incredibly valuable. Would you rather have a chatbot who can book hotels and flights for you, or quit smoking?

Real-world example:

Now let’s talk about how easy it is to overcome most bad habits (of course - I’m not talking about genuine addictions). I have a bad habit of smoking cigarettes - I don’t care if I’m drunk, but I can’t say no if a friend offers me one when I’m sober. They’re incredibly easy to resist, like a 1/10 in the moment, but for some reason this doesn’t happen. Let’s compare that with some deterrance mechanisms:

Texting my mum telling her I smoked: 5/10 (I wouldn’t smoke)

Losing $5: 3/10 (I wouldn’t smoke)

Having to spend 5 minutes sat in silence: 4/10 (I wouldn’t smoke).

As long as the deterrance mechanism score higher than the negative action, I wouldn’t do it. And it takes a shockingly simple deterrance mechanism to counteract a negative action, and therefore break a lifelong bad habit.

Now the deterrance mechanism is fixed, how can we monitor people? We ingest all the personal data (ie, an AI Overlord), and piece it together from there. We spend about 10 hours a day on screens, so that leaves ~6hrs where we don’t know what a user is doing. The obvious ones work: Location, credit card transactions, simply asking the person what they’re doing.

Crucially, from this, just as a human would, an intelligent AI can piece things together. If I’m trying to quit drinking, and the AI sees that it’s 9pm on a Friday, I’m at a bar, and I spend $9 on my card, it can easily suss out that I’m probably drinking. It can recognise patterns and weak points very easily. It doesn’t need to prevent you from doing your bad habits every minute of the day, just when you’re weak - each minute will have a different “Chance of relapse score” depending on the time of day, how much sleep you had last night, and in general how suspicious this AI is that you’re close to relapsing.

AI will create the perfect conditions to create addicts

Here’s what’s likely to happen, and why we will need this more than ever.

  1. Loss of purpose: People will no longer have purpose through their jobs, as AI does it better than them. Purpose fends off addiction/bad habits very well.
  2. Economic displacement: Until we figure out UBI, people will lose their jobs and have no money. Poorer people are more likely to be addicts.
  3. Abundance of time: These jobless, purposeless, broke people will now have 16 hours a day to fill. Abundance of time with nothing to do is a recipe for addiction.
  4. AI Superstimuli: AIs will very soon be the most charismatic, caring, intelligent people you’ve ever spoke to. Most people’s friends would be a 5/10 in the following: Attractiveness, Intelligent, Interestingness, Empathy. If you could facetime a person who’s a 10/10 in all these qualities, would you still find it fun doing anything else?
  5. Social withdrawal: People’s friends will start to drop off the map due to this. With less and less friends, people will socialise less.

In short: We will very soon be purposeless, broke, bored, lonely, and a superintelligent, superattractive AI will step in - most of society won’t be able to resist. I argue that we simply won’t be able to resist - we will need an “AI Iron Dome” to protect from this new superstimuli.

We already have guardrails imposed on us by society: I wouldn’t stand up and shout profanities in a coffee shop, as it’s socially embarrasing. I wouldn’t ignore my boss as it’s financially painful. There are thousands of things that you could do at any moment in time, but due to these invisible guardrails we have around us, we can only do two or three. Most people wouldn’t even leave the line at the coffee shop as it’s a little bit weird. These defence AIs would do the same thing: It would let you set your own guardrails on your life, so you never do something future you would regret.

So, we will soon have two very powerful competing AIs: The AI “Superfriends”, and the AI “Iron Domes”. They won’t be able taking away your agency, just aligning your actions with what you in 24 hours would want you to do. Right now we have 100% control over our actions moment-to-moment, which is disastrous. We should have 90-95% control of what we are doing, and the other 5-10% should be controlled by an AI, aligned with our future, rational self. This seems dystopian now, but we will likely have no choice in the matter.

Practical examples

These are essentially the same as what you may tell a friend to do if they had a certain element of control over you. Here are some examples:

  • “In my own voice, call me at 7am and give me a pep talk. Every minute I’m not awake past 7am, charge me $0.10”
  • “For each rep of a posture exercise I do, give me one minute on Instagram”
  • “I tend to smoke weed when I go to Jake’s, and want to stop. Make me send a photo of my eyes every time I leave.”
  • “Only allow me on my phone when I have no events on my calendar”
  • “Make sure I take max three Zyns a day (must send photo of canister each morning)”
  • “When I go out, gently push me to get home. When I wake up hungover, intelligently motivate me with financial penalties to leave the house and get to the gym right as I wake up.”
  • “I’m getting home now - make sure I lock in on my Mac with 45 min pomodoros with 15 min spacing (must send video of myself just lying down, decompressing) until 6pm”
  • More examples that users have shared here (Overlord Community Goals)
5 Likes

You’ve got everything reversed

I don’t agree, and I’m going to address every point one by one…

Work being the #1 purpose of human is because of the way society is structured. Humans find purposes and stick to them because we’re like that. It just happened that it’s very convenient to have your citizens purpose to be flipping hamburgers and crafting Louis Vuitton bags.

Most people are suffering through their jobs and have totally different purposes. I guess that you that people have their job as a big part of their identity because that’s who you are and who your friends are.

→ you’re right that AGI would shake up people’s sense of purpose
→ you’re missing that people will find other purposes by themselves

You don’t automatically become an addict because you are poor. But you do often end up poor when you’re an addict. Think about yourself. What if you had 10$/day? Would you spend them on drugs? No, you don’t have money for that. What if you started heavy addictive drugs? Don’t you think you would have a chance of going to the rock bottom really quickly?

I’m not even going to elaborate on the rest since you didn’t take time to think this through…

That’s a non-problem as well. People find ways to fill their time easily. Travel, cook, read a good book… Humans know how to keep themselves busy. Yes, some people will have been too invested into their job and will feel like their life is empty. I would be one of the first. But it’s a mistake to think that people don’t know how to spend their time on their own. Poor people too, know how to keep themselves busy without self-destructing…

I’d rather hang out with my 5/10 friends than talk to an app. Who do you call for emotional support? ChatGPT or your mom/a friend?

What do you do for fun? Go to a music festival with your 5/10 friends, or facetime an AI?

Pure nonsense

See point above. Do you have any friend that would rather spend time on ChatGPT than going out and do something in the real world because ChatGPT is just so much better? If so, I’m sorry for them.

Power of love

This exists, and humans have been doing it since the dawn of time.

It’s something that nags you for everything and nothing until you comply, very often, for your own good. It’s often a money sink.

That’s called a girlfriend.

… Power of love, man.

I’d rather loose 5$ than getting into an argument with my girlfriend.

If you really wanted to solve everybody’s self-control problem, you would start a marital agency.

Data privacy

… do you really feel comfortable having all your data being ingested by an American company?


I know for many people Overlord is the holy grail. But it’s just really off-putting when you understand the reasoning behind it + you understand that your personal data, thoughts, flaws, etc goes to the USA to get stored there forever.

1 Like

I think I disagree with both sides of this debate on possible AI futures. My predictions gravitate to extremes. Well, there’s a lot of probability mass on AI progress hitting diminishing returns and life ending up relatively normal (for a few decades? eventually AGI is coming though). But assuming this rocketship ride continues to AGI this decade, which does seem possible if not probable, I think it will either go so well or so horribly that addictive chatbots are the least of our concerns. And I think there’s a real enough chance it goes horribly that I think it’s correct to be extremely freaked out.

Btw, to defend @joshmit’s point, it’s not about preferring ChatGPT to friends but about a potential future where bots are superhumanly engaging. Conceivably it will take discipline to keep from getting sucked in.

In any case, I think we should mostly focus on what Overlord can do today.

(And @sheik, let’s also set aside the question of data privacy. Those kinds of risks are clear enough, we don’t need to rehash them here. Also please be nicer to @joshmit; I have massive respect for him, building Forfeit and now Overlord and contributing to the Beeminder community as well. I should probably also mention that I encouraged him to make this post introducing Overlord. It won’t be for everyone and that’s fine. But it should be of huge interest to plenty of Beeminder users.)

Anyway, let me repeat some of my thoughts from the weekly beemail (which in turn are based on discussions in the Discord), in case it’s a better jumping-off point for discussion:

First, a review of how Beeminder especially and to some extent Overlord, I think, combats cheating. I think our old post on this – Combatting Cheating | Beeminder Blog – holds up well. Some autodata, some social accountability (like getting a +1 by posting something publicly or to friends/family), caring about the graph, caring about the commitment device retaining its motivational power.

Maybe for Overlord in particular it can remove friction in creating those characteristics even for one-off commitments?

(At this point Anita points out that Overlord doesn’t actually know that you smoked that cigarette.)

True, doing something you committed not to and just never speaking of it – that’s gonna be hard to monitor without an all-too-literal overlord. But maybe Overlord makes it very easy to be like “make sure I do XYZ for project P when I get home” and it’s like, “Great, I messaged one of the other project P people. Send a picture of yourself doing XYZ and I’ll pass it along to them.”

Et voila, actually doing XYZ is probably easier than whatever you’d have to do to weasel out. Maybe the real key there is not being willing to deceive your colleague. So maybe Overlord’s killer feature could be orchestrating things like that? Setting up such commitments with friends directly is notoriously hard: So You Want to Make a DIY Beeminder | Beeminder Blog

4 Likes

I can see this comes from a genuine desire to help people, and while I applaud that intention, there’s a fundamental flaw that could cause serious harm if the system actually works as intended.

The core issue is that ‘self control’ and ‘discipline’ are executive functions. Despite centuries of Protestant work ethic telling us otherwise, executive function doesn’t respond well to top-down punishment. It’s simply not how our brains are wired.

Without true real-time operant conditioning (which I doubt Overlord can deliver), this approach will likely increase long-term suffering for users. Negative reinforcement can work for failure to initiate. That’s what Beeminder already does well. But almost everything else requires genuine, timely positive reinforcement to create lasting change.

Consider addiction: you don’t break those cycles through extreme restriction and self-punishment. You break them by finding enjoyment in healthier alternatives, developing new skills, and building real human connections. These remain the gold standard for behavioral change because they work WITH our psychology, not against it.

4 Likes

Yes true, I apologize @joshmit, I was hungry and should’ve been nicer about it. For every person like me, I bet you’ll find 10 that think the opposite (and I respect that position as well, because this is a cool piece of technology for sure).

What do you mean by this? That negative reinforcement can help fixing the initial inertia of doing something (i.e. I really need to be calling my mom, I’m okay with it, but I just don’t think about it)?

2 Likes

In simple terms, positive reinforcement is about experiencing a pleasurable state after an action, leading to a nervous system to build connections, making the same action more efficient and more likely to happen the next time in similar conditions. The more timely this happens, the better it works.

Negative reinforcement is the opposite, experiencing a painful state after an action, leading to a nervous system to remove connections, making the same action less efficient and less likely to happen the next time in similar conditions. With negative reinforcement, timing is less important than with positive reinforcement.

This is how Beeminder fundamentally works. It’s a way to apply negative reinforcement towards any measurable state. And crucially, this also explains why the most successful Beeminder goals are those that negatively reinforce failure to start a task rather than failure to finish a task.

If I feel pain because of not having a phone call with my mom because she wasn’t answering, I create conditions that are too complex for the nervous wiring to correctly adapt to.

If I feel pain because of not pressing the call button on my phone, regardless of outcome, I am less inclined to maintain inaction the next time.

Reversely, the button press itself should ideally be immediately rewarded If I want to make that specific action more likely next time.

It’s really almost always that simple, still people use the wrong thing all of the time, usually because of internal Ego narratives.

2 Likes

The chat functionality of Overlord I really like, but I think it’s not an Overlord unless it can see whether or not I do my pushups. I’ve been using my new home assistant integration daily to speak voice commands to beeminder (Integration with Home Assistant - #4 by adamwolf see bottom comments) and I’ve found that to lower the barrier to entering data by a lot so I can see how this chat interface would be nice!

3 Likes

Thanks for the replies guys - of course, this essay is half-essay, half-promo material, so I have exaggerated a few things.

@sheik I think we disagree on a lot of these, but appreciate your thoughts. I think most are pretty arguable, but the most solid one is that AI will enable superaddictive technology, which is the most powerful point and most likely to be true. Teens already spend about 8hrs/day on their phones - there’s only another 8hrs to go until it’s 100% of their time. Superaddicting AI avatars will be able to (under the hood) essentially assess thousands of responses, tonality, etc, and completely control the conversation to take it to the most likely to retain you (and have an enormous dataset to train on). Also, they don’t care about getting themselves heard (as humans do). I do think it’ll be mayhem. No worries about the criticism - I named it Overlord and write about it pretty dystopianly for a reason, I expect the negative pushback.

@honj I agree with this - Overlord isn’t as negatively focused as Beeminder/Forfeit. It’s mostly aimed at really optimising the least negative, best timed “hit” to the user to keep them on the right track. The positive side (ie, taking up new hobbies), are less what we’re aiming at, but it can also help with that.

@wstewarttennes I’m not sure what you mean by this - you can send videos into overlord (ie, videos of you doing pushups)! Then set goals such as “30 pushups a day or lose $5”.

@dreev Completely agree on your points, and I appreciate the welcoming on these forums. Especially about everything either going so well or so horribly that chatbots are the least of our concerns (ie, similar to deepfakes vs ASI annihilation). It’s a tricky time to be projecting the future for a company, lol.

Thanks again for the thoughts guys, I appreciate it :slight_smile:

1 Like

I know this is a pedantic point, but it’s important: negative reinforcement in psychology is the removal of an aversive stimulus in response to an action, to reinforce the action. If you’re a rat in a box, and there’s an unpleasant loud noise that you can stop by pushing a lever, that will negatively reinforce the lever press.

It’s not the same as positive punishment, which is adding an aversive stimulus to suppress an action - if you press the lever, there’s an unpleasant loud noise. Negative punishment is removing a pleasant stimulus in response to an action.

Nice chart illustrating this, stolen from here: What Is Negative Reinforcement? Examples & Definition

To an extent, you’re right: Beeminder is providing negative reinforcement, in that you feel unpleasant pressure from Beeminder (aversive stimulus), and actually doing the task removes that pressure (giving you the negative reinforcement). But the mechanism is important, and that’s why I’m making this (somewhat pedantic) point.

5 Likes

Pedantry super welcome here! See also this old Beeminder blog post, featuring a Homer-Simpson-ified version of that handy diagram:

I do think the video requirements are a cool feature now that I understand. @dreev any desire to add these to beeminder as well? So data points could be videos of me doing my pushups?

Maybe a baby step in that direction would be to make URLs in datapoint comments be clickable? Or even support markdown? Then you could include links to screenshots or videos in whatever way made sense and it’d be less ugly than it currently is.

1 Like

I like that! If it was available in the API I could add my own evals as well which would be nice and i would trust more than some online AI model.

Nice. I’m building something like this, though not as a product to sell to anyone. I’m just building a “proof system” that leverages AI “vision” capabilities to analyze proofs and check if I did what I had to do, making it in a way that there are no loopholes or ways for me to cheat.

I got a cheap bodycam, like those that cops wear, and I’m just swapping power banks, lol. I agree with your premise; it works like a miracle, but I don’t trust the AI to handle the application of punishment yet. I need to escalate things to a human if it says I failed so there can be some manual review of the failure.

The problem I see in this system is that there are a lot of quirks in AI right now, and you need to iterate on your approach when using video as proof before you can build a solid system. I think this could be an issue when it comes to selling this as a solution right now; it requires so much customization of prompts to get it right for specific use cases. Maybe AI will get better at interpreting events in the future, and this won’t be required. I’m using Gemini 1.5 Pro to do it, by the way.

I’m a chronic procrastinator with ADHD, so I’m happy to have the overlord taking over. Although I think there are many like me, for most people, I think this will be a tough sell. My brother was horrified when I showed him, lol. He said I was turning my life into a Black Mirror episode, hehe. I disagree entirely; I think this is what true freedom looks like. I’m just tying myself to the mast, like Ulysses. Pre-commitment is a great strategy and, as authors like Parfit and Elster argue, can be the morally correct thing to do.

I hope you nail it. My system sucks; I’m not a developer, I’m just “vibecoding,” as they say, so it’s full of issues. If there were a ready-to-go solution, I would be more than happy to switch over.

1 Like