Let's see if Deep Research can tell us about the science of Beeminder

dreev · February 26, 2025, 12:26am

I gave OpenAI’s Deep Research tool, made available to premium users today, the following prompt:

What does the science say about Beeminder’s efficacy?

It asked a couple followup questions (Beeminder specifically? No. Any specific outcome measures in mind? Also no.) and then spit out the following 6 minutes later.

I’m quoting it in its entirety (for Science!) with my reactions interspersed.

UPDATES:

I’m collapsing long quotes to make it less excruciating to read my interspersed comments. Also replacing repeat citations with “[ibid.]”.
And I’m moving it to a separate follow-on post below because I managed to slightly exceed the forum’s length limit for a single post.

dreev · February 26, 2025, 12:27am

Efficacy of Commitment Devices (Beeminder and Similar Tools) in Goal Achievement

Commitment Devices and Self-Binding Strategies

“blah blah blah Odysseus had himself tied to the mast and Cortés burned his ships”

Commitment devices are voluntary arrangements that people use to bind themselves to a future course of action, typically by introducing immediate consequences for failure. Beeminder, StickK, and similar services fall into this category – they let individuals set goals and attach stakes (often monetary or social) to staying on track. This concept of self-binding has a long history: for example, the mythical Odysseus had himself tied to the mast to resist the Sirens, and historical figures like Cortés burned his ships

My first nitpick! I thought he scuttled them rather than burned them, and Claude is well-actuallying harder and saying he dismantled them. Ok, not important!

“commitment contracts and distracting website blockers yada yada”

to eliminate the option of retreat (Commitment contracts as a way to health). In modern life, commitment devices can take many forms – from apps that block distracting websites at work, to “Ulysses contracts” for personal behavior change, to formal commitment contracts where you pledge money that you forfeit if you don’t meet your target.

I’m vaguely crinkling my nose at this without having any concrete objection. Maybe the writing is just too LLM-y.

“blah blah locking in today's good intentions, curbing vices”

These tools create external constraints or incentives that make it harder for our future selves to give in to temptation, essentially locking in today’s good intentions. Notably, many people do choose to use such self-imposed contracts to help achieve goals: evidence shows individuals have used commitment contracts to save money, complete assignments on time, and curb vices like smoking [ibid.]. The premise is that by pre-committing and putting something on the line, we can overcome procrastination and impulsiveness that might derail our goals.

Citing the same 2012 paper twice, I see. So far I’m gratified by how much better the Beeminder blog is than this stuff.

“blah blah present bias ... (sometimes termed 'akrasia' in psychology)”

Behavioral Economics Basis: Present Bias and Loss Aversion

From a behavioral economics perspective, commitment devices target well-known biases in decision-making. One key issue is present bias (or hyperbolic discounting): we tend to favor immediate gratification over long-term benefits, which leads to procrastination and inconsistency in pursuing goals (sometimes termed “akrasia” in psychology).

It’s philosophers (and rationalists) who use that term. I guess we’re still in nitpick territory.

A commitment contract tries to counteract present bias by making the long-term goal salient in the present – often by attaching an immediate cost to slacking off.

Getting a little repetitive here.

For instance, Beeminder will charge you a preset amount if you fall off your “goal road,”

Severe nose-crinkle.

thus bringing a future penalty forward to the present moment each day. This leverages loss aversion: people are generally more motivated to avoid a loss than to achieve an equivalent gain [ibid.].

No no no. Or, well, it’s complicated. I mean, we got this wrong for years ourselves so I don’t know how stridently to complain. But at least a little stridently because we did put the right answer on the internet for anybotty to see:

Research underscores this principle – putting one’s own money at risk (a potential loss) tends to motivate better than offering a reward of the same size for success (Association Between Monetary Deposits and Weight Loss in Online Commitment Contracts - PMC). In other words, the fear of losing $100 can drive behavior more strongly than the hope of winning $100, in line with prospect theory.

This much is true. Anyway, see our Loss Aversion Aversion post and the post right before it for the rest of the story.

Commitment platforms harness this by using financial stakes or penalties as motivators. For example, StickK allows users to stake money which they forfeit to a charity (or an “anti-charity” they hate, for extra motivation)

I just have to mention again for the record how much we hate that.

if they fail – a design aimed at exploiting loss aversion and even moral stakes to push adherence [ibid.]. By imposing a cost for failure, these tools change the cost–benefit

Just to turn the nitpickery up to infinity for a second, because I’m honestly surprised to see an LLM do this, why on earth did it hyphenate “cost-benefit” with an em dash? (Claude, and ChatGPT itself, agree that this is wrong when I ask about it in isolation.)

“blah blah tools rely on what is essentially negative reinforcement or punishment”

calculation: skipping a workout or study session no longer just means slower progress; it means an immediate hit to your wallet or reputation. This helps align our short-term actions with our long-term goals by increasing the immediate consequences of procrastination.

Reinforcement and Accountability Mechanisms

Commitment devices operate via reinforcement mechanisms that alter the incentives around behavior. Many of these tools rely on what is essentially negative reinforcement or punishment:

Technically not the same thing, as we helpfully explain in a classic guest post by an actual psychologist (@tierrabluebird):

“blah blah avoid aversive outcome, Beeminder's sting, immediate feedback, social accountability”

if you do the desired behavior (e.g. stick to your daily goal), you avoid an aversive outcome (losing money or facing an embarrassing consequence). Conversely, failing to do the task triggers a penalty, which is akin to a punishment intended to reduce future lapses. For instance, with Beeminder you “pay the price” when you go off track – this sting is supposed to reinforce better habits going forward. From a psychology standpoint, immediate consequences are crucial for behavior change, and commitment contracts ensure immediate feedback (often aversive) for not following the plan. In addition to monetary penalties, these services often incorporate accountability and social reinforcement. Some allow or encourage goals to be made public or involve a referee to verify outcomes. Research suggests public commitments can enhance success – making a commitment in front of others or to a partner adds social pressure and positive accountability, which has been linked to higher adherence (The effect of commitment-making on weight loss and behaviour change in adults with obesity/overweight; a systematic review - PMC).

I’m falling asleep but nothing wrong exactly here.

“blah blah core idea of tools like Beeminder and StickK is to harness our stronger sensitivity to negative outcomes”

In practice, a user might appoint a friend as a referee who must confirm they actually went to the gym, or they might share their Beeminder graph publicly for encouragement and a bit of healthy shame if they falter. This adds a layer of social reinforcement (praise or encouragement when you succeed, concern or disappointment if you don’t). Some commitment platforms and productivity apps also use positive reinforcement strategies – for example, Habitica (a gamified habit app) rewards good behavior with points and treats failure as a mild in-game loss. However, the core idea of tools like Beeminder and StickK is to harness our stronger sensitivity to negative outcomes.

Ok when it’s put like that I really have to object. The gains-vs-losses asymmetry in the Kahneman and Tversky sense is emphatically not fundamental to how Beeminder and StickK work.

Notably, even simple tech interventions that remind and reward behavior can have an effect: for instance, apps that send reminders or provide social feedback for good habits have shown some effectiveness in improving health behaviors (Association Between Monetary Deposits and Weight Loss in Online Commitment Contracts - PMC).

So repetitive with these citations.

so much blah blah blah with specific claims about weight loss RCTs and then, “yet the magnitude of weight loss achieved is moderate, and maintaining the weight loss after the commitment period remains challenging”

By combining immediate penalties for failure with real-time progress tracking (e.g. Beeminder’s graphs) and sometimes rewards or positive feedback for success, commitment devices create a reinforcing loop that keeps one’s focus on the goal. The behaviorist notion at work is that consistently enforcing consequences (good or bad) for behavior will gradually build habit strength and self-control.

Evidence of Effectiveness from Research

A growing body of research in behavioral economics, health psychology, and productivity science has investigated how well commitment devices actually work. Here are some key findings from studies on their efficacy:

Weight Loss and Health: Several randomized controlled trials (RCTs) have tested commitment contracts for weight loss. In one RCT, roughly 50% of obese participants assigned to a 16-week commitment-contract program (with financial stakes for meeting weekly weight targets) succeeded in achieving their weekly weight loss goals, compared to only 10% of those in a usual-care control group (Commitment contracts as a way to health - PMC). A follow-up 32-week trial found similar initial advantages for the commitment group [ibid.]. However, in both cases the difference waned after the contracts ended – the commitment device spurred more weight loss during the intervention, but the benefits largely disappeared a few months after the program, as participants often regained weight once the financial stakes were removed [ibid.]. This suggests commitment contracts can powerfully jump-start behavior change in the short term, though maintaining long-term change may require either extending the commitment or transitioning to other motivators. A 2019 systematic review of weight-loss interventions using commitment strategies (mostly non-monetary behavioral contracts) likewise found modest short-term benefits. On average, adding a commitment contract to a diet/exercise program increased weight loss by about 1.5 kg more than the program alone in the short term (few months) (The effect of commitment-making on weight loss and behaviour change in adults with obesity/overweight; a systematic review - PMC). Some effects persisted at 12 months (about 1.7 kg greater loss than controls) [ibid.], but evidence was mixed and quality varied. Interestingly, that review noted commitments were most effective when they were made publicly and focused on specific diet changes, rather than private or solely exercise commitments [ibid.]. Overall, the research indicates that for weight management, commitment contracts do improve adherence and outcomes in many cases – people stick to diets or exercise regimens better when they’ve put skin in the game. Yet the magnitude of weight loss achieved is moderate, and maintaining the weight loss after the commitment period remains challenging.

Yes, obviously you can’t let the commitment contract end! See also our review of Katy Milkman’s research.

(Also I’m not taking the time to verify all the specific claims above about commitment devices for weight loss. We did that once for smoking cessation.)

“contracts with [...] stakes led to [...] more weight loss”

Online Commitment Platforms (Naturalistic Data): Real-world data from commitment websites also support their efficacy, while highlighting some caveats. One study analyzed over 10,000 weight-loss contracts on StickK.com, where users set a weight goal with optional monetary stakes. The findings showed that users who put down a monetary deposit (choosing contracts that penalized them for not losing weight) achieved greater weight loss than those who made no financial commitment (Association Between Monetary Deposits and Weight Loss in Online Commitment Contracts - PMC). Specifically, contracts with an anti-charity clause (money goes to a hated cause if you fail) or other stakes led to about 0.25–0.33% more weight loss per week compared to contracts with no money at stake [ibid.].

This is wrong awful bad science. Hopelessly confounded, not at all establishing causation. PS: Sorry, I freaked out too soon, ChatGPT acknowledges this a little further on. So this is merely bad writing.

Over a few months, that adds up to a meaningful extra loss. This aligns with the idea that adding a loss incentive increases motivation. The same analysis found that participants without any verification (no third-party or photo weigh-in) reported larger weight losses than those with verification [ibid.] – suggesting a few may have been cheating the system when no one was watching.

Ooh, I’m not sure I knew about that result. That’s important.

Importantly, because this was not a randomized experiment, there’s a potential selection effect: it could be that more motivated individuals are the ones choosing to put down money, which might partly explain their better outcomes. The authors indeed cautioned that while voluntary commitment contracts were associated with greater weight loss, it’s unclear how much is due to the incentive itself versus the user’s pre-existing motivation [ibid.]. In other words, these tools attract people who are already determined, so success rates from user data may look high. Nonetheless, the fact that thousands willingly use these platforms and that those who raise the stakes tend to do better is consistent with the experimental evidence that commitment devices can be effective.

Yes, thank you. (Also eternal shame on StickK for intentional deception about this in their marketing copy back in the day.)

blah blah about how much commitment devices reduce smoking

Smoking Cessation: Commitment devices have also been applied to quitting smoking. A notable field experiment in the Philippines offered smokers a bank account (nicknamed CARES) where they could deposit money that would be forfeited if they failed a nicotine test after 6 months. Only about 11% of eligible smokers actually opted into this rigid contract (showing that many are wary of self-imposed penalties), but those who did showed significantly higher success in quitting (Put Your Money Where Your Butts Are). In fact, smokers randomly given the opportunity to use the commitment account were 3.4–5.7 percentage points more likely to pass a nicotine urine test at 12 months than those in a control group with no such option [ibid.]. This was a statistically significant increase in quit rates attributable to the commitment device. Although the overall uptake was low, this study provided direct causal evidence that a voluntary commitment contract can improve behavior change outcomes for those who choose to enroll.

It’s a bit more complicated than that but … UPDATE: never mind, our blog post about a different smoking cessation study. I don’t have an opinion about the claims about this earlier study.

Similar results have been seen in other incentive-based smoking studies – people offered a refundable deposit or loss incentive to quit often achieve higher short-term cessation rates than those given no incentives (or even those given equivalent reward-only incentives). As with weight loss, a pattern emerges: while the commitment is in force, success rates improve, but maintaining the behavior long-term can require sustained efforts or repeat commitments.

Productivity and Other Behaviors: In domains like work or academics, formal research is a bit sparser, but existing studies and observations support the efficacy of self-imposed commitments. For example, a classic study by Ariely

and Wertenbroch (2002) showed that when students were allowed to set their own deadlines for assignments (a form of commitment device), many students did set spaced deadlines rather than one last-minute deadline, and those who spaced out and self-imposed earlier deadlines performed better than those who did not (1333).

Despite Ariely’s involvement, this study still seems like solid research to me. Just has to be taken with a massive grain of salt now, obviously.

The catch was that self-imposed deadlines, while helpful, were not always fully optimal – students who had externally imposed evenly spaced deadlines did best of all [ibid.]. This suggests people understand their procrastination problems and will use commitments to improve performance, though they might not choose the perfect commitment structure without guidance. In the workplace, commitment techniques are often recommended by productivity experts: for instance, publicly committing to a goal or staking money on completing a project by a due date can combat procrastination. While rigorous studies in general productivity (like coding, writing, etc.) are fewer, the same principles apply. Some field research in personal finance found that offering a commitment savings account (where people couldn’t withdraw until reaching a savings goal or date) significantly boosted savings rates for those who took it up – again indicating that self-control tools can work when adopted. Additionally, anecdotal evidence from user communities (such as Beeminder’s forum

Oh hi.

and habit-tracking communities) often reports success in using these tools for things like studying for exams, writing daily, or even managing ADHD-related behaviors. In summary, across various behavioral domains – health, finance, work or study – commitment devices tend to improve goal attainment relative to having no such binding mechanism in place. The strongest evidence comes from health behavior trials, but theory and smaller studies support that the effects are transferable to other self-control challenges.

Limitations and Considerations of Commitment Tools

Despite their promise, commitment devices are not a one-size-fits-all solution, and research has identified several limitations and potential pitfalls:

Limited Uptake and Acceptability: Many people choose not to use commitment contracts, even when they have the option,

See the type bee personality.

because the idea of self-imposing penalties can be aversive. Studies find that the higher the stakes required, the fewer people are willing to sign up (Commitment contracts as a way to health - PMC). For example, if a program asks participants to put down a large deposit as a commitment, enrollment rates drop off markedly. This means these tools often attract a self-selected group who are particularly motivated or confident. Low uptake can limit the population-level impact of commitment interventions – they help those who use them, but many individuals opt out.

Short-Term Efficacy vs. Long-Term Behavior: A consistent theme is that the benefits of commitment devices tend to be temporary.

Again, not if you don’t stop using them! Katy Milkman’s analogy is it’s like insulin for diabetes. You have to keep taking it indefinitely. (Though plenty of Beeminder users aver that they’ve successfully habituated certain behaviors and no longer needed Beeminder for them.)

“blah blah these tools might be best used as a kickstarter or a bridge to longer-term habit change, rather than a permanent crutch”

While the contract is active, people stick to their goals more (e.g. losing weight, not smoking, meeting deadlines), but once it ends, old habits can resurface [ibid.]. Unless the commitment is renewed or transitioned into intrinsic motivation, the behavior change may not fully “stick.” Researchers have noted that when the external enforcement is removed, there is often regression, much like stopping a medication leads symptoms to return [ibid.]. This raises the question of how to maintain progress after the commitment period – an area for further innovation (such as tapering stakes, building habit formation alongside the contract, etc.). It also implies that these tools might be best used as a kickstarter or a bridge to longer-term habit change, rather than a permanent crutch.

Crutch you say?

Selection Bias in Outcomes: As mentioned, there is a self-selection factor – those who voluntarily use apps like Beeminder or sign commitment contracts might be inherently more disciplined or motivated. So their high success rates aren’t wholly generalizable to everyone. For instance, data showed users who risk money lost more weight, but it’s uncertain whether the incentive caused the difference or if those users were already more driven (Association Between Monetary Deposits and Weight Loss in Online Commitment Contracts - PMC). This means that if someone is very averse to the idea of a commitment device, forcing them to use one might not magically change their behavior; these tools work partly because the user buys into the concept.

Yes, good to reiterate this, and well said.

Gaming and Compliance Issues: A commitment device is only effective if the user plays by the rules. There is potential for gaming the system or cheating to avoid penalties. An example is reporting false progress: in an online study, those without verification of outcomes claimed more success, suggesting some may have lied about their results to escape losing money [ibid.]. Similarly, someone could set an easy goal and achieve it with minimal change, or find loopholes (for instance, if you commit to “no desserts” but then redefine what counts as dessert). Robust commitment platforms try to mitigate this (Beeminder, for example, often requires automatic data tracking or makes it tedious to cheat), but the risk remains that a determined person might undermine the contract, limiting its effectiveness.

We have so much to say about this. Maybe mostly the following about ice cream truck loopholes and Goodhart’s law:

“Commitment Overload” and Stress: Using too many commitment devices or overly harsh contracts can backfire. There is evidence that piling on too many commitments can overwhelm individuals, a phenomenon one researcher termed “commitment overload” (The effectiveness of commitment devices: field experiments on health behaviour change - UCL Discovery).

Amen. See also beating Beeminder burnout and beeminding easy things.

“Designing commitments that are stringent enough to matter but not so punitive that they induce despair is a delicate balance”

If someone binds themselves in numerous ways or sets the stakes too high, the pressure and cognitive load might become counterproductive. Likewise, the stress of facing constant penalties can potentially reduce enjoyment or intrinsic motivation for the activity. In qualitative follow-ups, some participants report feeling anxiety about the consequences, which in excess could harm engagement. The key is finding the “right dose of commitment” that motivates without causing burnout [ibid.]. Designing commitments that are stringent enough to matter but not so punitive that they induce despair is a delicate balance.

Fair. This is part of what’s behind our thinking in our series of posts about “derailing is not failing” / “derailing it is nailing it”.

“These tools are most effective when tailored to the individual’s preferences and when the person is genuinely committed to the goal itself (the contract is then just a nudge)”

Individual Differences: Not everyone responds equally well to commitment contracts. Some people are extremely loss-averse and will rigorously stick to avoid paying a penalty, while others might respond better to positive reinforcement or different strategies. The impact can vary across sub-groups, suggesting a need for careful targeting and personalization [ibid.]. For instance, an extrovert who cares about reputation might thrive under a public pledge with social accountability, whereas someone else might prefer a private, financial contract. If misaligned with personality or values, a commitment device might just frustrate the user. Therefore, these tools are most effective when tailored to the individual’s preferences and when the person is genuinely committed to the goal itself (the contract is then just a nudge).

Fair again.

“blah blah to potentially lose money or be punished can be emotionally challenging etc etc”

Ethical and Psychological Considerations: Finally, there are considerations about reliance on external motivators. Some critics note that if one’s goals are only achieved because of a financial threat, the behavior change could be shallow – what happens when the threat is gone? Ideally, the period under contract helps build skills and habits (e.g. learning you can run daily or write regularly) which then continue. But there’s a risk that people don’t internalize the motivation. Additionally, setting yourself up to potentially lose money or be punished can be emotionally challenging, and not everyone wants to introduce that kind of adversarial relationship with their future self.

See again the derailing-is-good-actually series.

Ensuring that commitment contracts are entered into freely, and that users understand they are a tool for their own benefit (not a punishment for being “weak”), is important for their ethical use. Encouraging a mindset that paying a penalty is not failure but part of the learning process (as Beeminder’s founders often stress (What is Beeminder?)) can help mitigate negative feelings.

Finally, an actual citation to our own writing on this. But that particular citation doesn’t seem to make sense.

Conclusion

In summary, commitment devices like Beeminder, StickK, and related tools leverage powerful behavioral principles – by utilizing commitment contracts, loss aversion,

Again, not exactly.

“blah blah quintupling the success rate in a weight-loss program compared to no commitment”

and immediate reinforcement, they help bridge the gap between our intentions and actions. Scientific research from behavioral economics and psychology generally finds that these tools can be effective: they have been shown to improve adherence to goals ranging from weight loss and smoking cessation to meeting deadlines and saving money. The use of loss-based incentives and self-imposed constraints often does improve the odds of success, sometimes dramatically so in the short run (for example, quintupling the success rate in a weight-loss program compared to no commitment (Commitment contracts as a way to health - PMC)).

That sounds wrong but I’m not taking the time to fact-check all the specific claims like this.

That said, the effectiveness comes with qualifications. Commitment devices are most helpful for those who are willing to use them properly, and their impact may diminish over time if not coupled with longer-term habit strategies. They are not magic bullets that guarantee success for everyone, but rather tools that give a willing individual an edge over their weaker impulses. In practice, many find them useful to break through procrastination or sticking points – the psychology of “knowing I’ll face a consequence if I fail” can indeed keep one on task. The evidence to date supports the idea that these devices help achieve goals, especially when carefully designed (and chosen by the person) to suit their motivational style. Future research is exploring how to optimize commitment contracts – for example, determining ideal incentive amounts, durations, and how to maintain behavior change after the contract – as well as how to increase their acceptability and use among people who could benefit. In the meantime, commitment devices remain a compelling option grounded in behavioral science, and when used judiciously, they are a proven strategy to reinforce goal tracking and behavior change (The effectiveness of commitment devices: field experiments on health behaviour change - UCL Discovery) (Association Between Monetary Deposits and Weight Loss in Online Commitment Contracts - PMC). They effectively help many individuals achieve their goals – turning intentions into action by making sure that tomorrow’s you faces the promises today’s you made.

Alright, ChatGPT’s Deep Research, thank you for the vote of confidence. I guess for my thoughts on what this means in terms of how long till the robot apocalypse you can see my new AGI Friday newsletter.

mbork · February 26, 2025, 6:54pm

I admit that I only skimmed this (rather busy right now) but I guess nothing new – OpenAI and co.'s marketers keep claiming it’s “artificial intelligence”, but while it’s indeed artificial, it’s nothing near “intelligence” (of course) – more like a really big spreadsheet which is really good at word juggling. No Skynet in the near future. (Nor in the far future, I suppose.)

cthulhucultist · February 27, 2025, 12:21am

“Maybe the writing is just too LLM-y.”

It is indeed too LLM-y. We need a word to describe this feeling of when you know you’re reading something written by an AI but can’t quite put your finger on why. I know there’s variants on “uncanny” for things that are visually offputting in this way, but I don’t know of a word like this but for writing.

dreev · February 27, 2025, 6:37am

I 3/4 agree, I guess. Wanna debate it in the comments of my latest AI post? My main point isn’t that you’re wrong but that we genuinely don’t know. I came up with this analogy that, the more I think about it, the apter it seems:

We’ve spotted an asteroid headed in the direction of Earth. But also the field of astronomy is so much in its infancy that we’ve barely reached consensus on heliocentrism. We’re not sure how far away the asteroid is, how fast it’s moving, how big it is, whether it will hit Earth or just pass near it. We can’t even agree whether asteroids necessarily burn up harmlessly in the atmosphere.

Even if we could agree that direct impact is unlikely, it feels weird to be worried about anything else until that question is settled. Or at least the probability pushed below, say, 1%.

I believe the uncertainties about what AI will be like at the end of this decade or so are bigger than that.

I’m talking it over with Claude who thinks “synthetic” may be hard to beat if you want to keep it simple.

But further brainstorming with both Claude and ChatGPT is yielding:

GPTonal, promptstricken, botic, echoformal, predictitious, neuralgic, autoquilled, pseudocogent, machine-cadenced, autoregressive, token-soupy, parrotistic, markovian, predictomorphic

Dear Lord, the exquisite irony of how scintillatingly it’s skewering itself with these adjectives.

I might be especially enamored with “neuralgic” – it’s an existing word for “painful”, kind of, (literally “nerve pain” in Greek) but also evokes neural networks. Maybe it’ll stick if we start using it. Also makes sense because of how much it gets on your nerves?

clivemeister · February 27, 2025, 2:01pm

The thing I keep coming back to is “confabulation”. LLMs riff on a subject, and (if their knowledge base is sketchy or ill-formed or - as I suspect is somewhat the case here - can’t tell that newer stuff is an update on older stuff and more to be relied upon), then the riff is plausible-sounding, but error-prone.

We do the same, and some even go so far as to say that’s what our stream of consciousness is: an attempt to explain (immediately after the fact) the stuff that our bodies are just doing on their own. The whole split-brain experiments which produced confabulation showed this up quite starkly.

So for me, it’s a bit like listening to a world-class bullshitter: you think that’s what this is, there’s probably a bunch of truth (or at the very least, truthy) stuff in there, and you can’t quite put your finger on what’s wrong.

cthulhucultist · February 27, 2025, 2:31pm

Promptstricken gave me a chuckle, but neuralgic is more true to the feeling I’m getting.

And, to clivemeister’s point, yes, LLMs are truly bull shit artists. They will give you an answer. They do not care about whether it is true or helpful, just that they have responded. It’s the definition of bull shit. I thonk that’s also part of my unease. I’m thankful in a way that it’s usually discernable when something is written by AI so that I can put my guard up, but I’m also afraid that a lot of people don’t know to do that. (Although the people who don’t know to do that are likely the same people who believe chain emails, so maybe it doesn’t matter too much.)

Topic		Replies	Views
the type of person who uses beeminder	5	898	July 29, 2014
Rebuttal to Sinceriously's "Self-Blackmail" Akrasia	11	4059	January 3, 2017
"seems like they get money because I didn't do something"	16	1453	July 31, 2014
Common reactions to Beeminder Akrasia	18	1719	April 30, 2021
Commitwall is live! (credit card required to create goals) Akrasia	81	3454	March 14, 2019

Let's see if Deep Research can tell us about the science of Beeminder

Efficacy of Commitment Devices (Beeminder and Similar Tools) in Goal Achievement

Commitment Devices and Self-Binding Strategies

Behavioral Economics Basis: Present Bias and Loss Aversion

Reinforcement and Accountability Mechanisms

Evidence of Effectiveness from Research

Limitations and Considerations of Commitment Tools

Conclusion

Related topics