AGI delayed? My recent experience with Forfeit.

Hey all!

Josh - co-founder of Forfeit here. Just following up on what Brice said:

I completely agree that the verification system has holes in it. Right now it’s a single human reviewer, so you can imagine the fatigue/false approvals that may happen. We remedy this by asking users to add a sentence to the end of their approval, like: “Meditate, hands must be in frame for the whole timelapse”.

This decreases the rate of false approvals, but it isn’t perfect. We’re fixing this with AI image/timelapse (likely coming a bit later) analysis, and verification instructions.

The image analysis is pretty weaselproof. This was in the app last year, but there were rate limits on GPT4 Vision at the time, so we could only do 100/day. Of course, it won’t be denying anything, and will be using humans for the hardest ones.

The verification instructions is just a separate section to add in certain rules that the image must pass, like:

“Must have hands in screen”
“Deny if laptop is in screen”
“Must stay in an upright sitting position the whole time, deny if I stand up, walk, or lie down”.

This should fix these issues, and will be in the app in a month or two!

Side note: Just seen Brice has clarified this already, oops!

Cheers,
Josh

4 Likes

Yup. My experience is the same. It’s still useful though because the illusion of accountability does help a bit.

But it’s not strict accountability. One workaround would be using forfeit + BAAS in tandem. Having to submit the forfeit, and then screen record the “view evidence” within the forfeit app and send it to your BAAS.

1 Like

I appreciate your response and owning up to the weaknesses.

A few things I’m confused about:

  1. I’m very confused at why there is only one human reviewer. It seems like something so integral to the functionality of the platform, that with you being the founder you should step in, if you can’t afford to hire a few VAs to work for 10 hours day. If it is you doing all the reviews as founder, then I get it, that is a pretty rough circumstance…

I do think it’s frustrating that when downloading forfeit it implies accountability, and I’ve been willing to have my charges go through specifically DUE to the feature of accountability.

  1. I just glanced at forfeit’s FAQs and it says “We have a team that verifies each forfeit manually. Unless the photo/timelapse is obviously wrong, we’ll ask for clarification”

I’m very confused at why you say in the FAQs that you have a team of reviewers, and now in this thread you are saying it’s just 1 person… ?

And, I don’t want to give unsolicited advice. At the same time you did enter this thread, so I assume you want feedback from users. The only thing I want to say right now is that compromising the core value proposition of the app is something you should fight against all costs. Otherwise why use Forfeit when I can just use another habit tracking app ?

2 Likes

Point 1: We currently only have need for one reviewer. Our AI analysis implemented now should catch false approvals (but most “false approvals” are usually due to the “erring on the side of the user” that we have to have, and can be remedied with being strict in the description).

For point 2, it’s also the fact that we have to err on the side of the user - if we deny forfeits a lot more, a lot more people will be upset. Ie, we make sure we clarify with users before denying their evidence. Again, if your forfeit is something like “30 minute meditation”, and you submit 20 mins, we’ll deny it without asking for clarification. We have one main reviewer, and the founders also review images from time to time.

As with the compromising the core value of the app - i think this is easily fixed with being stricter in your descriptions. Obviously, some forfeits may slip through the cracks (should be fixed by AI), but if you write a lengthy description of what you want the forfeit to be, our reviewer will read all of the points in the description, and deny it if one of those points are missed (maybe asking for clarification, in which if there’s not a good reason, we’ll deny it).

2 Likes

We understand that the issues with verification are frustrating, really. We’re continuously working to improve this aspect. Now, it’s natural to find flaws in any system, but I believe we offer significant value, especially since we’re currently providing this service at virtually no cost (unless you fail a Forfeit).

If the system works well most of the time and is consistently improving, that’s what matters. Personally, it has greatly improved my life, and I’ve even paid Forfeit on purpose because it’s been more effective than other, often expensive, options I’ve tried. Many users feel the same. We regularly receive messages of love and gratitude.

Looking ahead, we will integrate more AI to make the system even more reliable, faster, and cost-effective, so some of the concerns raised here may soon be less relevant. We’ve already touched on this in previous discussions.

Regarding our participation in this thread, I believe it’s important for companies to engage with feedback, especially when it’s critical/negative. Having an open conversation helps us grow and shows that we’re real people working to make a positive impact. We also may have different perspectives, and I think it’s also important to share our side of the story.

If BAAS + Forfeit works better for you, that’s actually great. Our goal is to help you achieve more, whatever tools you choose to use. Of course, we would love you to rely on only one app for pure convenience. We will work hard to make sure this becomes a reality!

Then, if you have any feedback on how we can make the system even better for users like you, we will try to implement it. As said earlier, our chatbox is always open.

EDIT: Oh I just saw you deleted your message. I will keep mine there though, because it adresses some points you might still have and I don’t want you to leave frustrated out of this conversation.

2 Likes

Genuinely curious - what are the expensive options you’ve tried ?

1 Like

Sorry about that, that’s not good. I’ve just asked the verifier about it and he doesn’t remember approving that. I’ll ensure he stays strict, and please reach out to me if this happens again. I just approved a 20 minute meditation forfeit timelapse you submitted today as it was valid.

If it helps, this has been a recurring issue that’s definitely hard to fix, but it’s never been too bad (I speak to about 30 users per day in the in-app chat, and call/Zoom ~1 user per day, and it’s probably #10 or so in terms of feature requests). I recognise that the “Just put the verification instructions in the description” is super hacky/not a great solution.

What I believe should fix this, and make it pretty bulletproof, is the “verification instructions” we’re adding in (along with appeal instructions), and the AI analysis that should double-check it all.

Another thing (and this goes for anyone!): If you’d like, until this is fixed (likely in 3 weeks or so), ask me in the in-app chat to verify your forfeits, and I’ll do them myself if you make a group forfeit with me, rather than paying for BaaS/having a friend do it. I can be as brutal/strict as you’d like.

1 Like

BAAS (actually OK but not granular enough), Beeminder (account since 2017), Stickk (terrible experience), Coach.me (joke coaching), GoalsWon, Focusmate (very early user there as well, at a time when not all slots were even full), some random accountability app I dont even remember.

Personal coaching, 2 different ones (learned some things there but the price was definitely not worth it).

CBT Therapy with an actual good psychiatrist for 3 years. (Not day to day enough, accountability easy to bypass)

Pay for accountability partners found on Reddit via a thread I have made (ended with people not even showing up to ensure my most important habit was done which was waking up early even though I was paying them, talk about accountability…)

Almost every other task management app possible, including the gamified ones like Habitica, Focumon, Forest (and most of the smaller players as well).

As for distraction blockers, that’s even another subject. I’d recommend pluckeye as well as Horaire PC (french but this software is actually so great!)

Every option tried there didn’t give me as much growth as Forfeit did. Maybe I am weird, but Forfeit just clicked with me. And I don’t want to trash on anyone. It just didn’t work with me.

On a personal note, I don’t truly think there is a good alternative to Forfeit as of now. Even in the case of highly priced coaching (while it was supposedly some of the best in my country as well).

Yes it is not perfect but can you name one which does it perfectly? And for free as well? If you do I would love to hear about it because I myself am searching for it! But in the case there isn’t, well… let’s work together to make Forfeit the best it can be, no? There is a lot of room for growth and we talked about it a lot already. But we really have the best interest in heart

2 Likes

I think everything has been said at this point, so I just want to summarize my perspective one last time because I started this thread in good faith and still think Forfeit is a promising product.

It comes down to whether the evidence should match the description (to a reasonable degree).

If you think it is merely about uploading any evidence, then that view doesn’t match what the FAQ says, as I have shown objectively. I just don’t understand what the value proposition would be. Your model of what the app is supposed to be is fundamentally different from mine, and I don’t know how to reconcile that. We can agree to disagree.

If you agree that the purpose of the verification is to confirm that the goal description and evidence match to a reasonable degree, then we can discuss to what degree Forfeit has followed through on that promise. I concede that my initial example with the left and right hand distinction was too ambiguous and easily fulfills the to a reasonable degree requirement. However, some of the other examples, like accepting a video of an empty bathroom as a meditation session, clearly don’t pass the threshold.

It seems like the founders agree with the perspective that the description and evidence should have some resemblance but still somewhat struggle with the execution. For some of the examples, it is unclear to me how they could ever be approved, but there seems to be an acknowledgment that the process needs improvement. Maybe post here if there is a new version, new AI, or new human reviewers, and I would consider giving it another try. I would also consider paying a subscription fee or a one time payment if that helps to bootstrap better verification.

That was unnecessary. Yes, I know that my current self cheats on my past and future self by acting in suboptimal ways, but that is exactly why I need Forfeit to keep me accountable. Taking some kind of moral high road while I am trying to use your app to improve myself is not helpful.

1 Like