Posting this here, as opposed to a more private channel, since others may have a different take and I’d be interested to hear what others think.
After the recent release of the arbitrary deadlines feature I ran into several annoying bugs. They are being fixed quickly, and since I am already a beeminder super-fan, relatively speaking they are just a small bump in the road; but they have nonetheless been non-trivially annoying, and if I were just starting out with beeminder I think they might have caused me to abandon it. Judging by this post I am not the only one to run into bugs, and there are quite a few other bugs introduced that I have not personally run into.
Of course we all understand the value of putting yourself on the hook for things, but my take on this is that putting yourself on the hook for completing some big feature by a particular date is a Bad Idea. It just creates poor incentives to rush the implementation (thus introducing more bugs) and rush the feature out the door without doing a proper job of testing (thus leaving the bugs for users to find). Being really responsive and apologetic after the fact is much better than nothing, but this kind of thing is still going to burn through a lot of user goodwill regardless of how you deal with the aftermath. It might even end up costing you more than $810 in the long run due to lost sign-ups and the like.
At the very least, if you want to put yourself on the hook for releasing a big feature, you should think through all the things that ought to happen prior to the release, estimate how long they will take, and put yourself on the hook for those too. E.g. have the coding complete on such-and-such feature by this date; complete Android testing by some other date; and so on. Of course those are a bit harder to make into bright lines, and they are not as easily verified by the general public.
Don’t get me wrong, I am really glad to have the arbitrary deadlines feature. I just would have been willing to wait for a version with fewer bugs. If you really think that you will never finish big features like this without putting yourself on the hook then I suggest you should think more carefully about ways to do that which will create the right incentives.
I agree. I also prefer less bugs over new features. Beeminder already works extremely well for me, even without new shiny things. So I am in no hurry to see new features implemented if this risks having to deal with bugs.
Just an idea: Maybe it would make sense to have a separate beta version of beeminder in addition to a standard stable version. New features could then be tested with a subset of users who are willing to test them, and deployed to all users once most of the bugs introduced by those new features are squashed. You could then continue to impose challenging deadlines on yourselves for adding new features, but then initially only deploy these features to the beta version.
Maybe it would even make sense to have beta goals. That is, I as a user can say that I am willing to test some new features with some of my goals, but prefer to continue to use the standard stable version with my other goals.
Beautifully said! Not that we needed convincing. This was completely brutal and we’ve been brainstorming on how to not do that again. Here are ideas, including what @byorgey and @davitenio have suggested so far:
- Risk less money so it still pushes us forward but we can also decide to just cough it up.
- Commit to intermediate milestones.
- Commit to deploying to a subset of beta users.
- Beemind amount of time spent on the feature (maybe with the Beedroid timer instead of TagTime).
- Beemind commits with a certain hashtag (thanks @thunderbolt16).
- Commit to, say, 2 UVIs or Infras per week being tagged for the feature until it’s done.
For #1, in retrospect we should’ve actually coughed up the $810! Partly because of the large amount of money, it would’ve felt humiliating to not ship what we committed to. So that was part of the problem.
For #4, I actually did beemind time spent and got the spec in decent shape months ago (I was then daunted by the back end work requiring @bee’s big brain and flattened my road). But (a) it’s hard to beemind time with TagTime unless it’s something you’re spending hours a day on (hence me beeminding it so meekly for so long). And (b) @bee probably should’ve beeminded time spent on the actual implementation. A month ago when we hard-committed to this we would’ve been better off committing to spending some hours per day on it until it was done, however long that might take. That might risk going down rabbit holes but that might be better than what we did in October. We’ll try it!
Our thought process, by the way, was that we wanted this feature badly, yet, with the big changes needed in the back end to support it, months were going by with it not happening (even after it was seemingly mostly done). So at the beginning of October we decided it would just be too ridiculous not to have it as part of our 3rd anniversary post.
Reinforcing what @davitenio and @byorgey, bugs are a lot more “costly” to me than waiting a bit longer for features at this point of my use of beeminder. Beeminder is my nailgun, I have to spend a lot of time and effort and experimentation figuring out where to put the nail (setup up my goal and metrics), but beeminder “just working” as I expect is a wonderful feeling. It allows me to have firm confidence that once I know the goal structure I want, I can put it together simply and effectively.
Now mind you, the fact that beeminder has AWESOME customer support mitigates this a lot, and so for me, this isn’t something that would cause me to stop using beeminder, but it is sometimes annoying, just as @byorgey said.
Personally, I’d have a lot of fun being able to use beta features in some sort of sandbox. My career started out in software and system testing, so that’s a lot of fun for me, if I can have a separation between that and the goals that I’m actually struggling to complete and need the “nailgun” type tool for.
Another thing I just realized, with the $810 commitments, is that you’re acting more like StickK than beeminder. Large payoff, binary committed things. I think that’s a bad example to give users, especially when you could show-off the power of beeminder by tracking code checkins for the feature, or time, or number new tests passing.
Finally, for me, having something like a fever chart (and critical chain schedule) to measure milestones would be how I’d track output, rather than committing to it directly with a beeminder goal. But if you have the other stuff in place, this may not be needed. It depends on how worried you are about rabbit holes and how much transparency you need/want to give into your development processes.
So true! I added code checkins to the list above. Number of new tests passing sounds harder as a metric for ensuring eventually shipping. (Time spent also has that problem to some degree…) Maybe if we were more hardcore TDD that would work though?
Depends on the type of test. If you track acceptance tests (like Gherkin or FitNesse) it’s a tight tie to user value. You still have to be careful, because just because it’s user facing doesn’t mean it’s user-valuable, but it’s at the level of weaselness that UVIs are.
Also, getting those acceptance tests complete requires some discipline, since they’re only examples you have to ensure your code handles all cases, not just the example. They also don’t protect against any but the simplest of bugs.
One nice thing about doing scenario based testing is you don’t need to automate them, but it gives a relatively clear picture of partial value to users as you progress and mark them off as working. It seems to be a good fit, but you know your context better than I do, and there are no testing best practices, only good practices in context .
I agree that beeminding programmer level tests (dots from the test runner) would be about like beeminding time or other “input” activities.
At Skritter we once had a few weeks that we thought were horrifically buggy–just unusably buggy, to our minds, and way buggier than Beeminder has ever been–with some discussions of bugs on the forum and some support emails about them. We then ran a poll (using our poll widget on the homepage) asking about how buggy it was. Wasn’t that bad:
The thing you might miss if you try to figure out how much of a problem new bugs are is that you (the developers) are very, very close to them. Whichever users hit the bugs are close to them, and some of more vocal of those tell you about it. Meanwhile, the majority of users never notice any of the bugs, and a vanishingly small fraction are at risk of leaving if you never fix it (not if you fix it soon).
The end result is that you are stressed out about fires. The default method of sampling is to notice there’s smoke in your kitchen and assume the rest of the world is also burning. But it’s actually fine.
Thanks so much for saying this, @nick! Funny timing because last night in the weekly beemail I asked in the PS:
If you could give a quick reply with the worst frustration you’ve noticed (or “none!” if none!) since we deployed this last week that would be super helpful in prioritizing the remaining issues.
And, sure enough, we woke up to about 70% "none!"s.
:) Quite similar to that Skritter poll, I’d say.
In any case, some of these bugs are not exactly tolerable so we will keep hacking away, perhaps with less self-berating (and thanks again for that!).
I very much agree. I’ve been sort of begging for a couple of years to have fewer upgrades if it meant dealing with long-time bugs. There are very significant bugs that have existed for well over a year and some that are new but massive.
To be honest, it’s affected my decision to use the site for certain goals in the recent past. And on days I’m really busy or really frustrated, sometimes I think I might want to just go cold turkey and see if I can do without it cause the friction is just getting too high. I come to my senses, of course, but I worry many don’t. (Actually, I know that many don’t. Of the people I’ve introduced to Beeminder, only 1 stayed (despite all being in love with the concept) and all (including the one who stayed) complained to me, since I had been the one to introduce them, about bugs (and about the default settings)… complained a lot.)
The friction for using the site is high for everyone. It’s high for new users who have to learn to differentiate between confusing settings they “messed up” and bugs that have nothing to do with what they did, and the friction is high for long-time users with many goals or who want to use existing features, but for whom the bugs are an irritant. And look at the #1 reason usually cited for the advice “Don’t bother using app X” for any given app on the apple app store: bugginess/not ready for prime-time. Long-time users will wait, already knowing the value. New users bail. And, for new users, bug (or not understanding the default rules) + credit card = fear, and bug-caused derail + credit card = anger.
I think a way to strike balance is to have a bare-bones “works absolutely perfectly” interface and a beta “enter when you feel like tinkering” interface. Anyway, I posted this email I sent earlier this month: “Clean and Beta Versions/Options, Please”, as I thought it was related, but thought it might be more appropriate in a new thread.
I’m not sure I agree. Some leave before the poll comes out or are too detached to participate. Not everyone wants to be a part of the site-building discussion; some just want to use something and go on with their day. In the case of Beeminder, everyone I’ve introduced to it has complained either about bugs or about confusing settings. Every single one. Sure that’s not a statistically significant sample size, and the plural of anecdote is not data, but that tells me something. (And it bums me out!)
And this isn’t like a membership-based site where you pay X and get some service Y. A bug (or confusing setting) can directly cost you money if you don’t notice it! That makes them way more meaningful than on other sites. Beeminder can only survive if people trust it. Bugs that can cost money are not conducive to trust, even if the support team is as amazing as it is. People don’t know that up-front.
Although I see your point, I’m also somewhat amused that you lump bugs together with confusing settings in that criticism… in a post where you’re asking for another setting.
(“But surely having a setting for stable/beta isn’t confusing, right?” Most settings are confusing only because there are so many of them. Each setting you add, even if it makes sense to almost everyone, increases cognitive load for the user, because they have more to go through when finding the setting they need, and because there’s a low chance that they’ll misunderstand any given setting–this becomes bad when there are a lot of settings.)
Part of the reason that something like Beeminder is hard to keep bug-free is that it has so many different configurations, because over the years so many people have wanted something slightly different. But in each combination of settings lie 1) possible bugs, 2) ongoing developer maintenance and support, 3) user confusion and overwhelmth. I wonder if the UVI thing has contributed to adding setting-sized features.
We did beta.skritter.com and www.skritter.com, and we would deploy the latest version of the code once a month to www.skritter.com while deploying constantly to beta and encouraging people to be on beta to get the latest stuff so they’d run into any bugs early. It was good for some things, but looking back, I think on net it made things more buggy, because some fixes would be on beta and not on www, and there were of course bugs with switching between the two and cookies, and having to keep the database formats in sync even as the application code changed, and versioning mistakes, etc. And I don’t even want to think about the extra problem for Beeminder of a user derailment being different across beta and and www.
If the Beeminder team wants to reduce bugginess, then they should 1) do object-level work on fixing bugs, 2) take the meta-level step of saying no to new configurations, and 3) possibly think about removing unneeded features to reduce code and interface complexity. But I don’t think reducing bugginess is the best goal. I think what they should do instead is decide what is going to make Beeminder the most money and work on that. Mainly listening to feedback from vocal minorities like power users is going to lead to niche features which are hard to keep bug-free while making the site intimidating for newbees.
In theory you can have the best of both worlds, by not offering the bleeding edge beta setting to newbees, for example. But mostly I’m swayed by @nick’s response here, at least about clean-vs-beta. That can backfire and lead to more bugs. Nick’s experience with exactly that is a key datapoint. Another thing Nick didn’t mention explicitly: the time spent implementing that could be spent just fixing bugs! (:
To repeat my response to @mary from the other thread on clean-vs-beta, your points are also super persuasive and we jumped all over the bug you mentioned there and are ready to jump all over the next most frustrating one you’re experiencing. Thanks again for the help with this!
I lumped them together because in the info I was quoting, they’re not separated, and so it seemed imprecise of me to leave out the settings side of it, making it sound like bugs were contributing to more of the quitting that I’ve heard about than they might actually be.
I agree that additional beta settings would be more complicated. I recommended it only because, I’ve been on the “pretty please, no more features until there are no more bugs” bandwagon for about a year and a half. But it seems to me that new features is something the founders want to keep working on for some time still, so I was trying to propose a way to give users the opportunity to opt out from beta bugginess, hoping to reduce friction for long-time users and avoid scaring away as many new users as I worry it does. I have no idea about whether it’s technically viable or not, though, so you’re probably making very good points with respect to that. (I also agree that it’s a good idea to work on what would make Beeminder the most financially stable. Trust - friction = users = money.)
Your pleas have not fallen on deaf ears! In fact, your original plea induced a marked shift in focus for UVIs, from new goodies to bugfixes and simplifications. I remember taking it really seriously and deciding that the following month would be mostly bugfix and simplification UVIs. I would even say that, despite appearances, that shift has persisted. Looking at our latest UVIs, about half have been bugfixes and I think most of the rest have been things (like creating this forum) that don’t impact bugginess.
Oh, and one anecdote: It’s true that arbitrary deadlines caused a new swarm of bugs but it’s also the case that it fixed a longstanding source of confusion and user mistrust with entering data between midnight and 3am. There was a convoluted explanation but from a newbee’s perspective it was just totally buggy – but now it isn’t! Progress! (:
General request: try to pick at least one bug that’s plausibly the most frustrating for you and mention it as part of these meta discussions. Lamentations of pervasive bugginess can be demoralizing but specific bug reports are motivating! I mean, don’t hold back on general lamentations – we need to hear that too. Just try to mention a specific bug too. As I said in another thread, when you did that in your initial missive we jumped all over it and fixed it that day.
Sure. These are ones I’m pretty sure you guys already have on your lists, though.
Retroratchet/safe days when the user has entered Goal Date + Goal Value (and not Goal Rate) on the road dial. It overrides and changes the user-entered data. #allofthsadness This one is one of my least favourite bugs because, being crazy, I like to pick very precise amounts and dates and like to be able to spread out any lead get over the length of a goal, rather than take time off and get out of the habit (which is what often happens now that I no longer use safe days).
Unfreezing goals (when auto-quit is disabled) doesn’t allow you to select a rate anymore, even when you uncheck “start flat”. (I know you guys know about this one already, cause we’ve talked about it.)
Thanks @chipmanaged! @bee and I are discussing/reprioritizing. In general, definitely don’t assume that “already on our list” means not worth mentioning. Sadly there’s a continuous influx of new things to our list that greatly exceeds the rate at which we can hope to implement things. So re-reporting bugs like this that are still bugging you is very valuable.
Thoughts so far: #1 is possibly lower priority because we believe it will become moot with the new road editor aka generalized road dial. That’s possibly the case with #2 as well but there are definitely bad problems with restarting/unfreezing goals. We strongly advise against checking auto-quit for lots of reasons, not that that’s an excuse for bugs with a feature we provide, however discouraged.
Clarification on #1: Is it mainly the fact that retroratchet on do-more goals shifts all the dates forward? That’s often the right thing to do so fixing that before we deploy the new road editor won’t be easy.
And #2 I hope we can fix sooner but for now we basically have to advise never checking auto-quit and always picking end dates far in the future, at least when dialing the road or scheduling breaks.
PS: As I’ve said to you privately, you’ve been a huge help to us and driven big improvements to Beeminder since the early days. (Not that your fandom can be doubted after your other recent post.) In fact, much of the addendum to the above-mentioned blog post about automatic rerailing is thanks to you, who helped us figure out how to get the best of both worlds when we made that rather fundamental change to how Beeminder works.
Counterpoint: The people who bother to answer that poll are a completely skewed subset. Those who hit one too many bugs and said “screw this” are not included. So you’ve filtered out all but the sufficiently tolerant or sufficiently lucky. Also, people are way too nice. The 41% who said “no, it’s fine” are tacitly telling you they’re encountering bugs, since they didn’t choose “what bugs?”. So even among the subset who stuck around and like you enough to answer your poll, 88% were impacted by the bugs. So maybe your “horrifically/unusably buggy” wasn’t so far off. (:
Two related things:
The Cockroach Principle
If you spot one cockroach in your kitchen you can rest assured  that there are hordes of them sneaking around not making themselves noticed. Or maybe it’s possible it was just that one passing through, but if you see another one you’re almost definitely fucked. And, three? Forget about it. 
Similarly, if 10% of users will actually complain about something  and the rest will walk away then, in expectation, 9 users walked before this one complained.
In conclusion, mentally 10x user complaints.
 Highly ironic use of “rest assured”.
 Ie, burn your house to the ground.
 And, believe me, it can be less than 10%, even (perhaps especially ) for completely critical stuff like “it’s not letting me create a goal”.
 Like if you get the impression that maybe the whole site isn’t maintained anymore.