Yesterday I did more thinking about how to measure / systematize an experiment-based approach to product direction.
For a while I tried to think about how to use bayesian reasoning to quantify the magnitude of an experiment’s significance for the purpose of tracking with Beeminder. However, this seems tricky, since I’d prefer to quantify the input (the experimental design) rather than the output (the resulting observation). And from my basic understanding of bayesian reasoning, it’s much more concerned with the actual observation than the design of the experiment previous to the observation.
For example, say I asked people how excited they are about new feature X on a scale of 1 to 10. Once I have the observation (that is, the poll is finished), I think I could relatively easily use bayesian reasoning to approximate the significance of the result.
But how would I define the value of the experiment before running it? There are three properties I might consider–how many people responded, the average value of the response, and the distribution of the responses (e.g., normal, bimodal, …).
Every combination of these properties (that is, each discrete observation) would have a different significance compared against my starting hypothesis, yes? So how would I quantify the experiment’s potential significance to my hypothesis before making my observation?
If you’d like to help me think this through, I’d be very grateful. But be forewarned: my knowledge of this stuff is quite casual, so it may take some back-and-forth before I understand what you’re trying to say.
Somewhat relatedly, I was reminded of something that Dave Farley said in this episode of the podcast Ship It!. He drew the connection between the scientific method and continuous delivery. Basically the idea was that we want to assume we’re going to be wrong a significant percentage of the time and put systems in place that account for that, and then reduce the size of each deliverable as much as possible to reduce the amount of stuff we could be wrong about in each unit.
That reminded me that this is a big part of how science works–coming up with hypotheses and trying to falsify them. I think I had forgotten about the falsifying bit. Instead of thinking of hypotheses as questions, I should think of them as opinions or ideas I already tend to believe, and then look for ways to prove myself wrong.
So here is a list of things I already tend to believe in relationship to my startups that might be good candidates for attempting to falsify:
TaskRatchet
- Splitting the tasks view into two tabs–Next and Archive–would both decrease the complexity of TaskRatchet’s code and make the app easier to use for users.
- A recurring tasks feature would provide value to users and increase the number of tasks users create over time.
- Integrating with additional third-party services would allow me to increase the usefulness of TaskRatchet and increase the number of tasks that users create.
- Creating official Make.com and Zapier integrations would significantly increase the number of tasks users create.
- Emailing users regularly, beemail-style, would be a significantly more effective way to gather feedback than my current method of posting to Twitter and the Beeminder forum.
- More TaskRatchet users would see my UVIs if I posted them to the forum instead of to Twitter.
- Continuing to develop and add features to TaskRatchet is likely to increase my monthly revenue to a degree that it is worth the effort.
- Making TR more ADHD-friendly would make it easier for me to dogfood it, and also increase user engagement generally.
- Switching to a tech stack other than full-stack TypeScript would not provide enough benefit to outweigh the associated costs (learning, development, new bugs, etc).
- There are no good alternatives to Stripe that support my business model and would allow me to simplify my tax situation at a reasonable cost given my low revenue.
- TaskRatchet can be a significant piece of how I support myself financially.
Narthbugz
- Narthbugz, as currently envisioned, would not provide significant value for a small-a agile development team.
- It would be too difficult for me to find initial customers for Narthbugz.
- Narthbugz would not currently provide me enough value to warrant the time and effort I would need to spend in development.
I’ve tried to rework any items that started with “I should” into statements of the underlying ideas that lead me to believe that I should do this or that.
Do you have ideas on how I could challenge these opinions? Don’t worry if you agree with me on some of these points–the ultimate goal here is not to reverse my opinion. Rather it’s to put my opinions on a firmer foundation (which may at times mean reversing the opinion, but not necessarily).