I wrote this app for myself while I was consulting and needed to track my hours, it’s heavily inspired by tagtime, but I needed some automated in-app aggregation for my invoicing.
I would love a working Windows version of Tagtime. It was really useful when I used the Android version, but it sorely lacked visual feedback of any kind unless I plugged it into a Beeminder graph. I don’t want to beemind everything, but I’d love to be able to get some data visualisation without jumping through hoops…
I’m sampling at a 15 minute interval, that’s what I found was necessary to get accurate reports for my invoices. Also, (and this may be heresy to the TagTime old-timers ), I put lower and upper bounds on the sampling frequency. i.e. no pings closer than 3 minutes apart, and a maximum of 60 minutes between pings just to make sure I was getting the data I needed to get paid.
When I was nervous about this when I originally used Tagtime for my work hours, I actually ran a timer and Tagtime both for a month. They came up with almost exactly the same time by the end – they were actually already in agreement by the end of a week, but over a month the match was almost exact. I used the default ping schedule!
But it was really nervous-making at the start when I’d work for two hours and get no pings…
I believe this backfires! You’ll subconsciously get more focused as the one-hour bound approaches (and/or less focused for the few minutes after a ping) and that introduces a bias in the sampling. If you conscientiously avoid doing that, you’re liable to over-compensate and end up with a bias in the opposite direction. I think the only way to feel confident in the sampling is to nix the bounds and do proper, ungameable Poisson sampling, per the spec.
PS: Awesome work on Cactus! Thank you for sharing it and discussing it here! I just added it to the list at tagtime.com as well.
Yes, I don’t follow the tag time spec to a T (hence the change in name). The algorithm is essentially: nextPing = lastPing + max(3m, min(60m, gap())). Where gap() samples from the exponential distribution centered at your mean ping gap.
If the subconscious effect you describe is present, it’s very subtle! The 3 minute lower bound was introduced because pings closer than that were just annoying, and due to the nature of my consulting, I needed to give detailed reports on how I spent my time, pings longer than 1 hour apart wouldn’t give me this information. This is likely a special case of my working conditions so perhaps others will prefer to remove that feature.
More generally, my goal is to get as accurate view of how time is spent with as few samples as possible. I’m not necessarily married to the Poisson distribution. In the future I’d like to experiment with more black box algorithms that try to predict your behavior, and focus sampling during times of high variance / low confidence. The data collected in a sample would be valid for the time around the sample inversely proportionate to how variable your samples have been during that time period in the past.
That’s where my heads at, for now poisson sampling has been working very well (even with the bounds).
Thanks for adding it to the list! I’m happy to contribute, ya’ll have built a nice place here
I just checked and, assuming no subconscious bias (which I wouldn’t assume!), the average gap between pings with your algorithm will be 33 minutes instead of 45.
[I originally said that meant undercounting your time but I guess it depends on how you translate pings to amounts of time. The standard way to do that is to treat each ping as 45 minutes of work. If you treat each ping as 33 minutes of work, maybe it mostly works out?]
(I was lazy and monte-carlo’d it like so: Mean@Table[Clip[-45*Log[RandomReal[]], {3, 60}], 10^6])
Thanks for doing the math. I’ll see if I can expose that calculation in the settings panel so that you can actually see the estimated average time between pings with the clamps in place
The way Cactus translates pings into time is to divide the timeline into spans at the midpoints between samples.
That is, if the samples are spaced like so:
St-1---------------------St---------St+1
Then we’d carve up the timeline at the midpoints between samples, assigning each interval to each sample respectively:
St-1----------|----------St----|----St+1
So it doesn’t directly use the sampling frequency in the calculation, it just considers the samples directly.
I notice that when you complete a report, it focuses the main Cactus window, even if a different window was focused previously. I wonder if it would be a bit smoother if it didn’t do that?
Is there a way to be notified when the windows version is ready and/or there is beeminder integration? I would love to revisit this when those are available!