Beeminder Forum

Philosophical setup instructions for TagTime

Ha! Yes! Silly brains! I relate to all of those.

It’s mathematically true that RIGHT NOW is the most likely of all possible future moments for the next ping to ping. So I find that focusing on that helps to mitigate some of those incorrect intuitions.

Like, “I’ve gotten away with goofing off this long without getting pung, I can probably go a liiiiitle longer…”.(Logically true in the same way that “just one more piece of pie won’t matter in the long run” is true.) So I try to replace it with “OMG stay focused cuz the next ping is pinging ANY SECOND”. That works wonders on my brain.

But the right intuition to cultivate, I think, is just “start a tock and you’ll get 1 ping”. Exactly true, on average. It might be for naught or you might get double credit but just pretend it’s deterministic, start the 45-minute block of work, and expect your next ping.

PS: I also find that looking at the numbers at helps a lot. For example, seeing that the next ping will very likely happen between 1 minute and 3 hours from now.

1 Like

This week I am going to train my intuition by tracking everything both using Toggl AND Tagtime, and see how they compare. So far, even though it feels like it’s wrong, Tagtime almost exactly matched Toggl after a few hours. It felt uneven at first, but I can see that the longer I track it on both, the more Tagtime will prove correct on average.


Just did some math on the question of how much time you have to measure with TagTime for the estimate to be probably approximately correct:

  • If TagTime says you spent 45 minutes there’s an 86% chance it was really between 0 & 2 hours
  • If TagTime says you spent 3 hours there’s a >90% chance it was really between 1 & 6 hours
  • If TagTime says you spent 6 hours there’s a 71% chance that’s within 37% of the truth
  • If TagTime says you spent 8 hours there’s a 67% chance that’s within 33% of the truth
  • If TagTime says you spent 9 hours there’s a 74% chance it was between 6 & 12 hours
  • If TagTime says you spent 35 hours there’s an 80% chance that’s within 20% of the truth
  • If TagTime says you spent 40 hours there’s a 90% chance it was between 32 & 50 hours
  • If TagTime says you spent 80 hours there’s an 86% chance that’s within 15% of the truth
  • If TagTime says you spent 220 hours there’s a 90% chance that’s within 10% of the truth
  • If TagTime says you spent 50k hours there’s a 99% chance that’s within 1% of the truth

(Of course increasing the frequency of pings from the default of once per 45 minutes gives you tighter estimates faster.)

Hah, I like this. Much more motivating than the other mathematically true fact that you won’t get a ping until 45 minutes from right now (on average).

1 Like

Coming from Bitcoin where new blocks are found stochastically with an average interval of about 10 minutes, I found TagTime’s unpredictability over the short term but expected accuracy over the long time to be highly intuitive. Bitcoin even prepared me for weird things like the hitchhiker’s paradox which applies to TagTime as well. :slight_smile:

Still, on eep days, I’d really love it if it was possible to lower the interval time to reduce variance, get in my whatever number of minutes, and then dial it back up for everyday stuff.


One solution I’m playing with is, on eep days, you could ask TagTime to guarantee a ping in the next 45 minutes. This extra ping would be limited to - say - once a week or so.

Would that work or not for you?

I might be philosophically opposed to that, unless there’s a way to make the math work [1], ie, to not lose the property that what TagTime reports is a wholly unbiased estimate of how you spent your time.

The only way I’ve thought of for that to work is to explicitly pause TagTime and switch to manual tracking for some period of time. Here’s @byorgey spelling that out:

My opinion, though, is that this is a rabbit hole that it’s a bad idea for a new TagTime implementation to go down, until the basic functionality is solid.

[1] Example of why it’s hard: Suppose you guarantee exactly 1 ping in the next 45 minutes. You then goof off for 44 minutes and luck out with no pings. That’s when you notice that, by process of elimination, a ping is guaranteed in the next 60 seconds! So you shove your nose in a textbook, and, sure enough, you get your ping. The system has no way to distinguish that roguery from having diligently worked all 45 minutes! Not that goofing off till the last minute is a good strategy, but the point is that the longer you go without getting pung the more certain you are that a ping is about to happen, and that distorts the estimate.

1 Like

Hmmm…yes that’s an excellent point. I’m still developing my intuition for the ping schedule but what you’re saying feels right. I like @byorgey’s idea of switching to manual tracking too.

Agreed - but once we’ve got the basic functionality down solid I think this may be the next thing to take up and solve.

1 Like

I’m with @dreev in not wanting that feature unless the math works. Also, I don’t like the constraint of only being able to use a certain feature once a week—on certain goals I consider high priority, I might set an aggressive schedule so that every single day of the week ends up being an eep day. (Or I might just be lazy on an easy goal and get the same effect.)

I think the correct solution to the problem is the one I described: allow the ping interval to be changed and record that as part of the ping data. E.g. instead of,

1530711663 foo bar [2018.07.04 09:41:03 Wed]
1530712647 foo baz [2018.07.04 09:57:27 Wed]
1530715414 qux bar [2018.07.04 10:43:34 Wed]

You’d have additional metadata indicating what average interval time in minutes was used to calculate that ping time, e.g.:

1530711663 foo bar [2018.07.04 09:41:03 Wed] @45
1530712647 foo baz [2018.07.04 09:57:27 Wed] @15
1530715414 qux bar [2018.07.04 10:43:34 Wed] @120

It’s safe to change the ping interval as often as you want and to any length of time you want[1] as the poisson process is progress-free. That is, no matter how much time you spend waiting for a ping on one interval (say 45 minutes), you haven’t made any progress towards achieving it. If you haven’t made any progress, you also don’t lose any progress by changing the ping interval.

I think the challenge here is not the math but the tooling. It’s very easy to get data out of the current TagTime file if you assume all pings have the same average interval, e.g. you can just grep for the tags you want, count lines, and multiply by 45 to get the amount of time you worked. All the current tools are built around that assumption, and so changing it breaks everything.

On the other hand, extracting a minutes parameter from each line and using it in calculations doesn’t seem particularly hard to me. The new algorithm becomes: grep for the tags you want, extract the @minutes parameters, and sum them.

So I think it’s up to people whether or not they want more control over their variance. I think I’d probably use that feature in two ways: to reduce variance for certain tasks on their eep days, and to decrease the typical number of pings (increasing the variance) all the rest of the time when getting pinged is a bit bothersome (but I definitely still want the data, which is why I put up with answering pings now).

[1] But it does introduce the side effect that you’ll be spending more or less time answering pings depending on whether you’ve, respectively, increased or decreased the ping frequency. If your goal is very careful experimentation, it would probably be desirable not to mess with that parameter. OTOH, allowing users to mess with the ping frequency parameter would increase the amount of information contained in the log, which could maybe be used to correlate with something.


In other words, the 45 minute interval is a balance between how often we are willing to be disturbed/pinged and the accuracy of the results…? If you’re right @harding then increasing the frequency would mean higher accuracy (quicker accuracy?) simply at the expense of being more frequently pinged which is a really good solution.

One issue with allowing the change of interval is it will throw off the “universal ping schedule”. @dreev what do you think?


Well, over the long-term, any reasonable ping setting is going to give you accurate results—it’s just that longer intervals will take longer to converge on the probabilistically correct result. Here’s a plot of the cumulative probability (CDF) of being pinged for the default setting of 45 minutes (0.75 hours), 15 minutes (1/3rd the default), and 2.25 hours (3 times the default):

Of particular note is the difference in the sharpness of the elbows (arc of the curves) in these different plots: this is the result of variance, which is square in nature, so each time you double the sample rate (reduce the ping interval by half), you quadruple the rate at which the metric converges on truth.

The obvious downside of increasing the sample rate in tagtime is that you spend more time entering data. I’m guessing that when @dreev and @bee were first creating tagtime, they either did the math or simply experimented to choose the default 45 minute interval, which is reasonably frequent to track things over the short term but infrequent enough that you don’t become a slave to time tracking.

Regarding accuracy, an important consideration is how many minutes you spend performing the thing you want to track. For example, if you track your sleeping hours with tagtime’s default 45 minute interval, the system is very likely to converge on an accurate result quickly for most people due to them spending 4-10 hours a night sleeping. But if you want to track, say, how much time you spend clipping your toenails (5 minutes a month?), it’ll probably take years to converge on reasonable accuracy (except by chance). However, if you can adjust the interval for different things, you could even out that imbalance (or aggravate it if you like gambling).

I haven’t looked at the code, but I assumed from the description I read somewhere of the universal ping schedule that tagtime just used a seeded Pseudo Random Number Generator (PRNG). If that’s the case, then everyone on each n minute interval will receive pings at the same schedule. For example if Alice and Bob are both on a 15 minute interval, they’ll both be pinged at the same time; if Alice then switches to 45 minutes (which Carl also uses), she and Carl will both receive their next ping at the same time.


I endorse @harding’s math and that does sound like an elegant solution (also a great write-up advocating for it!). I had not thought of the idea of including the ping frequency with each TagTime log entry. Super smart.

Here are my remaining reservations:

  1. As @charles99 predicted I’d say, the universal ping schedule is really nice. Like how literally 7 or more devices in our house all ping in unison. So handy! Most users won’t care about that but it’s a nice feature for people who do eventually get that hardcore about it and it’s a shame to break that.

  2. This one sounds ridiculous but I’m constantly amazed how true it is: Giving users choices is usually bad. I’m not sure how to articulate this right now if you don’t already know about the UI principle often pithified as “don’t make me think”. Or chapter 3 of Spolsky’s treatise on UI design. Ping frequency may be a very reasonable thing for users to choose but you’d have to be sure they’re equipped to make tradeoff between accuracy and annoyingness of getting pinged in quick succession often.

  3. This one might be bogus but you can build intuition about ping frequency if you stick to 45 minutes. Maybe bogus because, as @byorgey was saying, your intuitions may persist in being wrong anyway.

Finally, a question for @harding: I agree that computing the expectation of the true amount of time spent is straightforward (just add up the frequencies for each ping) but does it get much messier to compute confidence intervals on those estimates?


Would it work if:
a) We kept the universal schedule of 45 minutes in general but
b) A user could choose to “switch to” a shorter schedule of - say - 12 minutes for a while because they want more granularity for a period

I liked @harding’s idea of ping frequency so much I’ve already incorporated it into the alpha :sunglasses:. I think it’s could be a very nice solution in general.


Great question! Unfortunately, I don’t have a good answer. I thought about it a bit, and couldn’t think of a good way. Then I searched to see if anyone else had described a process for merging confidence intervals generated for poisson-point processes of different frequencies and didn’t find anything (I’m only a little surprised: that doesn’t seem like something anyone would need, but the Internet if full of things I didn’t think anyone would need).

I think maybe a bayesian creditable interval would be more tractable to the problem of different ping lengths and could possibly provide significantly improved estimates for historic data by allowing the more reliable estimates of events with many pings to affect the likelihoods for the range of times possible for events with few pings. E.g., the more credible my data is about me sleeping 6 hours a night, the less credible it should be about all my other times adding up to more than 18 hours per day. However, using it that way would almost certainly qualify as “much messier”.

OTOH, I don’t know how important either of confidence intervals or possibly-improved credible intervals are. The only place I personally use confidence scores for TagTime is via TagTime Minder, which should be easily updatable to give estimates for n pings all of the same m minutes length. I did read an older article somewhere on the site by @dreev where he gave confidence intervals for some of his data, but so far in my usage of TagTime I’ve just accepted as reasonable whatever number I get by multiplying the number of pings by 45 minutes. Are confidence intervals a commonly-used thing? What tool are people using to generate the intervals for their tags?


I think this is the answer:

Note that for a Poisson process like TagTime, the gaps between pings have an exponential distribution with a rate parameter equal to the reciprocal of the average gap length. If the rate parameter doesn’t change then the distribution of the sum of n gaps is a gamma distribution. That’s what TagTime Minder uses to give probabilities of getting n pings in t amount of time, and for confidence intervals. Generalizing this looks tricky but possible!

It’s also true that most of the time you’re just asking the question “if I set the ping frequency to x, what’s the confidence interval for the n pings I need?” And that’s a trivial generalization of the current TagTime Minder.


This looks like the maker / manager conflict.
Makers tend to work for long periods of time on similar activities.
Managers are juggling many tasks and tend to spend short amount of time on variety of tasks.

Just brain storming here. There are potentially a variety of approaches.
First let me state an assumption I am making.
The information I actually require is “on an average I am spending X hours a week / month on Activity A and Y hour a week / month on Activity B.”

Option One is a self learning algorithm.
sleep time accuracy on a pinging interval of 45 minutes converges quickly because many pings in a day have the same consecutive tags. Sleeping Sleeping Sleeping
If you are a maker and coding, you are frequently going to have coding or design in consecutive pings.

So if you have (few?? how many) consecutive tags repeating within a day then your ping interval is fine. If every consecutive ping has a different tag probably your ping interval can reduce. If you frequently have four or five identical tags, your ping interval can increase. It is still random to prevent gaming the system.

The self learning algorithm can be smarter. Different ping intervals at different times of the day / day of the week. Each ping carries a weight indicating this ping is worth 6 min / 15 min / 45 min / 2 hours

Option 2
Have a frequent ping. make it very easy (one key combo) saying same ping as last time.
Give user option to
next 5 days auto tag all pings “holiday” “france” “familytime”
next 7 hours auto tag all pings “Design Workshop”.
When autotagging is active The pings show up as notification so you can change the tag but the notification disappears after 10 secs and fills out auto tag
Interrupt auto tagging.

Option 3
Give the user a way to say more pings, less pings. This adjusts the average duration of a ping. Again this just gives each ping a different weight. 5 min / 15 min / 45 min / 2 hours
This gives an alternate answer to Charles suggestion guarantee me a ping in the next 45 minutes.

1 Like

Not just how many minutes you spend, but also the length of time over which that task is performed. For instance, if you work for a large number of clients but only work on each individual client for a few days or weeks, TagTime is likely to miss some of those.

Ultimately it’s just total time spent doing a particular task. You’re correct that if you work only, say, 5 hours on each project, there will be some projects that never receive even a single ping (about 0.1% in this case[1]), but it doesn’t actually matter whether you do those 5 hours all in one day or spread the work out over an entire year—the probabilities of being pinged (or not) are purely dependent on the amount of time spent on the project.

[1] exp(-5/0.75) = 0.001272


Oh, very cool! That makes sense since each minute is equally likely to be picked.

So a 15 minute task would get pinged 1 - exp(-0.25/0.75) = 28.3% of the time, and a 45 minute task would get pinged 1 - 1/e = 63.2%. Those are much higher probabilities than I would have guessed!

I made a table:

hours prob
0.1 0.124826681
0.2 0.234071662
0.3 0.329679954
0.4 0.41335378
0.5 0.486582881
0.6 0.550671036
0.7 0.606759279
0.8 0.655846213
0.9 0.698805788
1 0.736402862
1.1 0.769306818
1.2 0.798103482
1.3 0.823305554
1.4 0.845361735
1.5 0.864664717
1.6 0.881558171
1.7 0.896342871
1.8 0.909282047
1.9 0.920606068
2 0.930516549
2.1 0.939189937
2.2 0.946780656
2.3 0.95342385
2.4 0.959237796
2.5 0.964326007
2.6 0.968779073
2.7 0.972676278
2.8 0.976087007
2.9 0.979071987
3 0.981684361
3.1 0.983970642
3.2 0.985971533
3.3 0.98772266
3.4 0.9892552
3.5 0.990596437
3.6 0.991770253
3.7 0.992797545
3.8 0.993696604
3.9 0.994483436
4 0.99517205

Now I’m curious if there’s a formula for the expected error for what TagTime says.


I think your table is what @harding made a graph of above, and expected error is what I was getting at here: