Philosophical setup instructions for TagTime


I was just chatting with @shanaqui about possibly setting up TagTime and pointed out that part of the setup is philosophical…

Namely, you have to form a deep appreciation for the concept of the mathematical expectation of a random variable. Otherwise your intuition will scream at you that TagTime is being totally unfair when you glance at Twitter for literally 2 seconds and TagTime pings right then and says you spent 45 minutes on Twitter that day. Or when you’re slaving away for hours with nary a ping and TagTime says you’ve done nothing.

Without understanding the math, I think it amounts to just having faith that if you wait long enough the averages will work out and it will end up fair. And resisting the temptation to correct for unfairnesses in the short term, because that ruins all the pretty math that makes it fair in the long term.

PS: For those just tuning in, TagTime assumes that you answer each ping with what the focus of your attention was at the exact moment that the ping pung. It may take some mental discipline to do that consistently and conscientiously.


I have nothing to say here (because I use toggl instead, avoiding all philosophical conundrums) except that I deeply enjoyed your use of the past tense of ping as pung, and I’m going to steal it.


Yes indeed. Using TagTime feels very weird because I have yet to be successful in retraining my intuition, no matter how hard I try. I am coming to the conclusion that human brains (well, at least mine) are fundamentally wired to respond to/look for things that are correlated over time and have a very hard time coming to grips with a truly memoryless process. The rational part of my brain has to constantly work to override the other parts of my brain that are trying to get me to do stupid/irrational things like (a) get upset when I get a ping right during a break, (b) avoid taking a break when I’ve been working for a long time without a ping because A PING IS PROBABLY ABOUT TO COME ANY SECOND NOW, © slack off right after getting a ping (especially one during a break) because it’s “too late”, etc.


Ha! Yes! Silly brains! I relate to all of those.

It’s mathematically true that RIGHT NOW is the most likely of all possible future moments for the next ping to ping. So I find that focusing on that helps to mitigate some of those incorrect intuitions.

Like, “I’ve gotten away with goofing off this long without getting pung, I can probably go a liiiiitle longer…”.(Logically true in the same way that “just one more piece of pie won’t matter in the long run” is true.) So I try to replace it with “OMG stay focused cuz the next ping is pinging ANY SECOND”. That works wonders on my brain.

But the right intuition to cultivate, I think, is just “start a tock and you’ll get 1 ping”. Exactly true, on average. It might be for naught or you might get double credit but just pretend it’s deterministic, start the 45-minute block of work, and expect your next ping.

PS: I also find that looking at the numbers at helps a lot. For example, seeing that the next ping will very likely happen between 1 minute and 3 hours from now.


This week I am going to train my intuition by tracking everything both using Toggl AND Tagtime, and see how they compare. So far, even though it feels like it’s wrong, Tagtime almost exactly matched Toggl after a few hours. It felt uneven at first, but I can see that the longer I track it on both, the more Tagtime will prove correct on average.


Just did some math on the question of how much time you have to measure with TagTime for the estimate to be probably approximately correct:

  • If TagTime says you spent 45 minutes there’s an 86% chance it was really between 0 & 2 hours
  • If TagTime says you spent 3 hours there’s a >90% chance it was really between 1 & 6 hours
  • If TagTime says you spent 6 hours there’s a 71% chance that’s within 37% of the truth
  • If TagTime says you spent 8 hours there’s a 67% chance that’s within 33% of the truth
  • If TagTime says you spent 9 hours there’s a 74% chance it was between 6 & 12 hours
  • If TagTime says you spent 35 hours there’s an 80% chance that’s within 20% of the truth
  • If TagTime says you spent 40 hours there’s a 90% chance it was between 32 & 50 hours
  • If TagTime says you spent 80 hours there’s an 86% chance that’s within 15% of the truth
  • If TagTime says you spent 220 hours there’s a 90% chance that’s within 10% of the truth
  • If TagTime says you spent 50k hours there’s a 99% chance that’s within 1% of the truth

(Of course increasing the frequency of pings from the default of once per 45 minutes gives you tighter estimates faster.)


Hah, I like this. Much more motivating than the other mathematically true fact that you won’t get a ping until 45 minutes from right now (on average).


Coming from Bitcoin where new blocks are found stochastically with an average interval of about 10 minutes, I found TagTime’s unpredictability over the short term but expected accuracy over the long time to be highly intuitive. Bitcoin even prepared me for weird things like the hitchhiker’s paradox which applies to TagTime as well. :slight_smile:

Still, on eep days, I’d really love it if it was possible to lower the interval time to reduce variance, get in my whatever number of minutes, and then dial it back up for everyday stuff.


One solution I’m playing with is, on eep days, you could ask TagTime to guarantee a ping in the next 45 minutes. This extra ping would be limited to - say - once a week or so.

Would that work or not for you?


I might be philosophically opposed to that, unless there’s a way to make the math work [1], ie, to not lose the property that what TagTime reports is a wholly unbiased estimate of how you spent your time.

The only way I’ve thought of for that to work is to explicitly pause TagTime and switch to manual tracking for some period of time. Here’s @byorgey spelling that out:

My opinion, though, is that this is a rabbit hole that it’s a bad idea for a new TagTime implementation to go down, until the basic functionality is solid.

[1] Example of why it’s hard: Suppose you guarantee exactly 1 ping in the next 45 minutes. You then goof off for 44 minutes and luck out with no pings. That’s when you notice that, by process of elimination, a ping is guaranteed in the next 60 seconds! So you shove your nose in a textbook, and, sure enough, you get your ping. The system has no way to distinguish that roguery from having diligently worked all 45 minutes! Not that goofing off till the last minute is a good strategy, but the point is that the longer you go without getting pung the more certain you are that a ping is about to happen, and that distorts the estimate.


Hmmm…yes that’s an excellent point. I’m still developing my intuition for the ping schedule but what you’re saying feels right. I like @byorgey’s idea of switching to manual tracking too.

Agreed - but once we’ve got the basic functionality down solid I think this may be the next thing to take up and solve.


I’m with @dreev in not wanting that feature unless the math works. Also, I don’t like the constraint of only being able to use a certain feature once a week—on certain goals I consider high priority, I might set an aggressive schedule so that every single day of the week ends up being an eep day. (Or I might just be lazy on an easy goal and get the same effect.)

I think the correct solution to the problem is the one I described: allow the ping interval to be changed and record that as part of the ping data. E.g. instead of,

1530711663 foo bar [2018.07.04 09:41:03 Wed]
1530712647 foo baz [2018.07.04 09:57:27 Wed]
1530715414 qux bar [2018.07.04 10:43:34 Wed]

You’d have additional metadata indicating what average interval time in minutes was used to calculate that ping time, e.g.:

1530711663 foo bar [2018.07.04 09:41:03 Wed] @45
1530712647 foo baz [2018.07.04 09:57:27 Wed] @15
1530715414 qux bar [2018.07.04 10:43:34 Wed] @120

It’s safe to change the ping interval as often as you want and to any length of time you want[1] as the poisson process is progress-free. That is, no matter how much time you spend waiting for a ping on one interval (say 45 minutes), you haven’t made any progress towards achieving it. If you haven’t made any progress, you also don’t lose any progress by changing the ping interval.

I think the challenge here is not the math but the tooling. It’s very easy to get data out of the current TagTime file if you assume all pings have the same average interval, e.g. you can just grep for the tags you want, count lines, and multiply by 45 to get the amount of time you worked. All the current tools are built around that assumption, and so changing it breaks everything.

On the other hand, extracting a minutes parameter from each line and using it in calculations doesn’t seem particularly hard to me. The new algorithm becomes: grep for the tags you want, extract the @minutes parameters, and sum them.

So I think it’s up to people whether or not they want more control over their variance. I think I’d probably use that feature in two ways: to reduce variance for certain tasks on their eep days, and to decrease the typical number of pings (increasing the variance) all the rest of the time when getting pinged is a bit bothersome (but I definitely still want the data, which is why I put up with answering pings now).

[1] But it does introduce the side effect that you’ll be spending more or less time answering pings depending on whether you’ve, respectively, increased or decreased the ping frequency. If your goal is very careful experimentation, it would probably be desirable not to mess with that parameter. OTOH, allowing users to mess with the ping frequency parameter would increase the amount of information contained in the log, which could maybe be used to correlate with something.


In other words, the 45 minute interval is a balance between how often we are willing to be disturbed/pinged and the accuracy of the results…? If you’re right @harding then increasing the frequency would mean higher accuracy (quicker accuracy?) simply at the expense of being more frequently pinged which is a really good solution.

One issue with allowing the change of interval is it will throw off the “universal ping schedule”. @dreev what do you think?


Well, over the long-term, any reasonable ping setting is going to give you accurate results—it’s just that longer intervals will take longer to converge on the probabilistically correct result. Here’s a plot of the cumulative probability (CDF) of being pinged for the default setting of 45 minutes (0.75 hours), 15 minutes (1/3rd the default), and 2.25 hours (3 times the default):

Of particular note is the difference in the sharpness of the elbows (arc of the curves) in these different plots: this is the result of variance, which is square in nature, so each time you double the sample rate (reduce the ping interval by half), you quadruple the rate at which the metric converges on truth.

The obvious downside of increasing the sample rate in tagtime is that you spend more time entering data. I’m guessing that when @dreev and @bee were first creating tagtime, they either did the math or simply experimented to choose the default 45 minute interval, which is reasonably frequent to track things over the short term but infrequent enough that you don’t become a slave to time tracking.

Regarding accuracy, an important consideration is how many minutes you spend performing the thing you want to track. For example, if you track your sleeping hours with tagtime’s default 45 minute interval, the system is very likely to converge on an accurate result quickly for most people due to them spending 4-10 hours a night sleeping. But if you want to track, say, how much time you spend clipping your toenails (5 minutes a month?), it’ll probably take years to converge on reasonable accuracy (except by chance). However, if you can adjust the interval for different things, you could even out that imbalance (or aggravate it if you like gambling).

I haven’t looked at the code, but I assumed from the description I read somewhere of the universal ping schedule that tagtime just used a seeded Pseudo Random Number Generator (PRNG). If that’s the case, then everyone on each n minute interval will receive pings at the same schedule. For example if Alice and Bob are both on a 15 minute interval, they’ll both be pinged at the same time; if Alice then switches to 45 minutes (which Carl also uses), she and Carl will both receive their next ping at the same time.


I endorse @harding’s math and that does sound like an elegant solution (also a great write-up advocating for it!). I had not thought of the idea of including the ping frequency with each TagTime log entry. Super smart.

Here are my remaining reservations:

  1. As @charles99 predicted I’d say, the universal ping schedule is really nice. Like how literally 7 or more devices in our house all ping in unison. So handy! Most users won’t care about that but it’s a nice feature for people who do eventually get that hardcore about it and it’s a shame to break that.

  2. This one sounds ridiculous but I’m constantly amazed how true it is: Giving users choices is usually bad. I’m not sure how to articulate this right now if you don’t already know about the UI principle often pithified as “don’t make me think”. Or chapter 3 of Spolsky’s treatise on UI design. Ping frequency may be a very reasonable thing for users to choose but you’d have to be sure they’re equipped to make tradeoff between accuracy and annoyingness of getting pinged in quick succession often.

  3. This one might be bogus but you can build intuition about ping frequency if you stick to 45 minutes. Maybe bogus because, as @byorgey was saying, your intuitions may persist in being wrong anyway.

Finally, a question for @harding: I agree that computing the expectation of the true amount of time spent is straightforward (just add up the frequencies for each ping) but does it get much messier to compute confidence intervals on those estimates?


Would it work if:
a) We kept the universal schedule of 45 minutes in general but
b) A user could choose to “switch to” a shorter schedule of - say - 12 minutes for a while because they want more granularity for a period

I liked @harding’s idea of ping frequency so much I’ve already incorporated it into the alpha :sunglasses:. I think it’s could be a very nice solution in general.


Great question! Unfortunately, I don’t have a good answer. I thought about it a bit, and couldn’t think of a good way. Then I searched to see if anyone else had described a process for merging confidence intervals generated for poisson-point processes of different frequencies and didn’t find anything (I’m only a little surprised: that doesn’t seem like something anyone would need, but the Internet if full of things I didn’t think anyone would need).

I think maybe a bayesian creditable interval would be more tractable to the problem of different ping lengths and could possibly provide significantly improved estimates for historic data by allowing the more reliable estimates of events with many pings to affect the likelihoods for the range of times possible for events with few pings. E.g., the more credible my data is about me sleeping 6 hours a night, the less credible it should be about all my other times adding up to more than 18 hours per day. However, using it that way would almost certainly qualify as “much messier”.

OTOH, I don’t know how important either of confidence intervals or possibly-improved credible intervals are. The only place I personally use confidence scores for TagTime is via TagTime Minder, which should be easily updatable to give estimates for n pings all of the same m minutes length. I did read an older article somewhere on the site by @dreev where he gave confidence intervals for some of his data, but so far in my usage of TagTime I’ve just accepted as reasonable whatever number I get by multiplying the number of pings by 45 minutes. Are confidence intervals a commonly-used thing? What tool are people using to generate the intervals for their tags?


I think this is the answer:

Note that for a Poisson process like TagTime, the gaps between pings have an exponential distribution with a rate parameter equal to the reciprocal of the average gap length. If the rate parameter doesn’t change then the distribution of the sum of n gaps is a gamma distribution. That’s what TagTime Minder uses to give probabilities of getting n pings in t amount of time, and for confidence intervals. Generalizing this looks tricky but possible!

It’s also true that most of the time you’re just asking the question “if I set the ping frequency to x, what’s the confidence interval for the n pings I need?” And that’s a trivial generalization of the current TagTime Minder.