How to cope with delayed autodata

Fact: Some autodata sources (notoriously RescueTime and Apple Health) don’t give us a lot of control over when they sync.

Question: Should we implement a delay after the deadline to check for additional data that comes in after the deadline but is timestamped for the previous day?

Asking it that way, I think the answer has to be yes (and is already yes but in a gross, opaque, inconsistent way). Like RescueTime is explicitly vouching for you that you did X by time T – it just took them till 15 minutes after time T to tell us that.

There’s no grace/leniency and there’s only as much magic as RescueTime forces on us in order to not be unfair to users. (I was confused about one or both of these last time we talked about this internally – apologies to @adamwolf!)

But ideally we should be explicit and transparent about this. I’m now thinking the ideal solution is like this:

First, every autodata integration should know how potentially stale it is. Namely, there’s a timestamp that we count on the autodata being up to date as of. Call that autofresh. For RescueTime that’s always now minus 15 minutes (or maybe RescueTime has gotten laggier for free users lately? whatever it is). For autodata integrations that push data to us in real time, autofresh is always simply Now. For Apple Health, we want the Beeminder iOS app to report an autofresh timestamp to the server every time the user opens the app.

Now when your deadline hits we can make the situation crystal clear to the user. The graph freezes until autofresh ≥ deadline with a message like “waiting for straggling RescueTime data” or “go open the Beeminder app” overlaid.

What do you think?

PS: implementation idea (probably ignore this)

The autofresh timestamp is either a normal unixtime in seconds – a strictly positive number – or it’s a nonpositive number (0 or negative). If it’s ≤ 0 then it’s taken to be a number of seconds before right now. That’s how Apple Health can just report a unixtime of last sync and RescueTime can just have autofresh=-15*60 and real-time autodata can have autofresh=0 and everything works. That means an awkward if-statement every time you use autofresh, converting it to an absolute timestamp, but it avoids the ugliness of separate fields for relative vs absolute time, I guess.

Said another way, autofresh is conceptually always an absolute unixtime. If it’s 0 or negative then first add the current unixtime to it before using it.

PPS: I now think freshAt is a better name for this field.

5 Likes

Beemios certainly could sync the last time it updated autodata on a per goal basis. This would potentially be a pessimistic date (Apple pings us with new data, but doesn’t ping us to say “no new data”), but would mean we only derail people when we are confident we should derail them.

It would be interesting to think through what this means for varying levels of autofresh age. For example, would it be ok for autofresh to be over a week in the past?

Interestingly this behavior of “we won’t derail you if you haven’t put in data” is directly the opposite of pessimistic presumption in manual do less goals.

3 Likes

I am not sure about the “graph freezes” part but I think a lot of my objection would be related to how it would be implemented. /Shrug

As to the rest, I’ve thought some autodata should be delayed (like some already are!) from the beginning :slight_smile:

Some more ideas:

There’s a difference between making a commitment to

  1. doing a thing by a deadline
  2. doing a thing by a deadline as measured by tool X, and
  3. doing a thing by a deadline as measured by Beeminder measuring tool X.

On a different autodata type that isn’t affected by this, if there were Fitbit downtime, and a user said “I did the thing in Fitbit, but Fitbit was down so my data didn’t get into Beeminder” we’d totally wait for Fitbit to be back and syncing before derailing them. (This is because 2 is more important than 3.)

This is a case of being able to do that automatically.

If we had sync downtime and weren’t able to sync to a provider, we wouldn’t want to derail and charge users before getting their data. (This is because 2 is more important than 3.)

There is a difference between a crisp hard line in your commitment, and a crisp hard line for immediately charging for a derail. This doesn’t soften the edges of the commitment.

It wasn’t too long ago that it could take 20, 30+ minutes to process all the jobs, and we didn’t think that derailing half an hour after the deadline was a soft edge… we’re down to 6 or 7 minutes. Adding a few minutes delay to some jobs to make sure the provider’s data is accurate would still derail folks closer to the deadline than we regularly derailed them just a few years ago!

Beyond the derail email, we already delay charges for everything and we don’t think it softens the commitment.

3 Likes

Ah, wait, in this proposal, BeemiOS sets autofresh=now not just when Apple pings us with new data but any time the user opens the app and Apple could’ve pinged us with new data. I have @bee and @morehavoc telling me out loud here that we need a better name for this field.

Maybe checkedat? I.e., it’s the last time we checked for new data and would’ve seen it if there’d been any.

Very good question. It would be great to notify the user that they need to open the app after some amount of staleness. But in principle maybe the graph really should stay in that frozen state indefinitely if the user doesn’t?

(This might be similar to our existing deadman switch. Thanks again to @morehavoc for noticing that!)

Ha, yeah! Now I’m wondering what it would be like if we did the same graph freezing for a do-less goal. I guess we can set that can of worms aside for another day. (We could add this to the collection at blog.bmndr.co/docdriven.)

Agreed. I think I buried the key part of my proposal here (I edited in bolding just now). I’m basically proposing that we explicitly have a delay, in a way that’s fully transparent to the user.

1 Like

Absolutely!

There was a poll somewhere in bee space recently like “I think that ensuring a sync takes place before midnight is a) my job or b) the app’s job” and it’s absolutely the app’s job. I’ve always been uneasy with the fact that, if a third party integration goes down, I’m on the hook, as if there’s anything I can do about that.

Anyway, yes. :stuck_out_tongue:

1 Like

Ah, that’s this poll:

But this incipient freshAt spec aims to be the best of all worlds! Neither pushing responsibility on the user nor violating the anti-magic principle.

Here’s a review of some prototypical autodata integrations and how freshAt handles them:

(We’ll assume that freshAt can be set to an absolute time by setting it to a specific positive unixtime and can be set to a relative time – an offset from Right Now – by setting it to a nonpositive number of seconds.)

  1. Fitbit which pushes data to us in real time and so freshAt = 0.
  2. Duolingo which we have to query for but it’s always perfectly fresh when we do. This too can have freshAt = 0.
  3. RescueTime which gives us data that’s stale by, say, 15 minutes, so freshAt = -15*60.
  4. Apple Health which can be arbitrarily stale if the user hasn’t opened the app lately. In this case the iOS app updates freshAt to the current unixtime whenever the user opens the app, i.e., whenever the iOS app checks for new Apple Health data.
  5. Metaminder which is as stale as its stalest source goal so freshAt is the min of its source goals’ freshAts (after converting them all to absolute timestamps).
  6. A cron job set up by a user using the Beeminder API to automatically send data to Beeminder every hour. The API client can set freshAt to the current time when it sends data, or can set freshAt = -60*60.

Now when the deadline hits there are 3 possibilities:

(Here we’ve converted freshAt to an absolute unixtime.)

  1. The data for the just-completed day ended on the wrong side of the red line and freshAt < deadline ⇒ data is stale and graph is in limbo till it’s fresh.
  2. Same but freshAt ≥ deadline ⇒ DERAIL.
  3. All good, redraw the graph for the new day.

There are still holes in this spec, like how exactly to put the graph into a frozen state when the deadline hits but the data is stale, and maybe especially when and how often to recheck for freshness when that happens. (Also I can imagine issues with giving API clients free rein to set freshAt, like loopholes and makeshift grace periods, but starting out fully permissive and worrying about it when we see what shenanigans people come up with is probably fine.)

Anyway, I think this is all already working ok-ish for all but Apple Health and even there the workaround isn’t crazy (and, per the above poll, is already natural for most people, at least those who hang out in the Discord and take our polls, so, um, yeah). You just have to know that Apple Health can’t sync unless the Beeminder iOS app is open so if it’s a beemergency, make sure to open the app before the deadline and confirm that Beeminder agrees you’re out of the red.

2 Likes

I’ll note the discord poll was reflective of reality, not aspirational. I voted :chart_with_upwards_trend: in that poll precisely because I’ve been derailed by :hatching_chick: mindset in the past! I would rather live in a world where :hatching_chick: was reliable. I’m not sure that’s possible—there’s definitely a comfort in seeing “not red” that won’t go away no matter what—but I think heading that direction is admirable.

2 Likes