Android: how to always run in background and check which app is on top and whether the phone is actively being used?


I’m making a very simple rescuetime alternative with beeminder integration, but a bit nerdier and with features I want it to have.[1]

I have no issue getting data on what I’m doing on windows or linux (in windows’ case you need 30 lines of C code to make a working keylogger :stuck_out_tongue: ), but I have no idea how to do that in an android app.

Asking google is also not that easy, as rescuetime-like software is basically a spyware, except you’re spying on yourself :stuck_out_tongue:

Can anyone share a minimal example of how to do this?

[1] The main thing apart from beeminder integration is to always keep all the raw data, as a simple sequence of json files containing batches of second-level observations, so that I can analyze it retroactively as thoroughly as I want in ways I didn’t think of before, and trivially export into whatever format I want.

One example of something trivial to code, but far from trivial to get from any existing service: I might want to beemind single-taskedness, with a single-task-violation do-less goal. Concrete example:

I section stuff into topics.
Music: reaper, melodyne, musescore, imslp
Coding: vscode, terminal, vmware, readthedocs, github
Break: youtube, twitch, twitter, facebook, discord

then each time I observe going from one to the other, that’s a single-task violation.

The idea behind this is would be that half an hour of chatting and tweeting and half an hour of coding is much, much more productive than alternating 5 minutes of each for one hour.

1 Like

I don’t know all that much about Android app development, but If I were doing this I’d look into taking advantage of Android’s accessibility services. By their very nature they need to have access to just about every interaction the user has with every app, so Android more or less has to expose all sorts of information to them, at least if Android is going to allow custom accessibility services in the first place.

Some quick googling gives me this page, and it links to the AccessibilityEvent page. Among other things listed on that page, the “Windows changed” even seems like it could give you what you want. It seems that if every time that event is fired you call the getWindows method, you’ll be able to keep a log of what windows are active when. (Keep in mind that multiple windows might be active at once, because Android allows for a split screen, with two apps each taking up half of it!)


Can you elaborate? I’ve tried like a dozen programs for Windows to try to keep track of what I do but none works well.

1 Like

Here’s a minimal(ish) example that just gets your topmost window every second and logs that to a file.

At the top we make some aliases for the windows APIs we want to use and give a name to a constant from msdn.

get_window_below gets either the topmost visible window, or the window below the argument passed.
get_n_windows just calls get_window_below until it gets enough windows.
is_window_name_blacklisted just checks whether the window name is empty for now.
Then there’s a while loop that dumps observations taken every second.

Obviously this is a simplified example, yet I think for my needs it will suffice - apart from needing another field in the observations for whether or not the cursor and keyboard are active.

Then you can do whatever you want with postprocessing, like counting time as procrastination if the topmost window is a video (regardless of whether your mouse and keyboard are active), or another procrastination website like reddit/twitter (if mouse and keyboard are active in the last K seconds). Or you can quite trivially count your own “context switches” as explained in the motivating post.

import ctypes
import time
import json

GetWindowText = ctypes.windll.user32.GetWindowTextW
GetWindowTextLength = ctypes.windll.user32.GetWindowTextLengthW
GetTopWindow = ctypes.windll.user32.GetTopWindow
GetWindow = ctypes.windll.user32.GetWindow
IsWindowVisible = ctypes.windll.user32.IsWindowVisible


titles = []
def get_window_below(below = None):
    if not below:
        hwnd = GetTopWindow(None)
        hwnd = GetWindow(below, GW_HWNDNEXT)

    if not IsWindowVisible(hwnd):
        return get_window_below(hwnd)

    bufsize = GetWindowTextLength(hwnd) + 1
    buf = ctypes.create_unicode_buffer(bufsize)
    GetWindowText(hwnd, buf, bufsize)
    window_name = buf.value

    return {
        "window_name": window_name,
        "hwnd": hwnd

def get_n_windows(n):
    windows = []
    for i in range(n):
        last_window = 0
        if len(windows) > 0:
            last_window = windows[-1]['hwnd']


    return windows

def is_window_name_blacklisted(name):
    if len(name.strip()) == 0:
        return True

while True:
    for w in get_n_windows(30):
        if is_window_name_blacklisted(w["window_name"]):

        observation = {
            "window_name": w["window_name"],
            "t": time.time()
1 Like

I have something like this I use for something else (screenshots for my old leaky brain). I think the accessibility things I had to do also allow me get application details. It’ll be probably about a week before I can make it shareable.

1 Like