Beeminder Forum

Eugenio's language learning journal

Howdy!

The thing I love most about language learning is making fairly meaningless and arbitrary (but pretty) charts, so here we go.

But roughly, I’m working on a language learning platform, starting from japanese. Updates will follow!

2 Likes

Interested to hear how you’re calculating this. Will you be manually inputting your score into Beeminder, or are you using the API or IFTTT somehow?

2 Likes

I’ve actually taken down that graph. It worked via the API, but as I want to tweak how the score is calculated, I would have to always delete all datapoints and add them back every time. I’ll just post some charts once in a while.

As for how it’s calculated, I get from 0 to 1 for every word depending on how well the system thinks I know it, weighted by how common the word is, and the sum is normalized to go from 0 to 1.

The main issue with this is the weighting. As it is right now, uncommon words are weighted thousands of times less than the most common words. On the other end of the spectrum, I wouldn’t want thousands of obscure words to force the score to never get close to 1.

I might try weighting by percentile or by the log of the number of occurrences…

1 Like

Let’s model a good “vocabulary knowledge score” algorithm! This algorithm will be responsible for “grading” a student on how much of the japanese vocabulary they know. The score will range from 0% and 100%, and 100% should be obtainable by students who wish to do so.

Let’s start by creating some Monte Carlo students to test how various grading algorithms will treat them.

Our first Monte Carlo student will read a text consisting of 1M words, each one drawn from a vocabulary of 8000 words distributed according to zipf’s law.

Our student has a daily time budget of 2 hours. He won’t study before 8 AM or after 8 PM. He will first read until he encounters 20 new words. Reading a word he had already seen takes him 1 second, reading one he never saw takes him 12 seconds (this accounts for reading becoming easier as you read more). He will then start vocabulary review until completion or until the 3 hours have passed. Each word review takes him 8 seconds. He has an 80% chance of getting the review right, and a 20% chance of getting the review wrong.

We will first tune our student so that he achieves satisfactory learning. After we have a good student (or some good students), we’ll try to make a fair grading for him (them).

Does he need to study the whole two hours/starts getting backlogged?

image

No. Furthermore, around day 300, he’s discovered all 8000 words, and every day begins getting easier and easier. Good student!

So, how does the chart of our student’s progress look if we just use the scoring already in my platform?

image

More than reasonable! I’m really, really surprised by its linearity as a function of days studied. I expected it to get really wonky at the start and towards the end.

Our model student gets a score of about 70% after one year of study. That certainly looks like a goal to try to achieve.

Here’s the scoring for our monte carlo student in all of it’s naivety:

def calculate_ln1score(self):
    max_score = 0
    for word in range(1, 8001):
        max_score += math.log10(int(1 / word * len(self.text))) + 1.0

    score = 0
    for word in self.studying:
        orig_ivl = word["ivl"]
        weight   = math.log10(word["occurrences"]) + 1.0

        max_ivl = math.log2(60 * 60 * 24 * 356)
        min_ivl = math.log2(60 * 60 * 24 * 2)
        ivl     = math.log2(orig_ivl)
        ivl     = min(max_ivl, ivl)
        ivl     = max(min_ivl, ivl)
        ivl     = ivl - min_ivl

        score  += weight * (ivl) / (max_ivl - min_ivl)

    return score / max_score

There are about 190 days to the end of the year. To get to a 70% score before 2020, I’d need to roughly add 0.3% to the score every day… and we’re not nearly there yet.

image
image

Let’s see if I can bring up that daily delta or if I have to change my expectations :slight_smile:

All of my playing with monte carlo students seems to indicate that the only thing that regulates daily workload and progress is the number of new words added daily, which… duh:

This student adds 10 words every day. Its daily workload after 200 days is roughly half an hour:

image

This student adds 20 words every day. Its daily workload after 200 days is roughly 1 hour.

image

But also, the daily workload looks like it tends to a limit as the number of days goes to infinity… in the real world, until we are out of new words in the language to study - flipping through hard to read books provide no new words to be learned - and the workload then tends to 0!

Daily workload for 10K words in the lexicon of the language, 30 new words studied per day, 3 years of study:

image

It looks like the daily workload tends towards 1:15 to 1:30 hours, but as soon as we run out of words to practically add, be it from our self-chosen corpus or from a most-common-list, the workload starts dropping drammatically right away.

An hardcore learner can learn a 10K word lexicon in 8 months adding 40 words a day, and after those 8 months of intense study, the workload starts relaxing immediately, dropping to less than 20 minutes by year end:

image

Bonus: what if we keep a fixed daily time budget and actually use it to self-regulate our learning? Let’s say one hour a day. Every day we’re not done after an hour, we stop immediately, and the next day we add no words. Otherwise, we add 30 words per day.

Using this very, very simple control algorithm, we fairly good result. We end up actually studying between 40 minutes and 1 hour, for about 18 months, before studying our whole 10K words lexicon:

If we want any better, we’ll have to calibrate ourselves into a PID loop :slight_smile:

Obvious caveats: this monte carlo student is very primitive. In this version, it gets things wrong 25% of the time, and that’s it. In reality, just as a single example of “weird” human behaviour, you’ll discover while learning that you’ll see words that you can’t get to remember for some reason, and words that immediately click. Because of differences such as this, the human learner might find the daily workload differs substantially compared to what is predicted by a monte carlo student with their same right/wrong rates.

1 Like

I’ve been way too inconsistent with this. Time to cobble something up with beeminder’s API :slight_smile:

deriv%20(2)

cumul%20(2)

1 Like

And here’s the cobbled API-using goal set at a conservative half an hour a day.