…says a thousand words:
What is happening?
I am learning Norwegian.
I learned English by reading a half-translated Harry Potter fanfic and looking up every word in the dictionary. So, I want to try something similar with Norwegian.
My workflow right now looks like this:
- I am coming up with random English sentences,
- putting them through an online translator,
- and writing down Norwegian translations in a huge-ass notebook with colored pens.
This is pictured below, and tracked in norwegian-sentences – artyomkazak – beeminder.
Eventually I’ll know enough words that I’ll be able to read Et dukkehjem, and then I’ll be happy. But until then I want somebody to explain each word to me.
So, I wrote a small program that puts sentences through ChatGPT and asks it to explain each word. You’ve seen it above, already.
The source is available at GitHub - neongreen/forklar: Language learning explanations w/ ChatGPT.
To run it, you will need:
- A free DeepL account with API access.
- A paid OpenAI account (I’ve been using it for the whole day and so far spent only $0.18).
.env file in the same directory as the
.ts file. It should look like this:
Start the program with
deno run -A index.ts.
If you just give it a sentence, it will translate it and show explanations. If you start a command with
/, it will run a ChatGPT query.
You’ll probably want to modify the program. Maybe you are learning a different language, or maybe you want more/less detailed explanations.
There are several things to modify:
deeplApi.translateText(prompt, "en", "nb") — instead of
"nb", you can use a different target language. You’ll also have to replace “Norwegian” in the ChatGPT prompt.
The ChatGPT prompt itself — go wild on this one. You can tell ChatGPT what kind of explanations you’d like to get, ask to provide similar sentences, ask to make your sentences more/less formal, whatever. I suppose it would be interesting to ask ChatGPT to also provide grammar notes — might be very convenient if your target language is very different from English.
This is amazing!
What I would add would be generation of Anki flashcards to burn the sentences into my brain.
Also it will be even more amazing when larger LLM models get accessible.
For instance, I can say with 100% certainty that it’s not really good at foreign languages… I am native french, and it doesn’t feel super natural. A way to fix this might be to tweak the prompt to take this into account.
Also, all sorts of autocomplete in Anki. Imagine you create a new card and just type one field, get the rest autocompleted. I don’t have enough time to ship it, but imo it’s a great use case.
ChatGPT even in its native chat interface can be a great language teacher, doing repetitions, fixing your sentences, or just chatting. We can imagine “have at least 5 mins conversation” or “exchange at least 10 messages with my language tutor” goals on beeminder.
Yeah. I tried the “Norwegian tutor” and “Spanish tutor” characters at character.ai and they didn’t well work for me, but just asking ChatGPT or Bing random language questions works somewhat well in my experience.
A few days later, this is still going – although I think I’ll have to switch to something different because it’s getting less interesting.
In the meantime I have two technical questions that somebody might be able to answer:
Any text-to-speech APIs or open-source models you could recommend? I know there’s Whisper for speech-to-text, but for text-to-speech I only know Google Cloud and I don’t want to use their overengineered console/dashboard/etc if I can avoid it.
What’s the best way to get structured data out of ChatGPT? Currently I’m asking it to output JSON and it’s working more or less fine but I get malformed output sometimes and JSON parsers are pretty unforgiving when it comes to malformed output.
You might as well ask “what’s the best way to get sharp images using this blurry camera lens?” Simply because of the way it works, there’s no way to reliably get ChatGPT to yield correctly structured output. Check out this great New Yorker article that explains it well: ChatGPT Is a Blurry JPEG of the Web | The New Yorker
Yeah, I didn’t mean “get it to reliably output …”, but unfortunately I can’t phrase it well.
Tell it to return well structured json with keys of … and set temperature to 0. Should be good enough
Ask it to fix the JSON if it’s wrong?
Sorry I replied too fast. This is telling GPT models to be very deterministic:
Remember that the model predicts which text is most likely to follow the text preceding it. Temperature is a value between 0 and 1 that essentially lets you control how confident the model should be when making these predictions. Lowering temperature means it will take fewer risks, and completions will be more accurate and deterministic. Increasing temperature will result in more diverse completions.