Python and Anki - Richard's Blog

Why use Anki?

I have been learning Korean thus far and the number of words has been quite overwhelming. I had someone recommend the use of the spaced-repetition software Anki for the very purpose of memorizing vocabulary. Though I was hesitant (Its not that I doubted its use. As a matter of fact, I have used Anki in the past to successfully memorize words in another language. Its just that setting it up was such a PITA that I kinda put it off for a while), I was at a point that if I do not become more efficient, I will not even have the time to completely review what I have memorized. The premise is this: reviewing every single word is wasting my time. I need only to review the words that I struggle with.

Hence I decided to set about creating my Anki decks based on the words that I needed to study for this language semester (I was enrolled in the Hanyang University Language Program, perhaps I can write about it some day?). How could I make Anki decks without sinking obscene amounts of time? I found that I could use Python and a library called Genanki to generate Anki decks. The fortunate thing was I also found an amazing blog post on how to use this library.

Using Genanki

Firstly, I would create a TSV file of the words from a chapter of my Korean vocabulary book. It looks something like this:

hangul	meaning	type
영화배우	actor/actress	noun
계란	egg	noun
야채	vegetable	noun
일찍	early	adverb

I know that I still needed to type out the words, but keying these information in emacs is a lot faster than keying the same information in Anki itself. And also, I inputted everything into emacs way before doing this because I wanted to produce an electronic copy that I can review on the go. So it was incredibly helpful to just export these emacs tables to TSV and start from there.

For the code below, it is recommended that a Jupyter notebook be used to follow along.

We will first import the relevant libraries.

import csv
import random
import genanki

We then need to generate a 10-digit model_id that will help Anki identify this specific “Note Type”. All decks that you generate that share the same format must use the same model_id to be classified under the same “Note Type”. You can find the “Note Type” when you open the “Browse” section of Anki. Not having the same model_id for decks of the same type will cause the creation of duplicate Note Types that differ by a “+” symbol.

Generate the model_id for the first time that you run this, then comment that line and keep using the same model_id for subsequent deck generations.

# generate once and use the hardcoded values
model_id = random.randrange(1 << 30, 1 << 31)
# model_id = <replace this with the new model_id digits>

Then give names for the following variables:

# Filename of the data file
data_filename = <name>

# Filename of the Anki deck to generate
deck_filename = <name>

# Title of the deck as shown in Anki
anki_deck_title = <name>

# Name of the card model
anki_model_name = <name>

As for me, I like to use Python f-strings, which can replace the variable bounded in {} in the f-string with the value of the variable. In the following example, I only have to define the chapter_num and all the names will be replaced. Ain’t that neat?

# chapter num
chapter_num = 9

# Filename of the data file
data_filename = f"chapter{chapter_num}"

# Filename of the Anki deck to generate
deck_filename = f"Level2_{data_filename}.apkg"

# Title of the deck as shown in Anki
anki_deck_title = "Hanyang Level 2 " + data_filename

# Name of the card model
anki_model_name = f"Hanyang_Korean"

The next part helps us to create the structure of the Anki cards

# Create the deck model

style = """
.card {
 font-family: arial;
 font-size: 24px;
 text-align: center;
 color: black;
 background-color: white;
}
"""

anki_model = genanki.Model(
    model_id,
    anki_model_name,
    fields=[{"name": "hangul"}, {"name": "meaning"}, {"name": "type"}],
    templates=[
        {
            "name": "Card 1",
            "qfmt": '<p class="meaning">{{meaning}}</p>',
            "afmt": '{{FrontSide}}<hr id="answer"><p class="hangul">{{hangul}}</p><p class="type">{{type}}</p>',
        },
        {
            "name": "Card 2",
            "qfmt": '<p class="hangul">{{hangul}}</p>',
            "afmt": '{{FrontSide}}<hr id="answer"><p class="meaning">{{meaning}}</p><p class="type">{{type}}</p>',
        },
    ],
    css=style,
)

You can just re-use the style. Remember that the fields that I have in my TSV file are these (in-order): [ hangul, meaning, type ]. I would think that qfmt is “question-format” and afmt is “answer-format”. These use HTML to specify the template of the question and answer sides of the card. Just replace the variable in the braces with the variables in the fields to display the value from the fields.

Here, I have 2 types of cards. Card 1 would prompt with the meaning of the word, and I will need to be able to remember how to write the hangul of the word. Card 2 is the opposite; it would prompt with the hangul word, and I will need to recall the meaning. The reason for this is that I realized that my memory is not always bi-directional. I might remember how to write a word when given its meaning, but somehow when being shown the actual word in hangul form, I might forget what it means. Card 2 prevents this scenario by ensuring that my memory would work in both directions. This has a downside of doubling my deck size :(

The next step has us trying to populate the list anki_notes with anki_note objects with fields populated from the TSV file. I use a “\t” delimiter because I use a tab delimited file. If you use a CSV, it would be a “,” delimiter.

# The list of flashcards
anki_notes = []

with open(data_filename, "r") as csv_file:

    csv_reader = csv.reader(csv_file, delimiter="\t")

    for row in csv_reader:
        anki_note = genanki.Note(
            model=anki_model,
            # hangul, meaning, type
            fields=[row[0], row[1], row[2]],
        )
        anki_notes.append(anki_note)

Shuffling the cards would help with not having similar cards being too close to each other.

# Shuffle flashcards
random.shuffle(anki_notes)

We will now create the deck and populate it with the anki_note objects that we placed in anki_notes (the naming convention is not the best).

# create anki deck
anki_deck = genanki.Deck(model_id, anki_deck_title)

# Add flashcards to the deck
for anki_note in anki_notes:
    anki_deck.add_note(anki_note)

Finally, create a package from the deck and save it.

# create package object
anki_package = genanki.Package(anki_deck)
# Save the deck to a file
anki_package.write_to_file(deck_filename)

# output line for status
print("Created deck with {} flashcards".format(len(anki_deck.notes)))

You should now have a .apkg file in the working directory. Import this file into Anki and start studying!

Disclaimer: I did not write most of the code in this post. Credits go to Charly Lersteau for his post on how he used Genanki to generate Anki decks for learning Mandarin. I merely co-opted the codes for use with learning Korean.