Why use Anki?
I have been learning Korean thus far and the number of words has been quite overwhelming. I had someone recommend the use of the spaced-repetition software Anki for the very purpose of memorizing vocabulary. Though I was hesitant (Its not that I doubted its use. As a matter of fact, I have used Anki in the past to successfully memorize words in another language. Its just that setting it up was such a PITA that I kinda put it off for a while), I was at a point that if I do not become more efficient, I will not even have the time to completely review what I have memorized. The premise is this: reviewing every single word is wasting my time. I need only to review the words that I struggle with.
Hence I decided to set about creating my Anki decks based on the words that I needed to study for this language semester (I was enrolled in the Hanyang University Language Program, perhaps I can write about it some day?). How could I make Anki decks without sinking obscene amounts of time? I found that I could use Python and a library called Genanki to generate Anki decks. The fortunate thing was I also found an amazing blog post on how to use this library.
Using Genanki
Firstly, I would create a TSV file of the words from a chapter of my Korean vocabulary book. It looks something like this:
hangul | meaning | type |
---|---|---|
영화배우 | actor/actress | noun |
계란 | egg | noun |
야채 | vegetable | noun |
일찍 | early | adverb |
I know that I still needed to type out the words, but keying these information in emacs is a lot faster than keying the same information in Anki itself. And also, I inputted everything into emacs way before doing this because I wanted to produce an electronic copy that I can review on the go. So it was incredibly helpful to just export these emacs tables to TSV and start from there.
For the code below, it is recommended that a Jupyter notebook be used to follow along.
We will first import the relevant libraries.
import csv
import random
import genanki
We then need to generate a 10-digit model_id
that will help Anki identify this
specific “Note Type”. All decks that you generate that share the same format
must use the same model_id
to be classified under the same “Note Type”. You can
find the “Note Type” when you open the “Browse” section of Anki. Not having the
same model_id
for decks of the same type will cause the creation of duplicate
Note Types that differ by a “+” symbol.
Generate the model_id
for the first time that you run this, then comment that
line and keep using the same model_id
for subsequent deck generations.
# generate once and use the hardcoded values
model_id = random.randrange(1 << 30, 1 << 31)
# model_id = <replace this with the new model_id digits>
Then give names for the following variables:
# Filename of the data file
data_filename = <name>
# Filename of the Anki deck to generate
deck_filename = <name>
# Title of the deck as shown in Anki
anki_deck_title = <name>
# Name of the card model
anki_model_name = <name>
As for me, I like to use Python f-strings, which can replace the variable
bounded in {}
in the f-string with the value of the variable. In the following
example, I only have to define the chapter_num
and all the names will be
replaced. Ain’t that neat?
# chapter num
chapter_num = 9
# Filename of the data file
data_filename = f"chapter{chapter_num}"
# Filename of the Anki deck to generate
deck_filename = f"Level2_{data_filename}.apkg"
# Title of the deck as shown in Anki
anki_deck_title = "Hanyang Level 2 " + data_filename
# Name of the card model
anki_model_name = f"Hanyang_Korean"
The next part helps us to create the structure of the Anki cards
# Create the deck model
style = """
.card {
font-family: arial;
font-size: 24px;
text-align: center;
color: black;
background-color: white;
}
"""
anki_model = genanki.Model(
model_id,
anki_model_name,
fields=[{"name": "hangul"}, {"name": "meaning"}, {"name": "type"}],
templates=[
{
"name": "Card 1",
"qfmt": '<p class="meaning">{{meaning}}</p>',
"afmt": '{{FrontSide}}<hr id="answer"><p class="hangul">{{hangul}}</p><p class="type">{{type}}</p>',
},
{
"name": "Card 2",
"qfmt": '<p class="hangul">{{hangul}}</p>',
"afmt": '{{FrontSide}}<hr id="answer"><p class="meaning">{{meaning}}</p><p class="type">{{type}}</p>',
},
],
css=style,
)
You can just re-use the style
. Remember that the fields that I have in my TSV
file are these (in-order): [ hangul, meaning, type ]. I would think that qfmt
is
“question-format” and afmt
is “answer-format”. These use HTML to specify the
template of the question and answer sides of the card. Just replace the variable
in the braces with the variables in the fields to display the value from the
fields.
Here, I have 2 types of cards. Card 1
would prompt with the meaning of the word,
and I will need to be able to remember how to write the hangul of the word. Card 2
is the opposite; it would prompt with the hangul word, and I will need to
recall the meaning. The reason for this is that I realized that my memory is not
always bi-directional. I might remember how to write a word when given its
meaning, but somehow when being shown the actual word in hangul form, I might
forget what it means. Card 2
prevents this scenario by ensuring that my memory
would work in both directions. This has a downside of doubling my deck size :(
The next step has us trying to populate the list anki_notes
with anki_note
objects with fields populated from the TSV file. I use a “\t” delimiter because
I use a tab delimited file. If you use a CSV, it would be a “,” delimiter.
# The list of flashcards
anki_notes = []
with open(data_filename, "r") as csv_file:
csv_reader = csv.reader(csv_file, delimiter="\t")
for row in csv_reader:
anki_note = genanki.Note(
model=anki_model,
# hangul, meaning, type
fields=[row[0], row[1], row[2]],
)
anki_notes.append(anki_note)
Shuffling the cards would help with not having similar cards being too close to each other.
# Shuffle flashcards
random.shuffle(anki_notes)
We will now create the deck and populate it with the anki_note
objects that we
placed in anki_notes
(the naming convention is not the best).
# create anki deck
anki_deck = genanki.Deck(model_id, anki_deck_title)
# Add flashcards to the deck
for anki_note in anki_notes:
anki_deck.add_note(anki_note)
Finally, create a package from the deck and save it.
# create package object
anki_package = genanki.Package(anki_deck)
# Save the deck to a file
anki_package.write_to_file(deck_filename)
# output line for status
print("Created deck with {} flashcards".format(len(anki_deck.notes)))
You should now have a .apkg
file in the working directory. Import this file into
Anki and start studying!
Disclaimer: I did not write most of the code in this post. Credits go to Charly Lersteau for his post on how he used Genanki to generate Anki decks for learning Mandarin. I merely co-opted the codes for use with learning Korean.