eReader Dictionary Creation

Foreword

One of the thing I like most on the ereaders (apart from the font size :-) ) is the built-in dictionaries. Being a linguist and avid multilingual reader, the possibility to refresh my memory on a specific word instantly is priceless.
It is one of the (many) reasons I like the kobo: it is by far the machine where you have access to most dictionaries, for the simple reason that the creation process is well documented and fairly straightforward. On other machines, it is not quite so easy, which is a shame, because I my opinion it is definitely a big selling point. Anyway.
Recently I started learning Indonesian, and as my level got somewhat decent I wanted to start reading books, but of course my vocabulary is not very large yet. That's where dictionaries come in :-)
As said, my main reader is Kobo, but for several reasons (see here), I have also kindle, Nook, and Trekstor readers.
So I started on a journey to produce dictonaries for all those machines, including for the PocketBook for a friend of mine.
I have documented here the whole process. The idea is to provide a one-stop place for dictionary production :-)
Also be creative. I didn't find an Indonesian-whatever dictionary, but I found an english-Indonesian... It took a couple of scripts and a lot of patience to review/correct the file, but from that, I managed to produce an Indonesian-English dictionary :-)
Don't hesitate to contact me on mastodon (@FrankAuLux on polyglot.city) for changes / additions / corrections.

Generic

Creating a dictionary usually implies 3 steps:
  1. Finding a file with the language or language pair you are looking for.
    Good sites to start are:
  2. Converting the said file to a format that can be used for your specific machine:
    Chances are that PyGlossary will have it.
    See also
  3. Finally compiling that file (or files) into something your machine will understand
For example, to create an english-french dictionary for the Pocket book,
* step 1 would be to get a en-fr.txt file from dict.cc
* step 2 transform it into xdxf with Pyglossary
* step 3 compile it to the .dic format with converter.exe under windows.

The compilation may not work immediately. For example on the kobo, you can't have <w in a .df file, so I had to search and replace every time...

[Top]

The Kobo(s)

One of the most useful feature of the Kobo is the built-in dictionaries. Not only do you have the one provided for free by kobo, but you can add whatever you want. This link has the links to the official ones available for free download. More dictionaries are available (firmware 4.24.15672+).
Otherwise, you can go to Mickaël Schoentgen's project for free French, Catalan, Danish, German, English, Spanish, Greek, Italian, Norwegian, Portuguese, Romanian, Russian and Swedish dictionaries.
You can also easily make your own dictionaries using either Penelope or PyGlossary and manage them with dictutil.
This script will convert a csv file into df format. I used it to create my own dictionary from my Indonesian vocabulary list:



I also created Ukrainian/German/English dictionaries (UK=>DE, DE=>UK, UK=>EN,EN=>UK).
The process I used is:
  1. Get a source file from one of the resources listed above
  2. Transform it into .df file. If you use PyGlossary on debian, you may have to install python3-icu, python3-marisa-trie,python3-marisa, marisa, libmarisa0 and python3-mistune0.
  3. Transform the .df file into dicthtml-xx.zip with kobogen (part of dictutil).

[Top]

The Kindle

To create your own dictionary for the kindle. Jake McCrary's website is the place to start. There is some extra info on amz website (mostly for inflections). Finally you can use John Bent's python script to help you convert the database. (One more info source).
However, amz no longer supports or distributes the kindlegen for linux which is necessary to create the kindle dictionary.
You can still download the old i386 version from the internet archive. In order to make it work on a recent amd64 architecture, you need to enable i386 support:
Ukrainian/German/English dictionaries also available.
[Top]

Nook

The nook uses a sqlite 3 format with entries in html for dictionaries. However, this varies according to the firmware release, so it can be a bit tricky.
Here is a good place to start, with plenty of pre-compiled dictionaries. Some more dictionaries here .
The procedure to compile your own dictionaries is described here, along with a python script (note: python v2.xx, won't run under V3).
Another windows dic compiler.



[Top]

PocketBook

If you have a windows machine, creating a dictionary for the PB is rather simple:
* step 1 get a source file from one of the site above
* step 2 transform it into xdxf with makedict (or one of the scripts below)
* step 3 compile it to the .dic format with converter.exe under windows.
That's it!
Other resources if that doesn't work for you:

[Top]

TrekStor Pyrus Mini

Unfortunately I haven't been able (yet ?) to find any information about built-in dictionaries for the TrekStor.


[Top]

Envoi

That's all there is to say, really.
Don't hesitate to contribute to this page !!

More websites:
  • xdxf
  • sdict
  • dictd
  • [Top]



    My current collection:

    [Top]