This plugin provides support for Japanese deinflection during lookup as well as making long-hold word selection actually select whole words properly. With this plugin, word lookups in Japanese text in KOReader become much easier, and no longer requires users to use special dictionaries that have synonym-based deinflection rules defined (which were always fairly annoying to use). The basic idea and deinflection data for this plugin come from Yomichan (which is also a GPL-3.0+ project), but everything was implemented specifically for KOReader. Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
1.3 KiB
Japanese Support Plugin for KOReader
This plugin is heavily based on Yomichan's design, and makes use of Yomichan's deinflection data, but was written specifically for KOReader. There are two major features implemented by this plugin:
-
Verb deinflection (aka deconjugation) support, based on Yomichan's very elegant rule-matching suffix replacement system using Yomichan's data.
-
Text segmentation support without needing MeCab or any other binary helper, by re-using the users' installed dictionaries to exhaustively try every length of text and select the longest match which is present in the dictionary. This is similar to how Yomichan does MeCab-less segmentation.
On paper this plugin should also be work with Chinese text if the user has Chinese dictionaries installed, though that is not its primary intended use-case.
The backbone of this plugin is the included yomichan-deinflect.json
. This
file is copied verbatim from Yomichan's ext/data/deinflect.json
and can be updated when necessary by simply getting a newer copy.
Note that Yomichan and KOReader use the same license (GPL-3.0-or-later) so any theoretical licensing problems are a non-issue.