2
0
mirror of https://github.com/koreader/koreader synced 2024-11-18 03:25:46 +00:00
koreader/plugins/japanese.koplugin/README.md
Aleksa Sarai 3d4e54c7e6 plugins: add Japanese Support plugin
This plugin provides support for Japanese deinflection during lookup as
well as making long-hold word selection actually select whole words
properly. With this plugin, word lookups in Japanese text in KOReader
become much easier, and no longer requires users to use special
dictionaries that have synonym-based deinflection rules defined (which
were always fairly annoying to use).

The basic idea and deinflection data for this plugin come from
Yomichan (which is also a GPL-3.0+ project), but everything was
implemented specifically for KOReader.

Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
2021-10-23 15:49:54 +02:00

1.3 KiB

Japanese Support Plugin for KOReader

This plugin is heavily based on Yomichan's design, and makes use of Yomichan's deinflection data, but was written specifically for KOReader. There are two major features implemented by this plugin:

  1. Verb deinflection (aka deconjugation) support, based on Yomichan's very elegant rule-matching suffix replacement system using Yomichan's data.

  2. Text segmentation support without needing MeCab or any other binary helper, by re-using the users' installed dictionaries to exhaustively try every length of text and select the longest match which is present in the dictionary. This is similar to how Yomichan does MeCab-less segmentation.

    On paper this plugin should also be work with Chinese text if the user has Chinese dictionaries installed, though that is not its primary intended use-case.

The backbone of this plugin is the included yomichan-deinflect.json. This file is copied verbatim from Yomichan's ext/data/deinflect.json and can be updated when necessary by simply getting a newer copy.

Note that Yomichan and KOReader use the same license (GPL-3.0-or-later) so any theoretical licensing problems are a non-issue.