I've found that some OPDS catalogs have multiple downloads of the same filetype, but optimized or formatted in different ways. The Title of the download is much more descriptive in this case, so I thought it would be better to display the title if available.
The OPDS catalog at https://standardebooks.org/opds is a good example. Note how entries in https://standardebooks.org/opds/new-releases have three different epub downloads, titled "Recommended compatible epub", "Advanced epub", and "Kobo Kepub epub".
Previously getTextFromBoxes would just pass the first and last three
bytes of the current and previous words when trying to detect CJK
characters (which shouldn't have spaces inserted).
However, this handling was not correct because CJK characters can be
longer than 3 bytes, and internally BaseUtil.utf8charcode doesn't ensure
that it was only given a single utf8 character (it blindly does the bit
operations on whatever length code you give it).
As a result, before this patch selections in PDF documents would have
lots of spaces stripped because getTextFromBoxes would think that almost
all characters were CJK characters.
Fixes: 6f1b70e5eb ("util.utf8: improve CJK character detection")
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
This layout is far more commonly used on mobile devices, and allows for
much easier typing. The keyboard primarily functions through gestures in
the four cardinal directions to select which vowel kana to select. In
addition, users can cycle through each kana row by tapping the key
within a 2-second window (this is the equivalent to T9 input for
Japanese phone keyboards).
This also resolves the long-standing issue that the old keyboard did not
correctly handle dakuten (there was a standalone dakuten key which added
a stray dakuten mark, and the umlat mode which added dakuten to all of
the keys it could) and could not input handakuten characters at all.
In order to allow adding dakuten and cycling through the various
modifiers for the previous kana, we need to wrap the input-box (similar
to korean) but luckily we don't need any state machine magic since we
just need to modify the last character in the character buffer. However
because the tap timeout for T9-like-cycling needs to be reset after any
non-tap key we need to add some basic wrappers around a few other
input-box methods.
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
A layout might want to make some specific feature configurable, so
create an addToMainMenu-like system for allowing layouts to add their
own configuration sub-menu to the keyboard configuration menu.
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
This allows for InputText wrappers (namely the Japanese keyboard which
needs to be able to apply modifiers to the character before the cursor)
to nicely access the character list.
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
In some cases, it's useful to be able to wrap a function and either
replace its contents entirely or have some callback be run before
calling the underlying function.
The most obvious users for this feature are the Japanese and Korean
keyboards (both of which need to wrap the inputbox methods with either
their own versions or have basic callbacks be run before the method is
executed).
This is loosely based on how busted/luassert spies work.
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
It's possible for the user to have selected nothing, and trying to
operate on the nil highlight can cause confusion or crashes. This
restores the behaviour before commit 7a0e3d5e68 ("readerhighlight:
remove selected_word and use selected_text everywhere"), which missed
this case.
In addition, add some debug guards to ReaderHighlight methods which
cannot handle selected_text being nil (or at least, shouldn't be called
with selected_text being nil).
Fixes: 7a0e3d5e68 ("readerhighlight: remove selected_word and use selected_text everywhere")
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
The previous version of JMdict comes from 2009 and doesn't appear to
work at all when trying to do basic lookups (likely due to some kind of
encoding problem). In addition, the license information and sourcing was
not really in line with the requirements specified by the JMdict
license. This version is far more up-to-date and also includes synonym-based
deinflection (though because KOReader has a Japanese plugin now, this is
technically not necessary).
Since there didn't exist a nicely-maintained place to download these
dictionaries (because StarDict is not widely used for Japanese
dictionaries), I've set up a personal GitHub repository where I've
hosted them. Note that we're intentionally not pinning the commit hash
because GitHub only recommends we use gh-pages for CDN purposes, and one
of the requirements of the JMdict license is that you need to be able to
update to later versions.
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
If we had a local prefered AP with a higher RSSI,
we attempted to associate with it over wpa_supplicant
being already attempting to associate with its own preferred AP.
That... failed horribly.
Also adapt to the new lj-wpaclient API, fixing a few other edge-cases,
and making the whole thing slightly faster because we no longer
uselessly sleep.
And more reliable because we now actually wait for replies to our
requests.
Bump base
https://github.com/koreader/koreader-base/pull/1424
screen.
Otherwise, on the Sage, weird flash glitches may happen, depending on
what was on screen...
(e.g., there's some weird update merging shenanigans going on
despite those updates being flagged NO_MERGE...).
Switch to flashing GC16, because the screen is so fast that using GL16
simply leaves us with an unreadable mess of ghosting.
I'm halfway considering rewriting this in Lua so that I can do a proper
batched update...
The second argument is a ddjvu_render_mode_t
Try to actually honor the user settings instead of enforcing COLOR
while we're there.
Fix#8376
Regression since #8250
Now that FileManager registers its UI modules in the same way as Reader,
this shouldn't be necessary but this protects us against some other app
creating a ReaderDictionary instance without having ui.languagesupport
registered properly.
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
With the addition of the language support module, ReaderDictionary
tries to use modules registered with the UI instance, but the
FileManager doesn't provide key-based registration of its UI modules.
In order to allow the same code to be used by both FileManager and
Reader seamlessly, copy the :registerPlugin() method from Reader and use
it with FileManager. This will ensure any other hidden assumptions about
UI module registration are handled properly.
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
- Add an icon to distinguish between page bookmarks, plain
highlights, and highlights with an added note
- Bookmark details: show both highlighted text and added note
- Bookmark list: allow filtering by type and/or by keyword
- New bookmark selection mode, to allow multiple removals
- New option: show separator line
This primarily consists of some spies added to ensure that the
LanguageSupport plugin is actually being called at the right time.
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
This plugin provides support for Japanese deinflection during lookup as
well as making long-hold word selection actually select whole words
properly. With this plugin, word lookups in Japanese text in KOReader
become much easier, and no longer requires users to use special
dictionaries that have synonym-based deinflection rules defined (which
were always fairly annoying to use).
The basic idea and deinflection data for this plugin come from
Yomichan (which is also a GPL-3.0+ project), but everything was
implemented specifically for KOReader.
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
This creates a new plugin system which hooks into a handful of reader
operations in order to allow plugins to add language-specific support
where the default reader falls short. The two hooks added are:
* During hold-without-pan taps, language plugins can modify the
selection in order to better match what users expect koreader to
highlight when selecting a single word.
The vast majority of CJK language words are more than one character,
but KOReader treats all CJK characters as a single word by default,
so adding this hook means that readers no longer need to manually
select the whole word every time they need to look something.
* During dictionary lookup, language plugins can propose alternative
candidate words to look up if the selected word could not be found in
the dictionary.
This is pretty necessary for Japanese and Korean, both of which are
highly agglutinative languages and the fuzzy searching system of
StarDict is simply not usable because often the inflection of the
word is so much longer than the dictionary form that sdcv decides to
chop off the actual word and search for the inflection (which yields
useless results).
This system is of particular interest for readers of CJK languages
(without this, looking up words using KOReader was fairly painful) but
this system is designed to be minimal and language-agnostic enough that
other languages could make use of it by creating their own plugins if
the default "whole word" highlight and fuzzy-search system doesn't match
their needs.
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
This is necessary in order to allow the language support module to be
added to the menu outside of the common_settings menu table definition.
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
In order to make startSdcv usable for plugins that might need to do
dictionary lookups in order to work, it is necessary to split out the
core of startSdcv into another method which returns the raw data from
sdcv.
In addition, in order to make it possible to amortise the cost of each
lookup (which could be fairly expensive) make it possible to pass
multiple words to rawSdcv at the same time. Sdcv supports passing
multiple words as arguments (which it then looks up in order and returns
a separate JSON array per line for each word) so we just need to tweak
the return style accordingly.
All of the deduplication and dummy results handling remains in startSdcv
because plugins might strongly depend on whether sdcv returned actual
results.
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
There were two ways of specifing selected text for a highlight depending
on whether it was a "single word" or text selected using hold-and-pan.
In addition to being a bit more complicated than is necessary, with the
addition of the language support plugin system (where the "single word"
selected might be expanded), it makes more sense to simply use the same
logic and table structure for both cases.
The dictionary lookup special case (hold-without-pan triggering a
dictionary lookup by default) still works as before.
In addition, this patch fixes a minor inefficiency during dictionary
quick lookup -- before this patch, the highlight would be re-selected
because the quick lookup window is run concurrently and tries to fetch
ReaderHighlight.selected_text but this is set to nil immediately after
triggering the lookup. This is unnecessary because :clear() will be
called anyway when the quick pop-up closes, and so clearing this can be
left until then.
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
There were a handful of cases where if there was no cached kctx there
was no fallback and several KoptInterface methods would return nil,
causing issues in various parts of KOReader (this happened with the
migration to selected_text everywhere but it's unclear how that change
caused this regression).
In any case, from a correctness perspective it makes sense to have the
corresponding fallback paths to create a new kctx if we couldn't find a
cached one.
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>