In order to make startSdcv usable for plugins that might need to do
dictionary lookups in order to work, it is necessary to split out the
core of startSdcv into another method which returns the raw data from
sdcv.
In addition, in order to make it possible to amortise the cost of each
lookup (which could be fairly expensive) make it possible to pass
multiple words to rawSdcv at the same time. Sdcv supports passing
multiple words as arguments (which it then looks up in order and returns
a separate JSON array per line for each word) so we just need to tweak
the return style accordingly.
All of the deduplication and dummy results handling remains in startSdcv
because plugins might strongly depend on whether sdcv returned actual
results.
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
There were two ways of specifing selected text for a highlight depending
on whether it was a "single word" or text selected using hold-and-pan.
In addition to being a bit more complicated than is necessary, with the
addition of the language support plugin system (where the "single word"
selected might be expanded), it makes more sense to simply use the same
logic and table structure for both cases.
The dictionary lookup special case (hold-without-pan triggering a
dictionary lookup by default) still works as before.
In addition, this patch fixes a minor inefficiency during dictionary
quick lookup -- before this patch, the highlight would be re-selected
because the quick lookup window is run concurrently and tries to fetch
ReaderHighlight.selected_text but this is set to nil immediately after
triggering the lookup. This is unnecessary because :clear() will be
called anyway when the quick pop-up closes, and so clearing this can be
left until then.
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
There were a handful of cases where if there was no cached kctx there
was no fallback and several KoptInterface methods would return nil,
causing issues in various parts of KOReader (this happened with the
migration to selected_text everywhere but it's unclear how that change
caused this regression).
In any case, from a correctness perspective it makes sense to have the
corresponding fallback paths to create a new kctx if we couldn't find a
cached one.
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
With the latest koreader-base update, we can now create native
selections using getTextFromXPointers. In order to make the wrapper less
annoying to use, always enable segmented selection if selections are
enabled (to match getTextFromPositions).
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
It is a bit cleaner to do all of the necessary looping over lists of
Geoms within a straight-forward Geom.boundingBox function rather than
looping over :combine every time (or reimplementing :combine in some
cases). Geom:combine can be trivially reimplemented in terms of
Geom.boundingBox as well.
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Previously the CJK character detection defined only characters in the
range U+4000..U+AFFF as "CJK characters". This excludes an incredibly
large number of CJK characters within the BMP, let alone the whole two
planes dedicated to rarer CJK characters (the SIP and TIP). As a result,
a very large number of Chinese, Japanese, and Korean characters were not
detected as being CJK characters.
While slightly less elegant-looking, it is far more accurate to compute
the codepoint from the utf8 character and then see if it falls within
one of the defined CJK blocks. This is not future-proof against future
CJK ideograph extensions in future Unicode versions, but there is no
real way to accurately predict such changes so this is the best we can
do without accidentally treating characters explicitily defined as being
non-CJK in Unicode as CJK.
While we're at it, copy Lua 5.3's utf8.charpattern constant definition
so that we can more easily write utf8 iterators with string.gmatch (at
least in the interim until there is a rework of utf8 handling in
KOReader and everything is rebuilt on top of utf8proc).
Some unit tests are added for Korean and Japanese text, and the existing
unit tests needed a minor adjustment to handle the fact that
isSplittable now correctly detects CJK punctuation as a character to
compare against the forbidden split rules.
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
* Use paintRect and plain colors instead of lightenRect and a weird
dimming factor.
* Update call sites to the new API
* Handle FP maths properly (i.e., floor coordinates & ceil dimensions at
the latest possible time).
* Fix border handling in the fill bar (make sure we actually honor it
when computin the x position, and that we won't overflow into it when
computing the width).
* Update docs
Reword "Purge .sdr" to "Reset settings".
When purging, remove only the known document metadata
files, and not those for a document with the same name but
a different suffix.
Bookmarks list:
- page numbers are displayed
- page bookmarks are marked with a star
- new setting: Sort by largest page number (default: checked)
New bookmark setting: Add page number / timestamp to bookmark
- If enabled (default), bookmark name is 'Page # notes @ time'.
- If disabled, bookmark name is equal to the notes field.
Rename bookmark dialog:
- page number and timestamp are displayed in the input
dialog description
- blank input renames bookmark to the default name in
accordance with the new setting
Also fix: changing boundaries of the highlight: the name of the
highlight is not changed if it was previously edited by the user.