You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
koreader/frontend/document
Aleksa Sarai 5709b4c2f1
kopt: correctly handle CJK character detection for space insertion (#8438)
Previously getTextFromBoxes would just pass the first and last three
bytes of the current and previous words when trying to detect CJK
characters (which shouldn't have spaces inserted).

However, this handling was not correct because CJK characters can be
longer than 3 bytes, and internally BaseUtil.utf8charcode doesn't ensure
that it was only given a single utf8 character (it blindly does the bit
operations on whatever length code you give it).

As a result, before this patch selections in PDF documents would have
lots of spaces stripped because getTextFromBoxes would think that almost
all characters were CJK characters.

Fixes: 6f1b70e5eb ("util.utf8: improve CJK character detection")
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
3 years ago
..
canvascontext.lua add hasSystemFonts device property (#7535) 3 years ago
credocument.lua credocument: update getTextFromXPointers wrapper to support selections 3 years ago
djvudocument.lua Kobo/Elipsa: More fine-grained control over the amount of online CPU 3 years ago
doccache.lua DocCache: Only compute cache size once 3 years ago
document.lua Kobo/Elipsa: More fine-grained control over the amount of online CPU 3 years ago
documentregistry.lua DocumentRegistry: Downgrade refcount warnings to debug logging. 3 years ago
koptinterface.lua kopt: correctly handle CJK character detection for space insertion (#8438) 3 years ago
pdfdocument.lua Kobo/Elipsa: More fine-grained control over the amount of online CPU 3 years ago
picdocument.lua DocumentRegistry: Downgrade refcount warnings to debug logging. 3 years ago
tilecacheitem.lua PDF written highlights: fix boxes, trash cached tiles 3 years ago