[Lazarus] Encoding agnostic functions for codepoints + an iterator

Martin Frb lazarus at mfriebe.de
Mon Jun 20 14:42:53 CEST 2016


On 20/06/2016 12:41, Juha Manninen wrote:
> 2. How to implement an iterator for Unicode glyps + decomposed
> accented characters? It is the most complex part of Unicode. Has
> anybody made such code?
>

Glyph and (de-)composed are different things

A glyph may be a ligature https://en.wikipedia.org/wiki/Typographic_ligature
2 or more chars, depending on the font you use. Even whole words.

As for (de-)composed: SynEdit has some code to detect them (combining 
codepoints). Not sure if (still) complete (if new ones were added ??).
But if you just want (de-)composed, and not a complete unicode library 
(with all the properties for each codepoint), then feel free to look at 
the code, and use/copy it (that is: check svn blame, if any one else 
committed to that particular code. If it is just I, who committed, then 
feel free to copy, even if license changes.



More information about the Lazarus mailing list