[Lazarus] How to use strings properly with fixes_1_6 and FPC 3.0.0?

Sat Oct 22 23:21:09 CEST 2016

On Sat, 22 Oct 2016 13:25:30 +0300
Juha Manninen via Lazarus <lazarus at lists.lazarus-ide.org> wrote:

>[...]
> I guess the biggest complexity is in glyphs and ligatures. I still
> don't understand their details.

There is nothing to understand. Some languages have irregular letters.
Same as English has irregular verbs. You don't "understand" them, you
simply learn them.
As a programmer you don't need to learn them, but you should be aware
that many languages can't be mapped to simple arrays of characters.

> However for a program that must care about Unicode, like a text layout
> app, the rules for combining codepoints and glyphs are equally
> important. Codepoints for one glyph should never be split or copied
> separately. Isn't it so?

"Never" is wrong here. For example some editors allow to select the single letters of a
ligature. Also when comparing words you may want to ignore the
diacritical signs using the decomposed form of Unicode.
But afaik you are right that most programs never have an issue with
ligatures.
Btw, we need a wiki page about collation.

>[...]
> Despite problems and incompleteness of our Unicode support, it is
> actually better than most other solutions out there.
> Ok, most programming tools support Unicode somehow but people use them wrong.
> A good example is our forum SMF software. It deals with text layout
> and definitely should handle Unicode but it does not.
> Not even single Codepoints beyond BMP which should be the most easy
> case! No combining rules needed or anything.

Yes, that is basic Unicode encoding. No ligatures, no bidi. I agree
that this is the minimum for supporting Unicode.
Synedit goes much further.
And the native widgets often have pretty good support for the language
of the user. So the LCL controls using native widgets have
automatically good Unicode support.

Mattias