[Lazarus] UTF8 RTL for Windows

Hans-Peter Diettrich DrDiettrich1 at aol.com
Mon Nov 24 22:53:44 CET 2014


Graeme Geldenhuys schrieb:

> How is ThousandSeparator and DecimalSeparator supposed to work it
> TFormatSettings? If you switched the RTL to UTF-8 or UTF-16 a Russian
> thousand separator (4-byte non-breaking white space character) for
> example will not fit into a Char type.

The Char type is quite useless with Unicode, at least if it has less 
than 3 bytes (4 for UTF-8). There exist many more flaws in the RTL/LCL, 
assuming that a character always fits into a Char (like the Pos 
overload...).

In the best case Char could be retyped into an string (substring), so 
that it can hold any Unicode character *and* its encoding. Unicode 
stringhandling in general should always use substrings, for the same 
reasons. Until then 99.9% of occurences of Char in UTF-8 aware library 
or application code can be considered bugs :-(

The FPC team can sort out the real low-level code (most probably only 
the string conversion routines), the rest will become Delphi 
incompatible when fixed.

DoDi





More information about the Lazarus mailing list