[Lazarus] Does Lazarus support a complete Unicode Component Library?

Hans-Peter Diettrich DrDiettrich1 at aol.com
Thu Feb 17 14:56:25 CET 2011


Michael Schnell schrieb:

>> That's why such loops should be disallowed with Unicode strings, as 
>> kind of low level string handling. 
> Not only this, but the normal user would like to do
> 
>   MyChar := MyString[Length(MyString];
> 
> to get the last character of a string.

What's simply nonsense with UTF-8/16 strings :-(


> Thus IMHO the Length function name should be dumped and two new 
> functions (such as CharacterCount and ByteCount) should be introduced.

Length() since ever returned the number of *physical* elements.


>> For-Each loops may be acceptable as high level string handling, but 
>> with what type of the loop variable???
>>
> We obviously  would need a UnicodeChar Type that holds the 32 Bit encoding.

Iff we ever want to support such functionality.


> But the said "quirks" can't be handled by this. I up till now don't 
> understand if - technically - these "quirks" are seen as a single 
> Unicode character or as a sequence of Unicode Characters. Nor do I 
> understand how they can be used in a decent way and if they are 
> necessary or just legacy.

The big mess starts with combinations of codepoints. No problems as long 
as the RTL functions deal with the physical storage of the codepoints, 
and nothing else.

DoDi





More information about the Lazarus mailing list