[Lazarus] Does Lazarus support a complete Unicode Component Library?

Sergei Gorelkin sergei_gorelkin at mail.ru
Wed Feb 16 19:25:18 CET 2011


Sven Barth пишет:
> Am 16.02.2011 11:52, schrieb Hans-Peter Diettrich:
>>> I must add, that I would be very surprised if Embarcadero doesn't use
>>> native encoded string types for the "unicode string" support in the
>>> upcoming Delphi under Windows (UTF-16), Linux (UTF-8), Mac (UTF-8) etc..
>>> I'm not 100% sure about the default Mac encoding, but seeing that it
>>> comes from FreeBSD, I would guess UTF-8 there too.
>>
>> AFAIK the UnicodeString allows for any dynamic encoding, be SBCS, MBCS
>> or UTF-8/16. The element (char) size and encoding have become part of
>> every Unicode string descriptor.
> 
> This is wrong.
> 
> The following compiles:
> 
> type
>   UTF8String = type AnsiString(65001);
> 
> but the following does not:
> 
> type
>   UTF8String = type UnicodeString(65001); // ';' expected, but '(' found
> 
> Tested using Delphi XE (65001 is the codepage for UTF-8 on Windows).
> 
You are right. Likewise, type AnsiString(1200) can be declared, but it won't work (1200 is utf-16 
codepage).
In Delphi, UnicodeString is a very separate type, something close to the current FPC design.
It has BytesPerChar and Encoding attributes, but they are fixed to 2 and 1200 respectively and their 
purpose is unclear (to consume memory? to make it look like it's compatible with AnsiString?)

This has a lot of consequences in RTL, e.g. passing them in 'array of const' uses type field 
vtUnicodeString, not vtAnsiString; assigning to Variant uses varUString; a published property of 
type UnicodeString has typekind=tkUString and so on. Part of these are already implemented in FPC 
RTL due to compatibility reasons.

I'm afraid that due to this "compatibility" we're doomed to clone the Delphi implementation whatever 
crappy it is :(

Sergei




More information about the Lazarus mailing list