[Lazarus] Removed use of UTF8String in Lazarus to work with cpstrnew
Paul Ishenin
ip at kmiac.ru
Mon Sep 19 11:35:34 CEST 2011
19.09.2011 17:06, cobines wrote:
> 2011/9/19 Paul Ishenin<webpirat at mail.ru>:
>> Lazarus must use either UTF8String everywhere or nowhere.
> I thought that Lazarus will continue to use UTF-8 on all platforms and
> String will mean String<65001> and it will be interchangeable with
> UTF8String. Is that not so?
Will it use or won't it does not matter. If in one place you use
UTF8String and in other AnsiString the compiler will convert that
UTF8String in one place to default system codepage in another. For
example on my windows it converted UTF8 string to 1251 codepage. Having
all strings in default codepage prevents from automatic conversion.
> Why change UTF8String to AnsiString and not String, like almost every
> other string parameter?
The argument was UTF8String. UTF8String previosly was declared as type
AnsiString in RTL. Why should I choose a "string" type which depends on
compiler switches and source code modes.
> Why not apply the same to AnsiString and change all to String since
> Lazarus does not work with Ansi code pages anyway?
Lazarus works with strings which have 1 byte per element. If FPC later
switch default string type to UnicodeString Lazarus will suddenly get
many problems.
> For example, if UTF8ToUTF16 was left to accept UTF8String I would
> think it would force the parameter to have UTF-8 code page, which
> would be more correct. And this is what I don't understand, how will
> it break when UTF8String is left.
Compiler adds implicit codepage conversion for string arguments. I had
to avoid that. The better choise would be to use RawByteString type but
I it is not defined in fpc 2.4.4 which we need to support.
Best regards,
Paul Ishenin.
More information about the Lazarus
mailing list