[Lazarus] Converting all code to use UnicodeString

Mon Sep 25 22:18:05 CEST 2017

Hi Sven,
First of all, thanks for your time to answer me.

On Mon, Sep 25, 2017 at 4:43 PM, Sven Barth via Lazarus
<lazarus at lists.lazarus-ide.org> wrote:
> On 25.09.2017 20:51, Marcos Douglas B. Santos via Lazarus wrote:
>> I understand use IFDEF to compile in different platforms like Windows
>> vs... err... Haiku. Of Linux vs Nintendo Wii...
>> But why should I use IFDEF in a code that should be the same in both
>> compilers (FPC vs Delphi)?
>
> Because they *aren't* the same. In Delphi String = UnicodeString while
> in the RTL, the FCL and the LCL String = AnsiString(CP_ACP) and using a
> different modeswitch *does not* change that, cause modes are unit specific.

Yes, but using {$modeswitch unicodestrings}, at least in a certain
unit, should work with the same code between compilers because
"string", for that unit, is UnicodeString as Delphi string is, no?

> Especially the RTL is not ready for String = UnicodeString. So your best
> bet is to use UTF8String or set the default code page to UTF8 (the LCL
> units do that by default if I remember correctly, but Ondrej can confirm
> or deny that).

Yes, Lazarus do that by default. But did you see in my examples, at
the first email, how many inconsistencies I got, using just Lazarus
and change chars in one simple constant?

>> It will be slower than now? Yes, maybe... but we already use objects!
>> If you want 500% performance, use pointers, records and procedures
>> with whatever encode you want. But if you use objects, the overhead
>> already exists... and who cares? 1ms... 2ms... even 2s that you may
>> lost using UTF16? (or UTF8, but make all equal!) So? The world is
>> using Ruby and they don't care... or Python, Java... and they store in
>> UTF16 too, which requires a double of space... but if it works and the
>> code is clean, should be more important, don't agree?
>
> For FPC also more restricted targets are to be kept in mind (AVR, DOS,
> etc.).

I know almost nothing about compilers. But IMHO, the compiler should
have which it already have: "string", which is an alias.
Then, for each OS, we should pass one argument like (simplifying):
-S=UnicodeString  or -S=AnsiString... something like that (I hope you
understood).
I mean, we should not have overload functions, but only one type of
string. Even if that type may be RawByteString.

After compiled, we will have a RTL that will work follow the "-S" argument.

> So the RTL will be adjusted in a way that it can be easily
> compiled with String = UnicodeString or as is now with String =
> AnsiString(CP_ACP). But we are not there yet.

Now we're talking.
Almost everyone that know how to work with "the group of strings",
making them compatible between FPC and Delphi, are saying that Unicode
is already done and everything is fine. You are the first one to say
that is not complete yet. Thank you. I'm glad to know that I'm not
crazy.

Best regards,
Marcos Douglas