[Lazarus] Converting all code to use UnicodeString

Mattias Gaertner nc-gaertnma at netcologne.de
Mon Sep 25 22:45:03 CEST 2017


On Mon, 25 Sep 2017 17:18:05 -0300
"Marcos Douglas B. Santos via Lazarus" <lazarus at lists.lazarus-ide.org>
wrote:

>[...]
> Yes, but using {$modeswitch unicodestrings}, at least in a certain
> unit, should work with the same code between compilers because
> "string", for that unit, is UnicodeString as Delphi string is, no?

The important thing is "in a certain unit". As soon as you access
strings from other units, you have to consider their type.

 
> > Especially the RTL is not ready for String = UnicodeString. So your best
> > bet is to use UTF8String or set the default code page to UTF8 (the LCL
> > units do that by default if I remember correctly, but Ondrej can confirm
> > or deny that).  

Unit LazUtf8 does it.

 
> Yes, Lazarus do that by default. But did you see in my examples, at
> the first email, how many inconsistencies I got, using just Lazarus
> and change chars in one simple constant?

Your first email does not contain a simple Lazarus+string example. I
see an example for LCL+unicodestring.

 
>[...]
> I know almost nothing about compilers. But IMHO, the compiler should
> have which it already have: "string", which is an alias.
> Then, for each OS, we should pass one argument like (simplifying):
> -S=UnicodeString  or -S=AnsiString... something like that (I hope you
> understood).

The flags are -MDelphiUnicode, -MDelphi or -MObjFPC.
But they only compile the units with sources in the unit path, which
excludes all FPC units. Also keep in mind that the system unit and the
RTL require a lot of low level functions, which require separate
versions.


> I mean, we should not have overload functions, but only one type of
> string. Even if that type may be RawByteString.

From a user pov: Yes, that's what Lazarus is recommending: Simply use
one string type, and that is String. The confusion starts when you start
using different string types.


> After compiled, we will have a RTL that will work follow the "-S" argument.

The RTL has already a lot of IFDEFs for the coming UnicodeString RTL.

 
> > So the RTL will be adjusted in a way that it can be easily
> > compiled with String = UnicodeString or as is now with String =
> > AnsiString(CP_ACP). But we are not there yet.  
> 
> Now we're talking.
> Almost everyone that know how to work with "the group of strings",
> making them compatible between FPC and Delphi, are saying that Unicode
> is already done and everything is fine. You are the first one to say
> that is not complete yet. Thank you. I'm glad to know that I'm not
> crazy.

Unicode <> UnicodeString
Unicode is working with UTF-8.
If you want a Delphi compatible UTF-16 RTL and packages you are welcome
to help the FPC team.


Mattias


More information about the Lazarus mailing list