[Lazarus] Lazarus (UTF8) and Windows: SysToUTF8, UTF8ToSys... Is there a better solution?
Hans-Peter Diettrich
DrDiettrich1 at aol.com
Fri Dec 27 10:16:56 CET 2013
Juha Manninen schrieb:
> It happened again. The word "Unicode" was mentioned and the result is
> an endless debate of how it should be done. Now > 100 messages and
> counting ...
Now that we are in pre-release of strings with Encoding, the debate
enters a very new round.
> I personally don't care much what the default encoding will be, but I
> wonder how easy it will be to use UTF-8 for my employer's code.
> The situation with FPC will be better than with Delphi because FPC
> does not convert automatically to default encoding ALWAYS. It only
> converts when the conversion is needed.
> For example TStringList can be used for UTF8Strings and it does not
> trigger automatic conversion.
> Isn't it so? Please correct me if I still got it wrong.
That's the old state, where strings have no stored Encoding. As soon as
AnsiStrings have an encoding, the default encoding becomes important for
the reduction of automatic conversions. When the RTL is converted to
UTF-16, you'll have to accept either this new default encoding, or any
number of automatic conversions between Ansi and UnicodeStrings.
> It means UTF-8 with FPC will be easier than UTF-8 with Delphi, even if
> UTF-16 was the default.
Delphi suffers from the use of CP_ACP, which was the only supported
encoding before, and still is the only explicitly supported encoding
when the AnsiString unit is used. In Lazarus we had the same "only one
encoding" philosophy, except that here the default string type is UTF-8.
With the encoded AnsiStrings the problem of other encodings and
automatic conversion arises. Delphi solved most problems by changing
"string" to UTF-16, so that only the forced used of AnsiString will ever
result in automatic conversions due to different string encodings.
In FPC/Lazarus the situation is somewhat different, because now the
default string type could be UTF-8, UTF-16 or even CP_ACP, with a number
of users voting for each of them. Technically the simplest solution
would be to keep the de-facto standard UTF-8, as assumed by Lazarus. But
when "string" becomes UTF-16, as in recent Delphi versions, Lazarus and
the LCL deserves heavy refactoring. That's the top discussion topic
right now.
DoDi
More information about the Lazarus
mailing list