[Lazarus] Lazarus (UTF8) and Windows: SysToUTF8, UTF8ToSys... Is there a better solution?

Juha Manninen juha.manninen62 at gmail.com
Wed Dec 18 19:25:45 CET 2013


On Wed, Dec 18, 2013 at 5:19 PM, Marcos Douglas <md at delfire.net> wrote:
> Here too, more or less... I'm thinking to switch all own packages to UTF-8.
> But, in your codes, how do you works on Delphi -- or with Lazarus on
> Windows -- using your core parts? There are many calls from/to
> SysToUTF8 and/or UTF8ToSys from core to Windows?

If you need to call WinAPI, then you must convert obviously.
In our case API calls are not needed by the core program. It is
cross-platform code. However using a new Unicode-Delphi would cause
many problems because all VCL functions and classes, including
TStringList, expect UTF-16 string. When using UTF8String, the compiler
converts between encodings all the time.
UTF-8 is needed in many places, thus we would need to duplicate much
of VCL code for UTF-8. No good.
Using UTF-8 with FPC/Lazarus would simplify the task. LCL classes and
functions work as expected etc.
I was even presented a possibility of doing a hybrid Ansi/UTF-8 system
and a gradual data conversion plan.
If Lenght(s) = UTF8Lenght(s), then the string is an AnsiString, and so on...

If you call WinAPI a lot, then with UTF-8 you must convert encodings.
But, if you are calling WinAPI a lot, then you are in trouble anyway.

As Michael Van Canneyt wrote, backwards compatibility with UTF-8 is
good. For example all our lower-ascii data will work without
conversions.
Also lots of code which is not designed for Unicode, will continue to
work with UTF-8 but not with UTF-16. For example parsers for common
markup languages (HTML, XML, BB-code) still magically work because all
tags are in lower-ascii area.

Juha




More information about the Lazarus mailing list