[Lazarus] Lazarus (UTF8) and Windows: SysToUTF8, UTF8ToSys... Is there a better solution?

Juha Manninen juha.manninen62 at gmail.com
Wed Dec 18 10:48:52 CET 2013


On Wed, Dec 18, 2013 at 3:52 AM, Marcos Douglas <md at delfire.net> wrote:
> I would like to understand: Why Java, .Net and others use UTF-16 as
> default encode and why Lazarus team chose UTF-8?

... and don't forget Windows API.
I believe the decision was made by people who didn't know the issue
well enough, and some decision had to be pushed quickly.
I am slowly learning the issues around Unicode. UTF-8 seems to be the
best encoding for most purposes.
The only benefit of UTF-16 was supposed to be its fixed size character
length, but finally it did not happen. All characters in the world did
not fit into 16-bit space.
It means UTF-16 wastes space without any real benefits. Only UTF-32
would bring the fixed size character benefit.
What more, UTF-16 is confusing because it has variations. It all is
well explained here:
  http://www.utf8everywhere.org/

At my work we must switch to Unicode but the details of how to do it
are still open. The code now works with both Delphi and FPC.
There is a highly optimized DB engine where most data fits in a cache
at run-time making it lightning fast. UTF-16 would almost double the
space requirement and thus is out of question. The core parts must use
UTF-8 anyway. One choice is to dump Delphi completely and use
FPC+Lazarus for everything. Lets see...

Juha




More information about the Lazarus mailing list