[Lazarus] Lazarus (UTF8) and Windows: SysToUTF8, UTF8ToSys... Is there a better solution?

Graeme Geldenhuys graeme at geldenhuys.co.uk
Wed Dec 25 11:05:13 CET 2013


On 2013-12-24 17:13, Jürgen Hestermann wrote:
> All units used should use the same string encoding IMO.
> But which?

UTF-8 of course!  It's the newest Unicode encoding that overcomes all
problems found in other encodings. It is also the only Unicode encoding
that is backwards compatible with ASCII - hence the W3C and the rest of
the Internet etc standardised on it. It is also future proof and can
(again) be extended to full (4 byte range) or to using 5 or 6 byte code
points [*1]. Performance wise, it is also NOT any slower than any of the
other Unicode encodings.

Probably the only reason UTF-16 is still being used is because of
Windows - which used to use UCS2, and moving to UTF-16 was easier at the
time (and I don't think UTF-8 existed at that point).



[1] A couple years back they limited the range of UTF-8 so that it stays
compatible for now with the limited range of UTF-16. But the UTF-8
encoding can actually go all the way to 6 bytes per code page, which is
an absolute massive range.


Regards,
  - Graeme -

-- 
fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal
http://fpgui.sourceforge.net/




More information about the Lazarus mailing list