[Lazarus] Beyond Compare 4 built with Lazarus 1.2

Graeme Geldenhuys mailinglists at geldenhuys.co.uk
Sat Jan 4 11:08:07 CET 2014


On 2014-01-04 04:34, Kostas Michalopoulos wrote:
> Is there a way to ignore all these and make everything to work with
> UTF-8? Like setting some global variable that makes all strings
> (ansistrings) "UTF-8 codepage" or something

We will have to wait for FPC 2.8.0 (or 3.0) which should have much
better built-in Unicode support. String encoding conversion should then
be taken care of automatically. Unfortunately it seems that the FPC RTL
(there will be two of them) will be AnsiString or UTF-16 only. The RTL
encoding is not configurable!

So under all Unix-like systems (Linux, MacOSX, FreeBSD - basically every
platform except Microsoft ones) there will be lots of string conversions
from/to the OS or any libraries (which are normally UTF-8) to the FPC
RTL which is going to be UTF-16. The constant conversion will also kick
in when you do streaming to/from file or any TCP/IP communications -
which both normally use UTF-8.

I would have thought the Free Pascal team would improve their design
over Delphi. eg: Seeing that automatic encoding conversion is seamless,
I thought it shouldn't be hard to have native encodings on each
platform, and the RTL can then be a dynamic Unicode implementation (it
shouldn't care what encoding is used, as long as it is one of the
Unicode encodings). By that I mean UTF-8 is used under Unix like
systems, and UTF-16 under Windows. The UnicodeString type should have
lived up to its name, and not be an alias for UTF16String. But alas,
this is not going to happen.

So we as developers have to use UTF-16 everywhere, or define our own
dynamic types (which really should have been done at RTL level). For
example:

  {$IFDEF Unix}
   RealUnicodeString = UTF8String;
  {$ENDIF}
  {$IFDEF Windows}
   RealUnicodeString = UTF16String;
  {$ENDIF}

Then use the RealUnicodeString type in your applications and frameworks
to minimise encoding conversions. But like I said, when you do this
under Unix like systems, you are still going to get conversions when
talking to the UTF-16 only RTL. Sad, but that is the way the Free Pascal
team is going.

Once that FPC release is made, then we will start seeing what
performance impact it will have on all systems. Now is too early to tell.

Regards,
  - Graeme -

-- 
fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal
http://fpgui.sourceforge.net/




More information about the Lazarus mailing list