[Lazarus] UTF8 RTL for Windows

Mattias Gaertner nc-gaertnma at netcologne.de
Thu Nov 20 17:21:01 CET 2014


Hi all, especially Windows users,

The development version of FPC 2.7.1 has extended Strings and many RTL
functions now work for codepages other than the system codepage.

This means Lazarus can now be compiled in two modes:

1. The old mode: LCL treats all "String" as UTF-8 encoded. When
accessing RTL and WinAPI functions you have to use the UTF8 functions.
For example aStringList.LoadFromFile(UTF8ToSys(Filename)) and
FileExistsUTF8. Note that UTF8ToSys only supports characters in the
Windows code page, while FileExistsUTF8 supports the full Unicode range.

2. The new mode: The LCL, FCL and RTL treat all "String" as UTF-8
encoded. Most RTL file functions now work with full Unicode.
For example FileExists and aStringList.LoadFromFile(Filename) now
support full Unicode.
AnsiToUTF8, UTF8ToAnsi, SysToUTF8, UTF8ToAnsi have no effect. Many
UTF8Encode and UTF8Decode calls are no longer needed, because when
assigning UnicodeString to String and vice versus the compiler does it
automatically for you.
When accessing the WinAPI you must use the W functions or use
UTF8ToWinCP and WinCPToUTF8.
You can enable the new mode by compiling Lazarus clean with
-dEnableUTF8RTL.

More information about the new FPC Unicode Support:
http://wiki.freepascal.org/FPC_Unicode_support

RTL functions that now support Unicode under Windows:
http://wiki.freepascal.org/FPC_Unicode_support#RTL_changes

The above links are about the default RTL with system code page.
I want to create a Wiki page to gather all information about
the UTF8 RTL for Lazarus users and how to adapt their code.

Please test and tell what you find out.


Mattias




More information about the Lazarus mailing list