[Lazarus] UTF8 RTL for Windows

Mattias Gaertner nc-gaertnma at netcologne.de
Mon Nov 24 12:15:03 CET 2014


On Sun, 23 Nov 2014 21:37:56 -0300
luiz americo pereira camara <luizmed at oi.com.br> wrote:

> 2014-11-20 13:21 GMT-03:00 Mattias Gaertner <nc-gaertnma at netcologne.de>:
>[...]

First of all: Thanks for testing.

> Without {$codepage utf8} directive String constants will get Code Page 0
> (CP_ACP) and not the 1200 (UTF16 - UnicodeString).

Beware: There are different types of string constants.

 
> String variables assigned to those constants will also have Code Page = 0
> 
> This is because the constant string code page is evaluated at compile time
> 
> Not sure if there's a compiler command line param with same effect as
> {$codepage utf8}
> 
> The attached program show how data loss can occur

The program uses writeln, which converts to console CP.
When you save the strings to a file you can see what they contain. Or
write the byte values.

This works with or without {$codepage utf8}:

S := 'João'; // constant to (Ansi or Short)string
W:=S; 
SUTF8:=S;

const c: string = 'João';
W:=c; // constant to Wide/Unicode/UTF8String

This requires {$codepage utf8} or -Fcutf8:

W := 'João'; // constant to Wide/Unicode/UTF8string 

const c = 'João';
W:=c;

I guess it would be a good idea to pass -Fcutf8 with FPC 2.7.1. For
both modes.


Mattias




More information about the Lazarus mailing list