[Lazarus] TProcess, UTF8, Windows

Marcos Douglas md at delfire.net
Sat Apr 14 22:36:02 CEST 2012


On Sat, Apr 14, 2012 at 2:20 PM, Sven Barth <pascaldragon at googlemail.com> wrote:
> On 14.04.2012 15:39, Marcos Douglas wrote:
>>
>> On Sat, Apr 14, 2012 at 6:36 AM, Sven Barth<pascaldragon at googlemail.com>
>>  wrote:
>>>
>>> On 14.04.2012 07:20, Martin Schreiber wrote:
>>>>
>>>>
>>>> On Friday 13 April 2012 20:32:23 Marcos Douglas wrote:
>>>>>
>>>>>
>>>>> On Fri, Apr 13, 2012 at 2:59 PM, Martin Schreiber<mse00000 at gmail.com>
>>>>
>>>>
>>>> wrote:
>>>>>>
>>>>>>
>>>>>> On Friday 13 April 2012 18:51:42 Michael Van Canneyt wrote:
>>>>>>>
>>>>>>>
>>>>>>> Also:
>>>>>>> Lazarus happened to choose UTF-8 as the encoding of their LCL.
>>>>>>> By contrast, MSEGui chose UTF-16 as it's encoding.
>>>>>>
>>>>>>
>>>>>>
>>>>>> In order to differentiate from other string types and encodings MSEgui
>>>>>> uses "msestring" in properties, variables and parameters. msestring is
>>>>>> defined as
>>>>>> "
>>>>>> type
>>>>>>  msestring = UnicodeString;
>>>>>> "
>>>>>
>>>>>
>>>>>
>>>>> But even so, we still can pass AnsiString<UTF8String, string,
>>>>> whatever>    to a function:
>>>>>
>>>>> type
>>>>>   MyString = UnicodeString;
>>>>>
>>>>> procedure Foo(const s: MyString);
>>>>> begin
>>>>>   ShowMessage(s);
>>>>> end;
>>>>>
>>>>> procedure TForm1.Button1Click(Sender: TObject);
>>>>> var
>>>>>   s: AnsiString;
>>>>> begin
>>>>>   s := 'hi';
>>>>>   Foo(s);
>>>>> end;
>>>>>
>>>> Yes. And if "s" is not in utf-8 but in current system encoding it even
>>>> will be
>>>> translated correctly in FPC 2.6.0. ;-)
>>>
>>>
>>>
>>> In that case it will also be translated correctly in 2.7.1. It will also
>>> be
>>> translated correctly if the AnsiString is defined as
>>> "AnsiString(CP_UTF8)"
>>> and has an UTF-8 encoded string in it.
>>
>>
>> I did understand.
>> I coded a console app (just FPC) on Windows:
>>
>> program Project1;
>>
>> {$mode objfpc}{$H+}
>>
>> uses
>>   Classes, SysUtils;
>>
>> var
>>   fname: UnicodeString;
>>   list: TStringList;
>> begin
>>   fname := 'c:\á b ç\á.txt';
>>   list := TStringList.Create;
>>   try
>>     list.LoadFromFile(fname);
>>     writeln(list.Text);
>>   finally
>>     list.Free;
>>   end;
>> end.
>>
>> Should work? Here not worked...
>
>
> It won't work when interfacing with system functionality (like file loading)
> for now. At least on Windows, because there FPC still uses the Ansi (single
> character encoding) functions of the Windows API, but for your example to
> work it would need the WideString functions. This change was discussed some
> time ago, but I don't know what the final decission was...

Well, works if I change this line:
  fname := 'c:\á b ç\á.txt';
to this:
  fname := UTF8Decode('c:\á b ç\á.txt');

And doesn't matter if fname is UnicodeString or string -- well, the
debug hint to 'UnicodeString' is more beautiful than 'string' because
the compiler translate.

But I ask:
A simple console program, on Windows, only using RTL, uses UTF8? I
think not, but why I have to use UTF8 functions?

If RTL is AnsiString (for now), why we have UTF8 functions in RTL?
Is part of the RTL or just utilities? Do not is better have this
functions in a better unit name like utf8utils.pas or something like
that?

Marcos Douglas




More information about the Lazarus mailing list