[Lazarus] Utf8ToSys on Linux and cwstring in uses clause
Vladimir Zhirov
vvzh.lists at gmail.com
Fri Nov 12 18:44:12 CET 2010
Hi,
I've just tried to use Utf8ToConsole on my Linux box and was
surprised very much about the result.
If I run a simple program like this:
> program project1;
> {$mode objfpc}{$H+}
> uses
> FileUtil;
> begin
> WriteLn(UTF8ToConsole('Text in Russian:'));
> WriteLn(UTF8ToConsole('Текст на русском'));
> end.
It produces the following output:
> Text in Russian:
> ????? ?? ???????
My LANG environment variable is ru_RU.UTF-8
If I add cwstring to my uses clause it works OK.
Trying to figure out what happens I noticed an inconsistency
in FileUtil.NeedRTLAnsi function. The comment at line 186 says
NeedRTLAnsi is "true if system encoding is not UTF-8", but the
function itself contains the following code:
> FNeedRTLAnsi:=(SysUtils.CompareText(Encoding,'UTF-8')=0)
> or (SysUtils.CompareText(Encoding,'UTF8')=0);
So it looks like the reverse: NeedRTLAnsi is true if system
encoding IS utf-8. This causes redundant Utf8ToAnsi call in
Utf8ToSys that turns non-ASCII text into question marks in the
absence of widestring manager (cwstring).
Is it a bug in NeedRTLAnsi? If it is, the fix would be trivial:
> FNeedRTLAnsi:=(SysUtils.CompareText(Encoding,'UTF-8')<>0)
> and (SysUtils.CompareText(Encoding,'UTF8')<>0);
With this change everything works as expected, at least for me.
Should I also report this to mantis in this case?
Or is it expected behavior of NeedRTLAnsi and just a misprint in
the comment? In this case should I always use cwstring and bear
with libc/iconv dependency?
Thanks in advance.
More information about the Lazarus
mailing list