[Lazarus] Utf8ToSys on Linux and cwstring in uses clause

Mattias Gaertner nc-gaertnma at netcologne.de
Fri Nov 12 21:58:32 CET 2010


On Fri, 12 Nov 2010 20:44:12 +0300
Vladimir Zhirov <vvzh.lists at gmail.com> wrote:

> Hi,
> 
> I've just tried to use Utf8ToConsole on my Linux box and was
> surprised very much about the result.
> 
> If I run a simple program like this:
> > program project1;
> > {$mode objfpc}{$H+}
> > uses
> >   FileUtil;
> > begin
> >   WriteLn(UTF8ToConsole('Text in Russian:'));
> >   WriteLn(UTF8ToConsole('Текст на русском'));
> > end.
> 
> It produces the following output:
> > Text in Russian:
> > ????? ?? ???????
> 
> My LANG environment variable is ru_RU.UTF-8
> If I add cwstring to my uses clause it works OK.
> 
> Trying to figure out what happens I noticed an inconsistency
> in FileUtil.NeedRTLAnsi function. The comment at line 186 says
> NeedRTLAnsi is "true if system encoding is not UTF-8", but the
> function itself contains the following code:
> > FNeedRTLAnsi:=(SysUtils.CompareText(Encoding,'UTF-8')=0)
> >              or (SysUtils.CompareText(Encoding,'UTF8')=0);
> 
> So it looks like the reverse: NeedRTLAnsi is true if system
> encoding IS utf-8. This causes redundant Utf8ToAnsi call in
> Utf8ToSys that turns non-ASCII text into question marks in the
> absence of widestring manager (cwstring).
> 
> Is it a bug in NeedRTLAnsi? If it is, the fix would be trivial:
> > FNeedRTLAnsi:=(SysUtils.CompareText(Encoding,'UTF-8')<>0)
> >             and (SysUtils.CompareText(Encoding,'UTF8')<>0);
> With this change everything works as expected, at least for me.
> Should I also report this to mantis in this case?

Thanks. I fixed it in svn.

 
> Or is it expected behavior of NeedRTLAnsi and just a misprint in
> the comment? In this case should I always use cwstring and bear
> with libc/iconv dependency?

If your system is UTF8 there is no need for cwstring.


Mattias




More information about the Lazarus mailing list