[Lazarus] TStringList.LoadFromFile encoding parameter
Ondrej Pokorny
lazarus at kluug.net
Sun Jul 10 07:29:00 CEST 2016
On 09.07.2016 18:16, Michael Van Canneyt wrote:
> So what is the encoding argument supposed to do ?
> - Convert to current codepage (as set in
> http://www.freepascal.org/docs-html/rtl/system/defaultsystemcodepage.html
> ) with possible loss of characters during read ?
> - Keep all strings in the codepage that was passed on ?
>
>>
>> I see now that FPC's TEncoding uses UnicodeString and PUnicodeChar.
>> For me it's a strange decision. I would expect it to be String (
>> http://wiki.freepascal.org/Character_and_string_types#String ) and
>> use UTF8 since AnsiString(CP_UTF8) is compatible to String.
>
> AFAIK It is Delphi compatible, and there it is unicodestring (well,
> widestring).
I see now. You need UnicodeString support for the delphiunicode mode. So
now there are String/PChar overloads missing in TEncoding for FPC
default ansi mode. Once you have String/PChar support in TEncoding,
adding TEncoding support to ansi RTL is easy.
The best IMO is that the String-based TEncoding routines handle
passed/returned strings in current DefaultSystemCodePage with possible
character loss. If DefaultSystemCodePage is not UTF8 it will probably
mean that 2 conversions have to be executed (SOURCE->UTFxxx->TARGET). If
DefaultSystemCodePage is UTF8, one conversion must be enough. LazUtils
have the LConvEncoding unit for UTF8<>CP conversions.
Ondrej
More information about the Lazarus
mailing list