[Lazarus] TStringList.LoadFromFile encoding parameter

Ondrej Pokorny lazarus at kluug.net
Sun Jul 10 07:29:00 CEST 2016


On 09.07.2016 18:16, Michael Van Canneyt wrote:
> So what is the encoding argument supposed to do ?
> - Convert to current codepage (as set in
> http://www.freepascal.org/docs-html/rtl/system/defaultsystemcodepage.html
>   ) with possible loss of characters during read ?
> - Keep all strings in the codepage that was passed on ?
>
>>
>> I see now that FPC's TEncoding uses UnicodeString and PUnicodeChar. 
>> For me it's a strange decision. I would expect it to be String ( 
>> http://wiki.freepascal.org/Character_and_string_types#String ) and 
>> use UTF8 since AnsiString(CP_UTF8) is compatible to String.
>
> AFAIK It is Delphi compatible, and there it is unicodestring (well, 
> widestring).

I see now. You need UnicodeString support for the delphiunicode mode. So 
now there are String/PChar overloads missing in TEncoding for FPC 
default ansi mode. Once you have String/PChar support in TEncoding, 
adding TEncoding support to ansi RTL is easy.

The best IMO is that the String-based TEncoding routines handle 
passed/returned strings in current DefaultSystemCodePage with possible 
character loss. If DefaultSystemCodePage is not UTF8 it will probably 
mean that 2 conversions have to be executed (SOURCE->UTFxxx->TARGET). If 
DefaultSystemCodePage is UTF8, one conversion must be enough. LazUtils 
have the LConvEncoding unit for UTF8<>CP conversions.

Ondrej


More information about the Lazarus mailing list