[Lazarus] TStringList.LoadFromFile encoding parameter

Michael Van Canneyt michael at freepascal.org
Mon Jul 11 13:23:30 CEST 2016



On Sun, 10 Jul 2016, Ondrej Pokorny wrote:

> On 09.07.2016 18:16, Michael Van Canneyt wrote:
>> So what is the encoding argument supposed to do ?
>> - Convert to current codepage (as set in
>> http://www.freepascal.org/docs-html/rtl/system/defaultsystemcodepage.html
>>   ) with possible loss of characters during read ?
>> - Keep all strings in the codepage that was passed on ?
>>
>>>
>>> I see now that FPC's TEncoding uses UnicodeString and PUnicodeChar. 
>>> For me it's a strange decision. I would expect it to be String ( 
>>> http://wiki.freepascal.org/Character_and_string_types#String ) and 
>>> use UTF8 since AnsiString(CP_UTF8) is compatible to String.
>>
>> AFAIK It is Delphi compatible, and there it is unicodestring (well, 
>> widestring).
>
> I see now. You need UnicodeString support for the delphiunicode mode. So 
> now there are String/PChar overloads missing in TEncoding for FPC 
> default ansi mode. Once you have String/PChar support in TEncoding, 
> adding TEncoding support to ansi RTL is easy.

They were not necessarily missing, they were perceived as 'not necessary'.

As far as I can see, they are still unnecessary (see below). 
One could add them for completeness or symmetry reasons, but I don't see the point.

>
> The best IMO is that the String-based TEncoding routines handle 
> passed/returned strings in current DefaultSystemCodePage with possible 
> character loss.

I am not sure that this 'possible character loss' is a good idea.

But if the general consensus is that this is acceptable, then why not.
If you already have a helper class, then this can probably be easily
integrated in the TStrings class.

> If DefaultSystemCodePage is not UTF8 it will probably 
> mean that 2 conversions have to be executed (SOURCE->UTFxxx->TARGET). If 
> DefaultSystemCodePage is UTF8, one conversion must be enough. LazUtils 
> have the LConvEncoding unit for UTF8<>CP conversions.

This conversion should already be fully automatic if the widestring manager is used
and the 'SetCodePage' function and friends are used.

One does not need TEncoding for that. TEncoding is just a wrapper around the
widestring manager with some utility functions, implemented for Delphi compatibility.

Michael.


More information about the Lazarus mailing list