[Lazarus] Making sources compatible with Delphi (but Lazarus is priority)
Sven Barth
pascaldragon at googlemail.com
Sun May 7 17:59:58 CEST 2017
On 07.05.2017 12:17, Florian Klaempfl via Lazarus wrote:
> Am 07.05.2017 um 12:11 schrieb Sven Barth via Lazarus:
>> Am 07.05.2017 12:07 schrieb "Florian Klaempfl via Lazarus"
>> <lazarus at lists.lazarus-ide.org <mailto:lazarus at lists.lazarus-ide.org>>:
>>>
>>> Am 07.05.2017 um 11:57 schrieb Graeme Geldenhuys via Lazarus:
>>>> On 2017-05-07 09:10, Florian Klaempfl via Lazarus wrote:
>>>>>> Yeah, that would be the logical thing to do.
>>>>>
>>>>> Why? What makes a string literal UTF-8?
>>>>>
>>>>
>>>> As Mattias said, the fact that the source unit is UTF-8 encoded.
>>>> Defined by a BOM marker, or -Fcutf8 or {$codepage utf8}. If the source
>>>> unit is UTF-8 encoded, the literal string constant can't (and
>>>> shouldn't) be in any other encoding.
>>>>
>>>> I would say the same if the source unit was stored in UTF-16
>>>> encoding. Then string literals would be treated as UTF-16.
>>>
>>> And if a ISO/Ansi codepage is given? Things would probably fail.
>>>
>>> The point is: FPC is consistent in this regard: also sources with a
>>> given iso/ansi codepage are handled the same way. If there is a string
>>> literal with non-ascii chars, it is converted to UTF-16 using the
>>> codepage of the source. Very simple, very logical. It is a matter of
>>> preference if UTF-8, -16, -32 are chosen at this point, but FPC uses
>>> UTF-16. If it uses UTF-8, the problem would occur the other way around.
>>>
>>> If no codepage is given (by directive, command line, BOM), string
>>> literals are handled byte-wise as raw strings.
>>
>> Small correction: FPC only does this conversion if the codepage is
>> UTF-8, no other.
>
> Then something is wrong/broken :)
>
Well, the code in tscannerfile.readtoken() only does the conversion to
UTF-16 if the source codepage is UTF-8, otherwise it only converts to
UTF-16 if the string is already an UTF-16 string.
So probably not broken as it seems rather on purpose; if at all it's
wrong...
Regards,
Sven
More information about the Lazarus
mailing list