[Lazarus] Feature Request: Insert {codepage UTF8} per default

Michael W. Vogel m-w-vogel at gmx.de
Thu Mar 31 00:16:13 CEST 2016


Am 30.03.2016 um 21:46 schrieb Graeme Geldenhuys:
> Just thought I would let you know that with or without the {$codepage
> utf8}, your code works just fine here. Source code is saved in a UTF-8
> encoding with no BOM marker.
>
> ========================================================
> [tmp]$ fpc test.pas
> Free Pascal Compiler version 3.0.0 [2015/11/16] for x86_64
> Copyright (c) 1993-2015 by Florian Klaempfl and others
>
> [tmp]$ ./test
> DefaultSystemcodePage = 0
> TestUtf8 = C38441C384
> S1       = C38441C384 [0]
> ÄAÄ
>
> [.... adding $codepage and testing again.....]
>
> [tmp]$ fpc test.pas
> Free Pascal Compiler version 3.0.0 [2015/11/16] for x86_64
> Copyright (c) 1993-2015 by Florian Klaempfl and others
>
> [tmp]$ ./test
> DefaultSystemcodePage = 0
> TestUtf8 = C38441C384
> S1       = C38441C384 [65001]
> ÄAÄ
> ========================================================
>
> Regards,
>    - Graeme -
>
I've tested the example too and I got different results with different 
options. The test was:
- BOM / no BOM at the beginning of the sourcefile
- {$codepage UTF8} or not
- fpc -MObjFPC *-Sh* test.pas (with / without -Sh (use reference counted 
strings))


The results:

with BOM / with defined codepage / no -Sh,
no BOM / with defined codepage / no -Sh,
no BOM / no defined codepage / no -Sh:

DefaultSystemcodePage = 1252
TestUtf8 = C3 84 41 C3 84
S1       = C4 41 C4 [1252]
ÄAÄ


with BOM / with defined codepage / with -Sh,
no BOM / with defined codepage / with -Sh,
no BOM / no defined codepage / with -Sh:

DefaultSystemcodePage = 1252
TestUtf8 = C3 84 41 C3 84
S1       = C3 84 41 C3 84 [65001]
ÄAÄ


with BOM / no defined codepage / with -Sh:

DefaultSystemcodePage = 1252
TestUtf8 = C3 84 41 C3 84
S1       = C3 84 41 C3 84 [0]
Ã"AÃ"


with BOM / no defined codepage / no -Sh:

DefaultSystemcodePage = 1252
TestUtf8 = C3 84 41 C3 84
S1       = C3 84 41 C3 84 [1252]
ÄAÄ


So it is realy more complex as I thought...

As a résumé I would say a UTF-8 encoded file with a BOM and not setted 
{$codepage UTF8} is a showkiller here (Windows7, FPC 3.1.1, Lazarus 1.7).

If somebody is interested, the testfiles/-results are added.

Regards

Michl
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lazarus-ide.org/pipermail/lazarus/attachments/20160331/0e983fa2/attachment-0003.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: testfiles.zip
Type: application/x-zip-compressed
Size: 3808 bytes
Desc: not available
URL: <http://lists.lazarus-ide.org/pipermail/lazarus/attachments/20160331/0e983fa2/attachment-0003.bin>


More information about the Lazarus mailing list