[Lazarus] Windows and german Umlauts: why do i have to PWideChar(UTF8Decode(String)) and not UTF8ToAnsi(String)

Mattias Gaertner nc-gaertnma at netcologne.de
Tue Oct 18 11:31:53 CEST 2016


On Tue, 18 Oct 2016 10:46:59 +0200
Landmesser John via Lazarus <lazarus at lists.lazarus-ide.org> wrote:

> Am 18.10.2016 um 09:36 schrieb Mattias Gaertner via Lazarus:
> > That is NOT the same as PWideChar(UTF8Decode(String))?
> > Obviously, because one produces an 8-bit string and the other a 16-bit
> > string. How do you compare the result?  
> 
> I compare the results by open the *.csv file in Excel 2003 and see no 
> correct "Grünberg"
> Note: my Editor "TextPad" shows  "PC UTF8" as file encoding if i use 
> UTF8ToAnsi.

Did you start the csv file with the UTF8 BOM, so that Excel knows that
the file is UTF-8 encoded?

UTF8BOM = #$EF#$BB#$BF

 
> > What compiler version?  
> 
> Sorry,  forgot  that:
> 
> Lazarus 1.7 latest trunk FPC 3.0.0 i386-win32-win32/win64
> 
> Windows XP
> 
> > Are you using LazUTF8?  
> 
> Where to look for that?

LazUTF8 is a unit, that sets the global variable
DefaultSystemCodePage to CP_UTF8. DefaultSystemCodePage defines
the default encoding of AnsiString, so UTF8ToAnsi creates UTF-8.
LazUTF8 is used by LCL applications.
If unsure use writeln or use the debugger to find out the value of
DefaultSystemCodePage.


> "Alle Dateien haben bereits die richtige Kodierung" -> UTF8
> 
> >   What is your DefaultSystemCodePage?
> >
> > Mattias  
> 
> in Systemsettings language its: "Automatisch ( oder Englisch)"
> 
> 
> So the answer seems to be:
> "one produces an 8-bit string and the other a 16-bit "

And PWideChar is a pointer, while UT8ToAnsi creates a String.

Mattias


More information about the Lazarus mailing list