[Lazarus] Unicode support in source editor

Mattias Gaertner nc-gaertnma at netcologne.de
Sun Apr 20 13:30:51 CEST 2008


On Sun, 20 Apr 2008 07:54:57 -0300
"Felipe Monteiro de Carvalho" <felipemonteiro.carvalho at gmail.com> wrote:

> >  I believe that it is a wrong way to do it. For example the 12xx are
> >  Windows language codes. Sometimes you wish to use them as-is, so
> > this will actually create a bug imho.
> 
> That's trivial to solve. Don't add a BOM to your file and use:
> 
> {%encoding utf-8}
> 
> (By the way, is this supported? I don't see this option in the wiki
> article)

Yes. The example in the wiki is {%encoding cp1250}.

http://wiki.lazarus.freepascal.org/IDE_Development#UTF-8_Sources

 
> This way both the IDE and the Compiler will ignore your encoding, and
> no conversion takes place. On the bad side, you also won't be able to
> edit non-utf-8 strings in the editor, but it's always impossible to
> edit more then one encoding at the same time.
> 
> About asking for each string, I think this would be a inviable
> solution.

I will add a more complete explanation to the wiki:

Formerly the IDE only supported one encoding. Under windows this was
the windows code page. Under gtk2 it was UTF-8. Now the IDE can edit
files with various encodings. (one encoding per file). For instance,
the user opens a unit1.pas file created on a german windows. Then the
german umlaute will be encoded with the windows code page cp1250. The
IDE now "sees", that this is not UTF-8, assumes it is the system
encoding and converts it to UTF-8 (only in memory, not on disk), before
passing it to SynEdit. Now SynEdit can edit it. When the file is saved,
the IDE converts it back to cp1250. This way the file on disk can have
the windows system encoding, but all IDE functions only need to support
UTF-8.


Mattias



More information about the Lazarus mailing list