[Lazarus] rewriting of LConvEncoding

merlin352 at globe.lu merlin352 at globe.lu
Mon Sep 20 22:37:17 CEST 2010


--- Urspr√ľngliche Nachricht ---
Von: Mattias Gaertner <nc-gaertnma at netcologne.de>
An: lazarus at lists.lazarus.freepascal.org
Betreff: Re: [Lazarus] rewriting of LConvEncoding

> On Mon, 20 Sep 2010 20:51:20 +0200
> <merlin352 at globe.lu> wrote:
>
> > Hello
> >
> > I had some issues with LConvEncoding as I wrote in bugtracker issue
> #0017212
> >
> > My mistake was essentially the meaning of character encoding
> "ANSI". Sorry for that.
> >
> > So I took a deeper view at this unit. It is far not complete
> concerning existing codepages and support for Unicode.
>
> It's goal is to convert the most common codepages and do the
> rest with the existing OS libraries (iconv).
>
> A complete conversion unit should be part of FPC.

I agree, but that should be decided by the core developers.

>
>
> > Further, the implementation with string-based encoding-descriptions
> is
> > not optimal regarding speed and the possibilty to extend the unit.
> The
> > encoding and decoding of asian-codepages is really suboptimal. (a
> fast
> > and dirty performance check showed that it is about 3000 times
> slower
> > than singlebyte codepages like ISO-8859-1)
>
> True.
> And the tables are quite big. Maybe the unit should be splitted into
> several units.

I think with the rigth datamodel the tables can be reduced a lot in size. But nevertheless, it may be a good idea for those who don't need the asiancodepages.

>
> > I would rewrite the entire unit, including most of the codepages
> from ftp.unicode.org, full support for UTF-8, UTF-16 and UTF-32,
> including LE and BE variants. And as a further utility, a conversion
> program for the mentioned codepagefiles, that will create a complete
> pascal include file with all the necessary tables and conversion
> functions.
>
> What do you mean with full support for UTF-8, UTF-16 and UTF-32?
>
> Mattias
>

I mean full conversion possibilities between old fashioned codepages and UTF, and between the different UTF-strings, always both directions.

I know that parts of this functionality is also in LCLProc, but it is also not complete.



______________________________________________________
powered by GLOBER.LU
Luxembourg Internet Service Provider
Hosting. Domain Registration, Webshops, Webdesign, FreeMail ...

Our professional Web Hosting plans include all the features you are looking for at the best possible price.
www.globe.lu





More information about the Lazarus mailing list