[Lazarus] rewriting of LConvEncoding

Mattias Gaertner nc-gaertnma at netcologne.de
Mon Sep 20 22:16:39 CEST 2010


On Mon, 20 Sep 2010 20:51:20 +0200
<merlin352 at globe.lu> wrote:

> Hello
> 
> I had some issues with LConvEncoding as I wrote in bugtracker issue #0017212
> 
> My mistake was essentially the meaning of character encoding "ANSI". Sorry for that.
> 
> So I took a deeper view at this unit. It is far not complete concerning existing codepages and support for Unicode.

It's goal is to convert the most common codepages and do the
rest with the existing OS libraries (iconv).

A complete conversion unit should be part of FPC.


> Further, the implementation with string-based encoding-descriptions is
> not optimal regarding speed and the possibilty to extend the unit. The
> encoding and decoding of asian-codepages is really suboptimal. (a fast
> and dirty performance check showed that it is about 3000 times slower
> than singlebyte codepages like ISO-8859-1)

True.
And the tables are quite big. Maybe the unit should be splitted into
several units.
 
> I would rewrite the entire unit, including most of the codepages from ftp.unicode.org, full support for UTF-8, UTF-16 and UTF-32, including LE and BE variants. And as a further utility, a conversion program for the mentioned codepagefiles, that will create a complete pascal include file with all the necessary tables and conversion functions.

What do you mean with full support for UTF-8, UTF-16 and UTF-32?

Mattias




More information about the Lazarus mailing list