[Lazarus] rewriting of LConvEncoding

Guy Fink merlin352 at globe.lu
Fri Sep 24 21:23:38 CEST 2010

> I'm not going into detail about your rant, I don't have time for ego
> management, I'll just summarize my remarks again, slightly trying to
> explain them when necessary:
> - first, try to reuse as much as possible. Specially the interfaces of
>   charset. Since otherwise this becomes the 4th or 5th charset
>  conversion
>   unit, and yet another maintenance burden. Lconv or charset, doesn't
>   matter.
>     - charset
>     - ccharset ( a copy of ccharset cut down for compiler use)
>     - iconvenc
>     - lconvencoding
>     - your solution

Ok, I fully agree. Since I started from LConvEncoding I planned to kee at least the existing interface, so no completly new solution anyway (from the side of the interface), I see no problem to integrate the existing charset interface. Internally it will be completely different, especially faster and smaller (I hope :-))

> - Keep in mind that in practice such units are used mostly by embedded
>   targets. And maybe a few people that have special codepage needs will
>   use it in addition. Size (code+ tables) is of importance here. Users
>   must be able to only add a few codepages.
> - Keep the minimal dependancies as low as possible, allowing it to be
>   positioned as deep as possible in the FPC/Lazarus system. This means
> both
>   libraries as footprint. Make tables pluggable.

Thats what I have in mind. The only dependencies I have is the UTF8 and UTF16 functions from LProc. But as I understood it is also forseen to integrate these routines in the RTL. I think this would be the time to do that also and create a proper and complete character-conversion library.

> - I've never seen UTF32 files in the wild. I assume however that
>   smartlinking will adequately kill those routines, so it is not that
> much
>   of a problem unless you use it internally.

If there is support fot UTF16, UTF32 does not really blow up the code. 2 lines of code to decode surrogate pairs, thats it.

> We all want to get rid of lconvencoding, or at least break it up in to
> pieces and move it from the LCL to the RTL.
> That's why starting with 2.2.4 I added an iconvenc unit to pull the
> iconv
> support out of for the lconvencoding.  Having a table driven package
> would
> pull even more out of it (I guess lconvencoding will persist for a while
> to allow the lazarus team to deal with FPC versioning, but it will be
> mostly
> empty), so I'm all for it. But be a bit flexible and keep an eye on the
> usage scenarios.
> Note that units that are very large can't be in the RTL.  (the RTL is
> compiled three-five times each bootstrap) If large then it must go to
> packages/

I know that. The core-unit witchsupports UTF8, UTF16 and UTF32 does not need tables, so size is no matter there.

Support for codepages will be in separate units which will plugin to the core-unit. I still prefer a one unit per codepage solution, this will give the most flexibility and will not stress smartlinking to much.


powered by GLOBER.LU
Luxembourg Internet Service Provider
Hosting. Domain Registration, Webshops, Webdesign, FreeMail ...

Our professional Web Hosting plans include all the features you are looking for at the best possible price.

More information about the Lazarus mailing list