[Lazarus] Using fcl-xml with lcl and constant Utf8Encode/Decode conversion

Marco van de Voort marcov at stack.nl
Mon May 10 18:48:34 CEST 2010


On Mon, May 10, 2010 at 07:15:31PM +0500, Vladimir Zhirov wrote:
> I used to process XML files in my Lazarus applications using SimpleXML
> library for Delphi. Recently I tried to switch to fcl-xml (because of
> better performance) and faced with the following problem.
> 
> Since LCL uses utf-8 as internal string encoding, I use it for all
> string data in my applications too. But DOM/XMLRead/XMLWrite units use
> usc-2/utf-16 (WideString) so when working with TXmlDocument I have
> to do conversion like this:
> 
> myUtf8String := UTF8Encode(myElement['name']);
> myElement['name'] := UTF8Decode(myString);
> 
> What I would like is to get rid of UTF8Encode/UTF8Decode stuff in my
> source. I guess CodeTools uses modified fcl-xml units to achieve exactly
> that, but these units seem to be pretty outdated compared to recent
> fcl-xml version in FPC 2.4.0.

This will not be easy, since currently there is no native UTF-8 type in FPC,
so an automatic conversion is not possible.

Work has started on such feature in FPC (cpstrnew, base types for D2009
compatibility ), to be merged into 2.5.1 or even later, but nothing has been
committed in the last 5 months.
 
> * Does someone else use fcl-xml with LCL? Are there any tricks to avoid
> excessive UTF8Encode/UTF8Decode calls?

Not that I know.
 
> * Are there any plans to update CodeTools version and/or merge it into
> FCL? Well, according to the header of modified units there were such
> plans, but is it still true or the idea was abandoned for some
> reason?

Afaik first the lazarus units were branched of because of fcl-xml not being
fast enough.  Since that time though, Sergei has invested massive amounts of
time in fcl-xml, and we moved to unicode based Windows versions (Win2000,XP
and later).  and of course, the average computer has gotten faster too.

The last time I talked to Mattias about this, a few years back, IIRC he said
that the current situation was stable, and there was no reason to change it.

> * If the problem is lack of time only, it would be great to know the
> fcl-xml revision against which CodeTools modified version was made.
> Then updating it to the most recent fcl-xml version would be much
> easier for volunteers.

Since, as you noticed, this is all dependant on a solution for the manual
unicode problem, I think this is a discussion better saved for when the
unicode solution of FPC is in production. Now it is only guess work.




More information about the Lazarus mailing list