[Lazarus] Adding codepage-support to the RTL (making LConvEncoding obsolete)

Guy Fink merlin352 at globe.lu
Fri Dec 3 13:05:15 CET 2010

> It does not compile under 2.4.2:
> cp_ISO88591.pas(69,37) Error: Constant strings can't be longer than 255
> chars

Ohoh... Ok, that are the Teststrings, they contain ALL the unique definded characters for the codepage. I have developped under 2.5.1

Undefine cpCnvGenerateTeststrings in codepagesdef.inc. So it should compile. I only use the teststrings to verify the conversion process.

> >  - Widestringsupport (configurable)
> >  - UTF8 and UTF16 support (UTF16 needs widestrings)
> Great.

And I add UTF-32 support when FPC implements a 4-byte character :-)

> >  - Direct conversion from CP to CP without intermediate string
> Nice.
. and fast

> >    pascal unit. The cp_* units are entirely generated by this app.
> >  - Conversion up to 80% faster for SBCS.
> Ehm, you made many functions inline. Even those that are more than a
> few lines of code. This will enlarge the executables and can cost
> performance in normal applications (e.g. Lazarus).

I will have a look at this.

> You call for each character a conversion function. But most real world
> texts contain a big part of ASCII characters, where no conversion is
> needed for UTF-8. My guess is that for most texts this approach is
> slower. But I have to wait till it compiles before I can test.

My tests dont show this for now. The conversion function is used in DirectConversion only, or when the codepage has no translation tables (as for SHIFT_JIS resp. all the DBCS). The SingleByteToxxx use a table function, and also use the ASCII-optimisation if approriate for the Codepage. But as you see, I will also include EBCDIC-Codepages..


powered by GLOBER.LU
Luxembourg Internet Service Provider
Hosting. Domain Registration, Webshops, Webdesign, FreeMail ...

Our professional Web Hosting plans include all the features you are looking for at the best possible price.

More information about the Lazarus mailing list