[Lazarus] Does Lazarus support a complete Unicode Component Library?

Mon Feb 14 21:14:42 CET 2011

Hello Lazarus-List,

Monday, February 14, 2011, 8:29:04 PM, you wrote:

>> I'm unable to see the "great" problems with "UnicodeString". The
>> conversions should be the minimun needed, and they will be. Problem
>> would be in the RTL, but not at user level.
MG> Yes, since for example Linux allows non valid UTF-8 as file names,
MG> so any auto conversion of file names to UTF-16 is an error.

Hmmm... To me it looks like a Linux "problem"/"bug" for that kind of
access it is logical to me to use low level APIs. OK, that way you can
not access those files ? yes, but also in Windows there are similar
problems, some files can not be accessed using regular APIs and some
tricks must be used.

>> Many people are concerned about "speed" due hidden conversions, so can
>> anybody tell me why ? Maybe I'm blind and I can not see something that
>> is absolutly a problem (except some pieces of RTL).
MG> For instance searching needs a lot of compares. Comparing two
MG> strings normally fails on the very first characters. An auto conversion
MG> will always convert the whole string including allocating and releasing
MG> memory, easily slowing down the conversion by an order of magnitude.

This are the "some corner cases" which can not be handled in the usual
conversion, operation, conversion back, but I think there are not much
cases like this. Of course, there are cases like a TStringList with
100000 items in UTF16 and perform a search using an UTF8String, so or
a conversion request to the stringlist (convert all elements in one
go) or you must use your unicodestring using default unicode format
for the platform.

I would like to see an example of such problem (snippet) which could
be a headache, but maybe in the fpc mailing lists ?

-- 
Best regards,
 José