[Lazarus] UTF8 string compare with correct locale sorting

Michael Schnell mschnell at lumino.de
Mon Oct 21 10:24:31 CEST 2013


On 10/18/2013 06:16 PM, Jürgen Hestermann wrote:
>
> Who claims this?
Sorry if I over-interpreted your wording.
>
>
> > If this is not the case, why then use Unicode ?
>
> I thought Unicode is just for international *coding* of characters but 
> not for sort order definition.

In a Unicode aware programming language, the handling of Unicode encoded 
strings needs to provides compare (besides many other string operation, 
potentially including conversion between multiple Unicode and 
non-Unicode encoding schemes. )

If string compare only allows for "equal" vs "not equal" results (in 
some imaginary language) this is complicated enough, as there can be 
multiple different encodeings for the same "visual  character". 
Additionally, it might be viable to do a "case aware" and/or a "not case 
aware" operation. To me it's not clear what "case aware" might mean with 
characters for ancient Egyptian language,

If string compare also allows for "greater" vs "smaller" results the 
programming language needs to impose some sort order (and maybe a lot 
more "locale"-depending complex algorithms). This to me seems horribly 
complicated. Rather obviously you can't define a natural sort order for 
the complete set of Unicode characters. Thus a kind of "localization" is 
necessary and supposedly needs to be selectable/definable by the user 
via "locale" or whatever.

-Michael




More information about the Lazarus mailing list