[Lazarus] Adding codepage-support to the RTL (making LConvEncoding obsolete)
Marc Weustink
marc at dommelstein.net
Sat Dec 4 15:12:30 CET 2010
On 3-12-2010 21:51, Guy Fink wrote:
>> In some languages some unicode codepoints have different
>> uppercase/lowercase pair. In example "i" in english (and most
>> others) region is uppercased to "I" while in Turkish it is
>> "I"+Upperdot (i can not write it here).
>>
>> Take a look over: "Why Applications Fail With The Turkish
>> Language" at
>> http://www.i18nguy.com/unicode/turkish-i18n.htm
>
> There is no information on the language in a string, even not in a
> Unicodestring. So it is impossible to react on this point here.
IMO there is no need to have a language encoded in the string. Strings
won't get autoconverted to upper/lowercase. It's always a user call to
Upper/Lowercase(S)
> The uppercase/lowercase tables have been generated purely on the
> official Unicode-Character-Description. Characters having a "SMALL"
> in their description are replaced by the one having "CAPITAL" on that
> place and vice-versa. (only if the counterpart exists) You can't do
> more on this level. Please feel free to implement the functionality
> you mention, I'll be sure it will be appreciated.
To take the Language into account when converting, functions like
Upper/Lowercase should have a 2nd optional parameter indicating for what
language the conversion should be done.
THen the default conversion still can take place, but based on the
specified language, the exceptions can be implemented (if there anrent
many exceptions, only a simple case will do)
Marc
More information about the Lazarus
mailing list