[Lazarus] cwstring in arm-linux

Felipe Monteiro de Carvalho felipemonteiro.carvalho at gmail.com
Thu Oct 20 22:26:05 CEST 2011


On Thu, Oct 20, 2011 at 2:54 PM, Michael Schnell <mschnell at lumino.de> wrote:
> And thus functions like pos(), length() and myString[i] work on UTF-8 code
> bytes rather than on (displayed) characters.

Characters can be composed by separate codepoints for accent +
character (so at least 4 bytes in UTF-16). So if you write code which
depends on [] indexing characters your code will fail miserably in
this case.

Mac OS X uses the decomposed form in UTF-8 to store filenames, which
is rather unpleasant. If you convert this to UTF-16 for further work
the text will not magically get composed, although one could pass it
through a composing pre-processor.

-- 
Felipe Monteiro de Carvalho




More information about the Lazarus mailing list