[Lazarus] logic bug in many (or most) TSynEdit
Felipe Monteiro de Carvalho
felipemonteiro.carvalho at gmail.com
Sat Jun 5 10:55:10 CEST 2010
2010/6/5 ik <idokan at gmail.com>:
> How is so ? here is a multi-byte char: א . It takes more then a word to be
> used,
UTF-8 implements it's text support in such an way that characters
which require more then 1 byte are formed by using only valid ASCII or
Extended ASCII values. For example, save your א in a text file and in
a UNIX shell use "od -x file.txt"
You will see that this character is represented as 90 d7 in UTF-8
Or in Decimal: 215 144
Now go to the extended ASCII table and you will see that both are
valid Extended ASCII values: http://www.asciitable.com/
The same is valid for all other UTF-8 characters.
> so you can not do S[i] because it will provide you only part of the
> char (one byte).
S[i] returns a byte, not a character. If you character has 2 bytes
then S[i] will return the first byte and S[i+1] will return the second
byte.
So, it doesn't matter if this part of SynEdit thinks that your
identifier is actually 2 characters which read "×", the corresponding
Extended ASCII for your original character. It works just the same.
--
Felipe Monteiro de Carvalho
More information about the Lazarus
mailing list