[Lazarus] Unable to display symbols
Mattias Gaertner
nc-gaertnma at netcologne.de
Sun Mar 9 09:36:28 CET 2014
On Sun, 09 Mar 2014 01:30:36 +0100
Giuliano Colla <giuliano.colla at fastwebnet.it> wrote:
>[...]
> I was aware of that. My problem is that the char I must add to the Utf8
> string is calculated run time, and is in the range Unicode $A0-$BF.
The Unicode ranges are given in "code points". These are abstract
values that must be encoded in bytes. The most common
encodings are UTF-8 and UTF-16.
Code point $A0 has two bytes in UTF-8: $C2$A0.
> I had assumed (wrongly) that the compiler was smart enough to convert a
> type "char" to UTF8,
A char is not a code point. A char is an element of string.
Every byte encoding consists of chars and so does UTF-8.
> when concatenating it to an UTf8 string. Instead it
> turns out that the character is appended as it is, which leads to an
> invalid UTF8 character (above 127), which displays as a crossed box.
> IMHO that's an FPC bug.
It's not a bug.
> When I realized that, I then tried to explicitly convert the Unicode
> char to UTF8, but again I failed, this time because of the default
> behavior which is to map char <-> Unicode only in the range 0-127.
That's because UTF-8 maps Unicode 0-127 to one byte with the same
value as the code point.
Above that it uses a different mapping.
> Anything above 127 becomes a question mark.
> Therefore my symbol displays as a question mark.
> IMHO that's a silly FPC limitation.
Maybe you underestimate FPC.
FPC supports various source encodings. Lazarus uses by default UTF-8.
>[...]
There are some useful UTF-8 functions in unit LazUTF8 and LazFileUtils.
Mattias
More information about the Lazarus
mailing list