[Lazarus] Unable to display symbols

Mattias Gaertner nc-gaertnma at netcologne.de
Sun Mar 9 09:36:28 CET 2014


On Sun, 09 Mar 2014 01:30:36 +0100
Giuliano Colla <giuliano.colla at fastwebnet.it> wrote:

>[...]
> I was aware of that. My problem is that the char I must add to the Utf8 
> string is calculated run time, and is in the range Unicode $A0-$BF.

The Unicode ranges are given in "code points". These are abstract
values that must be encoded in bytes. The most common
encodings are UTF-8 and UTF-16.
Code point $A0 has two bytes in UTF-8: $C2$A0.

 
> I had assumed (wrongly) that the compiler was smart enough to convert a 
> type "char" to UTF8,

A char is not a code point. A char is an element of string.
Every byte encoding consists of chars and so does UTF-8.


> when concatenating it to an UTf8 string. Instead it 
> turns out that the character is appended as it is, which leads to an 
> invalid UTF8 character (above 127), which displays as a crossed box.
> IMHO that's an FPC bug.

It's not a bug.

 
> When I realized that, I then tried to explicitly convert the Unicode 
> char to UTF8, but again I failed, this time because of the default 
> behavior which is to map char <-> Unicode only in the range 0-127. 

That's because UTF-8 maps Unicode 0-127 to one byte with the same
value as the code point.
Above that it uses a different mapping.


> Anything above 127 becomes a question mark.
> Therefore my symbol displays as a question mark.
> IMHO that's a silly FPC limitation.

Maybe you underestimate FPC.
FPC supports various source encodings. Lazarus uses by default UTF-8. 

 
>[...]

There are some useful UTF-8 functions in unit LazUTF8 and LazFileUtils.

Mattias




More information about the Lazarus mailing list