[Lazarus] Unable to display symbols

Flávio Etrusco flavio.etrusco at gmail.com
Sun Mar 9 07:17:36 CET 2014


On Sat, Mar 8, 2014 at 9:30 PM, Giuliano Colla
<giuliano.colla at fastwebnet.it> wrote:
> Il 08/03/2014 14:05, Mattias Gaertner ha scritto:
>>
>> On Sat, 08 Mar 2014 13:52:12 +0100
>> Giuliano Colla <giuliano.colla at fastwebnet.it> wrote:
>> [snip]
>>
>>> What I'm doing wrong?
>>
>> The solution lies in your own mail. Your mail is UTF-8 and so are the
>> above characters. Just copy the above characters from your mail to the
>> IDE:
>>
>> aText := 'SomeString¥';
>>
>> Note: Keep in mind that "char" is a single byte, but ¥ in UTF-8 is
>> multiple bytes.
>>
>
> I was aware of that. My problem is that the char I must add to the Utf8
> string is calculated run time, and is in the range Unicode $A0-$BF.
>
> I had assumed (wrongly) that the compiler was smart enough to convert a type
> "char" to UTF8, when concatenating it to an UTf8 string. Instead it turns
> out that the character is appended as it is, which leads to an invalid UTF8
> character (above 127), which displays as a crossed box.
> IMHO that's an FPC bug.
>
> When I realized that, I then tried to explicitly convert the Unicode char to
> UTF8, but again I failed, this time because of the default behavior which is
> to map char <-> Unicode only in the range 0-127. Anything above 127 becomes
> a question mark.
> Therefore my symbol displays as a question mark.
> IMHO that's a silly FPC limitation.
>
> The only way out I found was to build a table of UTf8 strings in the correct
> range (corresponding to the Unicode symbols from $A0 to $BF),  and use my
> calculated value as an index to that table.
>
> Rather cumbersome and inelegant, if you think that in the unit ustrings
> there's an UnicodeToUtf8 routine, which performs exactly the required
> conversion, but which can't be used because of the above silly limitation.
>
> Giuliano

Are you aware of $CODEPAGE directive? And that Lazarus, unless told
otherwise, saves the source files in UTF-8 and tells FPC they are
encoded in UTF-8?

Regards,
Flávio




More information about the Lazarus mailing list