[Lazarus] Some questions on Unicode wiki page

Reinier Olislagers reinierolislagers at gmail.com
Fri Jun 22 11:03:56 CEST 2012


Hi list,

I'm diving into properly dealing with UTF8 encoded data coming in from
my Twitter application/library (see FPC list announcement yesterday).

Decided to go over the Laz wiki page on unicode and cleaned up some
typos etc:
http://wiki.lazarus.freepascal.org/LCL_Unicode_Support

I've got some questions:
- Widestrings and Ansistrings section
What is a widestring? An UTF-16 encoded string? Or a string that can
hold multiple bytes that is further undefined?

-Searching a substring section
Searching UTF 8 substrings has this code
uses lazutf8;
...
  BytePos:=Pos(SearchFor,aText);
  CharacterPos:=UTF8Length(PChar(aText),BytePos-1);
  writeln('The substring "',SearchFor,'" is in the text "',aText,'"',
    ' at byte position ',BytePos,' and at character position
',CharacterPos);

It says:
"Due to the special nature of UTF8 you can simply use the normal string
functions"
=> is the special nature the fact that there is only a single way to
decode UTF8 characters because multi-byte encodings have different high
bits set (as explained in the description of UTF8 on the bottom of the page)

- No Unicode support on Win9x section
"Windows platforms <=Win9x [..] only partially support Unicode"
however:
"Win 9x and NT offer two parallel sets of API functions:[..]the new,
Unicode enabled *W."
Presumably not all *A functions are available as *W functions on Win9x,
which is why Unicode is not fully supported?
(I can also imagine the default fonts etc do not support showing Unicode
characters)

Thanks for any clarification; I'll update the wiki...

Thanks,
Reinier




More information about the Lazarus mailing list