[Lazarus] Some questions on Unicode wiki page
Reinier Olislagers
reinierolislagers at gmail.com
Fri Jun 22 11:03:56 CEST 2012
Hi list,
I'm diving into properly dealing with UTF8 encoded data coming in from
my Twitter application/library (see FPC list announcement yesterday).
Decided to go over the Laz wiki page on unicode and cleaned up some
typos etc:
http://wiki.lazarus.freepascal.org/LCL_Unicode_Support
I've got some questions:
- Widestrings and Ansistrings section
What is a widestring? An UTF-16 encoded string? Or a string that can
hold multiple bytes that is further undefined?
-Searching a substring section
Searching UTF 8 substrings has this code
uses lazutf8;
...
BytePos:=Pos(SearchFor,aText);
CharacterPos:=UTF8Length(PChar(aText),BytePos-1);
writeln('The substring "',SearchFor,'" is in the text "',aText,'"',
' at byte position ',BytePos,' and at character position
',CharacterPos);
It says:
"Due to the special nature of UTF8 you can simply use the normal string
functions"
=> is the special nature the fact that there is only a single way to
decode UTF8 characters because multi-byte encodings have different high
bits set (as explained in the description of UTF8 on the bottom of the page)
- No Unicode support on Win9x section
"Windows platforms <=Win9x [..] only partially support Unicode"
however:
"Win 9x and NT offer two parallel sets of API functions:[..]the new,
Unicode enabled *W."
Presumably not all *A functions are available as *W functions on Win9x,
which is why Unicode is not fully supported?
(I can also imagine the default fonts etc do not support showing Unicode
characters)
Thanks for any clarification; I'll update the wiki...
Thanks,
Reinier
More information about the Lazarus
mailing list