[Lazarus] cwstring in arm-linux
Graeme Geldenhuys
graemeg.lists at gmail.com
Fri Oct 21 09:03:51 CEST 2011
On 2011-10-21 00:20, Hans-Peter Diettrich wrote:
> your legacy code can assume that every (visible) character is a Char, in
> an SBCS codepage, this is not different in UTF-16.
Rookie mistake!!! You forgot surrogate pairs in UTF-16. Think outside
the Unicode BMP where a "visible" character will be 4-bytes, thus two
UTF-16 Char values. As as I mentioned earlier, most programmers using
UTF-16 treat it like UCS2, forgetting that they need to check for
surrogate pairs too.
Now in UTF-8, this is not a problem at all. Finding a visible character
in the BMP or Supplementary Plane is a identical process, no special
checking is required. Thus making UTF-8 much easier and safer to use.
I've ported enough Delphi code to FPC + fpGUI where UTF-8 is used for
Unicode support. I fully agree with Felipe, using UTF-8 is much easier
with legacy code that UTF-16.
Regards,
- Graeme -
--
fpGUI Toolkit - a cross-platform GUI toolkit using Free Pascal
http://fpgui.sourceforge.net/
More information about the Lazarus
mailing list