[Lazarus] Does Lazarus support a complete Unicode Component Library?

Juha Manninen juha.manninen62 at gmail.com
Sat Jan 1 22:29:08 CET 2011


Vladimir Zhirov kirjoitti lauantai 01 tammikuu 2011 22:14:32:
> Sven Barth wrote:
> > You need to convert the UTF8 string to a different one, e.g.
> > UTF16:
> > 
> > var
> >    us: UnicodeString;
> > begin
> >    us := UTF8Encode(s);
> > end;
> > 
> > Now us[2] will return the a-umlaut.
> 
> I would suggest using Utf8Copy(s, 2, 1) instead. It helps
> to avoid conversion and works correctly even for characters
> that take 4 bytes in UnicodeString/WideString (i.e. 2
> wide characters). Utf8Copy is declared in LCLProc unit.

So the conversion is only needed if a char inside the string is accessed by 
index?

I understand the principle but I didn't understand how the functions 
UTF8Encode and UTF8Decode work. Of course I don't need to understand such 
details because I am not FPC developer but anyway ...

UTF8Encode returns UTF8String and the AnsiString parameter is internally 
typecasted to UnicodeString. How can that work?

Maybe Sven's example should use UTF8Decode. It returns UnicodeString.
According to debugger both functions convert the string to uppercase and add 
some garbage to the beginning and end, but it may be debugger error.


Regards,
Juha




More information about the Lazarus mailing list