[Lazarus] UTF8String and UTF8Delete
Sven Barth
pascaldragon at googlemail.com
Sat Dec 12 18:20:37 CET 2015
On 12.12.2015 12:46, Jürgen Hestermann wrote:
> Am 2015-12-11 um 19:14 schrieb Sven Barth:
> > Windows uses multi byte strings (one byte per character or more)
> > and UTF-16 (which is mostly 2 Byte and 4 for surrogate pairs).
> > The functions WideCharToMultiByte and MultiByteToWideChar which
> > are also used inside FPC for string conversions both take a
> > CodePage parameter that can also be CP_UTF8.
>
> As far as I know, (current) Windows versions only use UTF16 internally.
> But it provides the old legacy ANSI functions too (which convert to UTF16).
> MultiByteToWideChar and WideCharToMultiByte are just helper
> functions to convert from arbitrary encodings to UTF16 (and back).
> But UTF-8 is nowhere used internaly in Windows (not even ANSI anymore,
> except the legacy functions which convert to and from UTF16) and
> you cannot use UTF8 as string encoding for WIN API functions.
> Otherwise we would not have this problem and could use UTF-8 as
> a standard for everything.
Yes, internally Windows uses UTF-16, but if you set your Windows Ansi
code page or at least the current thread's locale to UTF-8 (indirectly
by choosing a locale that has UTF-8 as code page, I don't know one right
now though) then the *A functions *do* work with UTF-8, simply because
they use the current locale's code page to convert from Ansi to Unicode
and in this case Ansi includes UTF-8.
Regards,
Sven
More information about the Lazarus
mailing list