[Lazarus] API-Import of "PUTF8Char"
Mattias Gaertner
nc-gaertnma at netcologne.de
Sun Sep 4 10:33:18 CEST 2016
On Sat, 3 Sep 2016 23:55:44 +0200
Martok <listbox at martoks-place.de> wrote:
> Hi List,
>
> I'm writing an API interface that passes #0-terminated cstrings with encoding
> UTF8. What data type should be used to declare these parameters so that I may be
> able to use as much of 3.0+'s automatic encoding conversion as possible?
>
> Some example declarations would look like:
> procedure SetUserID(const NewValue: PUTF8Char);
> function GetUserID(const Buf: PUTF8Char; const BufLength: UInt32): UInt32;
>
> If I read the wiki correctly, PAnsiChar would not be clear as it is always
> assumed to be CP_ACP, causing the compiler to generate conversions to
> DefaultSystemCodePage. I'm posting this to the Lazarus list instead of
> fpc-pascal because I already use LazUTF8 so CP_ACP really is CP_UTF8, but I want
> to be sure that the header always works whether LazUTF8 is used or not.
PAnsiChar is usually a PChar.
They are different when using $mode delphiunicode, in which case PChar
becomes PWideChar.
So PAnsiChar is always a pointer to a CP_ACP char.
Thus assigning a PAnsiChar (=PChar) to a String, AnsiString,
RawbyteString or ShortString does not add conversion code and therefore
does no conversion with or without LazUTF8.
Assigning it to another string type (UnicodeString, UTF8String or
AnsiString[cp]) will add conversion code. With LazUTF8 this means your
PChar will be treated as CP_UTF8, without it will be treated as the
runtime system codepage.
The other way round - assigning a string to a PChar - is not supported
by FPC. So only with LazUTF8 you can use a simple type cast. Without
LazUTF8 you must convert the string first, before type casting.
If your header should work whether LazUTF8 is used or not then you can
provide a helper function:
procedure SetUserIDUTF8(const NewValue: PChar);
begin
...
end;
procedure SetUserID(const NewValue: UTF8String);
begin
SetUserIDUTF8(PChar(NewValue));
end;
Alternatively you can use a more optimized version in case LazUTF8 is
used:
procedure SetUserID(const NewValue: AnsiString);
var
uValue: String;
begin
if (DefaultSystemCodePage=CP_UTF8) then
SetUserIDUTF8(PChar(NewValue))
else begin
uValue:=AnsiToUTF8(NewValue);
SetUserIDUTF8(PChar(uValue));
end;
end;
> Is there a good way to do what I want, or would it be easier to use PUnicodeChar
> and pass the strings as UTF-16? How well would other languages work with that?
Whether it's easier totally depends on the other language.
Mattias
More information about the Lazarus
mailing list