[Lazarus] UTF8String and UTF8Delete

Sven Barth pascaldragon at googlemail.com
Wed Dec 16 20:55:50 CET 2015


On 16.12.2015 18:12, Bart wrote:
> On 12/10/15, Jürgen Hestermann <juergen.hestermann at gmx.de> wrote:
>
>> But now I have a problem with UTF8Strings:
>> With this declaration
>>
>> var S : UTF8String;
>>
>> I want to delete a character
>>
>> UTF8Delete(S,1,1);
>>
>> but I get an error that the (var) parameter mismatches.
>
> Fixed in r50850.
> Proposed for merging to 1.6RC2.

Better read Jonas' comments on that commit first:

> The code that was committed for UTF8Delete(utf8string) in http://svn.freepascal.org/cgi-bin/viewvc.cgi/trunk/components/lazutils/lazutf8.pas?root=lazarus&r1=50850&r2=50849&pathrev=50850 makes no sense:
> * it's the "string" version that needs this wrapper, not the utf8string one (it's probably done because the "string" version is more commonly called, but that doesn't help if the current approach is wrong)
> * changing the codepage of tmp at the start is a no-op, since it's an empty string at that point and you can't change the code page of an empty string (it's a nil pointer)
> * if you can reasonably expect any custom string type to already have an utf8 code page, it's utf8string. It's not guaranteed of course, but it's much more likely than for a random "string"-typed variable (even if defaultsystemcodepage = CP_UTF8, because system unit routines may return rawbytestrings that have different code page) -- so if you want to change the code page of strings to make sure it's utf8, that should be done in the plain string version rather than in the utf8string version.
>
> I also made an error in my original code (apart from the missing rawbytestring typecasts when calling SetStringCodePage). The "s:=tmp;" at the end needs to be "s:=RawByteString(tmp);" to avoid a codepage conversion in that assignment.

Regards,
Sven




More information about the Lazarus mailing list