[Lazarus] Improving UTF8CharacterLength?

Thu Aug 13 14:19:43 CEST 2015

On Thu, 13 Aug 2015 14:05:19 +0200
Jürgen Hestermann <juergen.hestermann at gmx.de> wrote:

>[...]
> Still I think it would be better to give back 3 in case the byte actually
> means 3 because 1 byte does not form a valid UTF-8 character.
> If I rely on this result I would try to use this 1 byte as a valid UTF-8 character
> which would be wrong so I have to apply further checks to cope with this situation anyway.

Do you mean like UTF8CharacterStrictLength?

> Then I can also check whether the 3 or 4 bytes of the correct result exist.
> I would not loose anything for invalid UTF-8 strings but I would gain performance if
> I can guarantee valid UTF-8 string.

For this the UTF8QuickCharLen function would suffice, would it not?

> And if no zero byte exists (for whatever reason) it currently fails anyway.

Till now the Lazarus code didn't have such a case.

Mattias