[Lazarus] How to use strings properly with fixes_1_6 and FPC 3.0.0?
Juha Manninen
juha.manninen62 at gmail.com
Fri Oct 21 14:59:38 CEST 2016
On Fri, Oct 21, 2016 at 3:24 PM, Gabor Boros via Lazarus
<lazarus at lists.lazarus-ide.org> wrote:
> Why the below example better than a for loop with UTF8Length and UTF8Copy
> for go through the string?
Because it is MUCH faster. It scales linearly, O(n).
Calling UTF8Length() and UTF8Copy() inside the loop makes it
polynomial O(n^2) or worse depending on how many UTF8...() calls you
have there.
Yes, we have seen complaints that UTF-8 is unusable because you must
use the slow UTF8Length() and UTF8Copy(), and UTF-16 is better because
you can use fixed width S[i] indexing.
That is obviously based on misunderstanding of both encodings.
Hint: if you need to iterate CodePoints, you can also use the
enumerator from LazUnicode unit. It uses the same concept as the
example in wiki page. It allows this code:
for ch in s do
writeln('ch=',ch);
and the same code even works in Delphi with UTF-16. Cool, ha!?
Juha
More information about the Lazarus
mailing list