[Lazarus] Faster than popcnt [[Re: UTF8LengthFast returning incorrect results on AARCH64 (MacOS)]]

Martin Frb lazarus at mfriebe.de
Wed Dec 29 10:16:39 CET 2021

On 29/12/2021 02:10, Marco van de Voort via lazarus wrote:
> On 28-12-2021 23:35, Martin Frb via lazarus wrote:
>> "nx" has a single "1" in each of the 8 bytes in a Qword (based on 
>> 64bit).
>> If we regard each of this bytes as an entity of its own, then we can 
>> keep adding those "1".
> I also was thinking in that direction, but more about how to optimize 
> that loop using SSE2
good idea...

> // Martin's routine that should be replaced by some punpkl magic, but 
> it is too late now.

Why too late?

There is a place for both. My routine works fine for cpu. (soon as a 32 
bit (and maybe 16 bit) variant are added).

Then for known cpu, special handling can be added.

More information about the lazarus mailing list