[Lazarus] Faster than popcnt [[Re: UTF8LengthFast returning incorrect results on AARCH64 (MacOS)]]
Martin Frb
lazarus at mfriebe.de
Thu Dec 30 15:09:04 CET 2021
On 30/12/2021 14:43, Marco van de Voort via lazarus wrote:
> Compile with -O4 -Cpcoreavx2 , the others (non asm) will become
> faster, my guess is "add" will be about double of asm.
Core I7 8700K
3.3.1 from Dec 10th
3.2.3 from Dec 9th
With fpc 3.3.1:
- fst is worse?
- add gets better
-O4 -Cpcoreavx2
fpc 3.2.3 / fpc 3.3.1
fst 594 fst 688
fst 578 fst 703
fst 578 fst 687
fst 562 fst 688
pop 485 pop 485
pop 500 pop 500
pop 500 pop 484
pop 484 pop 500
add 594 add 422
add 578 add 438
add 578 add 437
add 594 add 453
asm 250 asm 250
asm 250 asm 250
asm 250 asm 250
asm 250 asm 266
fpc 3.2.3
-O4 -Cpcoreavx -O4 -CpCOREI
fst 594 fst 593
fst 578 fst 579
fst 578 fst 562
fst 594 fst 578
pop 500 pop 500
pop 515 pop 500
pop 500 pop 500
pop 485 pop 485
add 593 add 593
add 579 add 578
add 578 add 594
add 593 add 594
asm 250 asm 250
asm 250 asm 250
asm 235 asm 250
asm 250 asm 250
More information about the lazarus
mailing list