[Lazarus] Faster than popcnt [[Re: UTF8LengthFast returning incorrect results on AARCH64 (MacOS)]]
John Landmesser
jmlandmesser at gmx.de
Thu Dec 30 14:17:40 CET 2021
Perhaps usefui test information from my PC:
******************************************
[john1 at manjaro sdb2]$ ./utf8lentest
234526968
fst:128406168
pop:128406168
add:128406168
asm:128406168
29315871
fst 1365
fst 1367
fst 1366
fst 1366
pop 9990
pop 9990
pop 9997
pop 9981
add 1386
add 1382
add 1386
add 1390
asm 346
asm 346
asm 346
asm 349
fst 1357
fst 1368
fst 1372
fst 1371
pop 10681
pop 6886
pop 6895
pop 6916
add 1247
add 1248
add 1250
add 1248
asm 295
asm 291
asm 291
asm 293
[john1 at manjaro sdb2]$
[john1 at manjaro sdb2]$ inxi -F
System:
Host: manjaro Kernel: 5.10.84-1-MANJARO x86_64 bits: 64
Desktop: Xfce 4.16.0 Distro: Manjaro Linux
Machine:
Type: Laptop System: LENOVO product: 81RS v: Lenovo Yoga S740-14IIL
serial: <superuser required>
Mobo: LENOVO model: LNVNB161216 v: SDK0J40709 WIN
serial: <superuser required> UEFI: LENOVO v: BYCN39WW date: 05/28/2021
Battery:
ID-1: BAT0 charge: 62.4 Wh (95.6%) condition: 65.3/62.0 Wh (105.3%)
CPU:
Info: quad core model: Intel Core i7-1065G7 bits: 64 type: MT MCP cache:
L2: 2 MiB
Speed (MHz): avg: 3520 min/max: 400/3900 cores: 1: 3543 2: 3890 3: 2319
4: 3513 5: 3709 6: 3650 7: 3792 8: 3749
Graphics:
Device-1: Intel Iris Plus Graphics G7 driver: i915 v: kernel
Device-2: NVIDIA GP108M [GeForce MX250] driver: nvidia v: 495.44
Device-3: Chicony Integrated Camera type: USB driver: uvcvideo
Display: x11 server: X.Org 1.21.1.2 driver: loaded: modesetting,nvidia
unloaded: nouveau resolution: 1: 1920x1080~60Hz 2: 1920x1080~60Hz
Message: Unable to show advanced data. Required tool glxinfo missing.
Audio:
Device-1: Intel Ice Lake-LP Smart Sound Audio driver: sof-audio-pci
Sound Server-1: ALSA v: k5.10.84-1-MANJARO running: yes
Sound Server-2: PipeWire v: 0.3.40 running: yes
Network:
Device-1: Intel Ice Lake-LP PCH CNVi WiFi driver: iwlwifi
IF: wlp0s20f3 state: up mac: 04:33:c2:02:de:51
Device-2: Realtek RTL8153 Gigabit Ethernet Adapter type: USB
driver: r8152
IF: enp0s13f0u1u4 state: up speed: 1000 Mbps duplex: full
mac: 4c:e1:73:42:1f:6b
IF-ID-1: pan1 state: down mac: 7a:5c:6a:f4:06:56
Bluetooth:
Device-1: Intel AX201 Bluetooth type: USB driver: btusb
Report: rfkill ID: hci0 state: up address: see --recommends
Drives:
Local Storage: total: 1.86 TiB used: 317.16 GiB (16.7%)
ID-1: /dev/nvme0n1 vendor: Micron model: MTFDHBA1T0TCK size: 953.87 GiB
ID-2: /dev/sda type: USB vendor: Western Digital model: WD10EARX-00N0YB0
size: 931.51 GiB
ID-3: /dev/sdb type: USB vendor: Kingston model: DataTraveler 2.0
size: 14.54 GiB
Partition:
ID-1: / size: 57.9 GiB used: 35.88 GiB (62.0%) fs: ext4 dev:
/dev/nvme0n1p8
ID-2: /boot/efi size: 259.5 MiB used: 114.1 MiB (44.0%) fs: vfat
dev: /dev/nvme0n1p1
Swap:
ID-1: swap-1 type: partition size: 16.67 GiB used: 0 KiB (0.0%)
dev: /dev/nvme0n1p9
Sensors:
System Temperatures: cpu: 58.0 C mobo: N/A
Fan Speeds (RPM): N/A
Info:
Processes: 289 Uptime: 9m Memory: 15.2 GiB used: 2.19 GiB (14.4%)
Shell: Bash inxi: 3.3.11
*****************************************
Am 30.12.21 um 13:58 schrieb Marco van de Voort via lazarus:
>
> On 30-12-2021 10:15, Florian Klämpfl via lazarus wrote:
>>
>> Linux uses different calling conventions, please check with the patch
>> below.
>>
> Linux is quite generous with the volatile registers, so luckily it
> matches quite closely.
>
> I first tried the approach of your patch, but [s] has problems on
> windows, so would require ifdef on every "s"use, so I simply move [s]
> to rcx
>
> {$ifndef Windows}
> // we can't use [s] as an alias for the pointer parameter, because
> the non assembler procedure on Windows
> // changes that into a stack reference. FPC doesn't support non
> volatile frame management for assembler procs like Delphi does.
> mov rcx,s // rdi
> mov edx,len // rsi
> {$endif}
>
> and the ifdeffing of the assembler procedure on linux vs inline asm
> block on Windows. Then it works on Linux x86_64.
>
> Funnily, our server AMD Athlon 200GE (Zen1, 3.2GHz?) nearly the exact
> same timings as my i7-3770 3.4GHz
>
> I did some other minor work after last post, so here is now the entire
> program:
>
More information about the lazarus
mailing list