Commit 3dc426b6 authored Aug 07, 2024 by Wilco Dijkstra

AArch64: Improve generic strlen

Improve performance by handling another 16 bytes before entering the loop.
Use ADDHN in the loop to avoid SHRN+FMOV when it terminates. Change final
size computation to avoid increasing latency. On Neoverse V1 performance
of the random strlen benchmark improves by 4.6%.

Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>

parent d5ce0e96

Show whitespace changes

Inline Side-by-side

Please register or to comment