AArch64: Optimize memcmp
Rewrite memcmp to improve performance. On small and medium inputs performance
is 10-20% better. Large inputs use a SIMD loop processing 64 bytes per
iteration, which is 30-50% faster depending on the size.
Reviewed-by:
Szabolcs Nagy <szabolcs.nagy@arm.com>
Loading