aarch64: Optimized implementation of memcmp
The loop body is expanded from a 16-byte comparison to a 64-byte
comparison, and the usage of ldp is replaced by the Post-index
mode to the Base plus offset mode. Hence, compare can faster 18%
around > 128 bytes in all.
Checked on aarch64-linux-gnu.
Reviewed-by:
Wilco Dijkstra <Wilco.Dijkstra@arm.com>
Loading
Please register or sign in to comment