Skip to content
Commit e941e0ae authored by Tulio Magno Quites Machado Filho's avatar Tulio Magno Quites Machado Filho
Browse files

powerpc64le: Optimize memcpy for POWER10

This implementation is based on __memcpy_power8_cached and integrates
suggestions from Anton Blanchard.
It benefits from loads and stores with length for short lengths and for
tail code, simplifying the code.

All unaligned memory accesses use instructions that do not generate
alignment interrupts on POWER10, making it safe to use on
caching-inhibited memory.

The main loop has also been modified in order to increase instruction
throughput by reducing the dependency on updates from previous iterations.

On average, this implementation provides around 30% improvement when
compared to __memcpy_power7 and 10% improvement in comparison to
__memcpy_power8_cached.
parent dd59655e
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment