Skip to content
Commit 5d56ee94 authored by Noah Goldstein's avatar Noah Goldstein Committed by Sunil K Pandey
Browse files

x86: Optimize memmove-vec-unaligned-erms.S



No bug.

The optimizations are as follows:

1) Always align entry to 64 bytes. This makes behavior more
   predictable and makes other frontend optimizations easier.

2) Make the L(more_8x_vec) cases 4k aliasing aware. This can have
   significant benefits in the case that:
        0 < (dst - src) < [256, 512]

3) Align before `rep movsb`. For ERMS this is roughly a [0, 30%]
   improvement and for FSRM [-10%, 25%].

In addition to these primary changes there is general cleanup
throughout to optimize the aligning routines and control flow logic.

Signed-off-by: default avatarNoah Goldstein <goldstein.w.n@gmail.com>
Reviewed-by: default avatarH.J. Lu <hjl.tools@gmail.com>
(cherry picked from commit a6b7502e)
parent e36de6a3
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment