Commit b412213e authored Oct 18, 2022 by Noah Goldstein

x86: Optimize strrchr-evex.S and implement with VMM headers

Optimization is:
1. Cache latest result in "fast path" loop with `vmovdqu` instead of
  `kunpckdq`.  This helps if there are more than one matches.

Code Size Changes:
strrchr-evex.S       :  +30 bytes (Same number of cache lines)

Net perf changes:

Reported as geometric mean of all improvements / regressions from N=10
runs of the benchtests. Value as New Time / Old Time so < 1.0 is
improvement and 1.0 is regression.

strrchr-evex.S       : 0.932 (From cases with higher match frequency)

Full results attached in email.

Full check passes on x86-64.

parent 4af6844a

Show whitespace changes

Inline Side-by-side

mirror @mirror
mentioned in commit c25eb94a
· Oct 22, 2022

mentioned in commit c25eb94a

mentioned in commit c25eb94aed942761aabdcd05239b7e76dcdde098

Toggle commit list

Please register or to comment