Commit fa2c4239 authored Jun 21, 2024 by Jan Beulich

x86: optimize left-shift-by-1

These can be replaced by adds when acting on a register operand.

While for the scalar forms there's no gain in encoding size, ADD
generally has higher throughput than SHL. EFLAGS set by ADD are a
superset of those set by SHL (AF in particular is undefined there).

For the SIMD cases the transformation also reduced code size, by
eliminating the 1-byte immediate from the resulting encoding. Note
that this transformation is not applied by gcc13 (according to my
observations), so would - as of now - even improve compiler generated
code.

parent 87860ef6

Show whitespace changes

Inline Side-by-side

Please register or to comment