Skip to content
Commit 525de033 authored by Xuelei Zhang's avatar Xuelei Zhang Committed by Adhemerval Zanella
Browse files

aarch64: Optimized memset for Kunpeng processor.



Due to the branch prediction issue of Kunpeng processor, we found
memset_generic has poor performance on middle sizes setting, and so
we reconstructed the logic, expanded the loop by 4 times in set_long
to solve the problem, even when setting below 1K sizes have benefit.

Another change is that DZ_ZVA seems no work when setting zero, so we
discarded it and used set_long to set zero instead. Fewer branches and
predictions also make the zero case have slightly improvement.

Checked on aarch64-linux-gnu.

Reviewed-by: default avatarWilco Dijkstra <Wilco.Dijkstra@arm.com>
parent c2150769
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment