Skip to content
Commit 75ded18a authored by WANG Xuerui's avatar WANG Xuerui Committed by Huacai Chen
Browse files

LoongArch: Add SIMD-optimized XOR routines



Add LSX and LASX implementations of xor operations, operating on 64
bytes (one L1 cache line) at a time, for a balance between memory
utilization and instruction mix. Huacai confirmed that all future
LoongArch implementations by Loongson (that we care) will likely also
feature 64-byte cache lines, and experiments show no throughput
improvement with further unrolling.

Performance numbers measured during system boot on a 3A5000 @ 2.5GHz:

> 8regs           : 12702 MB/sec
> 8regs_prefetch  : 10920 MB/sec
> 32regs          : 12686 MB/sec
> 32regs_prefetch : 10918 MB/sec
> lsx             : 17589 MB/sec
> lasx            : 26116 MB/sec

Acked-by: default avatarSong Liu <song@kernel.org>
Signed-off-by: default avatarWANG Xuerui <git@xen0n.name>
Signed-off-by: default avatarHuacai Chen <chenhuacai@loongson.cn>
parent 2478e4b7
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment