Skip to content
Commit 14318efb authored by Rob Herring's avatar Rob Herring Committed by Russell King
Browse files

ARM: 7587/1: implement optimized percpu variable access



Use the previously unused TPIDRPRW register to store percpu offsets.
TPIDRPRW is only accessible in PL1, so it can only be used in the kernel.

This replaces 2 loads with a mrc instruction for each percpu variable
access. With hackbench, the performance improvement is 1.4% on Cortex-A9
(highbank). Taking an average of 30 runs of "hackbench -l 1000" yields:

Before: 6.2191
After: 6.1348

Will Deacon reported similar delta on v6 with 11MPCore.

The asm "memory clobber" are needed here to ensure the percpu offset
gets reloaded. Testing by Will found that this would not happen in
__schedule() which is a bit of a special case as preemption is disabled
but the execution can move cores.

Signed-off-by: default avatarRob Herring <rob.herring@calxeda.com>
Acked-by: default avatarWill Deacon <will.deacon@arm.com>
Acked-by: default avatarNicolas Pitre <nico@linaro.org>
Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
parent 3e99675a
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment