Skip to content
Commit 1f1a89ac authored by Chris Wilson's avatar Chris Wilson Committed by Thomas Gleixner
Browse files

x86/mm: Micro-optimise clflush_cache_range()



Whilst inspecting the asm for clflush_cache_range() and some perf profiles
that required extensive flushing of single cachelines (from part of the
intel-gpu-tools GPU benchmarks), we noticed that gcc was reloading
boot_cpu_data.x86_clflush_size on every iteration of the loop. We can
manually hoist that read which perf regarded as taking ~25% of the
function time for a single cacheline flush.

Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: default avatarRoss Zwisler <ross.zwisler@linux.intel.com>
Acked-by: default avatar"H. Peter Anvin" <hpa@zytor.com>
Cc: Toshi Kani <toshi.kani@hpe.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Luis R. Rodriguez <mcgrof@suse.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Sai Praneeth <sai.praneeth.prakhya@intel.com>
Link: http://lkml.kernel.org/r/1452246933-10890-1-git-send-email-chris@chris-wilson.co.uk


Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
parent 2039e6ac
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment