Skip to content
Commit c4781e37 authored by Alexei Starovoitov's avatar Alexei Starovoitov
Browse files

selftests/bpf: Add BPF trampoline performance test



Add a test that benchmarks different ways of attaching BPF program to a kernel function.
Here are the results for 2.4Ghz x86 cpu on a kernel without mitigations:
$ ./test_progs -n 49 -v|grep events
task_rename base	2743K events per sec
task_rename kprobe	2419K events per sec
task_rename kretprobe	1876K events per sec
task_rename raw_tp	2578K events per sec
task_rename fentry	2710K events per sec
task_rename fexit	2685K events per sec

On a kernel with retpoline:
$ ./test_progs -n 49 -v|grep events
task_rename base	2401K events per sec
task_rename kprobe	1930K events per sec
task_rename kretprobe	1485K events per sec
task_rename raw_tp	2053K events per sec
task_rename fentry	2351K events per sec
task_rename fexit	2185K events per sec

All 5 approaches:
- kprobe/kretprobe in __set_task_comm()
- raw tracepoint in trace_task_rename()
- fentry/fexit in __set_task_comm()
are roughly equivalent.

__set_task_comm() by itself is quite fast, so any extra instructions add up.
Until BPF trampoline was introduced the fastest mechanism was raw tracepoint.
kprobe via ftrace was second best. kretprobe is slow due to trap. New
fentry/fexit methods via BPF trampoline are clearly the fastest and the
difference is more pronounced with retpoline on, since BPF trampoline doesn't
use indirect jumps.

Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
Acked-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20191122011515.255371-1-ast@kernel.org
parent 161f3cbc
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment