Commit d5c2ede5 authored by Jiri Olsa's avatar Jiri Olsa Committed by sanglipeng
Browse files

bpf: Disable preemption in bpf_event_output

stable inclusion
from stable-v5.10.190
commit 3048cb0dc0cc9dc74ed93690dffef00733bcad5b
category: bugfix
bugzilla: https://gitee.com/openeuler/kernel/issues/I928UI

Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=3048cb0dc0cc9dc74ed93690dffef00733bcad5b

--------------------------------

commit d62cc390 upstream.

We received report [1] of kernel crash, which is caused by
using nesting protection without disabled preemption.

The bpf_event_output can be called by programs executed by
bpf_prog_run_array_cg function that disabled migration but
keeps preemption enabled.

This can cause task to be preempted by another one inside the
nesting protection and lead eventually to two tasks using same
perf_sample_data buffer and cause crashes like:

  BUG: kernel NULL pointer dereference, address: 0000000000000001
  #PF: supervisor instruction fetch in kernel mode
  #PF: error_code(0x0010) - not-present page
  ...
  ? perf_output_sample+0x12a/0x9a0
  ? finish_task_switch.isra.0+0x81/0x280
  ? perf_event_output+0x66/0xa0
  ? bpf_event_output+0x13a/0x190
  ? bpf_event_output_data+0x22/0x40
  ? bpf_prog_dfc84bbde731b257_cil_sock4_connect+0x40a/0xacb
  ? xa_load+0x87/0xe0
  ? __cgroup_bpf_run_filter_sock_addr+0xc1/0x1a0
  ? release_sock+0x3e/0x90
  ? sk_setsockopt+0x1a1/0x12f0
  ? udp_pre_connect+0x36/0x50
  ? inet_dgram_connect+0x93/0xa0
  ? __sys_connect+0xb4/0xe0
  ? udp_setsockopt+0x27/0x40
  ? __pfx_udp_push_pending_frames+0x10/0x10
  ? __sys_setsockopt+0xdf/0x1a0
  ? __x64_sys_connect+0xf/0x20
  ? do_syscall_64+0x3a/0x90
  ? entry_SYSCALL_64_after_hwframe+0x72/0xdc

Fixing this by disabling preemption in bpf_event_output.

[1] https://github.com/cilium/cilium/issues/26756


Cc: stable@vger.kernel.org
Reported-by: default avatarOleg "livelace" Popov <o.popov@livelace.ru>
Closes: https://github.com/cilium/cilium/issues/26756


Fixes: 2a916f2f ("bpf: Use migrate_disable/enable in array macros and cgroup/lirc code.")
Acked-by: default avatarHou Tao <houtao1@huawei.com>
Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20230725084206.580930-3-jolsa@kernel.org


Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: default avatarsanglipeng <sanglipeng1@jd.com>
parent 2e6a7710
Loading
Loading
Loading
Loading
+5 −1
Original line number Diff line number Diff line
@@ -970,7 +970,6 @@ static DEFINE_PER_CPU(struct bpf_trace_sample_data, bpf_misc_sds);
u64 bpf_event_output(struct bpf_map *map, u64 flags, void *meta, u64 meta_size,
		     void *ctx, u64 ctx_size, bpf_ctx_copy_t ctx_copy)
{
	int nest_level = this_cpu_inc_return(bpf_event_output_nest_level);
	struct perf_raw_frag frag = {
		.copy		= ctx_copy,
		.size		= ctx_size,
@@ -987,8 +986,12 @@ u64 bpf_event_output(struct bpf_map *map, u64 flags, void *meta, u64 meta_size,
	};
	struct perf_sample_data *sd;
	struct pt_regs *regs;
	int nest_level;
	u64 ret;

	preempt_disable();
	nest_level = this_cpu_inc_return(bpf_event_output_nest_level);

	if (WARN_ON_ONCE(nest_level > ARRAY_SIZE(bpf_misc_sds.sds))) {
		ret = -EBUSY;
		goto out;
@@ -1003,6 +1006,7 @@ u64 bpf_event_output(struct bpf_map *map, u64 flags, void *meta, u64 meta_size,
	ret = __bpf_perf_event_output(regs, map, flags, sd);
out:
	this_cpu_dec(bpf_event_output_nest_level);
	preempt_enable();
	return ret;
}