Commit db55174d authored by Daniel Borkmann's avatar Daniel Borkmann
Browse files

Merge branch 'bpf-kptr-rcu'



Alexei Starovoitov says:

====================
v4->v5:
fix typos, add acks.

v3->v4:
- patch 3 got much cleaner after BPF_KPTR_RCU was removed as suggested by David.

- make KF_RCU stronger and require that bpf program checks for NULL
before passing such pointers into kfunc. The prog has to do that anyway
to access fields and it aligns with BTF_TYPE_SAFE_RCU allowlist.

- New patch 6: refactor RCU enforcement in the verifier.
The patches 2,3,6 are part of one feature.
The 2 and 3 alone are incomplete, since RCU pointers are barely useful
without bpf_rcu_read_lock/unlock in GCC compiled kernel.
Even if GCC lands support for btf_type_tag today it will take time
to mandate that version for kernel builds. Hence go with allow list
approach. See patch 6 for details.
This allows to start strict enforcement of TRUSTED | UNTRUSTED
in one part of PTR_TO_BTF_ID accesses.
One step closer to KF_TRUSTED_ARGS by default.

v2->v3:
- Instead of requiring bpf progs to tag fields with __kptr_rcu
teach the verifier to infer RCU properties based on the type.
BPF_KPTR_RCU becomes kernel internal type of struct btf_field.
- Add patch 2 to tag cgroups and dfl_cgrp as trusted.
That bug was spotted by BPF CI on clang compiler kernels,
since patch 3 is doing:
static bool in_rcu_cs(struct bpf_verifier_env *env)
{
        return env->cur_state->active_rcu_lock || !env->prog->aux->sleepable;
}
which makes all non-sleepable programs behave like they have implicit
rcu_read_lock around them. Which is the case in practice.
It was fine on gcc compiled kernels where task->cgroup deference was producing
PTR_TO_BTF_ID, but on clang compiled kernels task->cgroup deference was
producing PTR_TO_BTF_ID | MEM_RCU | MAYBE_NULL, which is more correct,
but selftests were failing. Patch 2 fixes this discrepancy.
With few more patches like patch 2 we can make KF_TRUSTED_ARGS default
for kfuncs and helpers.
- Add comment in selftest patch 5 that it's verifier only check.

v1->v2:
Instead of agressively allow dereferenced kptr_rcu pointers into KF_TRUSTED_ARGS
kfuncs only allow them into KF_RCU funcs.
The KF_RCU flag is a weaker version of KF_TRUSTED_ARGS. The kfuncs marked with
KF_RCU expect either PTR_TRUSTED or MEM_RCU arguments. The verifier guarantees
that the objects are valid and there is no use-after-free, but the pointers
maybe NULL and pointee object's reference count could have reached zero, hence
kfuncs must do != NULL check and consider refcnt==0 case when accessing such
arguments.
No changes in patch 1.
Patches 2,3,4 adjusted with above behavior.

v1:
The __kptr_ref turned out to be too limited, since any "trusted" pointer access
requires bpf_kptr_xchg() which is impractical when the same pointer needs
to be dereferenced by multiple cpus.
The __kptr "untrusted" only access isn't very useful in practice.
Rename __kptr to __kptr_untrusted with eventual goal to deprecate it,
and rename __kptr_ref to __kptr, since that looks to be more common use of kptrs.
Introduce __kptr_rcu that can be directly dereferenced and used similar
to native kernel C code.
Once bpf_cpumask and task_struct kfuncs are converted to observe RCU GP
when refcnt goes to zero, both __kptr and __kptr_untrusted can be deprecated
and __kptr_rcu can become the only __kptr tag.
====================

Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
parents 944459e8 6fcd486b
Loading
Loading
Loading
Loading
+2 −2
Original line number Diff line number Diff line
@@ -314,7 +314,7 @@ Q: What is the compatibility story for special BPF types in map values?
Q: Users are allowed to embed bpf_spin_lock, bpf_timer fields in their BPF map
values (when using BTF support for BPF maps). This allows to use helpers for
such objects on these fields inside map values. Users are also allowed to embed
pointers to some kernel types (with __kptr and __kptr_ref BTF tags). Will the
pointers to some kernel types (with __kptr_untrusted and __kptr BTF tags). Will the
kernel preserve backwards compatibility for these features?

A: It depends. For bpf_spin_lock, bpf_timer: YES, for kptr and everything else:
@@ -324,7 +324,7 @@ For struct types that have been added already, like bpf_spin_lock and bpf_timer,
the kernel will preserve backwards compatibility, as they are part of UAPI.

For kptrs, they are also part of UAPI, but only with respect to the kptr
mechanism. The types that you can use with a __kptr and __kptr_ref tagged
mechanism. The types that you can use with a __kptr_untrusted and __kptr tagged
pointer in your struct are NOT part of the UAPI contract. The supported types can
and will change across kernel releases. However, operations like accessing kptr
fields and bpf_kptr_xchg() helper will continue to be supported across kernel
+2 −2
Original line number Diff line number Diff line
@@ -51,7 +51,7 @@ For example:
.. code-block:: c

        struct cpumask_map_value {
                struct bpf_cpumask __kptr_ref * cpumask;
                struct bpf_cpumask __kptr * cpumask;
        };

        struct array_map {
@@ -128,7 +128,7 @@ Here is an example of a ``struct bpf_cpumask *`` being retrieved from a map:

	/* struct containing the struct bpf_cpumask kptr which is stored in the map. */
	struct cpumasks_kfunc_map_value {
		struct bpf_cpumask __kptr_ref * bpf_cpumask;
		struct bpf_cpumask __kptr * bpf_cpumask;
	};

	/* The map containing struct cpumasks_kfunc_map_value entries. */
+8 −6
Original line number Diff line number Diff line
@@ -249,11 +249,13 @@ added later.
2.4.8 KF_RCU flag
-----------------

The KF_RCU flag is used for kfuncs which have a rcu ptr as its argument.
When used together with KF_ACQUIRE, it indicates the kfunc should have a
single argument which must be a trusted argument or a MEM_RCU pointer.
The argument may have reference count of 0 and the kfunc must take this
into consideration.
The KF_RCU flag is a weaker version of KF_TRUSTED_ARGS. The kfuncs marked with
KF_RCU expect either PTR_TRUSTED or MEM_RCU arguments. The verifier guarantees
that the objects are valid and there is no use-after-free. The pointers are not
NULL, but the object's refcount could have reached zero. The kfuncs need to
consider doing refcnt != 0 check, especially when returning a KF_ACQUIRE
pointer. Note as well that a KF_ACQUIRE kfunc that is KF_RCU should very likely
also be KF_RET_NULL.

.. _KF_deprecated_flag:

@@ -544,7 +546,7 @@ Here's an example of how it can be used:

	/* struct containing the struct task_struct kptr which is actually stored in the map. */
	struct __cgroups_kfunc_map_value {
		struct cgroup __kptr_ref * cgroup;
		struct cgroup __kptr * cgroup;
	};

	/* The map containing struct __cgroups_kfunc_map_value entries. */
+1 −1
Original line number Diff line number Diff line
@@ -2279,7 +2279,7 @@ struct bpf_core_ctx {

bool btf_nested_type_is_trusted(struct bpf_verifier_log *log,
				const struct bpf_reg_state *reg,
				int off);
				int off, const char *suffix);

bool btf_type_ids_nocast_alias(struct bpf_verifier_log *log,
			       const struct btf *reg_btf, u32 reg_id,
+0 −1
Original line number Diff line number Diff line
@@ -537,7 +537,6 @@ struct bpf_verifier_env {
	bool bypass_spec_v1;
	bool bypass_spec_v4;
	bool seen_direct_write;
	bool rcu_tag_supported;
	struct bpf_insn_aux_data *insn_aux_data; /* array of per-insn state */
	const struct bpf_line_info *prev_linfo;
	struct bpf_verifier_log log;
Loading