Commit d17b557e authored by Alexei Starovoitov's avatar Alexei Starovoitov
Browse files

Merge branch 'bpf: cgroup_sock lsm flavor'



Stanislav Fomichev says:

====================

This series implements new lsm flavor for attaching per-cgroup programs to
existing lsm hooks. The cgroup is taken out of 'current', unless
the first argument of the hook is 'struct socket'. In this case,
the cgroup association is taken out of socket. The attachment
looks like a regular per-cgroup attachment: we add new BPF_LSM_CGROUP
attach type which, together with attach_btf_id, signals per-cgroup lsm.
Behind the scenes, we allocate trampoline shim program and
attach to lsm. This program looks up cgroup from current/socket
and runs cgroup's effective prog array. The rest of the per-cgroup BPF
stays the same: hierarchy, local storage, retval conventions
(return 1 == success).

Current limitations:
* haven't considered sleepable bpf; can be extended later on
* not sure the verifier does the right thing with null checks;
  see latest selftest for details
* total of 10 (global) per-cgroup LSM attach points

v11:
- Martin: address selftest memory & fd leaks
- Martin: address moving into root (instead have another temp leaf cgroup)
- Martin: move tools/include/uapi/linux/bpf.h change from libbpf patch
  into 'sync tools' patch

v10:
- Martin: reword commit message, drop outdated items
- Martin: remove rcu_real_lock from __cgroup_bpf_run_lsm_current
- Martin: remove CONFIG_BPF_LSM from cgroup_bpf_release
- Martin: fix leaking shim reference in bpf_cgroup_link_release
- Martin: WARN_ON_ONCE for bpf_trampoline_lookup in bpf_trampoline_unlink_cgroup_shim
- Martin: sync tools/include/linux/btf_ids.h
- Martin: move progs/flags closer to the places where they are used in __cgroup_bpf_query
- Martin: remove sk_clone_security & sctp_bind_connect from bpf_lsm_locked_sockopt_hooks
- Martin: try to determine vmlinux btf_id in bpftool
- Martin: update tools header in a separate commit
- Quentin: do libbpf_find_kernel_btf from the ops that need it
- lkp@intel.com: another build failure

v9:
Major change since last version is the switch to bpf_setsockopt to
change the socket state instead of letting the progs poke socket directly.
This, in turn, highlights the challenge that we need to care about whether
the socket is locked or not when we call bpf_setsockopt. (with my original
example selftest, the hooks are running early in the init phase for this
not to matter).

For now, I've added two btf id lists:
* hooks where we know the socket is locked and it's safe to call bpf_setsockopt
* hooks where we know the socket is _not_ locked, but the hook works on
  the socket that's not yet exposed to userspace so it should be safe
  (for this mode, special new set of bpf_{s,g}etsockopt helpers
   is added; they don't have sock_owned_by_me check)

Going forward, for the rest of the hooks, this might be a good motivation
to expand lsm cgroup to support sleeping bpf and allow the callers to
lock/unlock sockets or have a new bpf_setsockopt variant that does the
locking.

- ifdef around cleanup in cgroup_bpf_release
- Andrii: a few nits in libbpf patches
- Martin: remove unused btf_id_set_index
- Martin: bring back refcnt for cgroup_atype
- Martin: make __cgroup_bpf_query a bit more readable
- Martin: expose dst_prog->aux->attach_btf as attach_btf_obj_id as well
- Martin: reorg check_return_code path for BPF_LSM_CGROUP
- Martin: return directly from check_helper_call (instead of goto err)
- Martin: add note to new warning in check_return_code, print only for void hooks
- Martin: remove confusing shim reuse
- Martin: use bpf_{s,g}etsockopt instead of poking into socket data
- Martin: use CONFIG_CGROUP_BPF in bpf_prog_alloc_no_stats/bpf_prog_free_deferred

v8:
- CI: fix compile issue
- CI: fix broken bpf_cookie
- Yonghong: remove __bpf_trampoline_unlink_prog comment
- Yonghong: move cgroup_atype around to fill the gap
- Yonghong: make bpf_lsm_find_cgroup_shim void
- Yonghong: rename regs to args
- Yonghong: remove if(current) check
- Martin: move refcnt into bpf_link
- Martin: move shim management to bpf_link ops
- Martin: use cgroup_atype for shim only
- Martin: go back to arrays for managing cgroup_atype(s)
- Martin: export bpf_obj_id(aux->attach_btf)
- Andrii: reorder SEC_DEF("lsm_cgroup+")
- Andrii: OPTS_SET instead of OPTS_HAS
- Andrii: rename attach_btf_func_id
- Andrii: move into 1.0 map

v7:
- there were a lot of comments last time, hope I didn't forget anything,
  some of the bigger ones:
  - Martin: use/extend BTF_SOCK_TYPE_SOCKET
  - Martin: expose bpf_set_retval
  - Martin: reject 'return 0' at the verifier for 'void' hooks
  - Martin: prog_query returns all BPF_LSM_CGROUP, prog_info
    returns attach_btf_func_id
  - Andrii: split libbpf changes
  - Andrii: add field access test to test_progs, not test_verifier (still
    using asm though)
- things that I haven't addressed, stating them here explicitly, let
  me know if some of these are still problematic:
  1. Andrii: exposing only link-based api: seems like the changes
     to support non-link-based ones are minimal, couple of lines,
     so seems like it worth having it?
  2. Alexei: applying cgroup_atype for all cgroup hooks, not only
     cgroup lsm: looks a bit harder to apply everywhere that I
     originally thought; with lsm cgroup, we have a shim_prog pointer where
     we store cgroup_atype; for non-lsm programs, we don't have a
     trace program where to store it, so we still need some kind
     of global table to map from "static" hook to "dynamic" slot.
     So I'm dropping this "can be easily extended" clause from the
     description for now. I have converted this whole machinery
     to an RCU-managed list to remove synchronize_rcu().
- also note that I had to introduce new bpf_shim_tramp_link and
  moved refcnt there; we need something to manage new bpf_tramp_link

v6:
- remove active count & stats for shim program (Martin KaFai Lau)
- remove NULL/error check for btf_vmlinux (Martin)
- don't check cgroup_atype in bpf_cgroup_lsm_shim_release (Martin)
- use old_prog (instead of passed one) in __cgroup_bpf_detach (Martin)
- make sure attach_btf_id is the same in __cgroup_bpf_replace (Martin)
- enable cgroup local storage and test it (Martin)
- properly implement prog query and add bpftool & tests (Martin)
- prohibit non-shared cgroup storage mode for BPF_LSM_CGROUP (Martin)

v5:
- __cgroup_bpf_run_lsm_socket remove NULL sock/sk checks (Martin KaFai Lau)
- __cgroup_bpf_run_lsm_{socket,current} s/prog/shim_prog/ (Martin)
- make sure bpf_lsm_find_cgroup_shim works for hooks without args (Martin)
- __cgroup_bpf_attach make sure attach_btf_id is the same when replacing (Martin)
- call bpf_cgroup_lsm_shim_release only for LSM_CGROUP (Martin)
- drop BPF_LSM_CGROUP from bpf_attach_type_to_tramp (Martin)
- drop jited check from cgroup_shim_find (Martin)
- new patch to convert cgroup_bpf to hlist_node (Jakub Sitnicki)
- new shim flavor for 'struct sock' + list of exceptions (Martin)

v4:
- fix build when jit is on but syscall is off

v3:
- add BPF_LSM_CGROUP to bpftool
- use simple int instead of refcnt_t (to avoid use-after-free
  false positive)

v2:
- addressed build bot failures
====================

Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
parents c5c7358e dca85aac
Loading
Loading
Loading
Loading
+16 −8
Original line number Diff line number Diff line
@@ -1770,6 +1770,10 @@ static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog,
			   struct bpf_tramp_link *l, int stack_size,
			   int run_ctx_off, bool save_ret)
{
	void (*exit)(struct bpf_prog *prog, u64 start,
		     struct bpf_tramp_run_ctx *run_ctx) = __bpf_prog_exit;
	u64 (*enter)(struct bpf_prog *prog,
		     struct bpf_tramp_run_ctx *run_ctx) = __bpf_prog_enter;
	u8 *prog = *pprog;
	u8 *jmp_insn;
	int ctx_cookie_off = offsetof(struct bpf_tramp_run_ctx, bpf_cookie);
@@ -1788,14 +1792,20 @@ static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog,
	 */
	emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_1, -run_ctx_off + ctx_cookie_off);

	if (p->aux->sleepable) {
		enter = __bpf_prog_enter_sleepable;
		exit = __bpf_prog_exit_sleepable;
	} else if (p->expected_attach_type == BPF_LSM_CGROUP) {
		enter = __bpf_prog_enter_lsm_cgroup;
		exit = __bpf_prog_exit_lsm_cgroup;
	}

	/* arg1: mov rdi, progs[i] */
	emit_mov_imm64(&prog, BPF_REG_1, (long) p >> 32, (u32) (long) p);
	/* arg2: lea rsi, [rbp - ctx_cookie_off] */
	EMIT4(0x48, 0x8D, 0x75, -run_ctx_off);

	if (emit_call(&prog,
		      p->aux->sleepable ? __bpf_prog_enter_sleepable :
		      __bpf_prog_enter, prog))
	if (emit_call(&prog, enter, prog))
		return -EINVAL;
	/* remember prog start time returned by __bpf_prog_enter */
	emit_mov_reg(&prog, true, BPF_REG_6, BPF_REG_0);
@@ -1840,9 +1850,7 @@ static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog,
	emit_mov_reg(&prog, true, BPF_REG_2, BPF_REG_6);
	/* arg3: lea rdx, [rbp - run_ctx_off] */
	EMIT4(0x48, 0x8D, 0x55, -run_ctx_off);
	if (emit_call(&prog,
		      p->aux->sleepable ? __bpf_prog_exit_sleepable :
		      __bpf_prog_exit, prog))
	if (emit_call(&prog, exit, prog))
		return -EINVAL;

	*pprog = prog;
+11 −2
Original line number Diff line number Diff line
@@ -10,6 +10,13 @@

struct bpf_prog_array;

#ifdef CONFIG_BPF_LSM
/* Maximum number of concurrently attachable per-cgroup LSM hooks. */
#define CGROUP_LSM_NUM 10
#else
#define CGROUP_LSM_NUM 0
#endif

enum cgroup_bpf_attach_type {
	CGROUP_BPF_ATTACH_TYPE_INVALID = -1,
	CGROUP_INET_INGRESS = 0,
@@ -35,6 +42,8 @@ enum cgroup_bpf_attach_type {
	CGROUP_INET4_GETSOCKNAME,
	CGROUP_INET6_GETSOCKNAME,
	CGROUP_INET_SOCK_RELEASE,
	CGROUP_LSM_START,
	CGROUP_LSM_END = CGROUP_LSM_START + CGROUP_LSM_NUM - 1,
	MAX_CGROUP_BPF_ATTACH_TYPE
};

@@ -47,8 +56,8 @@ struct cgroup_bpf {
	 * have either zero or one element
	 * when BPF_F_ALLOW_MULTI the list can have up to BPF_CGROUP_MAX_PROGS
	 */
	struct list_head progs[MAX_CGROUP_BPF_ATTACH_TYPE];
	u32 flags[MAX_CGROUP_BPF_ATTACH_TYPE];
	struct hlist_head progs[MAX_CGROUP_BPF_ATTACH_TYPE];
	u8 flags[MAX_CGROUP_BPF_ATTACH_TYPE];

	/* list of cgroup shared storages */
	struct list_head storages;
+8 −1
Original line number Diff line number Diff line
@@ -23,6 +23,13 @@ struct ctl_table;
struct ctl_table_header;
struct task_struct;

unsigned int __cgroup_bpf_run_lsm_sock(const void *ctx,
				       const struct bpf_insn *insn);
unsigned int __cgroup_bpf_run_lsm_socket(const void *ctx,
					 const struct bpf_insn *insn);
unsigned int __cgroup_bpf_run_lsm_current(const void *ctx,
					  const struct bpf_insn *insn);

#ifdef CONFIG_CGROUP_BPF

#define CGROUP_ATYPE(type) \
@@ -95,7 +102,7 @@ struct bpf_cgroup_link {
};

struct bpf_prog_list {
	struct list_head node;
	struct hlist_node node;
	struct bpf_prog *prog;
	struct bpf_cgroup_link *link;
	struct bpf_cgroup_storage *storage[MAX_BPF_CGROUP_STORAGE_TYPE];
+38 −6
Original line number Diff line number Diff line
@@ -56,6 +56,8 @@ typedef u64 (*bpf_callback_t)(u64, u64, u64, u64, u64);
typedef int (*bpf_iter_init_seq_priv_t)(void *private_data,
					struct bpf_iter_aux_info *aux);
typedef void (*bpf_iter_fini_seq_priv_t)(void *private_data);
typedef unsigned int (*bpf_func_t)(const void *,
				   const struct bpf_insn *);
struct bpf_iter_seq_info {
	const struct seq_operations *seq_ops;
	bpf_iter_init_seq_priv_t init_seq_private;
@@ -792,6 +794,10 @@ void notrace __bpf_prog_exit(struct bpf_prog *prog, u64 start, struct bpf_tramp_
u64 notrace __bpf_prog_enter_sleepable(struct bpf_prog *prog, struct bpf_tramp_run_ctx *run_ctx);
void notrace __bpf_prog_exit_sleepable(struct bpf_prog *prog, u64 start,
				       struct bpf_tramp_run_ctx *run_ctx);
u64 notrace __bpf_prog_enter_lsm_cgroup(struct bpf_prog *prog,
					struct bpf_tramp_run_ctx *run_ctx);
void notrace __bpf_prog_exit_lsm_cgroup(struct bpf_prog *prog, u64 start,
					struct bpf_tramp_run_ctx *run_ctx);
void notrace __bpf_tramp_enter(struct bpf_tramp_image *tr);
void notrace __bpf_tramp_exit(struct bpf_tramp_image *tr);

@@ -879,8 +885,7 @@ struct bpf_dispatcher {
static __always_inline __nocfi unsigned int bpf_dispatcher_nop_func(
	const void *ctx,
	const struct bpf_insn *insnsi,
	unsigned int (*bpf_func)(const void *,
				 const struct bpf_insn *))
	bpf_func_t bpf_func)
{
	return bpf_func(ctx, insnsi);
}
@@ -909,8 +914,7 @@ int arch_prepare_bpf_dispatcher(void *image, s64 *funcs, int num_funcs);
	noinline __nocfi unsigned int bpf_dispatcher_##name##_func(	\
		const void *ctx,					\
		const struct bpf_insn *insnsi,				\
		unsigned int (*bpf_func)(const void *,			\
					 const struct bpf_insn *))	\
		bpf_func_t bpf_func)					\
	{								\
		return bpf_func(ctx, insnsi);				\
	}								\
@@ -921,8 +925,7 @@ int arch_prepare_bpf_dispatcher(void *image, s64 *funcs, int num_funcs);
	unsigned int bpf_dispatcher_##name##_func(			\
		const void *ctx,					\
		const struct bpf_insn *insnsi,				\
		unsigned int (*bpf_func)(const void *,			\
					 const struct bpf_insn *));	\
		bpf_func_t bpf_func);					\
	extern struct bpf_dispatcher bpf_dispatcher_##name;
#define BPF_DISPATCHER_FUNC(name) bpf_dispatcher_##name##_func
#define BPF_DISPATCHER_PTR(name) (&bpf_dispatcher_##name)
@@ -1061,6 +1064,7 @@ struct bpf_prog_aux {
	struct user_struct *user;
	u64 load_time; /* ns since boottime */
	u32 verified_insns;
	int cgroup_atype; /* enum cgroup_bpf_attach_type */
	struct bpf_map *cgroup_storage[MAX_BPF_CGROUP_STORAGE_TYPE];
	char name[BPF_OBJ_NAME_LEN];
#ifdef CONFIG_SECURITY
@@ -1168,6 +1172,11 @@ struct bpf_tramp_link {
	u64 cookie;
};

struct bpf_shim_tramp_link {
	struct bpf_tramp_link link;
	struct bpf_trampoline *trampoline;
};

struct bpf_tracing_link {
	struct bpf_tramp_link link;
	enum bpf_attach_type attach_type;
@@ -1246,6 +1255,9 @@ struct bpf_dummy_ops {
int bpf_struct_ops_test_run(struct bpf_prog *prog, const union bpf_attr *kattr,
			    union bpf_attr __user *uattr);
#endif
int bpf_trampoline_link_cgroup_shim(struct bpf_prog *prog,
				    int cgroup_atype);
void bpf_trampoline_unlink_cgroup_shim(struct bpf_prog *prog);
#else
static inline const struct bpf_struct_ops *bpf_struct_ops_find(u32 type_id)
{
@@ -1269,6 +1281,14 @@ static inline int bpf_struct_ops_map_sys_lookup_elem(struct bpf_map *map,
{
	return -EINVAL;
}
static inline int bpf_trampoline_link_cgroup_shim(struct bpf_prog *prog,
						  int cgroup_atype)
{
	return -EOPNOTSUPP;
}
static inline void bpf_trampoline_unlink_cgroup_shim(struct bpf_prog *prog)
{
}
#endif

struct bpf_array {
@@ -2366,9 +2386,13 @@ extern const struct bpf_func_proto bpf_for_each_map_elem_proto;
extern const struct bpf_func_proto bpf_btf_find_by_name_kind_proto;
extern const struct bpf_func_proto bpf_sk_setsockopt_proto;
extern const struct bpf_func_proto bpf_sk_getsockopt_proto;
extern const struct bpf_func_proto bpf_unlocked_sk_setsockopt_proto;
extern const struct bpf_func_proto bpf_unlocked_sk_getsockopt_proto;
extern const struct bpf_func_proto bpf_find_vma_proto;
extern const struct bpf_func_proto bpf_loop_proto;
extern const struct bpf_func_proto bpf_copy_from_user_task_proto;
extern const struct bpf_func_proto bpf_set_retval_proto;
extern const struct bpf_func_proto bpf_get_retval_proto;

const struct bpf_func_proto *tracing_prog_func_proto(
  enum bpf_func_id func_id, const struct bpf_prog *prog);
@@ -2522,4 +2546,12 @@ void bpf_dynptr_init(struct bpf_dynptr_kern *ptr, void *data,
void bpf_dynptr_set_null(struct bpf_dynptr_kern *ptr);
int bpf_dynptr_check_size(u32 size);

#ifdef CONFIG_BPF_LSM
void bpf_cgroup_atype_get(u32 attach_btf_id, int cgroup_atype);
void bpf_cgroup_atype_put(int cgroup_atype);
#else
static inline void bpf_cgroup_atype_get(u32 attach_btf_id, int cgroup_atype) {}
static inline void bpf_cgroup_atype_put(int cgroup_atype) {}
#endif /* CONFIG_BPF_LSM */

#endif /* _LINUX_BPF_H */
+7 −0
Original line number Diff line number Diff line
@@ -42,6 +42,8 @@ extern const struct bpf_func_proto bpf_inode_storage_get_proto;
extern const struct bpf_func_proto bpf_inode_storage_delete_proto;
void bpf_inode_storage_free(struct inode *inode);

void bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog, bpf_func_t *bpf_func);

#else /* !CONFIG_BPF_LSM */

static inline bool bpf_lsm_is_sleepable_hook(u32 btf_id)
@@ -65,6 +67,11 @@ static inline void bpf_inode_storage_free(struct inode *inode)
{
}

static inline void bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog,
					   bpf_func_t *bpf_func)
{
}

#endif /* CONFIG_BPF_LSM */

#endif /* _LINUX_BPF_LSM_H */
Loading