Commits · ef078600eec20f20eb7833cf597d4a5edf2953c1 · Mirrors / git.kernel.org / pub_scm_linux_kernel_git_stable_linux

Mar 21, 2022

bpf: Select proper size for bpf_prog_pack · ef078600

Song Liu authored Mar 11, 2022

Using HPAGE_PMD_SIZE as the size for bpf_prog_pack is not ideal in some
cases. Specifically, for NUMA systems, __vmalloc_node_range requires
PMD_SIZE * num_online_nodes() to allocate huge pages. Also, if the system
does not support huge pages (i.e., with cmdline option nohugevmalloc), it
is better to use PAGE_SIZE packs.

Add logic to select proper size for bpf_prog_pack. This solution is not
ideal, as it makes assumption about the behavior of module_alloc and
__vmalloc_node_range. However, it appears to be the easiest solution as
it doesn't require changes in module_alloc and vmalloc code.

Fixes: 57631054

 ("bpf: Introduce bpf_prog_pack allocator")
Signed-off-by: Song Liu <song@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220311201135.3573610-1-song@kernel.org

ef078600

Merge branch 'Make 2-byte access to bpf_sk_lookup->remote_port endian-agnostic' · 46e9244b

Alexei Starovoitov authored Mar 20, 2022



Jakub Sitnicki says:

====================

This patch set is a result of a discussion we had around the RFC patchset from
Ilya [1]. The fix for the narrow loads from the RFC series is still relevant,
but this series does not depend on it. Nor is it required to unbreak sk_lookup
tests on BE, if this series gets applied.

To summarize the takeaways from [1]:

 1) we want to make 2-byte load from ctx->remote_port portable across LE and BE,
 2) we keep the 4-byte load from ctx->remote_port as it is today - result varies
    on endianess of the platform.

[1] https://lore.kernel.org/bpf/20220222182559.2865596-2-iii@linux.ibm.com/

v1 -> v2:
- Remove needless check that 4-byte load is from &ctx->remote_port offset
  (Martin)

[v1]: https://lore.kernel.org/bpf/20220317165826.1099418-1-jakub@cloudflare.com/
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>

46e9244b

selftests/bpf: Fix test for 4-byte load from remote_port on big-endian · ce523680

Jakub Sitnicki authored Mar 19, 2022

The context access converter rewrites the 4-byte load from
bpf_sk_lookup->remote_port to a 2-byte load from bpf_sk_lookup_kern
structure.

It means that we cannot treat the destination register contents as a 32-bit
value, or the code will not be portable across big- and little-endian
architectures.

This is exactly the same case as with 4-byte loads from bpf_sock->dst_port
so follow the approach outlined in [1] and treat the register contents as a
16-bit value in the test.

[1]: https://lore.kernel.org/bpf/20220317113920.1068535-5-jakub@cloudflare.com/

Fixes: 2ed0dc59

 ("selftests/bpf: Cover 4-byte load from remote_port in bpf_sk_lookup")
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20220319183356.233666-4-jakub@cloudflare.com

ce523680

selftests/bpf: Fix u8 narrow load checks for bpf_sk_lookup remote_port · 3c69611b

Jakub Sitnicki authored Mar 19, 2022

In commit 9a69e2b3 ("bpf: Make remote_port field in struct
bpf_sk_lookup 16-bit wide") ->remote_port field changed from __u32 to
__be16.

However, narrow load tests which exercise 1-byte sized loads from
offsetof(struct bpf_sk_lookup, remote_port) were not adopted to reflect the
change.

As a result, on little-endian we continue testing loads from addresses:

 - (__u8 *)&ctx->remote_port + 3
 - (__u8 *)&ctx->remote_port + 4

which map to the zero padding following the remote_port field, and don't
break the tests because there is no observable change.

While on big-endian, we observe breakage because tests expect to see zeros
for values loaded from:

 - (__u8 *)&ctx->remote_port - 1
 - (__u8 *)&ctx->remote_port - 2

Above addresses map to ->remote_ip6 field, which precedes ->remote_port,
and are populated during the bpf_sk_lookup IPv6 tests.

Unsurprisingly, on s390x we observe:

  #136/38 sk_lookup/narrow access to ctx v4:OK
  #136/39 sk_lookup/narrow access to ctx v6:FAIL

Fix it by removing the checks for 1-byte loads from offsets outside of the
->remote_port field.

Fixes: 9a69e2b3

 ("bpf: Make remote_port field in struct bpf_sk_lookup 16-bit wide")
Suggested-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20220319183356.233666-3-jakub@cloudflare.com

3c69611b

bpf: Treat bpf_sk_lookup remote_port as a 2-byte field · 058ec4a7

Jakub Sitnicki authored Mar 19, 2022

In commit 9a69e2b3 ("bpf: Make remote_port field in struct
bpf_sk_lookup 16-bit wide") the remote_port field has been split up and
re-declared from u32 to be16.

However, the accompanying changes to the context access converter have not
been well thought through when it comes big-endian platforms.

Today 2-byte wide loads from offsetof(struct bpf_sk_lookup, remote_port)
are handled as narrow loads from a 4-byte wide field.

This by itself is not enough to create a problem, but when we combine

 1. 32-bit wide access to ->remote_port backed by a 16-wide wide load, with
 2. inherent difference between litte- and big-endian in how narrow loads
    need have to be handled (see bpf_ctx_narrow_access_offset),

we get inconsistent results for a 2-byte loads from &ctx->remote_port on LE
and BE architectures. This in turn makes BPF C code for the common case of
2-byte load from ctx->remote_port not portable.

To rectify it, inform the context access converter that remote_port is
2-byte wide field, and only 1-byte loads need to be treated as narrow
loads.

At the same time, we special-case the 4-byte load from &ctx->remote_port to
continue handling it the same way as do today, in order to keep the
existing BPF programs working.

Fixes: 9a69e2b3

 ("bpf: Make remote_port field in struct bpf_sk_lookup 16-bit wide")
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20220319183356.233666-2-jakub@cloudflare.com

058ec4a7

Merge branch 'Enable non-atomic allocations in local storage' · 30630e44

Alexei Starovoitov authored Mar 20, 2022



Joanne Koong says:

====================

From: Joanne Koong <joannelkoong@gmail.com>

Currently, local storage memory can only be allocated atomically
(GFP_ATOMIC). This restriction is too strict for sleepable bpf
programs.

In this patchset, sleepable programs can allocate memory in local
storage using GFP_KERNEL, while non-sleepable programs always default to
GFP_ATOMIC.

v3 <- v2:
* Add extra case to local_storage.c selftest to test associating multiple
elements with the local storage, which triggers a GFP_KERNEL allocation in
local_storage_update().
* Cast gfp_t to __s32 in verifier to fix the sparse warnings

v2 <- v1:
* Allocate the memory before/after the raw_spin_lock_irqsave, depending
on the gfp flags
* Rename mem_flags to gfp_flags
* Reword the comment "*mem_flags* is set by the bpf verifier" to
"*gfp_flags* is a hidden argument provided by the verifier"
* Add a sentence to the commit message about existing local storage
selftests covering both the GFP_ATOMIC and GFP_KERNEL paths in
bpf_local_storage_update.
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>

30630e44

selftests/bpf: Test for associating multiple elements with the local storage · 0e790cbb

Joanne Koong authored Mar 17, 2022

This patch adds a few calls to the existing local storage selftest to
test that we can associate multiple elements with the local storage.

The sleepable program's call to bpf_sk_storage_get with sk_storage_map2
will lead to an allocation of a new selem under the GFP_KERNEL flag.

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220318045553.3091807-3-joannekoong@fb.com

0e790cbb

bpf: Enable non-atomic allocations in local storage · b00fa38a

Joanne Koong authored Mar 17, 2022



Currently, local storage memory can only be allocated atomically
(GFP_ATOMIC). This restriction is too strict for sleepable bpf
programs.

In this patch, the verifier detects whether the program is sleepable,
and passes the corresponding GFP_KERNEL or GFP_ATOMIC flag as a
5th argument to bpf_task/sk/inode_storage_get. This flag will propagate
down to the local storage functions that allocate memory.

Please note that bpf_task/sk/inode_storage_update_elem functions are
invoked by userspace applications through syscalls. Preemption is
disabled before bpf_task/sk/inode_storage_update_elem is called, which
means they will always have to allocate memory atomically.

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: KP Singh <kpsingh@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20220318045553.3091807-2-joannekoong@fb.com

b00fa38a

libbpf: Avoid NULL deref when initializing map BTF info · a8fee962

Andrii Nakryiko authored Mar 19, 2022

If BPF object doesn't have an BTF info, don't attempt to search for BTF
types describing BPF map key or value layout.

Fixes: 262cfb74

 ("libbpf: Init btf_{key,value}_type_id on internal map open")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220320001911.3640917-1-andrii@kernel.org

a8fee962

Mar 20, 2022

bpf: Always raise reference in btf_get_module_btf · 9492450f

Kumar Kartikeya Dwivedi authored Mar 17, 2022



Align it with helpers like bpf_find_btf_id, so all functions returning
BTF in out parameter follow the same rule of raising reference
consistently, regardless of module or vmlinux BTF.

Adjust existing callers to handle the change accordinly.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220317115957.3193097-10-memxor@gmail.com

9492450f

Mar 19, 2022

bpf: Factor out fd returning from bpf_btf_find_by_name_kind · edc3ec09

Kumar Kartikeya Dwivedi authored Mar 17, 2022



In next few patches, we need a helper that searches all kernel BTFs
(vmlinux and module BTFs), and finds the type denoted by 'name' and
'kind'. Turns out bpf_btf_find_by_name_kind already does the same thing,
but it instead returns a BTF ID and optionally fd (if module BTF). This
is used for relocating ksyms in BPF loader code (bpftool gen skel -L).

We extract the core code out into a new helper bpf_find_btf_id, which
returns the BTF ID in the return value, and BTF pointer in an out
parameter. The reference for the returned BTF pointer is always raised,
hence user must either transfer it (e.g. to a fd), or release it after
use.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220317115957.3193097-2-memxor@gmail.com

edc3ec09

bpftool: Add BPF_TRACE_KPROBE_MULTI to attach type names table · 08063b4b

Andrii Nakryiko authored Mar 18, 2022



BPF_TRACE_KPROBE_MULTI is a new attach type name, add it to bpftool's
table. This fixes a currently failing CI bpftool check.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220318150106.2933343-1-andrii@kernel.org

08063b4b

Mar 18, 2022

Merge branch 'bpf-fix-sock-field-tests' · 63cc8e20

Daniel Borkmann authored Mar 18, 2022

Jakub Sitnicki says:

====================
I think we have reached a consensus [1] on how the test for the 4-byte load from
bpf_sock->dst_port and bpf_sk_lookup->remote_port should look, so here goes v3.

I will submit a separate set of patches for bpf_sk_lookup->remote_port tests.

This series has been tested on x86_64 and s390 on top of recent bpf-next -
ad13baf4

 ("selftests/bpf: Test subprog jit when toggle bpf_jit_harden
repeatedly").

  [1] https://lore.kernel.org/bpf/87k0cwxkzs.fsf@cloudflare.com/

v2 -> v3:
- Split what was previously patch 2 which was doing two things
- Use BPF_TCP_* constants (Martin)
- Treat the result of 4-byte load from dst_port as a 16-bit value (Martin)
- Typo fixup and some rewording in patch 4 description
v1 -> v2:
- Limit read_sk_dst_port only to client traffic (patch 2)
- Make read_sk_dst_port pass on litte- and big-endian (patch 3)

v1: https://lore.kernel.org/bpf/20220225184130.483208-1-jakub@cloudflare.com/
v2: https://lore.kernel.org/bpf/20220227202757.519015-1-jakub@cloudflare.com/
====================

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

63cc8e20

selftests/bpf: Fix test for 4-byte load from dst_port on big-endian · deb59400

Jakub Sitnicki authored Mar 17, 2022



The check for 4-byte load from dst_port offset into bpf_sock is failing on
big-endian architecture - s390. The bpf access converter rewrites the
4-byte load to a 2-byte load from sock_common at skc_dport offset, as shown
below.

  * s390 / llvm-objdump -S --no-show-raw-insn

  00000000000002a0 <sk_dst_port__load_word>:
        84:       r1 = *(u32 *)(r1 + 48)
        85:       w0 = 1
        86:       if w1 == 51966 goto +1 <LBB5_2>
        87:       w0 = 0
  00000000000002c0 <LBB5_2>:
        88:       exit

  * s390 / bpftool prog dump xlated

  _Bool sk_dst_port__load_word(struct bpf_sock * sk):
    35: (69) r1 = *(u16 *)(r1 +12)
    36: (bc) w1 = w1
    37: (b4) w0 = 1
    38: (16) if w1 == 0xcafe goto pc+1
    39: (b4) w0 = 0
    40: (95) exit

  * x86_64 / llvm-objdump -S --no-show-raw-insn

  00000000000002a0 <sk_dst_port__load_word>:
        84:       r1 = *(u32 *)(r1 + 48)
        85:       w0 = 1
        86:       if w1 == 65226 goto +1 <LBB5_2>
        87:       w0 = 0
  00000000000002c0 <LBB5_2>:
        88:       exit

  * x86_64 / bpftool prog dump xlated

  _Bool sk_dst_port__load_word(struct bpf_sock * sk):
    33: (69) r1 = *(u16 *)(r1 +12)
    34: (b4) w0 = 1
    35: (16) if w1 == 0xfeca goto pc+1
    36: (b4) w0 = 0
    37: (95) exit

This leads to surprises if we treat the destination register contents as a
32-bit value, ignoring the fact that in reality it contains a 16-bit value.

On little-endian the register contents reflect the bpf_sock struct
definition, where the lower 16-bits contain the port number:

	struct bpf_sock {
		...
		__be16 dst_port;	/* offset 48 */
		__u16 :16;
		...
	};

However, on big-endian the register contents suggest that field the layout
of bpf_sock struct is as so:

	struct bpf_sock {
		...
		__u16 :16;		/* offset 48 */
		__be16 dst_port;
		...
	};

Account for this quirky access conversion in the test case exercising the
4-byte load by treating the result as 16-bit wide.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20220317113920.1068535-5-jakub@cloudflare.com

deb59400

selftests/bpf: Use constants for socket states in sock_fields test · e06b5bbc

Jakub Sitnicki authored Mar 17, 2022



Replace magic numbers in BPF code with constants from bpf.h, so that they
don't require an explanation in the comments.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20220317113920.1068535-4-jakub@cloudflare.com

e06b5bbc

selftests/bpf: Check dst_port only on the client socket · 2d2202ba

Jakub Sitnicki authored Mar 17, 2022

cgroup_skb/egress programs which sock_fields test installs process packets
flying in both directions, from the client to the server, and in reverse
direction.

Recently added dst_port check relies on the fact that destination
port (remote peer port) of the socket which sends the packet is known ahead
of time. This holds true only for the client socket, which connects to the
known server port.

Filter out any traffic that is not egressing from the client socket in the
BPF program that tests reading the dst_port.

Fixes: 8f50f16f

 ("selftests/bpf: Extend verifier and bpf_sock tests for dst_port loads")
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20220317113920.1068535-3-jakub@cloudflare.com

2d2202ba

selftests/bpf: Fix error reporting from sock_fields programs · a4c9fe0e

Jakub Sitnicki authored Mar 17, 2022

The helper macro that records an error in BPF programs that exercise sock
fields access has been inadvertently broken by adaptation work that
happened in commit b18c1f0a ("bpf: selftest: Adapt sock_fields test to
use skel and global variables").

BPF_NOEXIST flag cannot be used to update BPF_MAP_TYPE_ARRAY. The operation
always fails with -EEXIST, which in turn means the error never gets
recorded, and the checks for errors always pass.

Revert the change in update flags.

Fixes: b18c1f0a

 ("bpf: selftest: Adapt sock_fields test to use skel and global variables")
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20220317113920.1068535-2-jakub@cloudflare.com

a4c9fe0e

Merge branch 'Subskeleton support for BPF librariesThread-Topic: [PATCH bpf-next v4 0/5' · 60911970

Andrii Nakryiko authored Mar 17, 2022



Delyan Kratunov says:

====================

In the quest for ever more modularity, a new need has arisen - the ability to
access data associated with a BPF library from a corresponding userspace library.
The catch is that we don't want the userspace library to know about the structure of the
final BPF object that the BPF library is linked into.

In pursuit of this modularity, this patch series introduces *subskeletons.*
Subskeletons are similar in use and design to skeletons with a couple of differences:

1. The generated storage types do not rely on contiguous storage for the library's
variables because they may be interspersed randomly throughout the final BPF object's sections.

2. Subskeletons do not own objects and instead require a loaded bpf_object* to
be passed at runtime in order to be initialized. By extension, symbols are resolved at
runtime by parsing the final object's BTF.

3. Subskeletons allow access to all global variables, programs, and custom maps. They also expose
the internal maps *of the final object*. This allows bpf_var_skeleton objects to contain a bpf_map**
instead of a section name.

Changes since v3:
 - Re-add key/value type lookup for legacy user maps (fixing btf test)
 - Minor cleanups (missed sanitize_identifier call, error messages, formatting)

Changes since v2:
 - Reuse SEC_NAME strict mode flag
 - Init bpf_map->btf_value_type_id on open for internal maps *and* user BTF maps
 - Test custom section names (.data.foo) and overlapping kconfig externs between the final object and the library
 - Minor review comments in gen.c & libbpf.c

Changes since v1:
 - Introduced new strict mode knob for single-routine-in-.text compatibility behavior, which
   disproportionately affects library objects. bpftool works in 1.0 mode so subskeleton generation
   doesn't have to worry about this now.
 - Made bpf_map_btf_value_type_id available earlier and used it wherever applicable.
 - Refactoring in bpftool gen.c per review comments.
 - Subskels now use typeof() for array and func proto globals to avoid the need for runtime split btf.
 - Expanded the subskeleton test to include arrays, custom maps, extern maps, weak symbols, and kconfigs.
 - selftests/bpf/Makefile now generates a subskel.h for every skel.h it would make.

For reference, here is a shortened subskeleton header:

#ifndef __TEST_SUBSKELETON_LIB_SUBSKEL_H__
#define __TEST_SUBSKELETON_LIB_SUBSKEL_H__

struct test_subskeleton_lib {
	struct bpf_object *obj;
	struct bpf_object_subskeleton *subskel;
	struct {
		struct bpf_map *map2;
		struct bpf_map *map1;
		struct bpf_map *data;
		struct bpf_map *rodata;
		struct bpf_map *bss;
		struct bpf_map *kconfig;
	} maps;
	struct {
		struct bpf_program *lib_perf_handler;
	} progs;
	struct test_subskeleton_lib__data {
		int *var6;
		int *var2;
		int *var5;
	} data;
	struct test_subskeleton_lib__rodata {
		int *var1;
	} rodata;
	struct test_subskeleton_lib__bss {
		struct {
			int var3_1;
			__s64 var3_2;
		} *var3;
		int *libout1;
		typeof(int[4]) *var4;
		typeof(int (*)()) *fn_ptr;
	} bss;
	struct test_subskeleton_lib__kconfig {
		_Bool *CONFIG_BPF_SYSCALL;
	} kconfig;

static inline struct test_subskeleton_lib *
test_subskeleton_lib__open(const struct bpf_object *src)
{
	struct test_subskeleton_lib *obj;
	struct bpf_object_subskeleton *s;
	int err;

	...
	s = (struct bpf_object_subskeleton *)calloc(1, sizeof(*s));
	...

	s->var_cnt = 9;
	...

	s->vars[0].name = "var6";
	s->vars[0].map = &obj->maps.data;
	s->vars[0].addr = (void**) &obj->data.var6;
  ...

	/* maps */
	...

	/* programs */
	s->prog_cnt = 1;
	...

	err = bpf_object__open_subskeleton(s);
  ...
	return obj;
}
#endif /* __TEST_SUBSKELETON_LIB_SUBSKEL_H__ */
====================

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>

60911970

selftests/bpf: Test subskeleton functionality · 3cccbaa0

Delyan Kratunov authored Mar 16, 2022

This patch changes the selftests/bpf Makefile to also generate
a subskel.h for every skel.h it would have normally generated.

Separately, it also introduces a new subskeleton test which tests
library objects, externs, weak symbols, kconfigs, and user maps.

Signed-off-by: Delyan Kratunov <delyank@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1bd24956940bbbfe169bb34f7f87b11df52ef011.1647473511.git.delyank@fb.com

3cccbaa0

bpftool: Add support for subskeletons · 00389c58

Delyan Kratunov authored Mar 16, 2022



Subskeletons are headers which require an already loaded program to
operate.

For example, when a BPF library is linked into a larger BPF object file,
the library userspace needs a way to access its own global variables
without requiring knowledge about the larger program at build time.

As a result, subskeletons require a loaded bpf_object to open().
Further, they find their own symbols in the larger program by
walking BTF type data at run time.

At this time, programs, maps, and globals are supported through
non-owning pointers.

Signed-off-by: Delyan Kratunov <delyank@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/ca8a48b4841c72d285ecce82371bef4a899756cb.1647473511.git.delyank@fb.com

00389c58

libbpf: Add subskeleton scaffolding · 430025e5

Delyan Kratunov authored Mar 16, 2022

In symmetry with bpf_object__open_skeleton(),
bpf_object__open_subskeleton() performs the actual walking and linking
of maps, progs, and globals described by bpf_*_skeleton objects.

Signed-off-by: Delyan Kratunov <delyank@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/6942a46fbe20e7ebf970affcca307ba616985b15.1647473511.git.delyank@fb.com

430025e5

libbpf: Init btf_{key,value}_type_id on internal map open · 262cfb74

Delyan Kratunov authored Mar 16, 2022



For internal and user maps, look up the key and value btf
types on open() and not load(), so that `bpf_map_btf_value_type_id`
is usable in `bpftool gen`.

Signed-off-by: Delyan Kratunov <delyank@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/78dbe4e457b4a05e098fc6c8f50014b680c86e4e.1647473511.git.delyank@fb.com

262cfb74

libbpf: .text routines are subprograms in strict mode · bc380eb9

Delyan Kratunov authored Mar 16, 2022

Currently, libbpf considers a single routine in .text to be a program. This
is particularly confusing when it comes to library objects - a single routine
meant to be used as an extern will instead be considered a bpf_program.

This patch hides this compatibility behavior behind the pre-existing
SEC_NAME strict mode flag.

Signed-off-by: Delyan Kratunov <delyank@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/018de8d0d67c04bf436055270d35d394ba393505.1647473511.git.delyank@fb.com

bc380eb9

Merge branch 'bpf: Add kprobe multi link' · 5a5c11ee

Alexei Starovoitov authored Mar 17, 2022



Jiri Olsa says:

====================

hi,
this patchset adds new link type BPF_TRACE_KPROBE_MULTI that attaches
kprobe program through fprobe API [1] instroduced by Masami.

The fprobe API allows to attach probe on multiple functions at once very
fast, because it works on top of ftrace. On the other hand this limits
the probe point to the function entry or return.

With bpftrace support I see following attach speed:

  # perf stat --null -r 5 ./src/bpftrace -e 'kprobe:x* { } i:ms:1 { exit(); } '
  Attaching 2 probes...
  Attaching 3342 functions
  ...

  1.4960 +- 0.0285 seconds time elapsed  ( +-  1.91% )

v3 changes:
  - based on latest fprobe post from Masami [2]
  - add acks
  - add extra comment to kprobe_multi_link_handler wrt entry ip setup [Masami]
  - keep swap_words_64 static and swap values directly in
    bpf_kprobe_multi_cookie_swap [Andrii]
  - rearrange locking/migrate setup in kprobe_multi_link_prog_run [Andrii]
  - move uapi fields [Andrii]
  - add bpf_program__attach_kprobe_multi_opts function [Andrii]
  - many small test changes [Andrii]
  - added tests for bpf_program__attach_kprobe_multi_opts
  - make kallsyms_lookup_name check for empty string [Andrii]

v2 changes:
  - based on latest fprobe changes [1]
  - renaming the uapi interface to kprobe multi
  - adding support for sort_r to pass user pointer for swap functions
    and using that in cookie support to keep just single functions array
  - moving new link to kernel/trace/bpf_trace.c file
  - using single fprobe callback function for entry and exit
  - using kvzalloc, libbpf_ensure_mem functions
  - adding new k[ret]probe.multi sections instead of using current kprobe
  - used glob_match from test_progs.c, added '?' matching
  - move bpf_get_func_ip verifier inline change to seprate change
  - couple of other minor fixes

Also available at:
  https://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
  bpf/kprobe_multi

thanks,
jirka

[1] https://lore.kernel.org/bpf/164458044634.586276.3261555265565111183.stgit@devnote2/
[2] https://lore.kernel.org/bpf/164735281449.1084943.12438881786173547153.stgit@devnote2/
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>

5a5c11ee

selftests/bpf: Add cookie test for bpf_program__attach_kprobe_multi_opts · 318c812c

Jiri Olsa authored Mar 16, 2022



Adding bpf_cookie test for programs attached by
bpf_program__attach_kprobe_multi_opts API.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220316122419.933957-14-jolsa@kernel.org

318c812c

selftests/bpf: Add attach test for bpf_program__attach_kprobe_multi_opts · 9271a0c7

Jiri Olsa authored Mar 16, 2022



Adding tests for bpf_program__attach_kprobe_multi_opts function,
that test attach with pattern, symbols and addrs.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220316122419.933957-13-jolsa@kernel.org

9271a0c7

selftests/bpf: Add kprobe_multi bpf_cookie test · 2c6401c9

Jiri Olsa authored Mar 16, 2022



Adding bpf_cookie test for programs attached by kprobe_multi links.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220316122419.933957-12-jolsa@kernel.org

2c6401c9

selftests/bpf: Add kprobe_multi attach test · f7a11eec

Jiri Olsa authored Mar 16, 2022



Adding kprobe_multi attach test that uses new fprobe interface to
attach kprobe program to multiple functions.

The test is attaching programs to bpf_fentry_test* functions and
uses single trampoline program bpf_prog_test_run to trigger
bpf_fentry_test* functions.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220316122419.933957-11-jolsa@kernel.org

f7a11eec

libbpf: Add bpf_program__attach_kprobe_multi_opts function · ddc6b049

Jiri Olsa authored Mar 16, 2022



Adding bpf_program__attach_kprobe_multi_opts function for attaching
kprobe program to multiple functions.

  struct bpf_link *
  bpf_program__attach_kprobe_multi_opts(const struct bpf_program *prog,
                                        const char *pattern,
                                        const struct bpf_kprobe_multi_opts *opts);

User can specify functions to attach with 'pattern' argument that
allows wildcards (*?' supported) or provide symbols or addresses
directly through opts argument. These 3 options are mutually
exclusive.

When using symbols or addresses, user can also provide cookie value
for each symbol/address that can be retrieved later in bpf program
with bpf_get_attach_cookie helper.

  struct bpf_kprobe_multi_opts {
          size_t sz;
          const char **syms;
          const unsigned long *addrs;
          const __u64 *cookies;
          size_t cnt;
          bool retprobe;
          size_t :0;
  };

Symbols, addresses and cookies are provided through opts object
(syms/addrs/cookies) as array pointers with specified count (cnt).

Each cookie value is paired with provided function address or symbol
with the same array index.

The program can be also attached as return probe if 'retprobe' is set.

For quick usage with NULL opts argument, like:

  bpf_program__attach_kprobe_multi_opts(prog, "ksys_*", NULL)

the 'prog' will be attached as kprobe to 'ksys_*' functions.

Also adding new program sections for automatic attachment:

  kprobe.multi/<symbol_pattern>
  kretprobe.multi/<symbol_pattern>

The symbol_pattern is used as 'pattern' argument in
bpf_program__attach_kprobe_multi_opts function.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220316122419.933957-10-jolsa@kernel.org

ddc6b049

libbpf: Add bpf_link_create support for multi kprobes · 5117c26e

Jiri Olsa authored Mar 16, 2022



Adding new kprobe_multi struct to bpf_link_create_opts object
to pass multiple kprobe data to link_create attr uapi.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220316122419.933957-9-jolsa@kernel.org

5117c26e

libbpf: Add libbpf_kallsyms_parse function · 85153ac0

Jiri Olsa authored Mar 16, 2022



Move the kallsyms parsing in internal libbpf_kallsyms_parse
function, so it can be used from other places.

It will be used in following changes.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220316122419.933957-8-jolsa@kernel.org

85153ac0

bpf: Add cookie support to programs attached with kprobe multi link · ca74823c

Jiri Olsa authored Mar 16, 2022



Adding support to call bpf_get_attach_cookie helper from
kprobe programs attached with kprobe multi link.

The cookie is provided by array of u64 values, where each
value is paired with provided function address or symbol
with the same array index.

When cookie array is provided it's sorted together with
addresses (check bpf_kprobe_multi_cookie_swap). This way
we can find cookie based on the address in
bpf_get_attach_cookie helper.

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220316122419.933957-7-jolsa@kernel.org

ca74823c

bpf: Add support to inline bpf_get_func_ip helper on x86 · 97ee4d20

Jiri Olsa authored Mar 16, 2022



Adding support to inline it on x86, because it's single
load instruction.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220316122419.933957-6-jolsa@kernel.org

97ee4d20

bpf: Add bpf_get_func_ip kprobe helper for multi kprobe link · 42a57120

Jiri Olsa authored Mar 16, 2022



Adding support to call bpf_get_func_ip helper from kprobe
programs attached by multi kprobe link.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220316122419.933957-5-jolsa@kernel.org

42a57120

Merge branch 'fprobe: Introduce fprobe function entry/exit probe' · 245d9496

Alexei Starovoitov authored Mar 17, 2022



Masami Hiramatsu says:

====================

Hi,

Here is the 12th version of fprobe. This version fixes a possible gcc-11 issue which
was reported as kretprobes on arm issue, and also I updated the fprobe document.

The previous version (v11) is here[1];

[1] https://lore.kernel.org/all/164701432038.268462.3329725152949938527.stgit@devnote2/T/#u

This series introduces the fprobe, the function entry/exit probe
with multiple probe point support for x86, arm64 and powerpc64le.
This also introduces the rethook for hooking function return as same as
the kretprobe does. This abstraction will help us to generalize the fgraph
tracer, because we can just switch to it from the rethook in fprobe,
depending on the kernel configuration.

The patch [1/12] is from Jiri's series[2].

[2] https://lore.kernel.org/all/20220104080943.113249-1-jolsa@kernel.org/T/#u

And the patch [9/10] adds the FPROBE_FL_KPROBE_SHARED flag for the case
if user wants to share the same code (or share a same resource) on the
fprobe and the kprobes.

I forcibly updated my kprobes/fprobe branch, you can pull this series
from:

 https://git.kernel.org/pub/scm/linux/kernel/git/mhiramat/linux.git kprobes/fprobe

Thank you,
---

Jiri Olsa (1):
      ftrace: Add ftrace_set_filter_ips function
====================

Signed-off-by: Alexei Starovoitov <ast@kernel.org>

245d9496

bpf: Add multi kprobe link · 0dcac272

Jiri Olsa authored Mar 16, 2022



Adding new link type BPF_LINK_TYPE_KPROBE_MULTI that attaches kprobe
program through fprobe API.

The fprobe API allows to attach probe on multiple functions at once
very fast, because it works on top of ftrace. On the other hand this
limits the probe point to the function entry or return.

The kprobe program gets the same pt_regs input ctx as when it's attached
through the perf API.

Adding new attach type BPF_TRACE_KPROBE_MULTI that allows attachment
kprobe to multiple function with new link.

User provides array of addresses or symbols with count to attach the
kprobe program to. The new link_create uapi interface looks like:

  struct {
          __u32           flags;
          __u32           cnt;
          __aligned_u64   syms;
          __aligned_u64   addrs;
  } kprobe_multi;

The flags field allows single BPF_TRACE_KPROBE_MULTI bit to create
return multi kprobe.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220316122419.933957-4-jolsa@kernel.org

0dcac272

kallsyms: Skip the name search for empty string · aecf489f

Jiri Olsa authored Mar 16, 2022



When kallsyms_lookup_name is called with empty string,
it will do futile search for it through all the symbols.

Skipping the search for empty string.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220316122419.933957-3-jolsa@kernel.org

aecf489f

lib/sort: Add priv pointer to swap function · a0019cd7

Jiri Olsa authored Mar 16, 2022



Adding support to have priv pointer in swap callback function.

Following the initial change on cmp callback functions [1]
and adding SWAP_WRAPPER macro to identify sort call of sort_r.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lore.kernel.org/bpf/20220316122419.933957-2-jolsa@kernel.org

[1] 4333fb96 ("media: lib/sort.c: implement sort() variant taking context argument")

a0019cd7

fprobe: Add a selftest for fprobe · f4616fab

Masami Hiramatsu authored Mar 15, 2022



Add a KUnit based selftest for fprobe interface.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Tested-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/164735295554.1084943.18347620679928750960.stgit@devnote2

f4616fab

docs: fprobe: Add fprobe description to ftrace-use.rst · aba09b44

Masami Hiramatsu authored Mar 15, 2022



Add a documentation of fprobe for the user who needs
this interface.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Tested-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/164735294272.1084943.12372175959382037397.stgit@devnote2

aba09b44