Skip to content
  1. Mar 21, 2022
    • Song Liu's avatar
      bpf: Select proper size for bpf_prog_pack · ef078600
      Song Liu authored
      Using HPAGE_PMD_SIZE as the size for bpf_prog_pack is not ideal in some
      cases. Specifically, for NUMA systems, __vmalloc_node_range requires
      PMD_SIZE * num_online_nodes() to allocate huge pages. Also, if the system
      does not support huge pages (i.e., with cmdline option nohugevmalloc), it
      is better to use PAGE_SIZE packs.
      
      Add logic to select proper size for bpf_prog_pack. This solution is not
      ideal, as it makes assumption about the behavior of module_alloc and
      __vmalloc_node_range. However, it appears to be the easiest solution as
      it doesn't require changes in module_alloc and vmalloc code.
      
      Fixes: 57631054
      
       ("bpf: Introduce bpf_prog_pack allocator")
      Signed-off-by: default avatarSong Liu <song@kernel.org>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20220311201135.3573610-1-song@kernel.org
      ef078600
    • Alexei Starovoitov's avatar
      Merge branch 'Make 2-byte access to bpf_sk_lookup->remote_port endian-agnostic' · 46e9244b
      Alexei Starovoitov authored
      
      
      Jakub Sitnicki says:
      
      ====================
      
      This patch set is a result of a discussion we had around the RFC patchset from
      Ilya [1]. The fix for the narrow loads from the RFC series is still relevant,
      but this series does not depend on it. Nor is it required to unbreak sk_lookup
      tests on BE, if this series gets applied.
      
      To summarize the takeaways from [1]:
      
       1) we want to make 2-byte load from ctx->remote_port portable across LE and BE,
       2) we keep the 4-byte load from ctx->remote_port as it is today - result varies
          on endianess of the platform.
      
      [1] https://lore.kernel.org/bpf/20220222182559.2865596-2-iii@linux.ibm.com/
      
      v1 -> v2:
      - Remove needless check that 4-byte load is from &ctx->remote_port offset
        (Martin)
      
      [v1]: https://lore.kernel.org/bpf/20220317165826.1099418-1-jakub@cloudflare.com/
      ====================
      
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      46e9244b
    • Jakub Sitnicki's avatar
      selftests/bpf: Fix test for 4-byte load from remote_port on big-endian · ce523680
      Jakub Sitnicki authored
      The context access converter rewrites the 4-byte load from
      bpf_sk_lookup->remote_port to a 2-byte load from bpf_sk_lookup_kern
      structure.
      
      It means that we cannot treat the destination register contents as a 32-bit
      value, or the code will not be portable across big- and little-endian
      architectures.
      
      This is exactly the same case as with 4-byte loads from bpf_sock->dst_port
      so follow the approach outlined in [1] and treat the register contents as a
      16-bit value in the test.
      
      [1]: https://lore.kernel.org/bpf/20220317113920.1068535-5-jakub@cloudflare.com/
      
      Fixes: 2ed0dc59
      
       ("selftests/bpf: Cover 4-byte load from remote_port in bpf_sk_lookup")
      Signed-off-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Link: https://lore.kernel.org/bpf/20220319183356.233666-4-jakub@cloudflare.com
      ce523680
    • Jakub Sitnicki's avatar
      selftests/bpf: Fix u8 narrow load checks for bpf_sk_lookup remote_port · 3c69611b
      Jakub Sitnicki authored
      In commit 9a69e2b3 ("bpf: Make remote_port field in struct
      bpf_sk_lookup 16-bit wide") ->remote_port field changed from __u32 to
      __be16.
      
      However, narrow load tests which exercise 1-byte sized loads from
      offsetof(struct bpf_sk_lookup, remote_port) were not adopted to reflect the
      change.
      
      As a result, on little-endian we continue testing loads from addresses:
      
       - (__u8 *)&ctx->remote_port + 3
       - (__u8 *)&ctx->remote_port + 4
      
      which map to the zero padding following the remote_port field, and don't
      break the tests because there is no observable change.
      
      While on big-endian, we observe breakage because tests expect to see zeros
      for values loaded from:
      
       - (__u8 *)&ctx->remote_port - 1
       - (__u8 *)&ctx->remote_port - 2
      
      Above addresses map to ->remote_ip6 field, which precedes ->remote_port,
      and are populated during the bpf_sk_lookup IPv6 tests.
      
      Unsurprisingly, on s390x we observe:
      
        #136/38 sk_lookup/narrow access to ctx v4:OK
        #136/39 sk_lookup/narrow access to ctx v6:FAIL
      
      Fix it by removing the checks for 1-byte loads from offsets outside of the
      ->remote_port field.
      
      Fixes: 9a69e2b3
      
       ("bpf: Make remote_port field in struct bpf_sk_lookup 16-bit wide")
      Suggested-by: default avatarIlya Leoshkevich <iii@linux.ibm.com>
      Signed-off-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Link: https://lore.kernel.org/bpf/20220319183356.233666-3-jakub@cloudflare.com
      3c69611b
    • Jakub Sitnicki's avatar
      bpf: Treat bpf_sk_lookup remote_port as a 2-byte field · 058ec4a7
      Jakub Sitnicki authored
      In commit 9a69e2b3 ("bpf: Make remote_port field in struct
      bpf_sk_lookup 16-bit wide") the remote_port field has been split up and
      re-declared from u32 to be16.
      
      However, the accompanying changes to the context access converter have not
      been well thought through when it comes big-endian platforms.
      
      Today 2-byte wide loads from offsetof(struct bpf_sk_lookup, remote_port)
      are handled as narrow loads from a 4-byte wide field.
      
      This by itself is not enough to create a problem, but when we combine
      
       1. 32-bit wide access to ->remote_port backed by a 16-wide wide load, with
       2. inherent difference between litte- and big-endian in how narrow loads
          need have to be handled (see bpf_ctx_narrow_access_offset),
      
      we get inconsistent results for a 2-byte loads from &ctx->remote_port on LE
      and BE architectures. This in turn makes BPF C code for the common case of
      2-byte load from ctx->remote_port not portable.
      
      To rectify it, inform the context access converter that remote_port is
      2-byte wide field, and only 1-byte loads need to be treated as narrow
      loads.
      
      At the same time, we special-case the 4-byte load from &ctx->remote_port to
      continue handling it the same way as do today, in order to keep the
      existing BPF programs working.
      
      Fixes: 9a69e2b3
      
       ("bpf: Make remote_port field in struct bpf_sk_lookup 16-bit wide")
      Signed-off-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Link: https://lore.kernel.org/bpf/20220319183356.233666-2-jakub@cloudflare.com
      058ec4a7
    • Alexei Starovoitov's avatar
      Merge branch 'Enable non-atomic allocations in local storage' · 30630e44
      Alexei Starovoitov authored
      
      
      Joanne Koong says:
      
      ====================
      
      From: Joanne Koong <joannelkoong@gmail.com>
      
      Currently, local storage memory can only be allocated atomically
      (GFP_ATOMIC). This restriction is too strict for sleepable bpf
      programs.
      
      In this patchset, sleepable programs can allocate memory in local
      storage using GFP_KERNEL, while non-sleepable programs always default to
      GFP_ATOMIC.
      
      v3 <- v2:
      * Add extra case to local_storage.c selftest to test associating multiple
      elements with the local storage, which triggers a GFP_KERNEL allocation in
      local_storage_update().
      * Cast gfp_t to __s32 in verifier to fix the sparse warnings
      
      v2 <- v1:
      * Allocate the memory before/after the raw_spin_lock_irqsave, depending
      on the gfp flags
      * Rename mem_flags to gfp_flags
      * Reword the comment "*mem_flags* is set by the bpf verifier" to
      "*gfp_flags* is a hidden argument provided by the verifier"
      * Add a sentence to the commit message about existing local storage
      selftests covering both the GFP_ATOMIC and GFP_KERNEL paths in
      bpf_local_storage_update.
      ====================
      
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      30630e44
    • Joanne Koong's avatar
      selftests/bpf: Test for associating multiple elements with the local storage · 0e790cbb
      Joanne Koong authored
      
      
      This patch adds a few calls to the existing local storage selftest to
      test that we can associate multiple elements with the local storage.
      
      The sleepable program's call to bpf_sk_storage_get with sk_storage_map2
      will lead to an allocation of a new selem under the GFP_KERNEL flag.
      
      Signed-off-by: default avatarJoanne Koong <joannelkoong@gmail.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20220318045553.3091807-3-joannekoong@fb.com
      0e790cbb
    • Joanne Koong's avatar
      bpf: Enable non-atomic allocations in local storage · b00fa38a
      Joanne Koong authored
      
      
      Currently, local storage memory can only be allocated atomically
      (GFP_ATOMIC). This restriction is too strict for sleepable bpf
      programs.
      
      In this patch, the verifier detects whether the program is sleepable,
      and passes the corresponding GFP_KERNEL or GFP_ATOMIC flag as a
      5th argument to bpf_task/sk/inode_storage_get. This flag will propagate
      down to the local storage functions that allocate memory.
      
      Please note that bpf_task/sk/inode_storage_update_elem functions are
      invoked by userspace applications through syscalls. Preemption is
      disabled before bpf_task/sk/inode_storage_update_elem is called, which
      means they will always have to allocate memory atomically.
      
      Signed-off-by: default avatarJoanne Koong <joannelkoong@gmail.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarKP Singh <kpsingh@kernel.org>
      Acked-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Link: https://lore.kernel.org/bpf/20220318045553.3091807-2-joannekoong@fb.com
      b00fa38a
    • Andrii Nakryiko's avatar
      libbpf: Avoid NULL deref when initializing map BTF info · a8fee962
      Andrii Nakryiko authored
      If BPF object doesn't have an BTF info, don't attempt to search for BTF
      types describing BPF map key or value layout.
      
      Fixes: 262cfb74
      
       ("libbpf: Init btf_{key,value}_type_id on internal map open")
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarYonghong Song <yhs@fb.com>
      Link: https://lore.kernel.org/bpf/20220320001911.3640917-1-andrii@kernel.org
      a8fee962
  2. Mar 20, 2022
  3. Mar 19, 2022
  4. Mar 18, 2022
    • Daniel Borkmann's avatar
      Merge branch 'bpf-fix-sock-field-tests' · 63cc8e20
      Daniel Borkmann authored
      Jakub Sitnicki says:
      
      ====================
      I think we have reached a consensus [1] on how the test for the 4-byte load from
      bpf_sock->dst_port and bpf_sk_lookup->remote_port should look, so here goes v3.
      
      I will submit a separate set of patches for bpf_sk_lookup->remote_port tests.
      
      This series has been tested on x86_64 and s390 on top of recent bpf-next -
      ad13baf4
      
       ("selftests/bpf: Test subprog jit when toggle bpf_jit_harden
      repeatedly").
      
        [1] https://lore.kernel.org/bpf/87k0cwxkzs.fsf@cloudflare.com/
      
      v2 -> v3:
      - Split what was previously patch 2 which was doing two things
      - Use BPF_TCP_* constants (Martin)
      - Treat the result of 4-byte load from dst_port as a 16-bit value (Martin)
      - Typo fixup and some rewording in patch 4 description
      v1 -> v2:
      - Limit read_sk_dst_port only to client traffic (patch 2)
      - Make read_sk_dst_port pass on litte- and big-endian (patch 3)
      
      v1: https://lore.kernel.org/bpf/20220225184130.483208-1-jakub@cloudflare.com/
      v2: https://lore.kernel.org/bpf/20220227202757.519015-1-jakub@cloudflare.com/
      ====================
      
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      63cc8e20
    • Jakub Sitnicki's avatar
      selftests/bpf: Fix test for 4-byte load from dst_port on big-endian · deb59400
      Jakub Sitnicki authored
      
      
      The check for 4-byte load from dst_port offset into bpf_sock is failing on
      big-endian architecture - s390. The bpf access converter rewrites the
      4-byte load to a 2-byte load from sock_common at skc_dport offset, as shown
      below.
      
        * s390 / llvm-objdump -S --no-show-raw-insn
      
        00000000000002a0 <sk_dst_port__load_word>:
              84:       r1 = *(u32 *)(r1 + 48)
              85:       w0 = 1
              86:       if w1 == 51966 goto +1 <LBB5_2>
              87:       w0 = 0
        00000000000002c0 <LBB5_2>:
              88:       exit
      
        * s390 / bpftool prog dump xlated
      
        _Bool sk_dst_port__load_word(struct bpf_sock * sk):
          35: (69) r1 = *(u16 *)(r1 +12)
          36: (bc) w1 = w1
          37: (b4) w0 = 1
          38: (16) if w1 == 0xcafe goto pc+1
          39: (b4) w0 = 0
          40: (95) exit
      
        * x86_64 / llvm-objdump -S --no-show-raw-insn
      
        00000000000002a0 <sk_dst_port__load_word>:
              84:       r1 = *(u32 *)(r1 + 48)
              85:       w0 = 1
              86:       if w1 == 65226 goto +1 <LBB5_2>
              87:       w0 = 0
        00000000000002c0 <LBB5_2>:
              88:       exit
      
        * x86_64 / bpftool prog dump xlated
      
        _Bool sk_dst_port__load_word(struct bpf_sock * sk):
          33: (69) r1 = *(u16 *)(r1 +12)
          34: (b4) w0 = 1
          35: (16) if w1 == 0xfeca goto pc+1
          36: (b4) w0 = 0
          37: (95) exit
      
      This leads to surprises if we treat the destination register contents as a
      32-bit value, ignoring the fact that in reality it contains a 16-bit value.
      
      On little-endian the register contents reflect the bpf_sock struct
      definition, where the lower 16-bits contain the port number:
      
      	struct bpf_sock {
      		...
      		__be16 dst_port;	/* offset 48 */
      		__u16 :16;
      		...
      	};
      
      However, on big-endian the register contents suggest that field the layout
      of bpf_sock struct is as so:
      
      	struct bpf_sock {
      		...
      		__u16 :16;		/* offset 48 */
      		__be16 dst_port;
      		...
      	};
      
      Account for this quirky access conversion in the test case exercising the
      4-byte load by treating the result as 16-bit wide.
      
      Signed-off-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Link: https://lore.kernel.org/bpf/20220317113920.1068535-5-jakub@cloudflare.com
      deb59400
    • Jakub Sitnicki's avatar
      selftests/bpf: Use constants for socket states in sock_fields test · e06b5bbc
      Jakub Sitnicki authored
      
      
      Replace magic numbers in BPF code with constants from bpf.h, so that they
      don't require an explanation in the comments.
      
      Signed-off-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Link: https://lore.kernel.org/bpf/20220317113920.1068535-4-jakub@cloudflare.com
      e06b5bbc
    • Jakub Sitnicki's avatar
      selftests/bpf: Check dst_port only on the client socket · 2d2202ba
      Jakub Sitnicki authored
      cgroup_skb/egress programs which sock_fields test installs process packets
      flying in both directions, from the client to the server, and in reverse
      direction.
      
      Recently added dst_port check relies on the fact that destination
      port (remote peer port) of the socket which sends the packet is known ahead
      of time. This holds true only for the client socket, which connects to the
      known server port.
      
      Filter out any traffic that is not egressing from the client socket in the
      BPF program that tests reading the dst_port.
      
      Fixes: 8f50f16f
      
       ("selftests/bpf: Extend verifier and bpf_sock tests for dst_port loads")
      Signed-off-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Link: https://lore.kernel.org/bpf/20220317113920.1068535-3-jakub@cloudflare.com
      2d2202ba
    • Jakub Sitnicki's avatar
      selftests/bpf: Fix error reporting from sock_fields programs · a4c9fe0e
      Jakub Sitnicki authored
      The helper macro that records an error in BPF programs that exercise sock
      fields access has been inadvertently broken by adaptation work that
      happened in commit b18c1f0a ("bpf: selftest: Adapt sock_fields test to
      use skel and global variables").
      
      BPF_NOEXIST flag cannot be used to update BPF_MAP_TYPE_ARRAY. The operation
      always fails with -EEXIST, which in turn means the error never gets
      recorded, and the checks for errors always pass.
      
      Revert the change in update flags.
      
      Fixes: b18c1f0a
      
       ("bpf: selftest: Adapt sock_fields test to use skel and global variables")
      Signed-off-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Link: https://lore.kernel.org/bpf/20220317113920.1068535-2-jakub@cloudflare.com
      a4c9fe0e
    • Andrii Nakryiko's avatar
      Merge branch 'Subskeleton support for BPF librariesThread-Topic: [PATCH bpf-next v4 0/5' · 60911970
      Andrii Nakryiko authored
      
      
      Delyan Kratunov says:
      
      ====================
      
      In the quest for ever more modularity, a new need has arisen - the ability to
      access data associated with a BPF library from a corresponding userspace library.
      The catch is that we don't want the userspace library to know about the structure of the
      final BPF object that the BPF library is linked into.
      
      In pursuit of this modularity, this patch series introduces *subskeletons.*
      Subskeletons are similar in use and design to skeletons with a couple of differences:
      
      1. The generated storage types do not rely on contiguous storage for the library's
      variables because they may be interspersed randomly throughout the final BPF object's sections.
      
      2. Subskeletons do not own objects and instead require a loaded bpf_object* to
      be passed at runtime in order to be initialized. By extension, symbols are resolved at
      runtime by parsing the final object's BTF.
      
      3. Subskeletons allow access to all global variables, programs, and custom maps. They also expose
      the internal maps *of the final object*. This allows bpf_var_skeleton objects to contain a bpf_map**
      instead of a section name.
      
      Changes since v3:
       - Re-add key/value type lookup for legacy user maps (fixing btf test)
       - Minor cleanups (missed sanitize_identifier call, error messages, formatting)
      
      Changes since v2:
       - Reuse SEC_NAME strict mode flag
       - Init bpf_map->btf_value_type_id on open for internal maps *and* user BTF maps
       - Test custom section names (.data.foo) and overlapping kconfig externs between the final object and the library
       - Minor review comments in gen.c & libbpf.c
      
      Changes since v1:
       - Introduced new strict mode knob for single-routine-in-.text compatibility behavior, which
         disproportionately affects library objects. bpftool works in 1.0 mode so subskeleton generation
         doesn't have to worry about this now.
       - Made bpf_map_btf_value_type_id available earlier and used it wherever applicable.
       - Refactoring in bpftool gen.c per review comments.
       - Subskels now use typeof() for array and func proto globals to avoid the need for runtime split btf.
       - Expanded the subskeleton test to include arrays, custom maps, extern maps, weak symbols, and kconfigs.
       - selftests/bpf/Makefile now generates a subskel.h for every skel.h it would make.
      
      For reference, here is a shortened subskeleton header:
      
      #ifndef __TEST_SUBSKELETON_LIB_SUBSKEL_H__
      #define __TEST_SUBSKELETON_LIB_SUBSKEL_H__
      
      struct test_subskeleton_lib {
      	struct bpf_object *obj;
      	struct bpf_object_subskeleton *subskel;
      	struct {
      		struct bpf_map *map2;
      		struct bpf_map *map1;
      		struct bpf_map *data;
      		struct bpf_map *rodata;
      		struct bpf_map *bss;
      		struct bpf_map *kconfig;
      	} maps;
      	struct {
      		struct bpf_program *lib_perf_handler;
      	} progs;
      	struct test_subskeleton_lib__data {
      		int *var6;
      		int *var2;
      		int *var5;
      	} data;
      	struct test_subskeleton_lib__rodata {
      		int *var1;
      	} rodata;
      	struct test_subskeleton_lib__bss {
      		struct {
      			int var3_1;
      			__s64 var3_2;
      		} *var3;
      		int *libout1;
      		typeof(int[4]) *var4;
      		typeof(int (*)()) *fn_ptr;
      	} bss;
      	struct test_subskeleton_lib__kconfig {
      		_Bool *CONFIG_BPF_SYSCALL;
      	} kconfig;
      
      static inline struct test_subskeleton_lib *
      test_subskeleton_lib__open(const struct bpf_object *src)
      {
      	struct test_subskeleton_lib *obj;
      	struct bpf_object_subskeleton *s;
      	int err;
      
      	...
      	s = (struct bpf_object_subskeleton *)calloc(1, sizeof(*s));
      	...
      
      	s->var_cnt = 9;
      	...
      
      	s->vars[0].name = "var6";
      	s->vars[0].map = &obj->maps.data;
      	s->vars[0].addr = (void**) &obj->data.var6;
        ...
      
      	/* maps */
      	...
      
      	/* programs */
      	s->prog_cnt = 1;
      	...
      
      	err = bpf_object__open_subskeleton(s);
        ...
      	return obj;
      }
      #endif /* __TEST_SUBSKELETON_LIB_SUBSKEL_H__ */
      ====================
      
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      60911970
    • Delyan Kratunov's avatar
      selftests/bpf: Test subskeleton functionality · 3cccbaa0
      Delyan Kratunov authored
      
      
      This patch changes the selftests/bpf Makefile to also generate
      a subskel.h for every skel.h it would have normally generated.
      
      Separately, it also introduces a new subskeleton test which tests
      library objects, externs, weak symbols, kconfigs, and user maps.
      
      Signed-off-by: default avatarDelyan Kratunov <delyank@fb.com>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/bpf/1bd24956940bbbfe169bb34f7f87b11df52ef011.1647473511.git.delyank@fb.com
      3cccbaa0
    • Delyan Kratunov's avatar
      bpftool: Add support for subskeletons · 00389c58
      Delyan Kratunov authored
      
      
      Subskeletons are headers which require an already loaded program to
      operate.
      
      For example, when a BPF library is linked into a larger BPF object file,
      the library userspace needs a way to access its own global variables
      without requiring knowledge about the larger program at build time.
      
      As a result, subskeletons require a loaded bpf_object to open().
      Further, they find their own symbols in the larger program by
      walking BTF type data at run time.
      
      At this time, programs, maps, and globals are supported through
      non-owning pointers.
      
      Signed-off-by: default avatarDelyan Kratunov <delyank@fb.com>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/bpf/ca8a48b4841c72d285ecce82371bef4a899756cb.1647473511.git.delyank@fb.com
      00389c58
    • Delyan Kratunov's avatar
      libbpf: Add subskeleton scaffolding · 430025e5
      Delyan Kratunov authored
      
      
      In symmetry with bpf_object__open_skeleton(),
      bpf_object__open_subskeleton() performs the actual walking and linking
      of maps, progs, and globals described by bpf_*_skeleton objects.
      
      Signed-off-by: default avatarDelyan Kratunov <delyank@fb.com>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/bpf/6942a46fbe20e7ebf970affcca307ba616985b15.1647473511.git.delyank@fb.com
      430025e5
    • Delyan Kratunov's avatar
      libbpf: Init btf_{key,value}_type_id on internal map open · 262cfb74
      Delyan Kratunov authored
      
      
      For internal and user maps, look up the key and value btf
      types on open() and not load(), so that `bpf_map_btf_value_type_id`
      is usable in `bpftool gen`.
      
      Signed-off-by: default avatarDelyan Kratunov <delyank@fb.com>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/bpf/78dbe4e457b4a05e098fc6c8f50014b680c86e4e.1647473511.git.delyank@fb.com
      262cfb74
    • Delyan Kratunov's avatar
      libbpf: .text routines are subprograms in strict mode · bc380eb9
      Delyan Kratunov authored
      
      
      Currently, libbpf considers a single routine in .text to be a program. This
      is particularly confusing when it comes to library objects - a single routine
      meant to be used as an extern will instead be considered a bpf_program.
      
      This patch hides this compatibility behavior behind the pre-existing
      SEC_NAME strict mode flag.
      
      Signed-off-by: default avatarDelyan Kratunov <delyank@fb.com>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/bpf/018de8d0d67c04bf436055270d35d394ba393505.1647473511.git.delyank@fb.com
      bc380eb9
    • Alexei Starovoitov's avatar
      Merge branch 'bpf: Add kprobe multi link' · 5a5c11ee
      Alexei Starovoitov authored
      
      
      Jiri Olsa says:
      
      ====================
      
      hi,
      this patchset adds new link type BPF_TRACE_KPROBE_MULTI that attaches
      kprobe program through fprobe API [1] instroduced by Masami.
      
      The fprobe API allows to attach probe on multiple functions at once very
      fast, because it works on top of ftrace. On the other hand this limits
      the probe point to the function entry or return.
      
      With bpftrace support I see following attach speed:
      
        # perf stat --null -r 5 ./src/bpftrace -e 'kprobe:x* { } i:ms:1 { exit(); } '
        Attaching 2 probes...
        Attaching 3342 functions
        ...
      
        1.4960 +- 0.0285 seconds time elapsed  ( +-  1.91% )
      
      v3 changes:
        - based on latest fprobe post from Masami [2]
        - add acks
        - add extra comment to kprobe_multi_link_handler wrt entry ip setup [Masami]
        - keep swap_words_64 static and swap values directly in
          bpf_kprobe_multi_cookie_swap [Andrii]
        - rearrange locking/migrate setup in kprobe_multi_link_prog_run [Andrii]
        - move uapi fields [Andrii]
        - add bpf_program__attach_kprobe_multi_opts function [Andrii]
        - many small test changes [Andrii]
        - added tests for bpf_program__attach_kprobe_multi_opts
        - make kallsyms_lookup_name check for empty string [Andrii]
      
      v2 changes:
        - based on latest fprobe changes [1]
        - renaming the uapi interface to kprobe multi
        - adding support for sort_r to pass user pointer for swap functions
          and using that in cookie support to keep just single functions array
        - moving new link to kernel/trace/bpf_trace.c file
        - using single fprobe callback function for entry and exit
        - using kvzalloc, libbpf_ensure_mem functions
        - adding new k[ret]probe.multi sections instead of using current kprobe
        - used glob_match from test_progs.c, added '?' matching
        - move bpf_get_func_ip verifier inline change to seprate change
        - couple of other minor fixes
      
      Also available at:
        https://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
        bpf/kprobe_multi
      
      thanks,
      jirka
      
      [1] https://lore.kernel.org/bpf/164458044634.586276.3261555265565111183.stgit@devnote2/
      [2] https://lore.kernel.org/bpf/164735281449.1084943.12438881786173547153.stgit@devnote2/
      ====================
      
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      5a5c11ee
    • Jiri Olsa's avatar
      selftests/bpf: Add cookie test for bpf_program__attach_kprobe_multi_opts · 318c812c
      Jiri Olsa authored
      
      
      Adding bpf_cookie test for programs attached by
      bpf_program__attach_kprobe_multi_opts API.
      
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20220316122419.933957-14-jolsa@kernel.org
      318c812c
    • Jiri Olsa's avatar
      selftests/bpf: Add attach test for bpf_program__attach_kprobe_multi_opts · 9271a0c7
      Jiri Olsa authored
      
      
      Adding tests for bpf_program__attach_kprobe_multi_opts function,
      that test attach with pattern, symbols and addrs.
      
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20220316122419.933957-13-jolsa@kernel.org
      9271a0c7
    • Jiri Olsa's avatar
      selftests/bpf: Add kprobe_multi bpf_cookie test · 2c6401c9
      Jiri Olsa authored
      
      
      Adding bpf_cookie test for programs attached by kprobe_multi links.
      
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20220316122419.933957-12-jolsa@kernel.org
      2c6401c9
    • Jiri Olsa's avatar
      selftests/bpf: Add kprobe_multi attach test · f7a11eec
      Jiri Olsa authored
      
      
      Adding kprobe_multi attach test that uses new fprobe interface to
      attach kprobe program to multiple functions.
      
      The test is attaching programs to bpf_fentry_test* functions and
      uses single trampoline program bpf_prog_test_run to trigger
      bpf_fentry_test* functions.
      
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20220316122419.933957-11-jolsa@kernel.org
      f7a11eec
    • Jiri Olsa's avatar
      libbpf: Add bpf_program__attach_kprobe_multi_opts function · ddc6b049
      Jiri Olsa authored
      
      
      Adding bpf_program__attach_kprobe_multi_opts function for attaching
      kprobe program to multiple functions.
      
        struct bpf_link *
        bpf_program__attach_kprobe_multi_opts(const struct bpf_program *prog,
                                              const char *pattern,
                                              const struct bpf_kprobe_multi_opts *opts);
      
      User can specify functions to attach with 'pattern' argument that
      allows wildcards (*?' supported) or provide symbols or addresses
      directly through opts argument. These 3 options are mutually
      exclusive.
      
      When using symbols or addresses, user can also provide cookie value
      for each symbol/address that can be retrieved later in bpf program
      with bpf_get_attach_cookie helper.
      
        struct bpf_kprobe_multi_opts {
                size_t sz;
                const char **syms;
                const unsigned long *addrs;
                const __u64 *cookies;
                size_t cnt;
                bool retprobe;
                size_t :0;
        };
      
      Symbols, addresses and cookies are provided through opts object
      (syms/addrs/cookies) as array pointers with specified count (cnt).
      
      Each cookie value is paired with provided function address or symbol
      with the same array index.
      
      The program can be also attached as return probe if 'retprobe' is set.
      
      For quick usage with NULL opts argument, like:
      
        bpf_program__attach_kprobe_multi_opts(prog, "ksys_*", NULL)
      
      the 'prog' will be attached as kprobe to 'ksys_*' functions.
      
      Also adding new program sections for automatic attachment:
      
        kprobe.multi/<symbol_pattern>
        kretprobe.multi/<symbol_pattern>
      
      The symbol_pattern is used as 'pattern' argument in
      bpf_program__attach_kprobe_multi_opts function.
      
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20220316122419.933957-10-jolsa@kernel.org
      ddc6b049
    • Jiri Olsa's avatar
      libbpf: Add bpf_link_create support for multi kprobes · 5117c26e
      Jiri Olsa authored
      
      
      Adding new kprobe_multi struct to bpf_link_create_opts object
      to pass multiple kprobe data to link_create attr uapi.
      
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20220316122419.933957-9-jolsa@kernel.org
      5117c26e
    • Jiri Olsa's avatar
      libbpf: Add libbpf_kallsyms_parse function · 85153ac0
      Jiri Olsa authored
      
      
      Move the kallsyms parsing in internal libbpf_kallsyms_parse
      function, so it can be used from other places.
      
      It will be used in following changes.
      
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/bpf/20220316122419.933957-8-jolsa@kernel.org
      85153ac0
    • Jiri Olsa's avatar
      bpf: Add cookie support to programs attached with kprobe multi link · ca74823c
      Jiri Olsa authored
      
      
      Adding support to call bpf_get_attach_cookie helper from
      kprobe programs attached with kprobe multi link.
      
      The cookie is provided by array of u64 values, where each
      value is paired with provided function address or symbol
      with the same array index.
      
      When cookie array is provided it's sorted together with
      addresses (check bpf_kprobe_multi_cookie_swap). This way
      we can find cookie based on the address in
      bpf_get_attach_cookie helper.
      
      Suggested-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20220316122419.933957-7-jolsa@kernel.org
      ca74823c
    • Jiri Olsa's avatar
      bpf: Add support to inline bpf_get_func_ip helper on x86 · 97ee4d20
      Jiri Olsa authored
      
      
      Adding support to inline it on x86, because it's single
      load instruction.
      
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20220316122419.933957-6-jolsa@kernel.org
      97ee4d20
    • Jiri Olsa's avatar
      bpf: Add bpf_get_func_ip kprobe helper for multi kprobe link · 42a57120
      Jiri Olsa authored
      
      
      Adding support to call bpf_get_func_ip helper from kprobe
      programs attached by multi kprobe link.
      
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/bpf/20220316122419.933957-5-jolsa@kernel.org
      42a57120
    • Alexei Starovoitov's avatar
      Merge branch 'fprobe: Introduce fprobe function entry/exit probe' · 245d9496
      Alexei Starovoitov authored
      
      
      Masami Hiramatsu says:
      
      ====================
      
      Hi,
      
      Here is the 12th version of fprobe. This version fixes a possible gcc-11 issue which
      was reported as kretprobes on arm issue, and also I updated the fprobe document.
      
      The previous version (v11) is here[1];
      
      [1] https://lore.kernel.org/all/164701432038.268462.3329725152949938527.stgit@devnote2/T/#u
      
      This series introduces the fprobe, the function entry/exit probe
      with multiple probe point support for x86, arm64 and powerpc64le.
      This also introduces the rethook for hooking function return as same as
      the kretprobe does. This abstraction will help us to generalize the fgraph
      tracer, because we can just switch to it from the rethook in fprobe,
      depending on the kernel configuration.
      
      The patch [1/12] is from Jiri's series[2].
      
      [2] https://lore.kernel.org/all/20220104080943.113249-1-jolsa@kernel.org/T/#u
      
      And the patch [9/10] adds the FPROBE_FL_KPROBE_SHARED flag for the case
      if user wants to share the same code (or share a same resource) on the
      fprobe and the kprobes.
      
      I forcibly updated my kprobes/fprobe branch, you can pull this series
      from:
      
       https://git.kernel.org/pub/scm/linux/kernel/git/mhiramat/linux.git kprobes/fprobe
      
      Thank you,
      ---
      
      Jiri Olsa (1):
            ftrace: Add ftrace_set_filter_ips function
      ====================
      
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      245d9496
    • Jiri Olsa's avatar
      bpf: Add multi kprobe link · 0dcac272
      Jiri Olsa authored
      
      
      Adding new link type BPF_LINK_TYPE_KPROBE_MULTI that attaches kprobe
      program through fprobe API.
      
      The fprobe API allows to attach probe on multiple functions at once
      very fast, because it works on top of ftrace. On the other hand this
      limits the probe point to the function entry or return.
      
      The kprobe program gets the same pt_regs input ctx as when it's attached
      through the perf API.
      
      Adding new attach type BPF_TRACE_KPROBE_MULTI that allows attachment
      kprobe to multiple function with new link.
      
      User provides array of addresses or symbols with count to attach the
      kprobe program to. The new link_create uapi interface looks like:
      
        struct {
                __u32           flags;
                __u32           cnt;
                __aligned_u64   syms;
                __aligned_u64   addrs;
        } kprobe_multi;
      
      The flags field allows single BPF_TRACE_KPROBE_MULTI bit to create
      return multi kprobe.
      
      Signed-off-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/bpf/20220316122419.933957-4-jolsa@kernel.org
      0dcac272
    • Jiri Olsa's avatar
      kallsyms: Skip the name search for empty string · aecf489f
      Jiri Olsa authored
      
      
      When kallsyms_lookup_name is called with empty string,
      it will do futile search for it through all the symbols.
      
      Skipping the search for empty string.
      
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20220316122419.933957-3-jolsa@kernel.org
      aecf489f
    • Jiri Olsa's avatar
      lib/sort: Add priv pointer to swap function · a0019cd7
      Jiri Olsa authored
      
      
      Adding support to have priv pointer in swap callback function.
      
      Following the initial change on cmp callback functions [1]
      and adding SWAP_WRAPPER macro to identify sort call of sort_r.
      
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Reviewed-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Link: https://lore.kernel.org/bpf/20220316122419.933957-2-jolsa@kernel.org
      
      [1] 4333fb96 ("media: lib/sort.c: implement sort() variant taking context argument")
      a0019cd7
    • Masami Hiramatsu's avatar
      fprobe: Add a selftest for fprobe · f4616fab
      Masami Hiramatsu authored
      
      
      Add a KUnit based selftest for fprobe interface.
      
      Signed-off-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      Tested-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/164735295554.1084943.18347620679928750960.stgit@devnote2
      f4616fab
    • Masami Hiramatsu's avatar
      docs: fprobe: Add fprobe description to ftrace-use.rst · aba09b44
      Masami Hiramatsu authored
      
      
      Add a documentation of fprobe for the user who needs
      this interface.
      
      Signed-off-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      Tested-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/164735294272.1084943.12372175959382037397.stgit@devnote2
      aba09b44