Skip to content
  1. Oct 26, 2021
    • Quentin Monnet's avatar
      bpftool: Switch to libbpf's hashmap for pinned paths of BPF objects · 8f184732
      Quentin Monnet authored
      
      
      In order to show pinned paths for BPF programs, maps, or links when
      listing them with the "-f" option, bpftool creates hash maps to store
      all relevant paths under the bpffs. So far, it would rely on the
      kernel implementation (from tools/include/linux/hashtable.h).
      
      We can make bpftool rely on libbpf's implementation instead. The
      motivation is to make bpftool less dependent of kernel headers, to ease
      the path to a potential out-of-tree mirror, like libbpf has.
      
      This commit is the first step of the conversion: the hash maps for
      pinned paths for programs, maps, and links are converted to libbpf's
      hashmap.{c,h}. Other hash maps used for the PIDs of process holding
      references to BPF objects are left unchanged for now. On the build side,
      this requires adding a dependency to a second header internal to libbpf,
      and making it a dependency for the bootstrap bpftool version as well.
      The rest of the changes are a rather straightforward conversion.
      
      Signed-off-by: default avatarQuentin Monnet <quentin@isovalent.com>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/bpf/20211023205154.6710-4-quentin@isovalent.com
      8f184732
    • Quentin Monnet's avatar
      bpftool: Do not expose and init hash maps for pinned path in main.c · 46241271
      Quentin Monnet authored
      
      
      BPF programs, maps, and links, can all be listed with their pinned paths
      by bpftool, when the "-f" option is provided. To do so, bpftool builds
      hash maps containing all pinned paths for each kind of objects.
      
      These three hash maps are always initialised in main.c, and exposed
      through main.h. There appear to be no particular reason to do so: we can
      just as well make them static to the files that need them (prog.c,
      map.c, and link.c respectively), and initialise them only when we want
      to show objects and the "-f" switch is provided.
      
      This may prevent unnecessary memory allocations if the implementation of
      the hash maps was to change in the future.
      
      Signed-off-by: default avatarQuentin Monnet <quentin@isovalent.com>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/bpf/20211023205154.6710-3-quentin@isovalent.com
      46241271
    • Quentin Monnet's avatar
      bpftool: Remove Makefile dep. on $(LIBBPF) for $(LIBBPF_INTERNAL_HDRS) · 8b6c4624
      Quentin Monnet authored
      
      
      The dependency is only useful to make sure that the $(LIBBPF_HDRS_DIR)
      directory is created before we try to install locally the required
      libbpf internal header. Let's create this directory properly instead.
      
      This is in preparation of making $(LIBBPF_INTERNAL_HDRS) a dependency to
      the bootstrap bpftool version, in which case we want no dependency on
      $(LIBBPF).
      
      Signed-off-by: default avatarQuentin Monnet <quentin@isovalent.com>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Link: https://lore.kernel.org/bpf/20211023205154.6710-2-quentin@isovalent.com
      8b6c4624
    • Alexei Starovoitov's avatar
      Merge branch 'Parallelize verif_scale selftests' · 57c8d362
      Alexei Starovoitov authored
      
      
      Andrii Nakryiko says:
      
      ====================
      
      Reduce amount of waiting time when running test_progs in parallel mode (-j) by
      splitting bpf_verif_scale selftests into multiple tests. Previously it was
      structured as a test with multiple subtests, but subtests are not easily
      parallelizable with test_progs' infra. Also in practice each scale subtest is
      really an independent test with nothing shared across all substest.
      
      This patch set changes how test_progs test discovery works. Now it is possible
      to define multiple tests within a single source code file. One of the patches
      also marks tc_redirect selftests as serial, because it's extremely harmful to
      the test system when run in parallel mode.
      ====================
      
      Acked-by: default avatarYucong Sun <sunyucong@gmail.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      57c8d362
    • Andrii Nakryiko's avatar
      selftests/bpf: Split out bpf_verif_scale selftests into multiple tests · 3762a39c
      Andrii Nakryiko authored
      
      
      Instead of using subtests in bpf_verif_scale selftest, turn each scale
      sub-test into its own test. Each subtest is compltely independent and
      just reuses a bit of common test running logic, so the conversion is
      trivial. For convenience, keep all of BPF verifier scale tests in one
      file.
      
      This conversion shaves off a significant amount of time when running
      test_progs in parallel mode. E.g., just running scale tests (-t verif_scale):
      
      BEFORE
      ======
      Summary: 24/0 PASSED, 0 SKIPPED, 0 FAILED
      
      real    0m22.894s
      user    0m0.012s
      sys     0m22.797s
      
      AFTER
      =====
      Summary: 24/0 PASSED, 0 SKIPPED, 0 FAILED
      
      real    0m12.044s
      user    0m0.024s
      sys     0m27.869s
      
      Ten second saving right there. test_progs -j is not yet ready to be
      turned on by default, unfortunately, and some tests fail almost every
      time, but this is a good improvement nevertheless. Ignoring few
      failures, here is sequential vs parallel run times when running all
      tests now:
      
      SEQUENTIAL
      ==========
      Summary: 206/953 PASSED, 4 SKIPPED, 0 FAILED
      
      real    1m5.625s
      user    0m4.211s
      sys     0m31.650s
      
      PARALLEL
      ========
      Summary: 204/952 PASSED, 4 SKIPPED, 2 FAILED
      
      real    0m35.550s
      user    0m4.998s
      sys     0m39.890s
      
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20211022223228.99920-5-andrii@kernel.org
      3762a39c
    • Andrii Nakryiko's avatar
      selftests/bpf: Mark tc_redirect selftest as serial · 2c0f51ac
      Andrii Nakryiko authored
      
      
      It seems to cause a lot of harm to kprobe/tracepoint selftests. Yucong
      mentioned before that it does manipulate sysfs, which might be the
      reason. So let's mark it as serial, though ideally it would be less
      intrusive on the system at test.
      
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20211022223228.99920-4-andrii@kernel.org
      2c0f51ac
    • Andrii Nakryiko's avatar
      selftests/bpf: Support multiple tests per file · 8ea688e7
      Andrii Nakryiko authored
      
      
      Revamp how test discovery works for test_progs and allow multiple test
      entries per file. Any global void function with no arguments and
      serial_test_ or test_ prefix is considered a test.
      
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20211022223228.99920-3-andrii@kernel.org
      8ea688e7
    • Andrii Nakryiko's avatar
      selftests/bpf: Normalize selftest entry points · 6972dc3b
      Andrii Nakryiko authored
      
      
      Ensure that all test entry points are global void functions with no
      input arguments. Mark few subtest entry points as static.
      
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20211022223228.99920-2-andrii@kernel.org
      6972dc3b
  2. Oct 23, 2021
  3. Oct 22, 2021
    • Alexei Starovoitov's avatar
      Merge branch 'libbpf: support custom .rodata.*/.data.* sections' · 29da17c4
      Alexei Starovoitov authored
      
      
      Andrii Nakryiko says:
      
      ====================
      
      This patch set refactors internals of libbpf to enable support for multiple
      custom .rodata.* and .data.* sections. Each such section is backed by its own
      BPF_MAP_TYPE_ARRAY, memory-mappable just like .rodata/.data. This is not
      extended to .bss because .bss is not a great name, it is generated by compiler
      with name that reflects completely irrelevant historical implementation
      details. Given that users have to annotate their variables with
      SEC(".data.my_sec") explicitly, standardizing on .rodata. and .data. prefixes
      makes more sense and keeps things simpler.
      
      Additionally, this patch set makes it simpler to work with those special
      internal maps by allowing to look them up by their full ELF section name.
      
      Patch #1 is a preparatory patch that deprecates one libbpf API and moves
      custom logic into libbpf.c, where it's used. This code is later refactored
      with the rest of libbpf.c logic to support multiple data section maps.
      
      See individual patches for all the details.
      
      For new custom "dot maps", their full ELF section names are used as the names
      that are sent into the kernel. Object name isn't prepended like for
      .data/.rodata/.bss. The reason is that with longer custom names, there isn't
      much space left for object name anyways. Also, if BTF is supported,
      btf_value_type_id points to DATASEC BTF type, which contains full original ELF
      name of the section, so tools like bpftool could use that to recover full
      name. This patch set doesn't add this logic yet, this is left for follow up
      patches.
      
      One interesting possibility that is now open by these changes is that it's
      possible to do:
      
          bpf_trace_printk("My fmt %s", sizeof("My fmt %s"), "blah");
      
      and it will work as expected. I haven't updated libbpf-provided helpers in
      bpf_helpers.h for snprintf, seq_printf, and printk, because using
      `static const char ___fmt[] = fmt;` trick is still efficient and doesn't fill
      out the buffer at runtime (no copying). But we might consider updating them in
      the future, especially with the array check that Kumar proposed (see [0]).
      
        [0] https://lore.kernel.org/bpf/20211012041524.udytbr2xs5wid6x2@apollo.localdomain/
      
      v1->v2:
        - don't prepend object name for new dot maps;
        - add __read_mostly example in selftests (Daniel).
      ====================
      
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      29da17c4
    • Andrii Nakryiko's avatar
      selftests/bpf: Switch to ".bss"/".rodata"/".data" lookups for internal maps · 4f2511e1
      Andrii Nakryiko authored
      
      
      Utilize libbpf's feature of allowing to lookup internal maps by their
      ELF section names. No need to guess or calculate the exact truncated
      prefix taken from the object name.
      
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarSong Liu <songliubraving@fb.com>
      Link: https://lore.kernel.org/bpf/20211021014404.2635234-11-andrii@kernel.org
      4f2511e1
    • Andrii Nakryiko's avatar
      libbpf: Simplify look up by name of internal maps · 26071635
      Andrii Nakryiko authored
      
      
      Map name that's assigned to internal maps (.rodata, .data, .bss, etc)
      consist of a small prefix of bpf_object's name and ELF section name as
      a suffix. This makes it hard for users to "guess" the name to use for
      looking up by name with bpf_object__find_map_by_name() API.
      
      One proposal was to drop object name prefix from the map name and just
      use ".rodata", ".data", etc, names. One downside called out was that
      when multiple BPF applications are active on the host, it will be hard
      to distinguish between multiple instances of .rodata and know which BPF
      object (app) they belong to. Having few first characters, while quite
      limiting, still can give a bit of a clue, in general.
      
      Note, though, that btf_value_type_id for such global data maps (ARRAY)
      points to DATASEC type, which encodes full ELF name, so tools like
      bpftool can take advantage of this fact to "recover" full original name
      of the map. This is also the reason why for custom .data.* and .rodata.*
      maps libbpf uses only their ELF names and doesn't prepend object name at
      all.
      
      Another downside of such approach is that it is not backwards compatible
      and, among direct use of bpf_object__find_map_by_name() API, will break
      any BPF skeleton generated using bpftool that was compiled with older
      libbpf version.
      
      Instead of causing all this pain, libbpf will still generate map name
      using a combination of object name and ELF section name, but it will
      allow looking such maps up by their natural names, which correspond to
      their respective ELF section names. This means non-truncated ELF section
      names longer than 15 characters are going to be expected and supported.
      
      With such set up, we get the best of both worlds: leave small bits of
      a clue about BPF application that instantiated such maps, as well as
      making it easy for user apps to lookup such maps at runtime. In this
      sense it closes corresponding libbpf 1.0 issue ([0]).
      
      BPF skeletons will continue using full names for lookups.
      
        [0] Closes: https://github.com/libbpf/libbpf/issues/275
      
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarSong Liu <songliubraving@fb.com>
      Link: https://lore.kernel.org/bpf/20211021014404.2635234-10-andrii@kernel.org
      26071635
    • Andrii Nakryiko's avatar
      selftests/bpf: Demonstrate use of custom .rodata/.data sections · 30c5bd96
      Andrii Nakryiko authored
      
      
      Enhance existing selftests to demonstrate the use of custom
      .data/.rodata sections.
      
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarSong Liu <songliubraving@fb.com>
      Link: https://lore.kernel.org/bpf/20211021014404.2635234-9-andrii@kernel.org
      30c5bd96
    • Andrii Nakryiko's avatar
      libbpf: Support multiple .rodata.* and .data.* BPF maps · aed65917
      Andrii Nakryiko authored
      
      
      Add support for having multiple .rodata and .data data sections ([0]).
      .rodata/.data are supported like the usual, but now also
      .rodata.<whatever> and .data.<whatever> are also supported. Each such
      section will get its own backing BPF_MAP_TYPE_ARRAY, just like
      .rodata and .data.
      
      Multiple .bss maps are not supported, as the whole '.bss' name is
      confusing and might be deprecated soon, as well as user would need to
      specify custom ELF section with SEC() attribute anyway, so might as well
      stick to just .data.* and .rodata.* convention.
      
      User-visible map name for such new maps is going to be just their ELF
      section names.
      
        [0] https://github.com/libbpf/libbpf/issues/274
      
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarSong Liu <songliubraving@fb.com>
      Link: https://lore.kernel.org/bpf/20211021014404.2635234-8-andrii@kernel.org
      aed65917
    • Andrii Nakryiko's avatar
      bpftool: Improve skeleton generation for data maps without DATASEC type · ef9356d3
      Andrii Nakryiko authored
      
      
      It can happen that some data sections (e.g., .rodata.cst16, containing
      compiler populated string constants) won't have a corresponding BTF
      DATASEC type. Now that libbpf supports .rodata.* and .data.* sections,
      situation like that will cause invalid BPF skeleton to be generated that
      won't compile successfully, as some parts of skeleton would assume
      memory-mapped struct definitions for each special data section.
      
      Fix this by generating empty struct definitions for such data sections.
      
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarSong Liu <songliubraving@fb.com>
      Link: https://lore.kernel.org/bpf/20211021014404.2635234-7-andrii@kernel.org
      ef9356d3
    • Andrii Nakryiko's avatar
      bpftool: Support multiple .rodata/.data internal maps in skeleton · 8654b4d3
      Andrii Nakryiko authored
      
      
      Remove the assumption about only single instance of each of .rodata and
      .data internal maps. Nothing changes for '.rodata' and '.data' maps, but new
      '.rodata.something' map will get 'rodata_something' section in BPF
      skeleton for them (as well as having struct bpf_map * field in maps
      section with the same field name).
      
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarSong Liu <songliubraving@fb.com>
      Link: https://lore.kernel.org/bpf/20211021014404.2635234-6-andrii@kernel.org
      8654b4d3
    • Andrii Nakryiko's avatar
      libbpf: Remove assumptions about uniqueness of .rodata/.data/.bss maps · 25bbbd7a
      Andrii Nakryiko authored
      
      
      Remove internal libbpf assumption that there can be only one .rodata,
      .data, and .bss map per BPF object. To achieve that, extend and
      generalize the scheme that was used for keeping track of relocation ELF
      sections. Now each ELF section has a temporary extra index that keeps
      track of logical type of ELF section (relocations, data, read-only data,
      BSS). Switch relocation to this scheme, as well as .rodata/.data/.bss
      handling.
      
      We don't yet allow multiple .rodata, .data, and .bss sections, but no
      libbpf internal code makes an assumption that there can be only one of
      each and thus they can be explicitly referenced by a single index. Next
      patches will actually allow multiple .rodata and .data sections.
      
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarSong Liu <songliubraving@fb.com>
      Link: https://lore.kernel.org/bpf/20211021014404.2635234-5-andrii@kernel.org
      25bbbd7a
    • Andrii Nakryiko's avatar
      libbpf: Use Elf64-specific types explicitly for dealing with ELF · ad23b723
      Andrii Nakryiko authored
      
      
      Minimize the usage of class-agnostic gelf_xxx() APIs from libelf. These
      APIs require copying ELF data structures into local GElf_xxx structs and
      have a more cumbersome API. BPF ELF file is defined to be always 64-bit
      ELF object, even when intended to be run on 32-bit host architectures,
      so there is no need to do class-agnostic conversions everywhere. BPF
      static linker implementation within libbpf has been using Elf64-specific
      types since initial implementation.
      
      Add two simple helpers, elf_sym_by_idx() and elf_rel_by_idx(), for more
      succinct direct access to ELF symbol and relocation records within ELF
      data itself and switch all the GElf_xxx usage into Elf64_xxx
      equivalents. The only remaining place within libbpf.c that's still using
      gelf API is gelf_getclass(), as there doesn't seem to be a direct way to
      get underlying ELF bitness.
      
      No functional changes intended.
      
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarSong Liu <songliubraving@fb.com>
      Link: https://lore.kernel.org/bpf/20211021014404.2635234-4-andrii@kernel.org
      ad23b723
    • Andrii Nakryiko's avatar
      libbpf: Extract ELF processing state into separate struct · 29a30ff5
      Andrii Nakryiko authored
      
      
      Name currently anonymous internal struct that keeps ELF-related state
      for bpf_object. Just a bit of clean up, no functional changes.
      
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarSong Liu <songliubraving@fb.com>
      Link: https://lore.kernel.org/bpf/20211021014404.2635234-3-andrii@kernel.org
      29a30ff5
    • Andrii Nakryiko's avatar
      libbpf: Deprecate btf__finalize_data() and move it into libbpf.c · b96c07f3
      Andrii Nakryiko authored
      
      
      There isn't a good use case where anyone but libbpf itself needs to call
      btf__finalize_data(). It was implemented for internal use and it's not
      clear why it was made into public API in the first place. To function, it
      requires active ELF data, which is stored inside bpf_object for the
      duration of opening phase only. But the only BTF that needs bpf_object's
      ELF is that bpf_object's BTF itself, which libbpf fixes up automatically
      during bpf_object__open() operation anyways. There is no need for any
      additional fix up and no reasonable scenario where it's useful and
      appropriate.
      
      Thus, btf__finalize_data() is just an API atavism and is better removed.
      So this patch marks it as deprecated immediately (v0.6+) and moves the
      code from btf.c into libbpf.c where it's used in the context of
      bpf_object opening phase. Such code co-location allows to make code
      structure more straightforward and remove bpf_object__section_size() and
      bpf_object__variable_offset() internal helpers from libbpf_internal.h,
      making them static. Their naming is also adjusted to not create
      a wrong illusion that they are some sort of method of bpf_object. They
      are internal helpers and are called appropriately.
      
      This is part of libbpf 1.0 effort ([0]).
      
        [0] Closes: https://github.com/libbpf/libbpf/issues/276
      
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarSong Liu <songliubraving@fb.com>
      Link: https://lore.kernel.org/bpf/20211021014404.2635234-2-andrii@kernel.org
      b96c07f3