Skip to content
  1. Nov 18, 2022
    • Kumar Kartikeya Dwivedi's avatar
      bpf: Add 'release on unlock' logic for bpf_list_push_{front,back} · 534e86bc
      Kumar Kartikeya Dwivedi authored
      
      
      This commit implements the delayed release logic for bpf_list_push_front
      and bpf_list_push_back.
      
      Once a node has been added to the list, it's pointer changes to
      PTR_UNTRUSTED. However, it is only released once the lock protecting the
      list is unlocked. For such PTR_TO_BTF_ID | MEM_ALLOC with PTR_UNTRUSTED
      set but an active ref_obj_id, it is still permitted to read them as long
      as the lock is held. Writing to them is not allowed.
      
      This allows having read access to push items we no longer own until we
      release the lock guarding the list, allowing a little more flexibility
      when working with these APIs.
      
      Note that enabling write support has fairly tricky interactions with
      what happens inside the critical section. Just as an example, currently,
      bpf_obj_drop is not permitted, but if it were, being able to write to
      the PTR_UNTRUSTED pointer while the object gets released back to the
      memory allocator would violate safety properties we wish to guarantee
      (i.e. not crashing the kernel). The memory could be reused for a
      different type in the BPF program or even in the kernel as it gets
      eventually kfree'd.
      
      Not enabling bpf_obj_drop inside the critical section would appear to
      prevent all of the above, but that is more of an artifical limitation
      right now. Since the write support is tangled with how we handle
      potential aliasing of nodes inside the critical section that may or may
      not be part of the list anymore, it has been deferred to a future patch.
      
      Acked-by: default avatarDave Marchevsky <davemarchevsky@fb.com>
      Signed-off-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Link: https://lore.kernel.org/r/20221118015614.2013203-18-memxor@gmail.com
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      534e86bc
    • Kumar Kartikeya Dwivedi's avatar
      bpf: Introduce single ownership BPF linked list API · 8cab76ec
      Kumar Kartikeya Dwivedi authored
      
      
      Add a linked list API for use in BPF programs, where it expects
      protection from the bpf_spin_lock in the same allocation as the
      bpf_list_head. For now, only one bpf_spin_lock can be present hence that
      is assumed to be the one protecting the bpf_list_head.
      
      The following functions are added to kick things off:
      
      // Add node to beginning of list
      void bpf_list_push_front(struct bpf_list_head *head, struct bpf_list_node *node);
      
      // Add node to end of list
      void bpf_list_push_back(struct bpf_list_head *head, struct bpf_list_node *node);
      
      // Remove node at beginning of list and return it
      struct bpf_list_node *bpf_list_pop_front(struct bpf_list_head *head);
      
      // Remove node at end of list and return it
      struct bpf_list_node *bpf_list_pop_back(struct bpf_list_head *head);
      
      The lock protecting the bpf_list_head needs to be taken for all
      operations. The verifier ensures that the lock that needs to be taken is
      always held, and only the correct lock is taken for these operations.
      These checks are made statically by relying on the reg->id preserved for
      registers pointing into regions having both bpf_spin_lock and the
      objects protected by it. The comment over check_reg_allocation_locked in
      this change describes the logic in detail.
      
      Note that bpf_list_push_front and bpf_list_push_back are meant to
      consume the object containing the node in the 1st argument, however that
      specific mechanism is intended to not release the ref_obj_id directly
      until the bpf_spin_unlock is called. In this commit, nothing is done,
      but the next commit will be introducing logic to handle this case, so it
      has been left as is for now.
      
      bpf_list_pop_front and bpf_list_pop_back delete the first or last item
      of the list respectively, and return pointer to the element at the
      list_node offset. The user can then use container_of style macro to get
      the actual entry type. The verifier however statically knows the actual
      type, so the safety properties are still preserved.
      
      With these additions, programs can now manage their own linked lists and
      store their objects in them.
      
      Signed-off-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Link: https://lore.kernel.org/r/20221118015614.2013203-17-memxor@gmail.com
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      8cab76ec
    • Kumar Kartikeya Dwivedi's avatar
      bpf: Permit NULL checking pointer with non-zero fixed offset · df57f38a
      Kumar Kartikeya Dwivedi authored
      
      
      Pointer increment on seeing PTR_MAYBE_NULL is already protected against,
      hence make an exception for PTR_TO_BTF_ID | MEM_ALLOC while still
      keeping the warning for other unintended cases that might creep in.
      
      bpf_list_pop_{front,_back} helpers planned to be introduced in next
      commit will return a MEM_ALLOC register with incremented offset pointing
      to bpf_list_node field. The user is supposed to then obtain the pointer
      to the entry using container_of after NULL checking it. The current
      restrictions trigger a warning when doing the NULL checking. Revisiting
      the reason, it is meant as an assertion which seems to actually work and
      catch the bad case.
      
      Hence, under no other circumstances can reg->off be non-zero for a
      register that has the PTR_MAYBE_NULL type flag set.
      
      Signed-off-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Link: https://lore.kernel.org/r/20221118015614.2013203-16-memxor@gmail.com
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      df57f38a
    • Kumar Kartikeya Dwivedi's avatar
      bpf: Introduce bpf_obj_drop · ac9f0605
      Kumar Kartikeya Dwivedi authored
      
      
      Introduce bpf_obj_drop, which is the kfunc used to free allocated
      objects (allocated using bpf_obj_new). Pairing with bpf_obj_new, it
      implicitly destructs the fields part of object automatically without
      user intervention.
      
      Just like the previous patch, btf_struct_meta that is needed to free up
      the special fields is passed as a hidden argument to the kfunc.
      
      For the user, a convenience macro hides over the kernel side kfunc which
      is named bpf_obj_drop_impl.
      
      Continuing the previous example:
      
      void prog(void) {
      	struct foo *f;
      
      	f = bpf_obj_new(typeof(*f));
      	if (!f)
      		return;
      	bpf_obj_drop(f);
      }
      
      Signed-off-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Link: https://lore.kernel.org/r/20221118015614.2013203-15-memxor@gmail.com
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      ac9f0605
    • Kumar Kartikeya Dwivedi's avatar
      bpf: Introduce bpf_obj_new · 958cf2e2
      Kumar Kartikeya Dwivedi authored
      
      
      Introduce type safe memory allocator bpf_obj_new for BPF programs. The
      kernel side kfunc is named bpf_obj_new_impl, as passing hidden arguments
      to kfuncs still requires having them in prototype, unlike BPF helpers
      which always take 5 arguments and have them checked using bpf_func_proto
      in verifier, ignoring unset argument types.
      
      Introduce __ign suffix to ignore a specific kfunc argument during type
      checks, then use this to introduce support for passing type metadata to
      the bpf_obj_new_impl kfunc.
      
      The user passes BTF ID of the type it wants to allocates in program BTF,
      the verifier then rewrites the first argument as the size of this type,
      after performing some sanity checks (to ensure it exists and it is a
      struct type).
      
      The second argument is also fixed up and passed by the verifier. This is
      the btf_struct_meta for the type being allocated. It would be needed
      mostly for the offset array which is required for zero initializing
      special fields while leaving the rest of storage in unitialized state.
      
      It would also be needed in the next patch to perform proper destruction
      of the object's special fields.
      
      Under the hood, bpf_obj_new will call bpf_mem_alloc and bpf_mem_free,
      using the any context BPF memory allocator introduced recently. To this
      end, a global instance of the BPF memory allocator is initialized on
      boot to be used for this purpose. This 'bpf_global_ma' serves all
      allocations for bpf_obj_new. In the future, bpf_obj_new variants will
      allow specifying a custom allocator.
      
      Note that now that bpf_obj_new can be used to allocate objects that can
      be linked to BPF linked list (when future linked list helpers are
      available), we need to also free the elements using bpf_mem_free.
      However, since the draining of elements is done outside the
      bpf_spin_lock, we need to do migrate_disable around the call since
      bpf_list_head_free can be called from map free path where migration is
      enabled. Otherwise, when called from BPF programs migration is already
      disabled.
      
      A convenience macro is included in the bpf_experimental.h header to hide
      over the ugly details of the implementation, leading to user code
      looking similar to a language level extension which allocates and
      constructs fields of a user type.
      
      struct bar {
      	struct bpf_list_node node;
      };
      
      struct foo {
      	struct bpf_spin_lock lock;
      	struct bpf_list_head head __contains(bar, node);
      };
      
      void prog(void) {
      	struct foo *f;
      
      	f = bpf_obj_new(typeof(*f));
      	if (!f)
      		return;
      	...
      }
      
      A key piece of this story is still missing, i.e. the free function,
      which will come in the next patch.
      
      Signed-off-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Link: https://lore.kernel.org/r/20221118015614.2013203-14-memxor@gmail.com
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      958cf2e2
    • Kumar Kartikeya Dwivedi's avatar
      bpf: Support constant scalar arguments for kfuncs · a50388db
      Kumar Kartikeya Dwivedi authored
      
      
      Allow passing known constant scalars as arguments to kfuncs that do not
      represent a size parameter. We use mark_chain_precision for the constant
      scalar argument to mark it precise. This makes the search pruning
      optimization of verifier more conservative for such kfunc calls, and
      each non-distinct argument is considered unequivalent.
      
      We will use this support to then expose a bpf_obj_new function where it
      takes the local type ID of a type in program BTF, and returns a
      PTR_TO_BTF_ID | MEM_ALLOC to the local type, and allows programs to
      allocate their own objects.
      
      Each type ID resolves to a distinct type with a possibly distinct size,
      hence the type ID constant matters in terms of program safety and its
      precision needs to be checked between old and cur states inside regsafe.
      The use of mark_chain_precision enables this.
      
      Signed-off-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Link: https://lore.kernel.org/r/20221118015614.2013203-13-memxor@gmail.com
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      a50388db
    • Kumar Kartikeya Dwivedi's avatar
      bpf: Rewrite kfunc argument handling · 00b85860
      Kumar Kartikeya Dwivedi authored
      
      
      As we continue to add more features, argument types, kfunc flags, and
      different extensions to kfuncs, the code to verify the correctness of
      the kfunc prototype wrt the passed in registers has become ad-hoc and
      ugly to read. To make life easier, and make a very clear split between
      different stages of argument processing, move all the code into
      verifier.c and refactor into easier to read helpers and functions.
      
      This also makes sharing code within the verifier easier with kfunc
      argument processing. This will be more and more useful in later patches
      as we are now moving to implement very core BPF helpers as kfuncs, to
      keep them experimental before baking into UAPI.
      
      Remove all kfunc related bits now from btf_check_func_arg_match, as
      users have been converted away to refactored kfunc argument handling.
      
      Signed-off-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Link: https://lore.kernel.org/r/20221118015614.2013203-12-memxor@gmail.com
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      00b85860
    • Kumar Kartikeya Dwivedi's avatar
      bpf: Allow locking bpf_spin_lock in inner map values · b7ff9792
      Kumar Kartikeya Dwivedi authored
      
      
      There is no need to restrict users from locking bpf_spin_lock in map
      values of inner maps. Each inner map lookup gets a unique reg->id
      assigned to the returned PTR_TO_MAP_VALUE which will be preserved after
      the NULL check. Distinct lookups into different inner map get unique
      IDs, and distinct lookups into same inner map also get unique IDs.
      
      Hence, lift the restriction by removing the check return -ENOTSUPP in
      map_in_map.c. Later commits will add comprehensive test cases to ensure
      that invalid cases are rejected.
      
      Signed-off-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Link: https://lore.kernel.org/r/20221118015614.2013203-11-memxor@gmail.com
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      b7ff9792
    • Kumar Kartikeya Dwivedi's avatar
      bpf: Allow locking bpf_spin_lock global variables · d0d78c1d
      Kumar Kartikeya Dwivedi authored
      
      
      Global variables reside in maps accessible using direct_value_addr
      callbacks, so giving each load instruction's rewrite a unique reg->id
      disallows us from holding locks which are global.
      
      The reason for preserving reg->id as a unique value for registers that
      may point to spin lock is that two separate lookups are treated as two
      separate memory regions, and any possible aliasing is ignored for the
      purposes of spin lock correctness.
      
      This is not great especially for the global variable case, which are
      served from maps that have max_entries == 1, i.e. they always lead to
      map values pointing into the same map value.
      
      So refactor the active_spin_lock into a 'active_lock' structure which
      represents the lock identity, and instead of the reg->id, remember two
      fields, a pointer and the reg->id. The pointer will store reg->map_ptr
      or reg->btf. It's only necessary to distinguish for the id == 0 case of
      global variables, but always setting the pointer to a non-NULL value and
      using the pointer to check whether the lock is held simplifies code in
      the verifier.
      
      This is generic enough to allow it for global variables, map lookups,
      and allocated objects at the same time.
      
      Note that while whether a lock is held can be answered by just comparing
      active_lock.ptr to NULL, to determine whether the register is pointing
      to the same held lock requires comparing _both_ ptr and id.
      
      Finally, as a result of this refactoring, pseudo load instructions are
      not given a unique reg->id, as they are doing lookup for the same map
      value (max_entries is never greater than 1).
      
      Essentially, we consider that the tuple of (ptr, id) will always be
      unique for any kind of argument to bpf_spin_{lock,unlock}.
      
      Note that this can be extended in the future to also remember offset
      used for locking, so that we can introduce multiple bpf_spin_lock fields
      in the same allocation.
      
      Signed-off-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Link: https://lore.kernel.org/r/20221118015614.2013203-10-memxor@gmail.com
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      d0d78c1d
    • Kumar Kartikeya Dwivedi's avatar
      bpf: Allow locking bpf_spin_lock in allocated objects · 4e814da0
      Kumar Kartikeya Dwivedi authored
      
      
      Allow locking a bpf_spin_lock in an allocated object, in addition to
      already supported map value pointers. The handling is similar to that of
      map values, by just preserving the reg->id of PTR_TO_BTF_ID | MEM_ALLOC
      as well, and adjusting process_spin_lock to work with them and remember
      the id in verifier state.
      
      Refactor the existing process_spin_lock to work with PTR_TO_BTF_ID |
      MEM_ALLOC in addition to PTR_TO_MAP_VALUE. We need to update the
      reg_may_point_to_spin_lock which is used in mark_ptr_or_null_reg to
      preserve reg->id, that will be used in env->cur_state->active_spin_lock
      to remember the currently held spin lock.
      
      Also update the comment describing bpf_spin_lock implementation details
      to also talk about PTR_TO_BTF_ID | MEM_ALLOC type.
      
      Signed-off-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Link: https://lore.kernel.org/r/20221118015614.2013203-9-memxor@gmail.com
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      4e814da0
    • Kumar Kartikeya Dwivedi's avatar
      bpf: Verify ownership relationships for user BTF types · 865ce09a
      Kumar Kartikeya Dwivedi authored
      
      
      Ensure that there can be no ownership cycles among different types by
      way of having owning objects that can hold some other type as their
      element. For instance, a map value can only hold allocated objects, but
      these are allowed to have another bpf_list_head. To prevent unbounded
      recursion while freeing resources, elements of bpf_list_head in local
      kptrs can never have a bpf_list_head which are part of list in a map
      value. Later patches will verify this by having dedicated BTF selftests.
      
      Also, to make runtime destruction easier, once btf_struct_metas is fully
      populated, we can stash the metadata of the value type directly in the
      metadata of the list_head fields, as that allows easier access to the
      value type's layout to destruct it at runtime from the btf_field entry
      of the list head itself.
      
      Signed-off-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Link: https://lore.kernel.org/r/20221118015614.2013203-8-memxor@gmail.com
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      865ce09a
    • Kumar Kartikeya Dwivedi's avatar
      bpf: Recognize lock and list fields in allocated objects · 8ffa5cc1
      Kumar Kartikeya Dwivedi authored
      
      
      Allow specifying bpf_spin_lock, bpf_list_head, bpf_list_node fields in a
      allocated object.
      
      Also update btf_struct_access to reject direct access to these special
      fields.
      
      A bpf_list_head allows implementing map-in-map style use cases, where an
      allocated object with bpf_list_head is linked into a list in a map
      value. This would require embedding a bpf_list_node, support for which
      is also included. The bpf_spin_lock is used to protect the bpf_list_head
      and other data.
      
      While we strictly don't require to hold a bpf_spin_lock while touching
      the bpf_list_head in such objects, as when have access to it, we have
      complete ownership of the object, the locking constraint is still kept
      and may be conditionally lifted in the future.
      
      Note that the specification of such types can be done just like map
      values, e.g.:
      
      struct bar {
      	struct bpf_list_node node;
      };
      
      struct foo {
      	struct bpf_spin_lock lock;
      	struct bpf_list_head head __contains(bar, node);
      	struct bpf_list_node node;
      };
      
      struct map_value {
      	struct bpf_spin_lock lock;
      	struct bpf_list_head head __contains(foo, node);
      };
      
      To recognize such types in user BTF, we build a btf_struct_metas array
      of metadata items corresponding to each BTF ID. This is done once during
      the btf_parse stage to avoid having to do it each time during the
      verification process's requirement to inspect the metadata.
      
      Moreover, the computed metadata needs to be passed to some helpers in
      future patches which requires allocating them and storing them in the
      BTF that is pinned by the program itself, so that valid access can be
      assumed to such data during program runtime.
      
      A key thing to note is that once a btf_struct_meta is available for a
      type, both the btf_record and btf_field_offs should be available. It is
      critical that btf_field_offs is available in case special fields are
      present, as we extensively rely on special fields being zeroed out in
      map values and allocated objects in later patches. The code ensures that
      by bailing out in case of errors and ensuring both are available
      together. If the record is not available, the special fields won't be
      recognized, so not having both is also fine (in terms of being a
      verification error and not a runtime bug).
      
      Signed-off-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Link: https://lore.kernel.org/r/20221118015614.2013203-7-memxor@gmail.com
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      8ffa5cc1
    • Kumar Kartikeya Dwivedi's avatar
      bpf: Introduce allocated objects support · 282de143
      Kumar Kartikeya Dwivedi authored
      
      
      Introduce support for representing pointers to objects allocated by the
      BPF program, i.e. PTR_TO_BTF_ID that point to a type in program BTF.
      This is indicated by the presence of MEM_ALLOC type flag in reg->type to
      avoid having to check btf_is_kernel when trying to match argument types
      in helpers.
      
      Whenever walking such types, any pointers being walked will always yield
      a SCALAR instead of pointer. In the future we might permit kptr inside
      such allocated objects (either kernel or program allocated), and it will
      then form a PTR_TO_BTF_ID of the respective type.
      
      For now, such allocated objects will always be referenced in verifier
      context, hence ref_obj_id == 0 for them is a bug. It is allowed to write
      to such objects, as long fields that are special are not touched
      (support for which will be added in subsequent patches). Note that once
      such a pointer is marked PTR_UNTRUSTED, it is no longer allowed to write
      to it.
      
      No PROBE_MEM handling is therefore done for loads into this type unless
      PTR_UNTRUSTED is part of the register type, since they can never be in
      an undefined state, and their lifetime will always be valid.
      
      Signed-off-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Link: https://lore.kernel.org/r/20221118015614.2013203-6-memxor@gmail.com
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      282de143
    • Kumar Kartikeya Dwivedi's avatar
      bpf: Populate field_offs for inner_map_meta · f73e601a
      Kumar Kartikeya Dwivedi authored
      
      
      Far too much code simply assumes that both btf_record and btf_field_offs
      are set to valid pointers together, or both are unset. They go together
      hand in hand as btf_record describes the special fields and
      btf_field_offs is compact representation for runtime copying/zeroing.
      
      It is very difficult to make this clear in the code when the only
      exception to this universal invariant is inner_map_meta which is used
      as reg->map_ptr in the verifier. This is simply a bug waiting to happen,
      as in verifier context we cannot easily distinguish if PTR_TO_MAP_VALUE
      is coming from an inner map, and if we ever end up using field_offs for
      any reason in the future, we will silently ignore the special fields for
      inner map case (as NULL is not an error but unset field_offs).
      
      Hence, simply copy field_offs from inner map together with btf_record.
      
      While at it, refactor code to unwind properly on errors with gotos.
      
      Signed-off-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Link: https://lore.kernel.org/r/20221118015614.2013203-5-memxor@gmail.com
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      f73e601a
    • Kumar Kartikeya Dwivedi's avatar
      bpf: Free inner_map_meta when btf_record_dup fails · d4899572
      Kumar Kartikeya Dwivedi authored
      Whenever btf_record_dup fails, we must free inner_map_meta that was
      allocated before.
      
      This fixes a memory leak (in case of errors) during inner map creation.
      
      Fixes: aa3496ac
      
       ("bpf: Refactor kptr_off_tab into btf_record")
      Signed-off-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Link: https://lore.kernel.org/r/20221118015614.2013203-4-memxor@gmail.com
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      d4899572
    • Kumar Kartikeya Dwivedi's avatar
      bpf: Do btf_record_free outside map_free callback · d7f5ef65
      Kumar Kartikeya Dwivedi authored
      Since the commit being fixed, we now miss freeing btf_record for local
      storage maps which will have a btf_record populated in case they have
      bpf_spin_lock element.
      
      This was missed because I made the choice of offloading the job to free
      kptr_off_tab (now btf_record) to the map_free callback when adding
      support for kptrs.
      
      Revisiting the reason for this decision, there is the possibility that
      the btf_record gets used inside map_free callback (e.g. in case of maps
      embedding kptrs) to iterate over them and free them, hence doing it
      before the map_free callback would be leaking special field memory, and
      do invalid memory access. The btf_record keeps module references which
      is critical to ensure the dtor call made for referenced kptr is safe to
      do.
      
      If doing it after map_free callback, the map area is already freed, so
      we cannot access bpf_map structure anymore.
      
      To fix this and prevent such lapses in future, move bpf_map_free_record
      out of the map_free callback, and do it after map_free by remembering
      the btf_record pointer. There is no need to access bpf_map structure in
      that case, and we can avoid missing this case when support for new map
      types is added for other special fields.
      
      Since a btf_record and its btf_field_offs are used together, for
      consistency delay freeing of field_offs as well. While not a problem
      right now, a lot of code assumes that either both record and field_offs
      are set or none at once.
      
      Note that in case of map of maps (outer maps), inner_map_meta->record is
      only used during verification, not to free fields in map value, hence we
      simply keep the bpf_map_free_record call as is in bpf_map_meta_free and
      never touch map->inner_map_meta in bpf_map_free_deferred.
      
      Add a comment making note of these details.
      
      Fixes: db559117
      
       ("bpf: Consolidate spin_lock, timer management into btf_record")
      Signed-off-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Link: https://lore.kernel.org/r/20221118015614.2013203-3-memxor@gmail.com
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      d7f5ef65
    • Kumar Kartikeya Dwivedi's avatar
      bpf: Fix early return in map_check_btf · c237bfa5
      Kumar Kartikeya Dwivedi authored
      Instead of returning directly with -EOPNOTSUPP for the timer case, we
      need to free the btf_record before returning to userspace.
      
      Fixes: db559117
      
       ("bpf: Consolidate spin_lock, timer management into btf_record")
      Reported-by: default avatarDan Carpenter <error27@gmail.com>
      Signed-off-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Link: https://lore.kernel.org/r/20221118015614.2013203-2-memxor@gmail.com
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      c237bfa5
    • Björn Töpel's avatar
      selftests/bpf: Pass target triple to get_sys_includes macro · 98b2afc8
      Björn Töpel authored
      
      
      When cross-compiling [1], the get_sys_includes make macro should use
      the target system include path, and not the build hosts system include
      path.
      
      Make clang honor the CROSS_COMPILE triple.
      
      [1] e.g. "ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu- make"
      
      Signed-off-by: default avatarBjörn Töpel <bjorn@rivosinc.com>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Tested-by: default avatarAnders Roxell <anders.roxell@linaro.org>
      Link: https://lore.kernel.org/bpf/20221115182051.582962-2-bjorn@kernel.org
      98b2afc8
    • Björn Töpel's avatar
      selftests/bpf: Explicitly pass RESOLVE_BTFIDS to sub-make · c4525f05
      Björn Töpel authored
      
      
      When cross-compiling selftests/bpf, the resolve_btfids binary end up
      in a different directory, than the regular resolve_btfids
      builds. Populate RESOLVE_BTFIDS for sub-make, so it can find the
      binary.
      
      Signed-off-by: default avatarBjörn Töpel <bjorn@rivosinc.com>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Acked-by: default avatarJiri Olsa <jolsa@kernel.org>
      Link: https://lore.kernel.org/bpf/20221115182051.582962-1-bjorn@kernel.org
      c4525f05
    • Hou Tao's avatar
      bpf: Pass map file to .map_update_batch directly · 3af43ba4
      Hou Tao authored
      
      
      Currently bpf_map_do_batch() first invokes fdget(batch.map_fd) to get
      the target map file, then it invokes generic_map_update_batch() to do
      batch update. generic_map_update_batch() will get the target map file
      by using fdget(batch.map_fd) again and pass it to bpf_map_update_value().
      
      The problem is map file returned by the second fdget() may be NULL or a
      totally different file compared by map file in bpf_map_do_batch(). The
      reason is that the first fdget() only guarantees the liveness of struct
      file instead of file descriptor and the file description may be released
      by concurrent close() through pick_file().
      
      It doesn't incur any problem as for now, because maps with batch update
      support don't use map file in .map_fd_get_ptr() ops. But it is better to
      fix the potential access of an invalid map file.
      
      Using __bpf_map_get() again in generic_map_update_batch() can not fix
      the problem, because batch.map_fd may be closed and reopened, and the
      returned map file may be different with map file got in
      bpf_map_do_batch(), so just passing the map file directly to
      .map_update_batch() in bpf_map_do_batch().
      
      Signed-off-by: default avatarHou Tao <houtao1@huawei.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarYonghong Song <yhs@fb.com>
      Link: https://lore.kernel.org/bpf/20221116075059.1551277-1-houtao@huaweicloud.com
      3af43ba4
  2. Nov 17, 2022
    • Daniel Müller's avatar
      bpf/docs: Include blank lines between bullet points in bpf_devel_QA.rst · 383f1a8d
      Daniel Müller authored
      Commit 26a9b433 ("bpf/docs: Document how to run CI without patch
      submission") caused a warning to be generated when compiling the
      documentation:
      
       > bpf_devel_QA.rst:55: WARNING: Unexpected indentation.
       > bpf_devel_QA.rst:56: WARNING: Block quote ends without a blank line
      
      This change fixes the problem by inserting the required blank lines.
      
      Fixes: 26a9b433
      
       ("bpf/docs: Document how to run CI without patch submission")
      Reported-by: default avatarAkira Yokosawa <akiyks@gmail.com>
      Signed-off-by: default avatarDaniel Müller <deso@posteo.net>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Reviewed-by: default avatarAkira Yokosawa <akiyks@gmail.com>
      Link: https://lore.kernel.org/bpf/20221116174358.2744613-1-deso@posteo.net
      383f1a8d
    • Wang Yufen's avatar
      selftests/bpf: fix memory leak of lsm_cgroup · c453e64c
      Wang Yufen authored
      kmemleak reports this issue:
      
      unreferenced object 0xffff88810b7835c0 (size 32):
        comm "test_progs", pid 270, jiffies 4294969007 (age 1621.315s)
        hex dump (first 32 bytes):
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
          03 00 00 00 03 00 00 00 0f 00 00 00 00 00 00 00  ................
        backtrace:
          [<00000000376cdeab>] kmalloc_trace+0x27/0x110
          [<000000003bcdb3b6>] selinux_sk_alloc_security+0x66/0x110
          [<000000003959008f>] security_sk_alloc+0x47/0x80
          [<00000000e7bc6668>] sk_prot_alloc+0xbd/0x1a0
          [<0000000002d6343a>] sk_alloc+0x3b/0x940
          [<000000009812a46d>] unix_create1+0x8f/0x3d0
          [<000000005ed0976b>] unix_create+0xa1/0x150
          [<0000000086a1d27f>] __sock_create+0x233/0x4a0
          [<00000000cffe3a73>] __sys_socket_create.part.0+0xaa/0x110
          [<0000000007c63f20>] __sys_socket+0x49/0xf0
          [<00000000b08753c8>] __x64_sys_socket+0x42/0x50
          [<00000000b56e26b3>] do_syscall_64+0x3b/0x90
          [<000000009b4871b8>] entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      The issue occurs in the following scenarios:
      
      unix_create1()
        sk_alloc()
          sk_prot_alloc()
            security_sk_alloc()
              call_int_hook()
                hlist_for_each_entry()
                  entry1->hook.sk_alloc_security
                  <-- selinux_sk_alloc_security() succeeded,
                  <-- sk->security alloced here.
                  entry2->hook.sk_alloc_security
                  <-- bpf_lsm_sk_alloc_security() failed
            goto out_free;
              ...    <-- the sk->security not freed, memleak
      
      The core problem is that the LSM is not yet fully stacked (work is
      actively going on in this space) which means that some LSM hooks do
      not support multiple LSMs at the same time. To fix, skip the
      "EPERM" test when it runs in the environments that already have
      non-bpf lsms installed
      
      Fixes: dca85aac
      
       ("selftests/bpf: lsm_cgroup functional test")
      Signed-off-by: default avatarWang Yufen <wangyufen@huawei.com>
      Cc: Stanislav Fomichev <sdf@google.com>
      Acked-by: default avatarStanislav Fomichev <sdf@google.com>
      Link: https://lore.kernel.org/r/1668482980-16163-1-git-send-email-wangyufen@huawei.com
      Signed-off-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      c453e64c
  3. Nov 16, 2022
    • Eduard Zingerman's avatar
      selftests/bpf: allow unpriv bpf for selftests by default · 5b1d6408
      Eduard Zingerman authored
      
      
      Enable unprivileged bpf for selftests kernel by default.
      This forces CI to run test_verifier tests in both privileged
      and unprivileged modes.
      
      The test_verifier.c:do_test uses sysctl kernel.unprivileged_bpf_disabled
      to decide whether to run or to skip test cases in unprivileged mode.
      The CONFIG_BPF_UNPRIV_DEFAULT_OFF controls the default value of the
      kernel.unprivileged_bpf_disabled.
      
      Signed-off-by: default avatarEduard Zingerman <eddyz87@gmail.com>
      Link: https://lore.kernel.org/r/20221116015456.2461135-1-eddyz87@gmail.com
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      5b1d6408
    • Tiezhu Yang's avatar
      bpftool: Check argc first before "file" in do_batch() · df9c41e9
      Tiezhu Yang authored
      
      
      If the parameters for batch are more than 2, check argc first can
      return immediately, no need to use is_prefix() to check "file" with
      a little overhead and then check argc, it is better to check "file"
      only when the parameters for batch are 2.
      
      Signed-off-by: default avatarTiezhu Yang <yangtiezhu@loongson.cn>
      Acked-by: default avatarStanislav Fomichev <sdf@google.com>
      Reviewed-by: default avatarQuentin Monnet <quentin@isovalent.com>
      Link: https://lore.kernel.org/r/1668517207-11822-1-git-send-email-yangtiezhu@loongson.cn
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      df9c41e9
    • Donald Hunter's avatar
      docs/bpf: Fix sample code in MAP_TYPE_ARRAY docs · e0eb6082
      Donald Hunter authored
      Remove mistaken & from code example in MAP_TYPE_ARRAY docs
      
      Fixes: 1cfa97b3
      
       ("bpf, docs: Document BPF_MAP_TYPE_ARRAY")
      Signed-off-by: default avatarDonald Hunter <donald.hunter@gmail.com>
      Link: https://lore.kernel.org/r/20221115095910.86407-1-donald.hunter@gmail.com
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      e0eb6082
    • Alexei Starovoitov's avatar
      Merge branch 'propagate nullness information for reg to reg comparisons' · 6373ef1c
      Alexei Starovoitov authored
      
      
      Eduard Zingerman says:
      
      ====================
      
      This patchset adds ability to propagates nullness information for
      branches of register to register equality compare instructions. The
      following rules are used:
       - suppose register A maybe null
       - suppose register B is not null
       - for JNE A, B, ... - A is not null in the false branch
       - for JEQ A, B, ... - A is not null in the true branch
      
      E.g. for program like below:
      
        r6 = skb->sk;
        r7 = sk_fullsock(r6);
        r0 = sk_fullsock(r6);
        if (r0 == 0) return 0;    (a)
        if (r0 != r7) return 0;   (b)
        *r7->type;                (c)
        return 0;
      
      It is safe to dereference r7 at point (c), because of (a) and (b).
      
      The utility of this change came up while working on BPF CLang backend
      issue [1]. Specifically, while debugging issue with selftest
      `test_sk_lookup.c`. This test has the following structure:
      
          int access_ctx_sk(struct bpf_sk_lookup *ctx __CTX__)
          {
              struct bpf_sock *sk1 = NULL, *sk2 = NULL;
              ...
              sk1 = bpf_map_lookup_elem(&redir_map, &KEY_SERVER_A);
              if (!sk1)           // (a)
                  goto out;
              ...
              if (ctx->sk != sk1) // (b)
                  goto out;
              ...
              if (ctx->sk->family != AF_INET ||     // (c)
                  ctx->sk->type != SOCK_STREAM ||
                  ctx->sk->state != BPF_TCP_LISTEN)
                  goto out;
                  ...
          }
      
      - at (a) `sk1` is checked to be not null;
      - at (b) `ctx->sk` is verified to be equal to `sk1`;
      - at (c) `ctx->sk` is accessed w/o nullness check.
      
      Currently Global Value Numbering pass considers expressions `sk1` and
      `ctx->sk` to be identical at point (c) and replaces `ctx->sk` with
      `sk1` (not expressions themselves but corresponding SSA values).
      Since `sk1` is known to be not null after (b) verifier allows
      execution of the program.
      
      However, such optimization is not guaranteed to happen. When it does
      not happen verifier reports an error.
      
      Changelog:
      v2 -> v3:
       - verifier tests are updated with correct error message for
         unprivileged mode (pointer comparisons are forbidden in
         unprivileged mode).
      
      v1 -> v2:
       - after investigation described in [2] as suggested by John, Daniel
         and Shung-Hsi, function `type_is_pointer` is removed, calls to this
         function are replaced by `__is_pointer_value(false, src_reg)`.
      
      RFC -> v1:
       - newly added if block in `check_cond_jmp_op` is moved down to keep
         `make_ptr_not_null_reg` actions together;
       - tests rewritten to have a single `r0 = 0; exit;` block.
      
      [1]   https://reviews.llvm.org/D131633#3722231
      [2]   https://lore.kernel.org/bpf/bad8be826d088e0d180232628160bf932006de89.camel@gmail.com/
      [RFC] https://lore.kernel.org/bpf/20220822094312.175448-1-eddyz87@gmail.com/
      [v1]  https://lore.kernel.org/bpf/20220826172915.1536914-1-eddyz87@gmail.com/
      [v2]  https://lore.kernel.org/bpf/20221106214921.117631-1-eddyz87@gmail.com/
      ====================
      
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      6373ef1c
    • Eduard Zingerman's avatar
      selftests/bpf: check nullness propagation for reg to reg comparisons · 4741c371
      Eduard Zingerman authored
      
      
      Verify that nullness information is porpagated in the branches of
      register to register JEQ and JNE operations.
      
      Signed-off-by: default avatarEduard Zingerman <eddyz87@gmail.com>
      Acked-by: default avatarYonghong Song <yhs@fb.com>
      Link: https://lore.kernel.org/r/20221115224859.2452988-3-eddyz87@gmail.com
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      4741c371
    • Eduard Zingerman's avatar
      bpf: propagate nullness information for reg to reg comparisons · befae758
      Eduard Zingerman authored
      
      
      Propagate nullness information for branches of register to register
      equality compare instructions. The following rules are used:
      - suppose register A maybe null
      - suppose register B is not null
      - for JNE A, B, ... - A is not null in the false branch
      - for JEQ A, B, ... - A is not null in the true branch
      
      E.g. for program like below:
      
        r6 = skb->sk;
        r7 = sk_fullsock(r6);
        r0 = sk_fullsock(r6);
        if (r0 == 0) return 0;    (a)
        if (r0 != r7) return 0;   (b)
        *r7->type;                (c)
        return 0;
      
      It is safe to dereference r7 at point (c), because of (a) and (b).
      
      Signed-off-by: default avatarEduard Zingerman <eddyz87@gmail.com>
      Acked-by: default avatarYonghong Song <yhs@fb.com>
      Link: https://lore.kernel.org/r/20221115224859.2452988-2-eddyz87@gmail.com
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      befae758
    • Toke Høiland-Jørgensen's avatar
      bpf: Expand map key argument of bpf_redirect_map to u64 · 32637e33
      Toke Høiland-Jørgensen authored
      
      
      For queueing packets in XDP we want to add a new redirect map type with
      support for 64-bit indexes. To prepare fore this, expand the width of the
      'key' argument to the bpf_redirect_map() helper. Since BPF registers are
      always 64-bit, this should be safe to do after the fact.
      
      Acked-by: default avatarSong Liu <song@kernel.org>
      Reviewed-by: default avatarStanislav Fomichev <sdf@google.com>
      Signed-off-by: default avatarToke Høiland-Jørgensen <toke@redhat.com>
      Link: https://lore.kernel.org/r/20221108140601.149971-3-toke@redhat.com
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      32637e33
    • Toke Høiland-Jørgensen's avatar
      dev: Move received_rps counter next to RPS members in softnet data · 14d898f3
      Toke Høiland-Jørgensen authored
      
      
      Move the received_rps counter value next to the other RPS-related members
      in softnet_data. This closes two four-byte holes in the structure, making
      room for another pointer in the first two cache lines without bumping the
      xmit struct to its own line.
      
      Acked-by: default avatarSong Liu <song@kernel.org>
      Reviewed-by: default avatarStanislav Fomichev <sdf@google.com>
      Signed-off-by: default avatarToke Høiland-Jørgensen <toke@redhat.com>
      Link: https://lore.kernel.org/r/20221108140601.149971-2-toke@redhat.com
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      14d898f3
  4. Nov 15, 2022
    • Daniel Müller's avatar
      bpf/docs: Document how to run CI without patch submission · 26a9b433
      Daniel Müller authored
      
      
      This change documents the process for running the BPF CI before
      submitting a patch to the upstream mailing list, similar to what happens
      if a patch is send to bpf@vger.kernel.org: it builds kernel and
      selftests and runs the latter on different architecture (but it notably
      does not cover stylistic checks such as cover letter verification).
      
      Running BPF CI this way can help achieve better test coverage ahead of
      patch submission than merely running locally (say, using
      tools/testing/selftests/bpf/vmtest.sh), as additional architectures may
      be covered as well.
      
      Signed-off-by: default avatarDaniel Müller <deso@posteo.net>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20221114211501.2068684-1-deso@posteo.net
      26a9b433
    • Kumar Kartikeya Dwivedi's avatar
      bpf: Refactor btf_struct_access · 6728aea7
      Kumar Kartikeya Dwivedi authored
      
      
      Instead of having to pass multiple arguments that describe the register,
      pass the bpf_reg_state into the btf_struct_access callback. Currently,
      all call sites simply reuse the btf and btf_id of the reg they want to
      check the access of. The only exception to this pattern is the callsite
      in check_ptr_to_map_access, hence for that case create a dummy reg to
      simulate PTR_TO_BTF_ID access.
      
      Signed-off-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Link: https://lore.kernel.org/r/20221114191547.1694267-8-memxor@gmail.com
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      6728aea7
    • Kumar Kartikeya Dwivedi's avatar
      bpf: Rename MEM_ALLOC to MEM_RINGBUF · 894f2a8b
      Kumar Kartikeya Dwivedi authored
      
      
      Currently, verifier uses MEM_ALLOC type tag to specially tag memory
      returned from bpf_ringbuf_reserve helper. However, this is currently
      only used for this purpose and there is an implicit assumption that it
      only refers to ringbuf memory (e.g. the check for ARG_PTR_TO_ALLOC_MEM
      in check_func_arg_reg_off).
      
      Hence, rename MEM_ALLOC to MEM_RINGBUF to indicate this special
      relationship and instead open the use of MEM_ALLOC for more generic
      allocations made for user types.
      
      Also, since ARG_PTR_TO_ALLOC_MEM_OR_NULL is unused, simply drop it.
      
      Finally, update selftests using 'alloc_' verifier string to 'ringbuf_'.
      
      Signed-off-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Link: https://lore.kernel.org/r/20221114191547.1694267-7-memxor@gmail.com
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      894f2a8b
    • Kumar Kartikeya Dwivedi's avatar
      bpf: Rename RET_PTR_TO_ALLOC_MEM · 2de2669b
      Kumar Kartikeya Dwivedi authored
      
      
      Currently, the verifier has two return types, RET_PTR_TO_ALLOC_MEM, and
      RET_PTR_TO_ALLOC_MEM_OR_NULL, however the former is confusingly named to
      imply that it carries MEM_ALLOC, while only the latter does. This causes
      confusion during code review leading to conclusions like that the return
      value of RET_PTR_TO_DYNPTR_MEM_OR_NULL (which is RET_PTR_TO_ALLOC_MEM |
      PTR_MAYBE_NULL) may be consumable by bpf_ringbuf_{submit,commit}.
      
      Rename it to make it clear MEM_ALLOC needs to be tacked on top of
      RET_PTR_TO_MEM.
      
      Signed-off-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Link: https://lore.kernel.org/r/20221114191547.1694267-6-memxor@gmail.com
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      2de2669b
    • Kumar Kartikeya Dwivedi's avatar
      bpf: Support bpf_list_head in map values · f0c5941f
      Kumar Kartikeya Dwivedi authored
      
      
      Add the support on the map side to parse, recognize, verify, and build
      metadata table for a new special field of the type struct bpf_list_head.
      To parameterize the bpf_list_head for a certain value type and the
      list_node member it will accept in that value type, we use BTF
      declaration tags.
      
      The definition of bpf_list_head in a map value will be done as follows:
      
      struct foo {
      	struct bpf_list_node node;
      	int data;
      };
      
      struct map_value {
      	struct bpf_list_head head __contains(foo, node);
      };
      
      Then, the bpf_list_head only allows adding to the list 'head' using the
      bpf_list_node 'node' for the type struct foo.
      
      The 'contains' annotation is a BTF declaration tag composed of four
      parts, "contains:name:node" where the name is then used to look up the
      type in the map BTF, with its kind hardcoded to BTF_KIND_STRUCT during
      the lookup. The node defines name of the member in this type that has
      the type struct bpf_list_node, which is actually used for linking into
      the linked list. For now, 'kind' part is hardcoded as struct.
      
      This allows building intrusive linked lists in BPF, using container_of
      to obtain pointer to entry, while being completely type safe from the
      perspective of the verifier. The verifier knows exactly the type of the
      nodes, and knows that list helpers return that type at some fixed offset
      where the bpf_list_node member used for this list exists. The verifier
      also uses this information to disallow adding types that are not
      accepted by a certain list.
      
      For now, no elements can be added to such lists. Support for that is
      coming in future patches, hence draining and freeing items is done with
      a TODO that will be resolved in a future patch.
      
      Note that the bpf_list_head_free function moves the list out to a local
      variable under the lock and releases it, doing the actual draining of
      the list items outside the lock. While this helps with not holding the
      lock for too long pessimizing other concurrent list operations, it is
      also necessary for deadlock prevention: unless every function called in
      the critical section would be notrace, a fentry/fexit program could
      attach and call bpf_map_update_elem again on the map, leading to the
      same lock being acquired if the key matches and lead to a deadlock.
      While this requires some special effort on part of the BPF programmer to
      trigger and is highly unlikely to occur in practice, it is always better
      if we can avoid such a condition.
      
      While notrace would prevent this, doing the draining outside the lock
      has advantages of its own, hence it is used to also fix the deadlock
      related problem.
      
      Signed-off-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Link: https://lore.kernel.org/r/20221114191547.1694267-5-memxor@gmail.com
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      f0c5941f
    • Kumar Kartikeya Dwivedi's avatar
      bpf: Fix copy_map_value, zero_map_value · e5feed0f
      Kumar Kartikeya Dwivedi authored
      The current offset needs to also skip over the already copied region in
      addition to the size of the next field. This case manifests where there
      are gaps between adjacent special fields.
      
      It was observed that for a map value with size 48, having fields at:
      off:  0, 16, 32
      size: 4, 16, 16
      
      The current code does:
      
      memcpy(dst + 0, src + 0, 0)
      memcpy(dst + 4, src + 4, 12)
      memcpy(dst + 20, src + 20, 12)
      memcpy(dst + 36, src + 36, 12)
      
      With the fix, it is done correctly as:
      
      memcpy(dst + 0, src + 0, 0)
      memcpy(dst + 4, src + 4, 12)
      memcpy(dst + 32, src + 32, 0)
      memcpy(dst + 48, src + 48, 0)
      
      Fixes: 4d7d7f69
      
       ("bpf: Adapt copy_map_value for multiple offset case")
      Signed-off-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Link: https://lore.kernel.org/r/20221114191547.1694267-4-memxor@gmail.com
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      e5feed0f
    • Kumar Kartikeya Dwivedi's avatar
      bpf: Remove BPF_MAP_OFF_ARR_MAX · 2d577252
      Kumar Kartikeya Dwivedi authored
      In f71b2f64
      
       ("bpf: Refactor map->off_arr handling"), map->off_arr
      was refactored to be btf_field_offs. The number of field offsets is
      equal to maximum possible fields limited by BTF_FIELDS_MAX. Hence, reuse
      BTF_FIELDS_MAX as spin_lock and timer no longer are to be handled
      specially for offset sorting, fix the comment, and remove incorrect
      WARN_ON as its rec->cnt can never exceed this value. The reason to keep
      separate constant was the it was always more 2 more than total kptrs.
      This is no longer the case.
      
      Signed-off-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Link: https://lore.kernel.org/r/20221114191547.1694267-3-memxor@gmail.com
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      2d577252
    • Kumar Kartikeya Dwivedi's avatar
      bpf: Remove local kptr references in documentation · 1f6d52f1
      Kumar Kartikeya Dwivedi authored
      
      
      We don't want to commit to a specific name for these. Simply call them
      allocated objects coming from bpf_obj_new, which is completely clear in
      itself.
      
      Signed-off-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Link: https://lore.kernel.org/r/20221114191547.1694267-2-memxor@gmail.com
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      1f6d52f1
    • Andrii Nakryiko's avatar
      Merge branch 'libbpf: Fixed various checkpatch issues' · de763fbb
      Andrii Nakryiko authored
      
      
      Kang Minchul says:
      
      ====================
      This patch series contains various checkpatch fixes
      in btf.c, libbpf.c, ringbuf.c.
      
      I know these are trivial but some issues are hard to ignore
      and I think these checkpatch issues are accumulating.
      
      v1 -> v2: changed cover letter message.
      ====================
      
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      de763fbb
    • Kang Minchul's avatar
      libbpf: checkpatch: Fixed code alignments in ringbuf.c · b486d19a
      Kang Minchul authored
      
      
      Fixed some checkpatch issues in ringbuf.c
      
      Signed-off-by: default avatarKang Minchul <tegongkang@gmail.com>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Acked-by: default avatarStanislav Fomichev <sdf@google.com>
      Link: https://lore.kernel.org/bpf/20221113190648.38556-4-tegongkang@gmail.com
      b486d19a