Skip to content
  1. Mar 08, 2023
  2. Mar 07, 2023
    • Jakub Kicinski's avatar
      Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next · 36e5e391
      Jakub Kicinski authored
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf-next 2023-03-06
      
      We've added 85 non-merge commits during the last 13 day(s) which contain
      a total of 131 files changed, 7102 insertions(+), 1792 deletions(-).
      
      The main changes are:
      
      1) Add skb and XDP typed dynptrs which allow BPF programs for more
         ergonomic and less brittle iteration through data and variable-sized
         accesses, from Joanne Koong.
      
      2) Bigger batch of BPF verifier improvements to prepare for upcoming BPF
         open-coded iterators allowing for less restrictive looping capabilities,
         from Andrii Nakryiko.
      
      3) Rework RCU enforcement in the verifier, add kptr_rcu and enforce BPF
         programs to NULL-check before passing such pointers into kfunc,
         from Alexei Starovoitov.
      
      4) Add support for kptrs in percpu hashmaps, percpu LRU hashmaps and in
         local storage maps, from Kumar Kartikeya Dwivedi.
      
      5) Add BPF verifier support for ST instructions in convert_ctx_access()
         which will help new -mcpu=v4 clang flag to start emitting them,
         from Eduard Zingerman.
      
      6) Make uprobe attachment Android APK aware by supporting attachment
         to functions inside ELF objects contained in APKs via function names,
         from Daniel Müller.
      
      7) Add a new flag BPF_F_TIMER_ABS flag for bpf_timer_start() helper
         to start the timer with absolute expiration value instead of relative
         one, from Tero Kristo.
      
      8) Add a new kfunc bpf_cgroup_from_id() to look up cgroups via id,
         from Tejun Heo.
      
      9) Extend libbpf to support users manually attaching kprobes/uprobes
         in the legacy/perf/link mode, from Menglong Dong.
      
      10) Implement workarounds in the mips BPF JIT for DADDI/R4000,
         from Jiaxun Yang.
      
      11) Enable mixing bpf2bpf and tailcalls for the loongarch BPF JIT,
          from Hengqi Chen.
      
      12) Extend BPF instruction set doc with describing the encoding of BPF
          instructions in terms of how bytes are stored under big/little endian,
          from Jose E. Marchesi.
      
      13) Follow-up to enable kfunc support for riscv BPF JIT, from Pu Lehui.
      
      14) Fix bpf_xdp_query() backwards compatibility on old kernels,
          from Yonghong Song.
      
      15) Fix BPF selftest cross compilation with CLANG_CROSS_FLAGS,
          from Florent Revest.
      
      16) Improve bpf_cpumask_ma to only allocate one bpf_mem_cache,
          from Hou Tao.
      
      17) Fix BPF verifier's check_subprogs to not unnecessarily mark
          a subprogram with has_tail_call, from Ilya Leoshkevich.
      
      18) Fix arm syscall regs spec in libbpf's bpf_tracing.h, from Puranjay Mohan.
      
      * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (85 commits)
        selftests/bpf: Add test for legacy/perf kprobe/uprobe attach mode
        selftests/bpf: Split test_attach_probe into multi subtests
        libbpf: Add support to set kprobe/uprobe attach mode
        tools/resolve_btfids: Add /libsubcmd to .gitignore
        bpf: add support for fixed-size memory pointer returns for kfuncs
        bpf: generalize dynptr_get_spi to be usable for iters
        bpf: mark PTR_TO_MEM as non-null register type
        bpf: move kfunc_call_arg_meta higher in the file
        bpf: ensure that r0 is marked scratched after any function call
        bpf: fix visit_insn()'s detection of BPF_FUNC_timer_set_callback helper
        bpf: clean up visit_insn()'s instruction processing
        selftests/bpf: adjust log_fixup's buffer size for proper truncation
        bpf: honor env->test_state_freq flag in is_state_visited()
        selftests/bpf: enhance align selftest's expected log matching
        bpf: improve regsafe() checks for PTR_TO_{MEM,BUF,TP_BUFFER}
        bpf: improve stack slot state printing
        selftests/bpf: Disassembler tests for verifier.c:convert_ctx_access()
        selftests/bpf: test if pointer type is tracked for BPF_ST_MEM
        bpf: allow ctx writes using BPF_ST_MEM instruction
        bpf: Use separate RCU callbacks for freeing selem
        ...
      ====================
      
      Link: https://lore.kernel.org/r/20230307004346.27578-1-daniel@iogearbox.net
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      36e5e391
    • Andrii Nakryiko's avatar
      Merge branch 'libbpf: allow users to set kprobe/uprobe attach mode' · 8f4c92f0
      Andrii Nakryiko authored
      
      
      Menglong Dong says:
      
      ====================
      
      From: Menglong Dong <imagedong@tencent.com>
      
      By default, libbpf will attach the kprobe/uprobe BPF program in the
      latest mode that supported by kernel. In this series, we add the support
      to let users manually attach kprobe/uprobe in legacy/perf/link mode in
      the 1th patch.
      
      And in the 2th patch, we split the testing 'attach_probe' into multi
      subtests, as Andrii suggested.
      
      In the 3th patch, we add the testings for loading kprobe/uprobe in
      different mode.
      
      Changes since v3:
      - rename eBPF to BPF in the doc
      - use OPTS_GET() to get the value of 'force_ioctl_attach'
      - error out on attach mode is not supported
      - use test_attach_probe_manual__open_and_load() directly
      
      Changes since v2:
      - fix the typo in the 2th patch
      
      Changes since v1:
      - some small changes in the 1th patch, as Andrii suggested
      - split 'attach_probe' into multi subtests
      ====================
      
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      8f4c92f0
    • Menglong Dong's avatar
      selftests/bpf: Add test for legacy/perf kprobe/uprobe attach mode · c7aec81b
      Menglong Dong authored
      
      
      Add the testing for kprobe/uprobe attaching in default, legacy, perf and
      link mode. And the testing passed:
      
      ./test_progs -t attach_probe
      $5/1     attach_probe/manual-default:OK
      $5/2     attach_probe/manual-legacy:OK
      $5/3     attach_probe/manual-perf:OK
      $5/4     attach_probe/manual-link:OK
      $5/5     attach_probe/auto:OK
      $5/6     attach_probe/kprobe-sleepable:OK
      $5/7     attach_probe/uprobe-lib:OK
      $5/8     attach_probe/uprobe-sleepable:OK
      $5/9     attach_probe/uprobe-ref_ctr:OK
      $5       attach_probe:OK
      Summary: 1/9 PASSED, 0 SKIPPED, 0 FAILED
      
      Signed-off-by: default avatarMenglong Dong <imagedong@tencent.com>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Reviewed-by: default avatarBiao Jiang <benbjiang@tencent.com>
      Link: https://lore.kernel.org/bpf/20230306064833.7932-4-imagedong@tencent.com
      c7aec81b
    • Menglong Dong's avatar
      selftests/bpf: Split test_attach_probe into multi subtests · 7391ec63
      Menglong Dong authored
      
      
      In order to adapt to the older kernel, now we split the "attach_probe"
      testing into multi subtests:
      
        manual // manual attach tests for kprobe/uprobe
        auto // auto-attach tests for kprobe and uprobe
        kprobe-sleepable // kprobe sleepable test
        uprobe-lib // uprobe tests for library function by name
        uprobe-sleepable // uprobe sleepable test
        uprobe-ref_ctr // uprobe ref_ctr test
      
      As sleepable kprobe needs to set BPF_F_SLEEPABLE flag before loading,
      we need to move it to a stand alone skel file, in case of it is not
      supported by kernel and make the whole loading fail.
      
      Therefore, we can only enable part of the subtests for older kernel.
      
      Signed-off-by: default avatarMenglong Dong <imagedong@tencent.com>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Reviewed-by: default avatarBiao Jiang <benbjiang@tencent.com>
      Link: https://lore.kernel.org/bpf/20230306064833.7932-3-imagedong@tencent.com
      7391ec63
    • Menglong Dong's avatar
      libbpf: Add support to set kprobe/uprobe attach mode · f8b299bc
      Menglong Dong authored
      
      
      By default, libbpf will attach the kprobe/uprobe BPF program in the
      latest mode that supported by kernel. In this patch, we add the support
      to let users manually attach kprobe/uprobe in legacy or perf mode.
      
      There are 3 mode that supported by the kernel to attach kprobe/uprobe:
      
        LEGACY: create perf event in legacy way and don't use bpf_link
        PERF: create perf event with perf_event_open() and don't use bpf_link
      
      Signed-off-by: default avatarMenglong Dong <imagedong@tencent.com>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Reviewed-by: default avatarBiao Jiang <benbjiang@tencent.com>
      Link: create perf event with perf_event_open() and use bpf_link
      Link: https://lore.kernel.org/bpf/20230113093427.1666466-1-imagedong@tencent.com/
      Link: https://lore.kernel.org/bpf/20230306064833.7932-2-imagedong@tencent.com
      
      Users now can manually choose the mode with
      bpf_program__attach_uprobe_opts()/bpf_program__attach_kprobe_opts().
      f8b299bc
  3. Mar 06, 2023
  4. Mar 05, 2023
  5. Mar 04, 2023
    • Alexei Starovoitov's avatar
      Merge branch 'bpf: allow ctx writes using BPF_ST_MEM instruction' · 2564a031
      Alexei Starovoitov authored
      Eduard Zingerman says:
      
      ====================
      
      Changes v1 -> v2, suggested by Alexei:
      - Resolved conflict with recent commit:
        6fcd486b ("bpf: Refactor RCU enforcement in the verifier");
      - Variable `ctx_access` removed in function `convert_ctx_accesses()`;
      - Macro `BPF_COPY_STORE` renamed to `BPF_EMIT_STORE` and fixed to
        correctly extract original store instruction class from code.
      
      Original message follows:
      
      The function verifier.c:convert_ctx_access() applies some rewrites to BPF
      instructions that read from or write to the BPF program context.
      For example, the write instruction for the `struct bpf_sockopt::retval`
      field:
      
          *(u32 *)(r1 + offsetof(struct bpf_sockopt, retval)) = r2
      
      Is transformed to:
      
          *(u64 *)(r1 + offsetof(struct bpf_sockopt_kern, tmp_reg)) = r9
          r9 = *(u64 *)(r1 + offsetof(struct bpf_sockopt_kern, current_task))
          r9 = *(u64 *)(r9 + offsetof(struct task_struct, bpf_ctx))
          *(u32 *)(r9 + offsetof(struct bpf_cg_run_ctx, retval)) = r2
          r9 = *(u64 *)(r1 + offsetof(struct bpf_sockopt_kern, tmp_reg))
      
      Currently, the verifier only supports such transformations for LDX
      (memory-to-register read) and STX (register-to-memory write) instructions.
      Error is reported for ST instructions (immediate-to-memory write).
      This is fine because clang does not currently emit ST instructions.
      
      However, new `-mcpu=v4` clang flag is planned, which would allow to emit
      ST instructions (discussed in [1]).
      
      This patch-set adjusts the verifier to support ST instructions in
      `verifier.c:convert_ctx_access()`.
      
      The patches #1 and #2 were previously shared as part of RFC [2]. The
      changes compared to that RFC are:
      - In patch #1, a bug in the handling of the
        `struct __sk_buff::queue_mapping` field was fixed.
      - Patch #3 is added, which is a set of disassembler-based test cases for
        context access rewrites. The test cases cover all fields for which the
        handling code is modified in patch #1.
      
      [1] Propose some new instructions for -mcpu=v4
          https://lore.kernel.org/bpf/4bfe98be-5333-1c7e-2f6d-42486c8ec039@meta.com/
      [2] RFC Support for BPF_ST instruction in LLVM C compiler
          https://lore.kernel.org/bpf/20221231163122.1360813-1-eddyz87@gmail.com/
      [3] v1
          https://lore.kernel.org/bpf/20230302225507.3413720-1-eddyz87@gmail.com/
      
      
      ====================
      
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      2564a031
    • Eduard Zingerman's avatar
      selftests/bpf: Disassembler tests for verifier.c:convert_ctx_access() · 71cf4d02
      Eduard Zingerman authored
      
      
      Function verifier.c:convert_ctx_access() applies some rewrites to BPF
      instructions that read or write BPF program context. This commit adds
      machinery to allow test cases that inspect BPF program after these
      rewrites are applied.
      
      An example of a test case:
      
        {
              // Shorthand for field offset and size specification
      	N(CGROUP_SOCKOPT, struct bpf_sockopt, retval),
      
              // Pattern generated for field read
      	.read  = "$dst = *(u64 *)($ctx + bpf_sockopt_kern::current_task);"
      		 "$dst = *(u64 *)($dst + task_struct::bpf_ctx);"
      		 "$dst = *(u32 *)($dst + bpf_cg_run_ctx::retval);",
      
              // Pattern generated for field write
      	.write = "*(u64 *)($ctx + bpf_sockopt_kern::tmp_reg) = r9;"
      		 "r9 = *(u64 *)($ctx + bpf_sockopt_kern::current_task);"
      		 "r9 = *(u64 *)(r9 + task_struct::bpf_ctx);"
      		 "*(u32 *)(r9 + bpf_cg_run_ctx::retval) = $src;"
      		 "r9 = *(u64 *)($ctx + bpf_sockopt_kern::tmp_reg);" ,
        },
      
      For each test case, up to three programs are created:
      - One that uses BPF_LDX_MEM to read the context field.
      - One that uses BPF_STX_MEM to write to the context field.
      - One that uses BPF_ST_MEM to write to the context field.
      
      The disassembly of each program is compared with the pattern specified
      in the test case.
      
      Kernel code for disassembly is reused (as is in the bpftool).
      To keep Makefile changes to the minimum, symbolic links to
      `kernel/bpf/disasm.c` and `kernel/bpf/disasm.h ` are added.
      
      Signed-off-by: default avatarEduard Zingerman <eddyz87@gmail.com>
      Link: https://lore.kernel.org/r/20230304011247.566040-4-eddyz87@gmail.com
      
      
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      71cf4d02
    • Eduard Zingerman's avatar
      selftests/bpf: test if pointer type is tracked for BPF_ST_MEM · 806f81cd
      Eduard Zingerman authored
      
      
      Check that verifier tracks pointer types for BPF_ST_MEM instructions
      and reports error if pointer types do not match for different
      execution branches.
      
      Signed-off-by: default avatarEduard Zingerman <eddyz87@gmail.com>
      Link: https://lore.kernel.org/r/20230304011247.566040-3-eddyz87@gmail.com
      
      
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      806f81cd
    • Eduard Zingerman's avatar
      bpf: allow ctx writes using BPF_ST_MEM instruction · 0d80a619
      Eduard Zingerman authored
      
      
      Lift verifier restriction to use BPF_ST_MEM instructions to write to
      context data structures. This requires the following changes:
       - verifier.c:do_check() for BPF_ST updated to:
         - no longer forbid writes to registers of type PTR_TO_CTX;
         - track dst_reg type in the env->insn_aux_data[...].ptr_type field
           (same way it is done for BPF_STX and BPF_LDX instructions).
       - verifier.c:convert_ctx_access() and various callbacks invoked by
         it are updated to handled BPF_ST instruction alongside BPF_STX.
      
      Signed-off-by: default avatarEduard Zingerman <eddyz87@gmail.com>
      Link: https://lore.kernel.org/r/20230304011247.566040-2-eddyz87@gmail.com
      
      
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      0d80a619
    • Kumar Kartikeya Dwivedi's avatar
      bpf: Use separate RCU callbacks for freeing selem · e768e3c5
      Kumar Kartikeya Dwivedi authored
      
      
      Martin suggested that instead of using a byte in the hole (which he has
      a use for in his future patch) in bpf_local_storage_elem, we can
      dispatch a different call_rcu callback based on whether we need to free
      special fields in bpf_local_storage_elem data. The free path, described
      in commit 9db44fdd ("bpf: Support kptrs in local storage maps"),
      only waits for call_rcu callbacks when there are special (kptrs, etc.)
      fields in the map value, hence it is necessary that we only access
      smap in this case.
      
      Therefore, dispatch different RCU callbacks based on the BPF map has a
      valid btf_record, which dereference and use smap's btf_record only when
      it is valid.
      
      Signed-off-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Link: https://lore.kernel.org/r/20230303141542.300068-1-memxor@gmail.com
      
      
      Signed-off-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      e768e3c5
    • Daniel Borkmann's avatar
      Merge branch 'bpf-kptr-rcu' · db55174d
      Daniel Borkmann authored
      
      
      Alexei Starovoitov says:
      
      ====================
      v4->v5:
      fix typos, add acks.
      
      v3->v4:
      - patch 3 got much cleaner after BPF_KPTR_RCU was removed as suggested by David.
      
      - make KF_RCU stronger and require that bpf program checks for NULL
      before passing such pointers into kfunc. The prog has to do that anyway
      to access fields and it aligns with BTF_TYPE_SAFE_RCU allowlist.
      
      - New patch 6: refactor RCU enforcement in the verifier.
      The patches 2,3,6 are part of one feature.
      The 2 and 3 alone are incomplete, since RCU pointers are barely useful
      without bpf_rcu_read_lock/unlock in GCC compiled kernel.
      Even if GCC lands support for btf_type_tag today it will take time
      to mandate that version for kernel builds. Hence go with allow list
      approach. See patch 6 for details.
      This allows to start strict enforcement of TRUSTED | UNTRUSTED
      in one part of PTR_TO_BTF_ID accesses.
      One step closer to KF_TRUSTED_ARGS by default.
      
      v2->v3:
      - Instead of requiring bpf progs to tag fields with __kptr_rcu
      teach the verifier to infer RCU properties based on the type.
      BPF_KPTR_RCU becomes kernel internal type of struct btf_field.
      - Add patch 2 to tag cgroups and dfl_cgrp as trusted.
      That bug was spotted by BPF CI on clang compiler kernels,
      since patch 3 is doing:
      static bool in_rcu_cs(struct bpf_verifier_env *env)
      {
              return env->cur_state->active_rcu_lock || !env->prog->aux->sleepable;
      }
      which makes all non-sleepable programs behave like they have implicit
      rcu_read_lock around them. Which is the case in practice.
      It was fine on gcc compiled kernels where task->cgroup deference was producing
      PTR_TO_BTF_ID, but on clang compiled kernels task->cgroup deference was
      producing PTR_TO_BTF_ID | MEM_RCU | MAYBE_NULL, which is more correct,
      but selftests were failing. Patch 2 fixes this discrepancy.
      With few more patches like patch 2 we can make KF_TRUSTED_ARGS default
      for kfuncs and helpers.
      - Add comment in selftest patch 5 that it's verifier only check.
      
      v1->v2:
      Instead of agressively allow dereferenced kptr_rcu pointers into KF_TRUSTED_ARGS
      kfuncs only allow them into KF_RCU funcs.
      The KF_RCU flag is a weaker version of KF_TRUSTED_ARGS. The kfuncs marked with
      KF_RCU expect either PTR_TRUSTED or MEM_RCU arguments. The verifier guarantees
      that the objects are valid and there is no use-after-free, but the pointers
      maybe NULL and pointee object's reference count could have reached zero, hence
      kfuncs must do != NULL check and consider refcnt==0 case when accessing such
      arguments.
      No changes in patch 1.
      Patches 2,3,4 adjusted with above behavior.
      
      v1:
      The __kptr_ref turned out to be too limited, since any "trusted" pointer access
      requires bpf_kptr_xchg() which is impractical when the same pointer needs
      to be dereferenced by multiple cpus.
      The __kptr "untrusted" only access isn't very useful in practice.
      Rename __kptr to __kptr_untrusted with eventual goal to deprecate it,
      and rename __kptr_ref to __kptr, since that looks to be more common use of kptrs.
      Introduce __kptr_rcu that can be directly dereferenced and used similar
      to native kernel C code.
      Once bpf_cpumask and task_struct kfuncs are converted to observe RCU GP
      when refcnt goes to zero, both __kptr and __kptr_untrusted can be deprecated
      and __kptr_rcu can become the only __kptr tag.
      ====================
      
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      db55174d
    • Alexei Starovoitov's avatar
      bpf: Refactor RCU enforcement in the verifier. · 6fcd486b
      Alexei Starovoitov authored
      
      
      bpf_rcu_read_lock/unlock() are only available in clang compiled kernels. Lack
      of such key mechanism makes it impossible for sleepable bpf programs to use RCU
      pointers.
      
      Allow bpf_rcu_read_lock/unlock() in GCC compiled kernels (though GCC doesn't
      support btf_type_tag yet) and allowlist certain field dereferences in important
      data structures like tast_struct, cgroup, socket that are used by sleepable
      programs either as RCU pointer or full trusted pointer (which is valid outside
      of RCU CS). Use BTF_TYPE_SAFE_RCU and BTF_TYPE_SAFE_TRUSTED macros for such
      tagging. They will be removed once GCC supports btf_type_tag.
      
      With that refactor check_ptr_to_btf_access(). Make it strict in enforcing
      PTR_TRUSTED and PTR_UNTRUSTED while deprecating old PTR_TO_BTF_ID without
      modifier flags. There is a chance that this strict enforcement might break
      existing programs (especially on GCC compiled kernels), but this cleanup has to
      start sooner than later. Note PTR_TO_CTX access still yields old deprecated
      PTR_TO_BTF_ID. Once it's converted to strict PTR_TRUSTED or PTR_UNTRUSTED the
      kfuncs and helpers will be able to default to KF_TRUSTED_ARGS. KF_RCU will
      remain as a weaker version of KF_TRUSTED_ARGS where obj refcnt could be 0.
      
      Adjust rcu_read_lock selftest to run on gcc and clang compiled kernels.
      
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarDavid Vernet <void@manifault.com>
      Link: https://lore.kernel.org/bpf/20230303041446.3630-7-alexei.starovoitov@gmail.com
      6fcd486b
    • Alexei Starovoitov's avatar
      selftests/bpf: Tweak cgroup kfunc test. · 0047d834
      Alexei Starovoitov authored
      
      
      Adjust cgroup kfunc test to dereference RCU protected cgroup pointer
      as PTR_TRUSTED and pass into KF_TRUSTED_ARGS kfunc.
      
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarDavid Vernet <void@manifault.com>
      Link: https://lore.kernel.org/bpf/20230303041446.3630-6-alexei.starovoitov@gmail.com
      0047d834
    • Alexei Starovoitov's avatar
    • Alexei Starovoitov's avatar
      bpf: Introduce kptr_rcu. · 20c09d92
      Alexei Starovoitov authored
      
      
      The life time of certain kernel structures like 'struct cgroup' is protected by RCU.
      Hence it's safe to dereference them directly from __kptr tagged pointers in bpf maps.
      The resulting pointer is MEM_RCU and can be passed to kfuncs that expect KF_RCU.
      Derefrence of other kptr-s returns PTR_UNTRUSTED.
      
      For example:
      struct map_value {
         struct cgroup __kptr *cgrp;
      };
      
      SEC("tp_btf/cgroup_mkdir")
      int BPF_PROG(test_cgrp_get_ancestors, struct cgroup *cgrp_arg, const char *path)
      {
        struct cgroup *cg, *cg2;
      
        cg = bpf_cgroup_acquire(cgrp_arg); // cg is PTR_TRUSTED and ref_obj_id > 0
        bpf_kptr_xchg(&v->cgrp, cg);
      
        cg2 = v->cgrp; // This is new feature introduced by this patch.
        // cg2 is PTR_MAYBE_NULL | MEM_RCU.
        // When cg2 != NULL, it's a valid cgroup, but its percpu_ref could be zero
      
        if (cg2)
          bpf_cgroup_ancestor(cg2, level); // safe to do.
      }
      
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarDavid Vernet <void@manifault.com>
      Link: https://lore.kernel.org/bpf/20230303041446.3630-4-alexei.starovoitov@gmail.com
      20c09d92
    • Alexei Starovoitov's avatar
      bpf: Mark cgroups and dfl_cgrp fields as trusted. · 8d093b4e
      Alexei Starovoitov authored
      
      
      bpf programs sometimes do:
      bpf_cgrp_storage_get(&map, task->cgroups->dfl_cgrp, ...);
      It is safe to do, because cgroups->dfl_cgrp pointer is set diring init and
      never changes. The task->cgroups is also never NULL. It is also set during init
      and will change when task switches cgroups. For any trusted task pointer
      dereference of cgroups and dfl_cgrp should yield trusted pointers. The verifier
      wasn't aware of this. Hence in gcc compiled kernels task->cgroups dereference
      was producing PTR_TO_BTF_ID without modifiers while in clang compiled kernels
      the verifier recognizes __rcu tag in cgroups field and produces
      PTR_TO_BTF_ID | MEM_RCU | MAYBE_NULL.
      Tag cgroups and dfl_cgrp as trusted to equalize clang and gcc behavior.
      When GCC supports btf_type_tag such tagging will done directly in the type.
      
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarDavid Vernet <void@manifault.com>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Link: https://lore.kernel.org/bpf/20230303041446.3630-3-alexei.starovoitov@gmail.com
      8d093b4e
    • Alexei Starovoitov's avatar
      bpf: Rename __kptr_ref -> __kptr and __kptr -> __kptr_untrusted. · 03b77e17
      Alexei Starovoitov authored
      
      
      __kptr meant to store PTR_UNTRUSTED kernel pointers inside bpf maps.
      The concept felt useful, but didn't get much traction,
      since bpf_rdonly_cast() was added soon after and bpf programs received
      a simpler way to access PTR_UNTRUSTED kernel pointers
      without going through restrictive __kptr usage.
      
      Rename __kptr_ref -> __kptr and __kptr -> __kptr_untrusted to indicate
      its intended usage.
      The main goal of __kptr_untrusted was to read/write such pointers
      directly while bpf_kptr_xchg was a mechanism to access refcnted
      kernel pointers. The next patch will allow RCU protected __kptr access
      with direct read. At that point __kptr_untrusted will be deprecated.
      
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarDavid Vernet <void@manifault.com>
      Link: https://lore.kernel.org/bpf/20230303041446.3630-2-alexei.starovoitov@gmail.com
      03b77e17
  6. Mar 03, 2023