Skip to content
  1. Mar 27, 2021
    • Martin KaFai Lau's avatar
      bpf: Support bpf program calling kernel function · e6ac2450
      Martin KaFai Lau authored
      
      
      This patch adds support to BPF verifier to allow bpf program calling
      kernel function directly.
      
      The use case included in this set is to allow bpf-tcp-cc to directly
      call some tcp-cc helper functions (e.g. "tcp_cong_avoid_ai()").  Those
      functions have already been used by some kernel tcp-cc implementations.
      
      This set will also allow the bpf-tcp-cc program to directly call the
      kernel tcp-cc implementation,  For example, a bpf_dctcp may only want to
      implement its own dctcp_cwnd_event() and reuse other dctcp_*() directly
      from the kernel tcp_dctcp.c instead of reimplementing (or
      copy-and-pasting) them.
      
      The tcp-cc kernel functions mentioned above will be white listed
      for the struct_ops bpf-tcp-cc programs to use in a later patch.
      The white listed functions are not bounded to a fixed ABI contract.
      Those functions have already been used by the existing kernel tcp-cc.
      If any of them has changed, both in-tree and out-of-tree kernel tcp-cc
      implementations have to be changed.  The same goes for the struct_ops
      bpf-tcp-cc programs which have to be adjusted accordingly.
      
      This patch is to make the required changes in the bpf verifier.
      
      First change is in btf.c, it adds a case in "btf_check_func_arg_match()".
      When the passed in "btf->kernel_btf == true", it means matching the
      verifier regs' states with a kernel function.  This will handle the
      PTR_TO_BTF_ID reg.  It also maps PTR_TO_SOCK_COMMON, PTR_TO_SOCKET,
      and PTR_TO_TCP_SOCK to its kernel's btf_id.
      
      In the later libbpf patch, the insn calling a kernel function will
      look like:
      
      insn->code == (BPF_JMP | BPF_CALL)
      insn->src_reg == BPF_PSEUDO_KFUNC_CALL /* <- new in this patch */
      insn->imm == func_btf_id /* btf_id of the running kernel */
      
      [ For the future calling function-in-kernel-module support, an array
        of module btf_fds can be passed at the load time and insn->off
        can be used to index into this array. ]
      
      At the early stage of verifier, the verifier will collect all kernel
      function calls into "struct bpf_kfunc_desc".  Those
      descriptors are stored in "prog->aux->kfunc_tab" and will
      be available to the JIT.  Since this "add" operation is similar
      to the current "add_subprog()" and looking for the same insn->code,
      they are done together in the new "add_subprog_and_kfunc()".
      
      In the "do_check()" stage, the new "check_kfunc_call()" is added
      to verify the kernel function call instruction:
      1. Ensure the kernel function can be used by a particular BPF_PROG_TYPE.
         A new bpf_verifier_ops "check_kfunc_call" is added to do that.
         The bpf-tcp-cc struct_ops program will implement this function in
         a later patch.
      2. Call "btf_check_kfunc_args_match()" to ensure the regs can be
         used as the args of a kernel function.
      3. Mark the regs' type, subreg_def, and zext_dst.
      
      At the later do_misc_fixups() stage, the new fixup_kfunc_call()
      will replace the insn->imm with the function address (relative
      to __bpf_call_base).  If needed, the jit can find the btf_func_model
      by calling the new bpf_jit_find_kfunc_model(prog, insn).
      With the imm set to the function address, "bpftool prog dump xlated"
      will be able to display the kernel function calls the same way as
      it displays other bpf helper calls.
      
      gpl_compatible program is required to call kernel function.
      
      This feature currently requires JIT.
      
      The verifier selftests are adjusted because of the changes in
      the verbose log in add_subprog_and_kfunc().
      
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20210325015142.1544736-1-kafai@fb.com
      e6ac2450
    • Martin KaFai Lau's avatar
      bpf: Refactor btf_check_func_arg_match · 34747c41
      Martin KaFai Lau authored
      
      
      This patch moved the subprog specific logic from
      btf_check_func_arg_match() to the new btf_check_subprog_arg_match().
      The core logic is left in btf_check_func_arg_match() which
      will be reused later to check the kernel function call.
      
      The "if (!btf_type_is_ptr(t))" is checked first to improve the
      indentation which will be useful for a later patch.
      
      Some of the "btf_kind_str[]" usages is replaced with the shortcut
      "btf_type_str(t)".
      
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20210325015136.1544504-1-kafai@fb.com
      34747c41
    • Martin KaFai Lau's avatar
      bpf: Simplify freeing logic in linfo and jited_linfo · e16301fb
      Martin KaFai Lau authored
      
      
      This patch simplifies the linfo freeing logic by combining
      "bpf_prog_free_jited_linfo()" and "bpf_prog_free_unused_jited_linfo()"
      into the new "bpf_prog_jit_attempt_done()".
      It is a prep work for the kernel function call support.  In a later
      patch, freeing the kernel function call descriptors will also
      be done in the "bpf_prog_jit_attempt_done()".
      
      "bpf_prog_free_linfo()" is removed since it is only called by
      "__bpf_prog_put_noref()".  The kvfree() are directly called
      instead.
      
      It also takes this chance to s/kcalloc/kvcalloc/ for the jited_linfo
      allocation.
      
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Link: https://lore.kernel.org/bpf/20210325015130.1544323-1-kafai@fb.com
      e16301fb
    • Andrii Nakryiko's avatar
      libbpf: Preserve empty DATASEC BTFs during static linking · 36e79851
      Andrii Nakryiko authored
      Ensure that BPF static linker preserves all DATASEC BTF types, even if some of
      them might not have any variable information at all. This may happen if the
      compiler promotes local initialized variable contents into .rodata section and
      there are no global or static functions in the program.
      
      For example,
      
        $ cat t.c
        struct t { char a; char b; char c; };
        void bar(struct t*);
        void find() {
           struct t tmp = {1, 2, 3};
           bar(&tmp);
        }
      
        $ clang -target bpf -O2 -g -S t.c
               .long   104                             # BTF_KIND_DATASEC(id = 8)
               .long   251658240                       # 0xf000000
               .long   0
      
               .ascii  ".rodata"                       # string offset=104
      
        $ clang -target bpf -O2 -g -c t.c
        $ readelf -S t.o | grep data
           [ 4] .rodata           PROGBITS         0000000000000000  00000090
      
      Fixes: 8fd27bf6
      
       ("libbpf: Add BPF static linker BTF and BTF.ext support")
      Reported-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarYonghong Song <yhs@fb.com>
      Link: https://lore.kernel.org/bpf/20210326043036.3081011-1-andrii@kernel.org
      36e79851
    • Wan Jiabing's avatar
      bpf: struct sock is declared twice in bpf_sk_storage header · fcb8d0d7
      Wan Jiabing authored
      
      
      struct sock has been declared twice, therefore remove the duplicate.
      
      Signed-off-by: default avatarWan Jiabing <wanjiabing@vivo.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20210325070602.858024-1-wanjiabing@vivo.com
      fcb8d0d7
  2. Mar 26, 2021
  3. Mar 25, 2021
    • Potnuri Bharat Teja's avatar
      RDMA/cxgb4: Fix adapter LE hash errors while destroying ipv6 listening server · 3408be14
      Potnuri Bharat Teja authored
      Not setting the ipv6 bit while destroying ipv6 listening servers may
      result in potential fatal adapter errors due to lookup engine memory hash
      errors. Therefore always set ipv6 field while destroying ipv6 listening
      servers.
      
      Fixes: 830662f6 ("RDMA/cxgb4: Add support for active and passive open connection with IPv6 address")
      Link: https://lore.kernel.org/r/20210324190453.8171-1-bharat@chelsio.com
      
      
      Signed-off-by: default avatarPotnuri Bharat Teja <bharat@chelsio.com>
      Reviewed-by: default avatarLeon Romanovsky <leonro@nvidia.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@nvidia.com>
      3408be14
    • Rich Wiley's avatar
      arm64: kernel: disable CNP on Carmel · 20109a85
      Rich Wiley authored
      
      
      On NVIDIA Carmel cores, CNP behaves differently than it does on standard
      ARM cores. On Carmel, if two cores have CNP enabled and share an L2 TLB
      entry created by core0 for a specific ASID, a non-shareable TLBI from
      core1 may still see the shared entry. On standard ARM cores, that TLBI
      will invalidate the shared entry as well.
      
      This causes issues with patchsets that attempt to do local TLBIs based
      on cpumasks instead of broadcast TLBIs. Avoid these issues by disabling
      CNP support for NVIDIA Carmel cores.
      
      Signed-off-by: default avatarRich Wiley <rwiley@nvidia.com>
      Link: https://lore.kernel.org/r/20210324002809.30271-1-rwiley@nvidia.com
      
      
      [will: Fix pre-existing whitespace issue]
      Signed-off-by: default avatarWill Deacon <will@kernel.org>
      20109a85
    • Maninder Singh's avatar
      arm64/process.c: fix Wmissing-prototypes build warnings · baa96377
      Maninder Singh authored
      Fix GCC warnings reported when building with "-Wmissing-prototypes":
      
        arch/arm64/kernel/process.c:261:6: warning: no previous prototype for '__show_regs' [-Wmissing-prototypes]
            261 | void __show_regs(struct pt_regs *regs)
                |      ^~~~~~~~~~~
        arch/arm64/kernel/process.c:307:6: warning: no previous prototype for '__show_regs_alloc_free' [-Wmissing-prototypes]
            307 | void __show_regs_alloc_free(struct pt_regs *regs)
                |      ^~~~~~~~~~~~~~~~~~~~~~
        arch/arm64/kernel/process.c:365:5: warning: no previous prototype for 'arch_dup_task_struct' [-Wmissing-prototypes]
            365 | int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
                |     ^~~~~~~~~~~~~~~~~~~~
        arch/arm64/kernel/process.c:546:41: warning: no previous prototype for '__switch_to' [-Wmissing-prototypes]
            546 | __notrace_funcgraph struct task_struct *__switch_to(struct task_struct *prev,
                |                                         ^~~~~~~~~~~
        arch/arm64/kernel/process.c:710:25: warning: no previous prototype for 'arm64_preempt_schedule_irq' [-Wmissing-prototypes]
            710 | asmlinkage void __sched arm64_preempt_schedule_irq(void)
                |                         ^~~~~~~~~~~~~~~~~~~~~~~~~~
      
      Link: https://lore.kernel.org/lkml/202103192250.AennsfXM-lkp@intel.com
      
      
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Signed-off-by: default avatarManinder Singh <maninder1.s@samsung.com>
      Link: https://lore.kernel.org/r/1616568899-986-1-git-send-email-maninder1.s@samsung.com
      
      
      Signed-off-by: default avatarWill Deacon <will@kernel.org>
      baa96377
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · e1381380
      Linus Torvalds authored
      Pull networking fixes from David Miller:
       "Various fixes, all over:
      
         1) Fix overflow in ptp_qoriq_adjfine(), from Yangbo Lu.
      
         2) Always store the rx queue mapping in veth, from Maciej
            Fijalkowski.
      
         3) Don't allow vmlinux btf in map_create, from Alexei Starovoitov.
      
         4) Fix memory leak in octeontx2-af from Colin Ian King.
      
         5) Use kvalloc in bpf x86 JIT for storing jit'd addresses, from
            Yonghong Song.
      
         6) Fix tx ptp stats in mlx5, from Aya Levin.
      
         7) Check correct ip version in tun decap, fropm Roi Dayan.
      
         8) Fix rate calculation in mlx5 E-Switch code, from arav Pandit.
      
         9) Work item memork leak in mlx5, from Shay Drory.
      
        10) Fix ip6ip6 tunnel crash with bpf, from Daniel Borkmann.
      
        11) Lack of preemptrion awareness in macvlan, from Eric Dumazet.
      
        12) Fix data race in pxa168_eth, from Pavel Andrianov.
      
        13) Range validate stab in red_check_params(), from Eric Dumazet.
      
        14) Inherit vlan filtering setting properly in b53 driver, from
            Florian Fainelli.
      
        15) Fix rtnl locking in igc driver, from Sasha Neftin.
      
        16) Pause handling fixes in igc driver, from Muhammad Husaini
            Zulkifli.
      
        17) Missing rtnl locking in e1000_reset_task, from Vitaly Lifshits.
      
        18) Use after free in qlcnic, from Lv Yunlong.
      
        19) fix crash in fritzpci mISDN, from Tong Zhang.
      
        20) Premature rx buffer reuse in igb, from Li RongQing.
      
        21) Missing termination of ip[a driver message handler arrays, from
            Alex Elder.
      
        22) Fix race between "x25_close" and "x25_xmit"/"x25_rx" in hdlc_x25
            driver, from Xie He.
      
        23) Use after free in c_can_pci_remove(), from Tong Zhang.
      
        24) Uninitialized variable use in nl80211, from Jarod Wilson.
      
        25) Off by one size calc in bpf verifier, from Piotr Krysiuk.
      
        26) Use delayed work instead of deferrable for flowtable GC, from
            Yinjun Zhang.
      
        27) Fix infinite loop in NPC unmap of octeontx2 driver, from
            Hariprasad Kelam.
      
        28) Fix being unable to change MTU of dwmac-sun8i devices due to lack
            of fifo sizes, from Corentin Labbe.
      
        29) DMA use after free in r8169 with WoL, fom Heiner Kallweit.
      
        30) Mismatched prototypes in isdn-capi, from Arnd Bergmann.
      
        31) Fix psample UAPI breakage, from Ido Schimmel"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (171 commits)
        psample: Fix user API breakage
        math: Export mul_u64_u64_div_u64
        ch_ktls: fix enum-conversion warning
        octeontx2-af: Fix memory leak of object buf
        ptp_qoriq: fix overflow in ptp_qoriq_adjfine() u64 calcalation
        net: bridge: don't notify switchdev for local FDB addresses
        net/sched: act_ct: clear post_ct if doing ct_clear
        net: dsa: don't assign an error value to tag_ops
        isdn: capi: fix mismatched prototypes
        net/mlx5: SF, do not use ecpu bit for vhca state processing
        net/mlx5e: Fix division by 0 in mlx5e_select_queue
        net/mlx5e: Fix error path for ethtool set-priv-flag
        net/mlx5e: Offload tuple rewrite for non-CT flows
        net/mlx5e: Allow to match on MPLS parameters only for MPLS over UDP
        net/mlx5: Add back multicast stats for uplink representor
        net: ipconfig: ic_dev can be NULL in ic_close_devs
        MAINTAINERS: Combine "QLOGIC QLGE 10Gb ETHERNET DRIVER" sections into one
        docs: networking: Fix a typo
        r8169: fix DMA being used after buffer free if WoL is enabled
        net: ipa: fix init header command validation
        ...
      e1381380