Skip to content
  1. Sep 01, 2023
    • Eric Dumazet's avatar
      net: use sk_forward_alloc_get() in sk_get_meminfo() · 66d58f04
      Eric Dumazet authored
      inet_sk_diag_fill() has been changed to use sk_forward_alloc_get(),
      but sk_get_meminfo() was forgotten.
      
      Fixes: 292e6077
      
       ("net: introduce sk_forward_alloc_get()")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      66d58f04
    • Eric Dumazet's avatar
      net/handshake: fix null-ptr-deref in handshake_nl_done_doit() · 82ba0ff7
      Eric Dumazet authored
      We should not call trace_handshake_cmd_done_err() if socket lookup has failed.
      
      Also we should call trace_handshake_cmd_done_err() before releasing the file,
      otherwise dereferencing sock->sk can return garbage.
      
      This also reverts 7afc6d0a ("net/handshake: Fix uninitialized local variable")
      
      Unable to handle kernel paging request at virtual address dfff800000000003
      KASAN: null-ptr-deref in range [0x0000000000000018-0x000000000000001f]
      Mem abort info:
      ESR = 0x0000000096000005
      EC = 0x25: DABT (current EL), IL = 32 bits
      SET = 0, FnV = 0
      EA = 0, S1PTW = 0
      FSC = 0x05: level 1 translation fault
      Data abort info:
      ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000
      CM = 0, WnR = 0, TnD = 0, TagAccess = 0
      GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
      [dfff800000000003] address between user and kernel address ranges
      Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP
      Modules linked in:
      CPU: 1 PID: 5986 Comm: syz-executor292 Not tainted 6.5.0-rc7-syzkaller-gfe4469582053 #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/26/2023
      pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
      pc : handshake_nl_done_doit+0x198/0x9c8 net/handshake/netlink.c:193
      lr : handshake_nl_done_doit+0x180/0x9c8
      sp : ffff800096e37180
      x29: ffff800096e37200 x28: 1ffff00012dc6e34 x27: dfff800000000000
      x26: ffff800096e373d0 x25: 0000000000000000 x24: 00000000ffffffa8
      x23: ffff800096e373f0 x22: 1ffff00012dc6e38 x21: 0000000000000000
      x20: ffff800096e371c0 x19: 0000000000000018 x18: 0000000000000000
      x17: 0000000000000000 x16: ffff800080516cc4 x15: 0000000000000001
      x14: 1fffe0001b14aa3b x13: 0000000000000000 x12: 0000000000000000
      x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000003
      x8 : 0000000000000003 x7 : ffff800080afe47c x6 : 0000000000000000
      x5 : 0000000000000000 x4 : 0000000000000000 x3 : ffff800080a88078
      x2 : 0000000000000001 x1 : 00000000ffffffa8 x0 : 0000000000000000
      Call trace:
      handshake_nl_done_doit+0x198/0x9c8 net/handshake/netlink.c:193
      genl_family_rcv_msg_doit net/netlink/genetlink.c:970 [inline]
      genl_family_rcv_msg net/netlink/genetlink.c:1050 [inline]
      genl_rcv_msg+0x96c/0xc50 net/netlink/genetlink.c:1067
      netlink_rcv_skb+0x214/0x3c4 net/netlink/af_netlink.c:2549
      genl_rcv+0x38/0x50 net/netlink/genetlink.c:1078
      netlink_unicast_kernel net/netlink/af_netlink.c:1339 [inline]
      netlink_unicast+0x660/0x8d4 net/netlink/af_netlink.c:1365
      netlink_sendmsg+0x834/0xb18 net/netlink/af_netlink.c:1914
      sock_sendmsg_nosec net/socket.c:725 [inline]
      sock_sendmsg net/socket.c:748 [inline]
      ____sys_sendmsg+0x56c/0x840 net/socket.c:2494
      ___sys_sendmsg net/socket.c:2548 [inline]
      __sys_sendmsg+0x26c/0x33c net/socket.c:2577
      __do_sys_sendmsg net/socket.c:2586 [inline]
      __se_sys_sendmsg net/socket.c:2584 [inline]
      __arm64_sys_sendmsg+0x80/0x94 net/socket.c:2584
      __invoke_syscall arch/arm64/kernel/syscall.c:37 [inline]
      invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:51
      el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:136
      do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:155
      el0_svc+0x58/0x16c arch/arm64/kernel/entry-common.c:678
      el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:696
      el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:591
      Code: 12800108 b90043e8 910062b3 d343fe68 (387b6908)
      
      Fixes: 3b3009ea
      
       ("net/handshake: Create a NETLINK service for handling handshake requests")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Chuck Lever <chuck.lever@oracle.com>
      Reviewed-by: default avatarMichal Kubiak <michal.kubiak@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      82ba0ff7
    • Jakub Kicinski's avatar
      Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · ddaa935d
      Jakub Kicinski authored
      
      
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf 2023-08-31
      
      We've added 15 non-merge commits during the last 3 day(s) which contain
      a total of 17 files changed, 468 insertions(+), 97 deletions(-).
      
      The main changes are:
      
      1) BPF selftest fixes: one flake and one related to clang18 testing,
         from Yonghong Song.
      
      2) Fix a d_path BPF selftest failure after fast-forward from Linus'
         tree, from Jiri Olsa.
      
      3) Fix a preempt_rt splat in sockmap when using raw_spin_lock_t,
         from John Fastabend.
      
      4) Fix a xsk_diag_fill use-after-free race during socket cleanup,
         from Magnus Karlsson.
      
      5) Fix xsk_build_skb to address a buggy dereference of an ERR_PTR(),
         from Tirthendu Sarkar.
      
      6) Fix a bpftool build warning when compiled with -Wtype-limits,
         from Yafang Shao.
      
      7) Several misc fixes and cleanups in standardization docs,
         from David Vernet.
      
      8) Fix BPF selftest install to consider no_alu32/cpuv4/bpf-gcc flavors,
         from Björn Töpel.
      
      9) Annotate a data race in bpf_long_memcpy for KCSAN, from Daniel Borkmann.
      
      10) Extend documentation with a description for CO-RE relocations,
          from Eduard Zingerman.
      
      11) Fix several invalid escape sequence warnings in bpf_doc.py script,
          from Vishal Chourasia.
      
      12) Fix the instruction set doc wrt offset of BPF-to-BPF call,
          from Will Hawkins.
      
      * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
        selftests/bpf: Include build flavors for install target
        bpf: Annotate bpf_long_memcpy with data_race
        selftests/bpf: Fix d_path test
        bpf, docs: Fix invalid escape sequence warnings in bpf_doc.py
        xsk: Fix xsk_diag use-after-free error during socket cleanup
        bpf, docs: s/eBPF/BPF in standards documents
        bpf, docs: Add abi.rst document to standardization subdirectory
        bpf, docs: Move linux-notes.rst to root bpf docs tree
        bpf, sockmap: Fix preempt_rt splat when using raw_spin_lock_t
        docs/bpf: Add description for CO-RE relocations
        bpf, docs: Correct source of offset for program-local call
        selftests/bpf: Fix flaky cgroup_iter_sleepable subtest
        xsk: Fix xsk_build_skb() error: 'skb' dereferencing possible ERR_PTR()
        bpftool: Fix build warnings with -Wtype-limits
        bpf: Prevent inlining of bpf_fentry_test7()
      ====================
      
      Link: https://lore.kernel.org/r/20230831210019.14417-1-daniel@iogearbox.net
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      ddaa935d
    • Björn Töpel's avatar
      selftests/bpf: Include build flavors for install target · be8e754c
      Björn Töpel authored
      
      
      When using the "install" or targets depending on install, e.g. "gen_tar",
      the BPF machine flavors weren't included.
      
      A command like:
        | make ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu- O=/workspace/kbuild \
        |    HOSTCC=gcc FORMAT= SKIP_TARGETS="arm64 ia64 powerpc sparc64 x86 sgx" \
        |    -C tools/testing/selftests gen_tar
      would not include bpf/no_alu32, bpf/cpuv4, or bpf/bpf-gcc.
      
      Include the BPF machine flavors for "install" make target.
      
      Signed-off-by: default avatarBjörn Töpel <bjorn@rivosinc.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20230831162954.111485-1-bjorn@kernel.org
      be8e754c
    • Daniel Borkmann's avatar
      bpf: Annotate bpf_long_memcpy with data_race · 6a86b5b5
      Daniel Borkmann authored
      
      
      syzbot reported a data race splat between two processes trying to
      update the same BPF map value via syscall on different CPUs:
      
        BUG: KCSAN: data-race in bpf_percpu_array_update / bpf_percpu_array_update
      
        write to 0xffffe8fffe7425d8 of 8 bytes by task 8257 on cpu 1:
         bpf_long_memcpy include/linux/bpf.h:428 [inline]
         bpf_obj_memcpy include/linux/bpf.h:441 [inline]
         copy_map_value_long include/linux/bpf.h:464 [inline]
         bpf_percpu_array_update+0x3bb/0x500 kernel/bpf/arraymap.c:380
         bpf_map_update_value+0x190/0x370 kernel/bpf/syscall.c:175
         generic_map_update_batch+0x3ae/0x4f0 kernel/bpf/syscall.c:1749
         bpf_map_do_batch+0x2df/0x3d0 kernel/bpf/syscall.c:4648
         __sys_bpf+0x28a/0x780
         __do_sys_bpf kernel/bpf/syscall.c:5241 [inline]
         __se_sys_bpf kernel/bpf/syscall.c:5239 [inline]
         __x64_sys_bpf+0x43/0x50 kernel/bpf/syscall.c:5239
         do_syscall_x64 arch/x86/entry/common.c:50 [inline]
         do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
         entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
        write to 0xffffe8fffe7425d8 of 8 bytes by task 8268 on cpu 0:
         bpf_long_memcpy include/linux/bpf.h:428 [inline]
         bpf_obj_memcpy include/linux/bpf.h:441 [inline]
         copy_map_value_long include/linux/bpf.h:464 [inline]
         bpf_percpu_array_update+0x3bb/0x500 kernel/bpf/arraymap.c:380
         bpf_map_update_value+0x190/0x370 kernel/bpf/syscall.c:175
         generic_map_update_batch+0x3ae/0x4f0 kernel/bpf/syscall.c:1749
         bpf_map_do_batch+0x2df/0x3d0 kernel/bpf/syscall.c:4648
         __sys_bpf+0x28a/0x780
         __do_sys_bpf kernel/bpf/syscall.c:5241 [inline]
         __se_sys_bpf kernel/bpf/syscall.c:5239 [inline]
         __x64_sys_bpf+0x43/0x50 kernel/bpf/syscall.c:5239
         do_syscall_x64 arch/x86/entry/common.c:50 [inline]
         do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
         entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
        value changed: 0x0000000000000000 -> 0xfffffff000002788
      
      The bpf_long_memcpy is used with 8-byte aligned pointers, power-of-8 size
      and forced to use long read/writes to try to atomically copy long counters.
      It is best-effort only and no barriers are here since it _will_ race with
      concurrent updates from BPF programs. The bpf_long_memcpy() is called from
      bpf(2) syscall. Marco suggested that the best way to make this known to
      KCSAN would be to use data_race() annotation.
      
      Reported-by: default avatar <syzbot+97522333291430dd277f@syzkaller.appspotmail.com>
      Suggested-by: default avatarMarco Elver <elver@google.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarMarco Elver <elver@google.com>
      Link: https://lore.kernel.org/bpf/000000000000d87a7f06040c970c@google.com
      Link: https://lore.kernel.org/bpf/57628f7a15e20d502247c3b55fceb1cb2b31f266.1693342186.git.daniel@iogearbox.net
      6a86b5b5
  2. Aug 31, 2023
    • Jiri Olsa's avatar
      selftests/bpf: Fix d_path test · d11ae1b1
      Jiri Olsa authored
      Recent commit [1] broke d_path test, because now filp_close is not called
      directly from sys_close, but eventually later when the file is finally
      released.
      
      As suggested by Hou Tao we don't need to re-hook the bpf program, but just
      instead we can use sys_close_range to trigger filp_close synchronously.
      
        [1] 021a160a
      
       ("fs: use __fput_sync in close(2)")
      
      Suggested-by: default avatarHou Tao <houtao@huaweicloud.com>
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20230831141103.359810-1-jolsa@kernel.org
      d11ae1b1
    • Vishal Chourasia's avatar
      bpf, docs: Fix invalid escape sequence warnings in bpf_doc.py · 121fd33b
      Vishal Chourasia authored
      
      
      The script bpf_doc.py generates multiple SyntaxWarnings related to invalid
      escape sequences when executed with Python 3.12. These warnings do not appear
      in Python 3.10 and 3.11 and do not affect the kernel build, which completes
      successfully.
      
      This patch resolves these SyntaxWarnings by converting the relevant string
      literals to raw strings or by escaping backslashes. This ensures that
      backslashes are interpreted as literal characters, eliminating the warnings.
      
      Reported-by: default avatarSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Signed-off-by: default avatarVishal Chourasia <vishalc@linux.ibm.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Tested-by: default avatarQuentin Monnet <quentin@isovalent.com>
      Link: https://lore.kernel.org/bpf/20230829074931.25112048-1-vishalc@linux.ibm.com
      121fd33b
    • Magnus Karlsson's avatar
      xsk: Fix xsk_diag use-after-free error during socket cleanup · 3e019d8a
      Magnus Karlsson authored
      Fix a use-after-free error that is possible if the xsk_diag interface
      is used after the socket has been unbound from the device. This can
      happen either due to the socket being closed or the device
      disappearing. In the early days of AF_XDP, the way we tested that a
      socket was not bound to a device was to simply check if the netdevice
      pointer in the xsk socket structure was NULL. Later, a better system
      was introduced by having an explicit state variable in the xsk socket
      struct. For example, the state of a socket that is on the way to being
      closed and has been unbound from the device is XSK_UNBOUND.
      
      The commit in the Fixes tag below deleted the old way of signalling
      that a socket is unbound, setting dev to NULL. This in the belief that
      all code using the old way had been exterminated. That was
      unfortunately not true as the xsk diagnostics code was still using the
      old way and thus does not work as intended when a socket is going
      down. Fix this by introducing a test against the state variable. If
      the socket is in the state XSK_UNBOUND, simply abort the diagnostic's
      netlink operation.
      
      Fixes: 18b1ab7a
      
       ("xsk: Fix race at socket teardown")
      Reported-by: default avatar <syzbot+822d1359297e2694f873@syzkaller.appspotmail.com>
      Signed-off-by: default avatarMagnus Karlsson <magnus.karlsson@intel.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Tested-by: default avatar <syzbot+822d1359297e2694f873@syzkaller.appspotmail.com>
      Tested-by: default avatarMaciej Fijalkowski <maciej.fijalkowski@intel.com>
      Reviewed-by: default avatarMaciej Fijalkowski <maciej.fijalkowski@intel.com>
      Link: https://lore.kernel.org/bpf/20230831100119.17408-1-magnus.karlsson@gmail.com
      3e019d8a
    • Florian Westphal's avatar
      net: fib: avoid warn splat in flow dissector · 8aae7625
      Florian Westphal authored
      New skbs allocated via nf_send_reset() have skb->dev == NULL.
      
      fib*_rules_early_flow_dissect helpers already have a 'struct net'
      argument but its not passed down to the flow dissector core, which
      will then WARN as it can't derive a net namespace to use:
      
       WARNING: CPU: 0 PID: 0 at net/core/flow_dissector.c:1016 __skb_flow_dissect+0xa91/0x1cd0
       [..]
        ip_route_me_harder+0x143/0x330
        nf_send_reset+0x17c/0x2d0 [nf_reject_ipv4]
        nft_reject_inet_eval+0xa9/0xf2 [nft_reject_inet]
        nft_do_chain+0x198/0x5d0 [nf_tables]
        nft_do_chain_inet+0xa4/0x110 [nf_tables]
        nf_hook_slow+0x41/0xc0
        ip_local_deliver+0xce/0x110
        ..
      
      Cc: Stanislav Fomichev <sdf@google.com>
      Cc: David Ahern <dsahern@kernel.org>
      Cc: Ido Schimmel <idosch@nvidia.com>
      Fixes: 812fa71f
      
       ("netfilter: Dissect flow after packet mangling")
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=217826
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Reviewed-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Link: https://lore.kernel.org/r/20230830110043.30497-1-fw@strlen.de
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      8aae7625
    • Eric Dumazet's avatar
      net: read sk->sk_family once in sk_mc_loop() · a3e0fdf7
      Eric Dumazet authored
      syzbot is playing with IPV6_ADDRFORM quite a lot these days,
      and managed to hit the WARN_ON_ONCE(1) in sk_mc_loop()
      
      We have many more similar issues to fix.
      
      WARNING: CPU: 1 PID: 1593 at net/core/sock.c:782 sk_mc_loop+0x165/0x260
      Modules linked in:
      CPU: 1 PID: 1593 Comm: kworker/1:3 Not tainted 6.1.40-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/26/2023
      Workqueue: events_power_efficient gc_worker
      RIP: 0010:sk_mc_loop+0x165/0x260 net/core/sock.c:782
      Code: 34 1b fd 49 81 c7 18 05 00 00 4c 89 f8 48 c1 e8 03 42 80 3c 20 00 74 08 4c 89 ff e8 25 36 6d fd 4d 8b 37 eb 13 e8 db 33 1b fd <0f> 0b b3 01 eb 34 e8 d0 33 1b fd 45 31 f6 49 83 c6 38 4c 89 f0 48
      RSP: 0018:ffffc90000388530 EFLAGS: 00010246
      RAX: ffffffff846d9b55 RBX: 0000000000000011 RCX: ffff88814f884980
      RDX: 0000000000000102 RSI: ffffffff87ae5160 RDI: 0000000000000011
      RBP: ffffc90000388550 R08: 0000000000000003 R09: ffffffff846d9a65
      R10: 0000000000000002 R11: ffff88814f884980 R12: dffffc0000000000
      R13: ffff88810dbee000 R14: 0000000000000010 R15: ffff888150084000
      FS: 0000000000000000(0000) GS:ffff8881f6b00000(0000) knlGS:0000000000000000
      CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000020000180 CR3: 000000014ee5b000 CR4: 00000000003506e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
      <IRQ>
      [<ffffffff8507734f>] ip6_finish_output2+0x33f/0x1ae0 net/ipv6/ip6_output.c:83
      [<ffffffff85062766>] __ip6_finish_output net/ipv6/ip6_output.c:200 [inline]
      [<ffffffff85062766>] ip6_finish_output+0x6c6/0xb10 net/ipv6/ip6_output.c:211
      [<ffffffff85061f8c>] NF_HOOK_COND include/linux/netfilter.h:298 [inline]
      [<ffffffff85061f8c>] ip6_output+0x2bc/0x3d0 net/ipv6/ip6_output.c:232
      [<ffffffff852071cf>] dst_output include/net/dst.h:444 [inline]
      [<ffffffff852071cf>] ip6_local_out+0x10f/0x140 net/ipv6/output_core.c:161
      [<ffffffff83618fb4>] ipvlan_process_v6_outbound drivers/net/ipvlan/ipvlan_core.c:483 [inline]
      [<ffffffff83618fb4>] ipvlan_process_outbound drivers/net/ipvlan/ipvlan_core.c:529 [inline]
      [<ffffffff83618fb4>] ipvlan_xmit_mode_l3 drivers/net/ipvlan/ipvlan_core.c:602 [inline]
      [<ffffffff83618fb4>] ipvlan_queue_xmit+0x1174/0x1be0 drivers/net/ipvlan/ipvlan_core.c:677
      [<ffffffff8361ddd9>] ipvlan_start_xmit+0x49/0x100 drivers/net/ipvlan/ipvlan_main.c:229
      [<ffffffff84763fc0>] netdev_start_xmit include/linux/netdevice.h:4925 [inline]
      [<ffffffff84763fc0>] xmit_one net/core/dev.c:3644 [inline]
      [<ffffffff84763fc0>] dev_hard_start_xmit+0x320/0x980 net/core/dev.c:3660
      [<ffffffff8494c650>] sch_direct_xmit+0x2a0/0x9c0 net/sched/sch_generic.c:342
      [<ffffffff8494d883>] qdisc_restart net/sched/sch_generic.c:407 [inline]
      [<ffffffff8494d883>] __qdisc_run+0xb13/0x1e70 net/sched/sch_generic.c:415
      [<ffffffff8478c426>] qdisc_run+0xd6/0x260 include/net/pkt_sched.h:125
      [<ffffffff84796eac>] net_tx_action+0x7ac/0x940 net/core/dev.c:5247
      [<ffffffff858002bd>] __do_softirq+0x2bd/0x9bd kernel/softirq.c:599
      [<ffffffff814c3fe8>] invoke_softirq kernel/softirq.c:430 [inline]
      [<ffffffff814c3fe8>] __irq_exit_rcu+0xc8/0x170 kernel/softirq.c:683
      [<ffffffff814c3f09>] irq_exit_rcu+0x9/0x20 kernel/softirq.c:695
      
      Fixes: 7ad6848c
      
       ("ip: fix mc_loop checks for tunnels with multicast outer addresses")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Link: https://lore.kernel.org/r/20230830101244.1146934-1-edumazet@google.com
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      a3e0fdf7
    • Eric Dumazet's avatar
      ipv4: annotate data-races around fi->fib_dead · fce92af1
      Eric Dumazet authored
      syzbot complained about a data-race in fib_table_lookup() [1]
      
      Add appropriate annotations to document it.
      
      [1]
      BUG: KCSAN: data-race in fib_release_info / fib_table_lookup
      
      write to 0xffff888150f31744 of 1 bytes by task 1189 on cpu 0:
      fib_release_info+0x3a0/0x460 net/ipv4/fib_semantics.c:281
      fib_table_delete+0x8d2/0x900 net/ipv4/fib_trie.c:1777
      fib_magic+0x1c1/0x1f0 net/ipv4/fib_frontend.c:1106
      fib_del_ifaddr+0x8cf/0xa60 net/ipv4/fib_frontend.c:1317
      fib_inetaddr_event+0x77/0x200 net/ipv4/fib_frontend.c:1448
      notifier_call_chain kernel/notifier.c:93 [inline]
      blocking_notifier_call_chain+0x90/0x200 kernel/notifier.c:388
      __inet_del_ifa+0x4df/0x800 net/ipv4/devinet.c:432
      inet_del_ifa net/ipv4/devinet.c:469 [inline]
      inetdev_destroy net/ipv4/devinet.c:322 [inline]
      inetdev_event+0x553/0xaf0 net/ipv4/devinet.c:1606
      notifier_call_chain kernel/notifier.c:93 [inline]
      raw_notifier_call_chain+0x6b/0x1c0 kernel/notifier.c:461
      call_netdevice_notifiers_info net/core/dev.c:1962 [inline]
      call_netdevice_notifiers_mtu+0xd2/0x130 net/core/dev.c:2037
      dev_set_mtu_ext+0x30b/0x3e0 net/core/dev.c:8673
      do_setlink+0x5be/0x2430 net/core/rtnetlink.c:2837
      rtnl_setlink+0x255/0x300 net/core/rtnetlink.c:3177
      rtnetlink_rcv_msg+0x807/0x8c0 net/core/rtnetlink.c:6445
      netlink_rcv_skb+0x126/0x220 net/netlink/af_netlink.c:2549
      rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:6463
      netlink_unicast_kernel net/netlink/af_netlink.c:1339 [inline]
      netlink_unicast+0x56f/0x640 net/netlink/af_netlink.c:1365
      netlink_sendmsg+0x665/0x770 net/netlink/af_netlink.c:1914
      sock_sendmsg_nosec net/socket.c:725 [inline]
      sock_sendmsg net/socket.c:748 [inline]
      sock_write_iter+0x1aa/0x230 net/socket.c:1129
      do_iter_write+0x4b4/0x7b0 fs/read_write.c:860
      vfs_writev+0x1a8/0x320 fs/read_write.c:933
      do_writev+0xf8/0x220 fs/read_write.c:976
      __do_sys_writev fs/read_write.c:1049 [inline]
      __se_sys_writev fs/read_write.c:1046 [inline]
      __x64_sys_writev+0x45/0x50 fs/read_write.c:1046
      do_syscall_x64 arch/x86/entry/common.c:50 [inline]
      do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
      entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      read to 0xffff888150f31744 of 1 bytes by task 21839 on cpu 1:
      fib_table_lookup+0x2bf/0xd50 net/ipv4/fib_trie.c:1585
      fib_lookup include/net/ip_fib.h:383 [inline]
      ip_route_output_key_hash_rcu+0x38c/0x12c0 net/ipv4/route.c:2751
      ip_route_output_key_hash net/ipv4/route.c:2641 [inline]
      __ip_route_output_key include/net/route.h:134 [inline]
      ip_route_output_flow+0xa6/0x150 net/ipv4/route.c:2869
      send4+0x1e7/0x500 drivers/net/wireguard/socket.c:61
      wg_socket_send_skb_to_peer+0x94/0x130 drivers/net/wireguard/socket.c:175
      wg_socket_send_buffer_to_peer+0xd6/0x100 drivers/net/wireguard/socket.c:200
      wg_packet_send_handshake_initiation drivers/net/wireguard/send.c:40 [inline]
      wg_packet_handshake_send_worker+0x10c/0x150 drivers/net/wireguard/send.c:51
      process_one_work+0x434/0x860 kernel/workqueue.c:2600
      worker_thread+0x5f2/0xa10 kernel/workqueue.c:2751
      kthread+0x1d7/0x210 kernel/kthread.c:389
      ret_from_fork+0x2e/0x40 arch/x86/kernel/process.c:145
      ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:304
      
      value changed: 0x00 -> 0x01
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 1 PID: 21839 Comm: kworker/u4:18 Tainted: G W 6.5.0-syzkaller #0
      
      Fixes: dccd9ecc
      
       ("ipv4: Do not use dead fib_info entries.")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Link: https://lore.kernel.org/r/20230830095520.1046984-1-edumazet@google.com
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      fce92af1
    • Eric Dumazet's avatar
      sctp: annotate data-races around sk->sk_wmem_queued · dc9511dd
      Eric Dumazet authored
      sk->sk_wmem_queued can be read locklessly from sctp_poll()
      
      Use sk_wmem_queued_add() when the field is changed,
      and add READ_ONCE() annotations in sctp_writeable()
      and sctp_assocs_seq_show()
      
      syzbot reported:
      
      BUG: KCSAN: data-race in sctp_poll / sctp_wfree
      
      read-write to 0xffff888149d77810 of 4 bytes by interrupt on cpu 0:
      sctp_wfree+0x170/0x4a0 net/sctp/socket.c:9147
      skb_release_head_state+0xb7/0x1a0 net/core/skbuff.c:988
      skb_release_all net/core/skbuff.c:1000 [inline]
      __kfree_skb+0x16/0x140 net/core/skbuff.c:1016
      consume_skb+0x57/0x180 net/core/skbuff.c:1232
      sctp_chunk_destroy net/sctp/sm_make_chunk.c:1503 [inline]
      sctp_chunk_put+0xcd/0x130 net/sctp/sm_make_chunk.c:1530
      sctp_datamsg_put+0x29a/0x300 net/sctp/chunk.c:128
      sctp_chunk_free+0x34/0x50 net/sctp/sm_make_chunk.c:1515
      sctp_outq_sack+0xafa/0xd70 net/sctp/outqueue.c:1381
      sctp_cmd_process_sack net/sctp/sm_sideeffect.c:834 [inline]
      sctp_cmd_interpreter net/sctp/sm_sideeffect.c:1366 [inline]
      sctp_side_effects net/sctp/sm_sideeffect.c:1198 [inline]
      sctp_do_sm+0x12c7/0x31b0 net/sctp/sm_sideeffect.c:1169
      sctp_assoc_bh_rcv+0x2b2/0x430 net/sctp/associola.c:1051
      sctp_inq_push+0x108/0x120 net/sctp/inqueue.c:80
      sctp_rcv+0x116e/0x1340 net/sctp/input.c:243
      sctp6_rcv+0x25/0x40 net/sctp/ipv6.c:1120
      ip6_protocol_deliver_rcu+0x92f/0xf30 net/ipv6/ip6_input.c:437
      ip6_input_finish net/ipv6/ip6_input.c:482 [inline]
      NF_HOOK include/linux/netfilter.h:303 [inline]
      ip6_input+0xbd/0x1b0 net/ipv6/ip6_input.c:491
      dst_input include/net/dst.h:468 [inline]
      ip6_rcv_finish+0x1e2/0x2e0 net/ipv6/ip6_input.c:79
      NF_HOOK include/linux/netfilter.h:303 [inline]
      ipv6_rcv+0x74/0x150 net/ipv6/ip6_input.c:309
      __netif_receive_skb_one_core net/core/dev.c:5452 [inline]
      __netif_receive_skb+0x90/0x1b0 net/core/dev.c:5566
      process_backlog+0x21f/0x380 net/core/dev.c:5894
      __napi_poll+0x60/0x3b0 net/core/dev.c:6460
      napi_poll net/core/dev.c:6527 [inline]
      net_rx_action+0x32b/0x750 net/core/dev.c:6660
      __do_softirq+0xc1/0x265 kernel/softirq.c:553
      run_ksoftirqd+0x17/0x20 kernel/softirq.c:921
      smpboot_thread_fn+0x30a/0x4a0 kernel/smpboot.c:164
      kthread+0x1d7/0x210 kernel/kthread.c:389
      ret_from_fork+0x2e/0x40 arch/x86/kernel/process.c:145
      ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:304
      
      read to 0xffff888149d77810 of 4 bytes by task 17828 on cpu 1:
      sctp_writeable net/sctp/socket.c:9304 [inline]
      sctp_poll+0x265/0x410 net/sctp/socket.c:8671
      sock_poll+0x253/0x270 net/socket.c:1374
      vfs_poll include/linux/poll.h:88 [inline]
      do_pollfd fs/select.c:873 [inline]
      do_poll fs/select.c:921 [inline]
      do_sys_poll+0x636/0xc00 fs/select.c:1015
      __do_sys_ppoll fs/select.c:1121 [inline]
      __se_sys_ppoll+0x1af/0x1f0 fs/select.c:1101
      __x64_sys_ppoll+0x67/0x80 fs/select.c:1101
      do_syscall_x64 arch/x86/entry/common.c:50 [inline]
      do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
      entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      value changed: 0x00019e80 -> 0x0000cc80
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 1 PID: 17828 Comm: syz-executor.1 Not tainted 6.5.0-rc7-syzkaller-00185-g28f20a19294d #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/26/2023
      
      Fixes: 1da177e4
      
       ("Linux-2.6.12-rc2")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Acked-by: default avatarXin Long <lucien.xin@gmail.com>
      Link: https://lore.kernel.org/r/20230830094519.950007-1-edumazet@google.com
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      dc9511dd
    • Eric Dumazet's avatar
      net/sched: fq_pie: avoid stalls in fq_pie_timer() · 8c21ab1b
      Eric Dumazet authored
      When setting a high number of flows (limit being 65536),
      fq_pie_timer() is currently using too much time as syzbot reported.
      
      Add logic to yield the cpu every 2048 flows (less than 150 usec
      on debug kernels).
      It should also help by not blocking qdisc fast paths for too long.
      Worst case (65536 flows) would need 31 jiffies for a complete scan.
      
      Relevant extract from syzbot report:
      
      rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 0-.... } 2663 jiffies s: 873 root: 0x1/.
      rcu: blocking rcu_node structures (internal RCU debug):
      Sending NMI from CPU 1 to CPUs 0:
      NMI backtrace for cpu 0
      CPU: 0 PID: 5177 Comm: syz-executor273 Not tainted 6.5.0-syzkaller-00453-g727dbda16b83 #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/26/2023
      RIP: 0010:check_kcov_mode kernel/kcov.c:173 [inline]
      RIP: 0010:write_comp_data+0x21/0x90 kernel/kcov.c:236
      Code: 2e 0f 1f 84 00 00 00 00 00 65 8b 05 01 b2 7d 7e 49 89 f1 89 c6 49 89 d2 81 e6 00 01 00 00 49 89 f8 65 48 8b 14 25 80 b9 03 00 <a9> 00 01 ff 00 74 0e 85 f6 74 59 8b 82 04 16 00 00 85 c0 74 4f 8b
      RSP: 0018:ffffc90000007bb8 EFLAGS: 00000206
      RAX: 0000000000000101 RBX: ffffc9000dc0d140 RCX: ffffffff885893b0
      RDX: ffff88807c075940 RSI: 0000000000000100 RDI: 0000000000000001
      RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000000 R12: ffffc9000dc0d178
      R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
      FS:  0000555555d54380(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007f6b442f6130 CR3: 000000006fe1c000 CR4: 00000000003506f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       <NMI>
       </NMI>
       <IRQ>
       pie_calculate_probability+0x480/0x850 net/sched/sch_pie.c:415
       fq_pie_timer+0x1da/0x4f0 net/sched/sch_fq_pie.c:387
       call_timer_fn+0x1a0/0x580 kernel/time/timer.c:1700
      
      Fixes: ec97ecf1
      
       ("net: sched: add Flow Queue PIE packet scheduler")
      Link: https://lore.kernel.org/lkml/00000000000017ad3f06040bf394@google.com/
      Reported-by: default avatar <syzbot+e46fbd5289363464bc13@syzkaller.appspotmail.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarMichal Kubiak <michal.kubiak@intel.com>
      Reviewed-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Link: https://lore.kernel.org/r/20230829123541.3745013-1-edumazet@google.com
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      8c21ab1b
    • Jakub Kicinski's avatar
      Merge tag 'nf-23-08-31' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf · 4e60de1e
      Jakub Kicinski authored
      
      
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter fixes for net
      
      The following patchset contains Netfilter fixes for net:
      
      1) Fix mangling of TCP options with non-linear skbuff, from Xiao Liang.
      
      2) OOB read in xt_sctp due to missing sanitization of array length field.
         From Wander Lairson Costa.
      
      3) OOB read in xt_u32 due to missing sanitization of array length field.
         Also from Wander Lairson Costa.
      
      All of them above, always broken for several releases.
      
      4) Missing audit log for set element reset command, from Phil Sutter.
      
      5) Missing audit log for rule reset command, also from Phil.
      
      These audit log support are missing in 6.5.
      
      * tag 'nf-23-08-31' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
        netfilter: nf_tables: Audit log rule reset
        netfilter: nf_tables: Audit log setelem reset
        netfilter: xt_u32: validate user space input
        netfilter: xt_sctp: validate the flag_info count
        netfilter: nft_exthdr: Fix non-linear header modification
      ====================
      
      Link: https://lore.kernel.org/r/20230830235935.465690-1-pablo@netfilter.org
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      4e60de1e
    • Donald Hunter's avatar
      doc/netlink: Fix missing classic_netlink doc reference · ee940b57
      Donald Hunter authored
      Add missing cross-reference label for classic_netlink.
      
      Fixes: 2db8abf0
      
       ("doc/netlink: Document the netlink-raw schema extensions")
      Signed-off-by: default avatarDonald Hunter <donald.hunter@gmail.com>
      Link: https://lore.kernel.org/r/20230829085539.36354-1-donald.hunter@gmail.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      ee940b57
    • Oliver Neukum's avatar
      NFC: nxp: add NXP1002 · 8b72d2a1
      Oliver Neukum authored
      
      
      It is backwards compatible
      
      Signed-off-by: default avatarOliver Neukum <oneukum@suse.com>
      Reviewed-by: default avatarKrzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
      Link: https://lore.kernel.org/r/20230829084717.961-1-oneukum@suse.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      8b72d2a1
    • Russell King (Oracle)'s avatar
      net: stmmac: failure to probe without MAC interface specified · b5947239
      Russell King (Oracle) authored
      Alexander Stein reports that commit a014c355
      
       ("net: stmmac: clarify
      difference between "interface" and "phy_interface"") caused breakage,
      because plat->mac_interface will never be negative. Fix this by using
      the "rc" temporary variable in stmmac_probe_config_dt().
      
      Reported-by: default avatarAlexander Stein <alexander.stein@ew.tq-group.com>
      Signed-off-by: default avatarRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
      Tested-by: default avatarAlexander Stein <alexander.stein@ew.tq-group.com>
      Link: https://lore.kernel.org/r/E1qayn0-006Q8J-GE@rmk-PC.armlinux.org.uk
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      b5947239
    • Phil Sutter's avatar
      netfilter: nf_tables: Audit log rule reset · ea078ae9
      Phil Sutter authored
      Resetting rules' stateful data happens outside of the transaction logic,
      so 'get' and 'dump' handlers have to emit audit log entries themselves.
      
      Fixes: 8daa8fde
      
       ("netfilter: nf_tables: Introduce NFT_MSG_GETRULE_RESET")
      Signed-off-by: default avatarPhil Sutter <phil@nwl.cc>
      Reviewed-by: default avatarRichard Guy Briggs <rgb@redhat.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      ea078ae9
    • Phil Sutter's avatar
      netfilter: nf_tables: Audit log setelem reset · 7e9be112
      Phil Sutter authored
      Since set element reset is not integrated into nf_tables' transaction
      logic, an explicit log call is needed, similar to NFT_MSG_GETOBJ_RESET
      handling.
      
      For the sake of simplicity, catchall element reset will always generate
      a dedicated log entry. This relieves nf_tables_dump_set() from having to
      adjust the logged element count depending on whether a catchall element
      was found or not.
      
      Fixes: 079cd633
      
       ("netfilter: nf_tables: Introduce NFT_MSG_GETSETELEM_RESET")
      Signed-off-by: default avatarPhil Sutter <phil@nwl.cc>
      Reviewed-by: default avatarRichard Guy Briggs <rgb@redhat.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      7e9be112
  3. Aug 30, 2023
    • Wander Lairson Costa's avatar
      netfilter: xt_u32: validate user space input · 69c5d284
      Wander Lairson Costa authored
      The xt_u32 module doesn't validate the fields in the xt_u32 structure.
      An attacker may take advantage of this to trigger an OOB read by setting
      the size fields with a value beyond the arrays boundaries.
      
      Add a checkentry function to validate the structure.
      
      This was originally reported by the ZDI project (ZDI-CAN-18408).
      
      Fixes: 1b50b8a3
      
       ("[NETFILTER]: Add u32 match")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarWander Lairson Costa <wander@redhat.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      69c5d284
    • Wander Lairson Costa's avatar
      netfilter: xt_sctp: validate the flag_info count · e9947649
      Wander Lairson Costa authored
      sctp_mt_check doesn't validate the flag_count field. An attacker can
      take advantage of that to trigger a OOB read and leak memory
      information.
      
      Add the field validation in the checkentry function.
      
      Fixes: 2e4e6a17
      
       ("[NETFILTER] x_tables: Abstraction layer for {ip,ip6,arp}_tables")
      Cc: stable@vger.kernel.org
      Reported-by: default avatarLucas Leong <wmliang@infosec.exchange>
      Signed-off-by: default avatarWander Lairson Costa <wander@redhat.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      e9947649
    • Xiao Liang's avatar
      netfilter: nft_exthdr: Fix non-linear header modification · 28427f36
      Xiao Liang authored
      Fix skb_ensure_writable() size. Don't use nft_tcp_header_pointer() to
      make it explicit that pointers point to the packet (not local buffer).
      
      Fixes: 99d1712b ("netfilter: exthdr: tcp option set support")
      Fixes: 7890cbea
      
       ("netfilter: exthdr: add support for tcp option removal")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarXiao Liang <shaw.leon@gmail.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      28427f36
    • David Vernet's avatar
      bpf, docs: s/eBPF/BPF in standards documents · 7d35eb1a
      David Vernet authored
      
      
      There isn't really anything other than just "BPF" at this point,
      so referring to it as "eBPF" in our standards document just causes
      unnecessary confusion. Let's just be consistent and use "BPF".
      
      Suggested-by: default avatarWill Hawkins <hawkinsw@obs.cr>
      Signed-off-by: default avatarDavid Vernet <void@manifault.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20230828155948.123405-4-void@manifault.com
      7d35eb1a
    • David Vernet's avatar
      bpf, docs: Add abi.rst document to standardization subdirectory · deb88407
      David Vernet authored
      
      
      As specified in the IETF BPF charter, the BPF working group has plans to
      add one or more informational documents that recommend conventions and
      guidelines for producing portable BPF program binaries. The
      instruction-set.rst document currently contains a "Registers and calling
      convention" subsection which dictates a calling convention that belongs
      in an ABI document, rather than an instruction set document. Let's move
      it to a new abi.rst document so we can clean it up. The abi.rst document
      will of course be significantly changed and expanded upon over time. For
      now, it's really just a placeholder which will contain ABI-specific
      language that doesn't belong in other documents.
      
      Signed-off-by: default avatarDavid Vernet <void@manifault.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20230828155948.123405-3-void@manifault.com
      deb88407
    • David Vernet's avatar
      bpf, docs: Move linux-notes.rst to root bpf docs tree · aee1720e
      David Vernet authored
      In commit 4d496be9
      
       ("bpf,docs: Create new standardization
      subdirectory"), I added a standardization/ directory to the BPF
      documentation, which will contain the docs that will be standardized
      as part of the effort with the IETF.
      
      I included linux-notes.rst in that directory, but I shouldn't have. It
      doesn't contain anything that will be standardized. Let's move it back
      to Documentation/bpf.
      
      Signed-off-by: default avatarDavid Vernet <void@manifault.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20230828155948.123405-2-void@manifault.com
      aee1720e
    • Heng Guo's avatar
      net: ipv4, ipv6: fix IPSTATS_MIB_OUTOCTETS increment duplicated · e4da8c78
      Heng Guo authored
      commit edf391ff ("snmp: add missing counters for RFC 4293") had
      already added OutOctets for RFC 4293. In commit 2d8dbb04 ("snmp: fix
      OutOctets counter to include forwarded datagrams"), OutOctets was
      counted again, but not removed from ip_output().
      
      According to RFC 4293 "3.2.3. IP Statistics Tables",
      ipipIfStatsOutTransmits is not equal to ipIfStatsOutForwDatagrams. So
      "IPSTATS_MIB_OUTOCTETS must be incremented when incrementing" is not
      accurate. And IPSTATS_MIB_OUTOCTETS should be counted after fragment.
      
      This patch reverts commit 2d8dbb04
      
       ("snmp: fix OutOctets counter to
      include forwarded datagrams") and move IPSTATS_MIB_OUTOCTETS to
      ip_finish_output2 for ipv4.
      
      Reviewed-by: default avatarFilip Pudak <filip.pudak@windriver.com>
      Signed-off-by: default avatarHeng Guo <heng.guo@windriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e4da8c78
    • John Fastabend's avatar
      bpf, sockmap: Fix preempt_rt splat when using raw_spin_lock_t · 35d2b7ff
      John Fastabend authored
      
      
      Sockmap and sockhash maps are a collection of psocks that are
      objects representing a socket plus a set of metadata needed
      to manage the BPF programs associated with the socket. These
      maps use the stab->lock to protect from concurrent operations
      on the maps, e.g. trying to insert to objects into the array
      at the same time in the same slot. Additionally, a sockhash map
      has a bucket lock to protect iteration and insert/delete into
      the hash entry.
      
      Each psock has a psock->link which is a linked list of all the
      maps that a psock is attached to. This allows a psock (socket)
      to be included in multiple sockmap and sockhash maps. This
      linked list is protected the psock->link_lock.
      
      They _must_ be nested correctly to avoid deadlock:
      
        lock(stab->lock)
          : do BPF map operations and psock insert/delete
          lock(psock->link_lock)
             : add map to psock linked list of maps
          unlock(psock->link_lock)
        unlock(stab->lock)
      
      For non PREEMPT_RT kernels both raw_spin_lock_t and spin_lock_t
      are guaranteed to not sleep. But, with PREEMPT_RT kernels the
      spin_lock_t variants may sleep. In the current code we have
      many patterns like this:
      
         rcu_critical_section:
            raw_spin_lock(stab->lock)
               spin_lock(psock->link_lock) <- may sleep ouch
               spin_unlock(psock->link_lock)
            raw_spin_unlock(stab->lock)
         rcu_critical_section
      
      Nesting spin_lock() inside a raw_spin_lock() violates locking
      rules for PREEMPT_RT kernels. And additionally we do alloc(GFP_ATOMICS)
      inside the stab->lock, but those might sleep on PREEMPT_RT kernels.
      The result is splats like this:
      
      ./test_progs -t sockmap_basic
      [   33.344330] bpf_testmod: loading out-of-tree module taints kernel.
      [   33.441933]
      [   33.442089] =============================
      [   33.442421] [ BUG: Invalid wait context ]
      [   33.442763] 6.5.0-rc5-01731-gec0ded2e0282 #4958 Tainted: G           O
      [   33.443320] -----------------------------
      [   33.443624] test_progs/2073 is trying to lock:
      [   33.443960] ffff888102a1c290 (&psock->link_lock){....}-{3:3}, at: sock_map_update_common+0x2c2/0x3d0
      [   33.444636] other info that might help us debug this:
      [   33.444991] context-{5:5}
      [   33.445183] 3 locks held by test_progs/2073:
      [   33.445498]  #0: ffff88811a208d30 (sk_lock-AF_INET){+.+.}-{0:0}, at: sock_map_update_elem_sys+0xff/0x330
      [   33.446159]  #1: ffffffff842539e0 (rcu_read_lock){....}-{1:3}, at: sock_map_update_elem_sys+0xf5/0x330
      [   33.446809]  #2: ffff88810d687240 (&stab->lock){+...}-{2:2}, at: sock_map_update_common+0x177/0x3d0
      [   33.447445] stack backtrace:
      [   33.447655] CPU: 10 PID
      
      To fix observe we can't readily remove the allocations (for that
      we would need to use/create something similar to bpf_map_alloc). So
      convert raw_spin_lock_t to spin_lock_t. We note that sock_map_update
      that would trigger the allocate and potential sleep is only allowed
      through sys_bpf ops and via sock_ops which precludes hw interrupts
      and low level atomic sections in RT preempt kernel. On non RT
      preempt kernel there are no changes here and spin locks sections
      and alloc(GFP_ATOMIC) are still not sleepable.
      
      Signed-off-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20230830053517.166611-1-john.fastabend@gmail.com
      35d2b7ff
    • Eduard Zingerman's avatar
      docs/bpf: Add description for CO-RE relocations · be4033d3
      Eduard Zingerman authored
      
      
      Add a section on CO-RE relocations to llvm_relo.rst. Describe relevant .BTF.ext
      structure, `enum bpf_core_relo_kind` and `struct bpf_core_relo` in some detail.
      
      Description is based on doc-strings from:
      
        - include/uapi/linux/bpf.h:struct bpf_core_relo
        - tools/lib/bpf/relo_core.c:__bpf_core_types_match()
      
      Signed-off-by: default avatarEduard Zingerman <eddyz87@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarYonghong Song <yonghong.song@linux.dev>
      Link: https://lore.kernel.org/bpf/20230826222912.2560865-2-eddyz87@gmail.com
      be4033d3
    • Will Hawkins's avatar
      bpf, docs: Correct source of offset for program-local call · 2d71a90f
      Will Hawkins authored
      
      
      The offset to use when calculating the target of a program-local call is
      in the instruction's imm field, not its offset field.
      
      Signed-off-by: default avatarWill Hawkins <hawkinsw@obs.cr>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarEduard Zingerman <eddyz87@gmail.com>
      Acked-by: default avatarDavid Vernet <void@manifault.com>
      Link: https://lore.kernel.org/bpf/20230826053258.1860167-1-hawkinsw@obs.cr
      2d71a90f
    • Yonghong Song's avatar
      selftests/bpf: Fix flaky cgroup_iter_sleepable subtest · 5439cfa7
      Yonghong Song authored
      Occasionally, with './test_progs -j' on my vm, I will hit the
      following failure:
      
        test_cgrp_local_storage:PASS:join_cgroup /cgrp_local_storage 0 nsec
        test_cgroup_iter_sleepable:PASS:skel_open 0 nsec
        test_cgroup_iter_sleepable:PASS:skel_load 0 nsec
        test_cgroup_iter_sleepable:PASS:attach_iter 0 nsec
        test_cgroup_iter_sleepable:PASS:iter_create 0 nsec
        test_cgroup_iter_sleepable:FAIL:cgroup_id unexpected cgroup_id: actual 1 != expected 2812
        #48/5    cgrp_local_storage/cgroup_iter_sleepable:FAIL
        #48      cgrp_local_storage:FAIL
      
      Finally, I decided to do some investigation since the test is introduced
      by myself. It turns out the reason is due to cgroup_fd with value 0.
      In cgroup_iter, a cgroup_fd of value 0 means the root cgroup.
      
      	/* from cgroup_iter.c */
              if (fd)
                      cgrp = cgroup_v1v2_get_from_fd(fd);
              else if (id)
                      cgrp = cgroup_get_from_id(id);
              else /* walk the entire hierarchy by default. */
                      cgrp = cgroup_get_from_path("/");
      
      That is why we got cgroup_id 1 instead of expected 2812.
      
      Why we got a cgroup_fd 0? Nobody should really touch 'stdin' (fd 0) in
      test_progs. I traced 'close' syscall with stack trace and found the root
      cause, which is a bug in bpf_obj_pinning.c. Basically, the code closed
      fd 0 although it should not. Fixing the bug in bpf_obj_pinning.c also
      resolved the above cgroup_iter_sleepable subtest failure.
      
      Fixes: 3b22f98e
      
       ("selftests/bpf: Add path_fd-based BPF_OBJ_PIN and BPF_OBJ_GET tests")
      Signed-off-by: default avatarYonghong Song <yonghong.song@linux.dev>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20230827150551.1743497-1-yonghong.song@linux.dev
      5439cfa7
    • Tirthendu Sarkar's avatar
      xsk: Fix xsk_build_skb() error: 'skb' dereferencing possible ERR_PTR() · 9d0a67b9
      Tirthendu Sarkar authored
      Currently, xsk_build_skb() is a function that builds skb in two possible
      ways and then is ended with common error handling.
      
      We can distinguish four possible error paths and handling in xsk_build_skb():
      
       1. sock_alloc_send_skb fails: Retry (skb is NULL).
       2. skb_store_bits fails : Free skb and retry.
       3. MAX_SKB_FRAGS exceeded: Free skb, cleanup and drop packet.
       4. alloc_page fails for frag: Retry page allocation w/o freeing skb
      
      1] and 3] can happen in xsk_build_skb_zerocopy(), which is one of the
      two code paths responsible for building skb. Common error path in
      xsk_build_skb() assumes that in case errno != -EAGAIN, skb is a valid
      pointer, which is wrong as kernel test robot reports that in
      xsk_build_skb_zerocopy() other errno values are returned for skb being
      NULL.
      
      To fix this, set -EOVERFLOW as error when MAX_SKB_FRAGS are exceeded
      and packet needs to be dropped in both xsk_build_skb() and
      xsk_build_skb_zerocopy() and use this to distinguish against all other
      error cases. Also, add explicit kfree_skb() for 3] so that handling
      of 1], 2], and 3] becomes identical where allocation needs to be retried.
      
      Fixes: cf24f5a5
      
       ("xsk: add support for AF_XDP multi-buffer on Tx path")
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Reported-by: default avatarDan Carpenter <dan.carpenter@linaro.org>
      Signed-off-by: default avatarTirthendu Sarkar <tirthendu.sarkar@intel.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarMagnus Karlsson <magnus.karlsson@intel.com>
      Closes: https://lore.kernel.org/r/202307210434.OjgqFcbB-lkp@intel.com
      Link: https://lore.kernel.org/bpf/20230823144713.2231808-1-tirthendu.sarkar@intel.com
      9d0a67b9
    • Yafang Shao's avatar
      bpftool: Fix build warnings with -Wtype-limits · 6a8faf10
      Yafang Shao authored
      Quentin reported build warnings when building bpftool :
      
          link.c: In function ‘perf_config_hw_cache_str’:
          link.c:86:18: warning: comparison of unsigned expression in ‘>= 0’ is always true [-Wtype-limits]
             86 |         if ((id) >= 0 && (id) < ARRAY_SIZE(array))      \
                |                  ^~
          link.c:320:20: note: in expansion of macro ‘perf_event_name’
            320 |         hw_cache = perf_event_name(evsel__hw_cache, config & 0xff);
                |                    ^~~~~~~~~~~~~~~
          [... more of the same for the other calls to perf_event_name ...]
      
      He also pointed out the reason and the solution:
      
        We're always passing unsigned, so it should be safe to drop the check on
        (id) >= 0.
      
      Fixes: 62b57e3d
      
       ("bpftool: Add perf event names")
      Reported-by: default avatarQuentin Monnet <quentin@isovalent.com>
      Suggested-by: default avatarQuentin Monnet <quentin@isovalent.com>
      Signed-off-by: default avatarYafang Shao <laoar.shao@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarQuentin Monnet <quentin@isovalent.com>
      Closes: https://lore.kernel.org/bpf/a35d9a2d-54a0-49ec-9ed1-8fcf1369d3cc@isovalent.com
      Link: https://lore.kernel.org/bpf/20230830030325.3786-1-laoar.shao@gmail.com
      6a8faf10
    • Yonghong Song's avatar
      bpf: Prevent inlining of bpf_fentry_test7() · 32337c0a
      Yonghong Song authored
      
      
      With latest clang18, I hit test_progs failures for the following test:
      
        #13/2    bpf_cookie/multi_kprobe_link_api:FAIL
        #13/3    bpf_cookie/multi_kprobe_attach_api:FAIL
        #13      bpf_cookie:FAIL
        #75      fentry_fexit:FAIL
        #76/1    fentry_test/fentry:FAIL
        #76      fentry_test:FAIL
        #80/1    fexit_test/fexit:FAIL
        #80      fexit_test:FAIL
        #110/1   kprobe_multi_test/skel_api:FAIL
        #110/2   kprobe_multi_test/link_api_addrs:FAIL
        #110/3   kprobe_multi_test/link_api_syms:FAIL
        #110/4   kprobe_multi_test/attach_api_pattern:FAIL
        #110/5   kprobe_multi_test/attach_api_addrs:FAIL
        #110/6   kprobe_multi_test/attach_api_syms:FAIL
        #110     kprobe_multi_test:FAIL
      
      For example, for #13/2, the error messages are:
      
        [...]
        kprobe_multi_test_run:FAIL:kprobe_test7_result unexpected kprobe_test7_result: actual 0 != expected 1
        [...]
        kprobe_multi_test_run:FAIL:kretprobe_test7_result unexpected kretprobe_test7_result: actual 0 != expected 1
      
      clang17 does not have this issue.
      
      Further investigation shows that kernel func bpf_fentry_test7(), used in
      the above tests, is inlined by the compiler although it is marked as
      noinline.
      
        int noinline bpf_fentry_test7(struct bpf_fentry_test_t *arg)
        {
              return (long)arg;
        }
      
      It is known that for simple functions like the above (e.g. just returning
      a constant or an input argument), the clang compiler may still do inlining
      for a noinline function. Adding 'asm volatile ("")' in the beginning of the
      bpf_fentry_test7() can prevent inlining.
      
      Signed-off-by: default avatarYonghong Song <yonghong.song@linux.dev>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Tested-by: default avatarEduard Zingerman <eddyz87@gmail.com>
      Link: https://lore.kernel.org/bpf/20230826200843.2210074-1-yonghong.song@linux.dev
      32337c0a
    • Linus Torvalds's avatar
      Merge tag 'net-next-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next · bd6c11bc
      Linus Torvalds authored
      Pull networking updates from Paolo Abeni:
       "Core:
      
         - Increase size limits for to-be-sent skb frag allocations. This
           allows tun, tap devices and packet sockets to better cope with
           large writes operations
      
         - Store netdevs in an xarray, to simplify iterating over netdevs
      
         - Refactor nexthop selection for multipath routes
      
         - Improve sched class lifetime handling
      
         - Add backup nexthop ID support for bridge
      
         - Implement drop reasons support in openvswitch
      
         - Several data races annotations and fixes
      
         - Constify the sk parameter of routing functions
      
         - Prepend kernel version to netconsole message
      
        Protocols:
      
         - Implement support for TCP probing the peer being under memory
           pressure
      
         - Remove hard coded limitation on IPv6 specific info placement inside
           the socket struct
      
         - Get rid of sysctl_tcp_adv_win_scale and use an auto-estimated per
           socket scaling factor
      
         - Scaling-up the IPv6 expired route GC via a separated list of
           expiring routes
      
         - In-kernel support for the TLS alert protocol
      
         - Better support for UDP reuseport with connected sockets
      
         - Add NEXT-C-SID support for SRv6 End.X behavior, reducing the SR
           header size
      
         - Get rid of additional ancillary per MPTCP connection struct socket
      
         - Implement support for BPF-based MPTCP packet schedulers
      
         - Format MPTCP subtests selftests results in TAP
      
         - Several new SMC 2.1 features including unique experimental options,
           max connections per lgr negotiation, max links per lgr negotiation
      
        BPF:
      
         - Multi-buffer support in AF_XDP
      
         - Add multi uprobe BPF links for attaching multiple uprobes and usdt
           probes, which is significantly faster and saves extra fds
      
         - Implement an fd-based tc BPF attach API (TCX) and BPF link support
           on top of it
      
         - Add SO_REUSEPORT support for TC bpf_sk_assign
      
         - Support new instructions from cpu v4 to simplify the generated code
           and feature completeness, for x86, arm64, riscv64
      
         - Support defragmenting IPv(4|6) packets in BPF
      
         - Teach verifier actual bounds of bpf_get_smp_processor_id() and fix
           perf+libbpf issue related to custom section handling
      
         - Introduce bpf map element count and enable it for all program types
      
         - Add a BPF hook in sys_socket() to change the protocol ID from
           IPPROTO_TCP to IPPROTO_MPTCP to cover migration for legacy
      
         - Introduce bpf_me_mcache_free_rcu() and fix OOM under stress
      
         - Add uprobe support for the bpf_get_func_ip helper
      
         - Check skb ownership against full socket
      
         - Support for up to 12 arguments in BPF trampoline
      
         - Extend link_info for kprobe_multi and perf_event links
      
        Netfilter:
      
         - Speed-up process exit by aborting ruleset validation if a fatal
           signal is pending
      
         - Allow NLA_POLICY_MASK to be used with BE16/BE32 types
      
        Driver API:
      
         - Page pool optimizations, to improve data locality and cache usage
      
         - Introduce ndo_hwtstamp_get() and ndo_hwtstamp_set() to avoid the
           need for raw ioctl() handling in drivers
      
         - Simplify genetlink dump operations (doit/dumpit) providing them the
           common information already populated in struct genl_info
      
         - Extend and use the yaml devlink specs to [re]generate the split ops
      
         - Introduce devlink selective dumps, to allow SF filtering SF based
           on handle and other attributes
      
         - Add yaml netlink spec for netlink-raw families, allow route, link
           and address related queries via the ynl tool
      
         - Remove phylink legacy mode support
      
         - Support offload LED blinking to phy
      
         - Add devlink port function attributes for IPsec
      
        New hardware / drivers:
      
         - Ethernet:
            - Broadcom ASP 2.0 (72165) ethernet controller
            - MediaTek MT7988 SoC
            - Texas Instruments AM654 SoC
            - Texas Instruments IEP driver
            - Atheros qca8081 phy
            - Marvell 88Q2110 phy
            - NXP TJA1120 phy
      
         - WiFi:
            - MediaTek mt7981 support
      
         - Can:
            - Kvaser SmartFusion2 PCI Express devices
            - Allwinner T113 controllers
            - Texas Instruments tcan4552/4553 chips
      
         - Bluetooth:
            - Intel Gale Peak
            - Qualcomm WCN3988 and WCN7850
            - NXP AW693 and IW624
            - Mediatek MT2925
      
        Drivers:
      
         - Ethernet NICs:
            - nVidia/Mellanox:
               - mlx5:
                  - support UDP encapsulation in packet offload mode
                  - IPsec packet offload support in eswitch mode
                  - improve aRFS observability by adding new set of counters
                  - extends MACsec offload support to cover RoCE traffic
                  - dynamic completion EQs
               - mlx4:
                  - convert to use auxiliary bus instead of custom interface
                    logic
            - Intel
               - ice:
                  - implement switchdev bridge offload, even for LAG
                    interfaces
                  - implement SRIOV support for LAG interfaces
               - igc:
                  - add support for multiple in-flight TX timestamps
            - Broadcom:
               - bnxt:
                  - use the unified RX page pool buffers for XDP and non-XDP
                  - use the NAPI skb allocation cache
            - OcteonTX2:
               - support Round Robin scheduling HTB offload
               - TC flower offload support for SPI field
            - Freescale:
               - add XDP_TX feature support
            - AMD:
               - ionic: add support for PCI FLR event
               - sfc:
                  - basic conntrack offload
                  - introduce eth, ipv4 and ipv6 pedit offloads
            - ST Microelectronics:
               - stmmac: maximze PTP timestamping resolution
      
         - Virtual NICs:
            - Microsoft vNIC:
               - batch ringing RX queue doorbell on receiving packets
               - add page pool for RX buffers
            - Virtio vNIC:
               - add per queue interrupt coalescing support
            - Google vNIC:
               - add queue-page-list mode support
      
         - Ethernet high-speed switches:
            - nVidia/Mellanox (mlxsw):
               - add port range matching tc-flower offload
               - permit enslavement to netdevices with uppers
      
         - Ethernet embedded switches:
            - Marvell (mv88e6xxx):
               - convert to phylink_pcs
            - Renesas:
               - r8A779fx: add speed change support
               - rzn1: enables vlan support
      
         - Ethernet PHYs:
            - convert mv88e6xxx to phylink_pcs
      
         - WiFi:
            - Qualcomm Wi-Fi 7 (ath12k):
               - extremely High Throughput (EHT) PHY support
            - RealTek (rtl8xxxu):
               - enable AP mode for: RTL8192FU, RTL8710BU (RTL8188GU),
                 RTL8192EU and RTL8723BU
            - RealTek (rtw89):
               - Introduce Time Averaged SAR (TAS) support
      
         - Connector:
            - support for event filtering"
      
      * tag 'net-next-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1806 commits)
        net: ethernet: mtk_wed: minor change in wed_{tx,rx}info_show
        net: ethernet: mtk_wed: add some more info in wed_txinfo_show handler
        net: stmmac: clarify difference between "interface" and "phy_interface"
        r8152: add vendor/device ID pair for D-Link DUB-E250
        devlink: move devlink_notify_register/unregister() to dev.c
        devlink: move small_ops definition into netlink.c
        devlink: move tracepoint definitions into core.c
        devlink: push linecard related code into separate file
        devlink: push rate related code into separate file
        devlink: push trap related code into separate file
        devlink: use tracepoint_enabled() helper
        devlink: push region related code into separate file
        devlink: push param related code into separate file
        devlink: push resource related code into separate file
        devlink: push dpipe related code into separate file
        devlink: move and rename devlink_dpipe_send_and_alloc_skb() helper
        devlink: push shared buffer related code into separate file
        devlink: push port related code into separate file
        devlink: push object register/unregister notifications into separate helpers
        inet: fix IP_TRANSPARENT error handling
        ...
      bd6c11bc
    • Linus Torvalds's avatar
      Merge tag 'v6.6-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · 68cf0176
      Linus Torvalds authored
      Pull crypto updates from Herbert Xu:
       "API:
         - Move crypto engine callback from tfm ctx into algorithm object
         - Fix atomic sleep bug in crypto_destroy_instance
         - Move lib/mpi into lib/crypto
      
        Algorithms:
         - Add chacha20 and poly1305 implementation for powerpc p10
      
        Drivers:
         - Add AES skcipher and aead support to starfive
         - Add Dynamic Boost Control support to ccp
         - Add support for STM32P13 platform to stm32"
      
      * tag 'v6.6-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (149 commits)
        Revert "dt-bindings: crypto: qcom,prng: Add SM8450"
        crypto: chelsio - Remove unused declarations
        X.509: if signature is unsupported skip validation
        crypto: qat - fix crypto capability detection for 4xxx
        crypto: drivers - Explicitly include correct DT includes
        crypto: engine - Remove crypto_engine_ctx
        crypto: zynqmp - Use new crypto_engine_op interface
        crypto: virtio - Use new crypto_engine_op interface
        crypto: stm32 - Use new crypto_engine_op interface
        crypto: jh7110 - Use new crypto_engine_op interface
        crypto: rk3288 - Use new crypto_engine_op interface
        crypto: omap - Use new crypto_engine_op interface
        crypto: keembay - Use new crypto_engine_op interface
        crypto: sl3516 - Use new crypto_engine_op interface
        crypto: caam - Use new crypto_engine_op interface
        crypto: aspeed - Remove non-standard sha512 algorithms
        crypto: aspeed - Use new crypto_engine_op interface
        crypto: amlogic - Use new crypto_engine_op interface
        crypto: sun8i-ss - Use new crypto_engine_op interface
        crypto: sun8i-ce - Use new crypto_engine_op interface
        ...
      68cf0176
    • Linus Torvalds's avatar
      Merge tag 'gpio-updates-for-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux · f97e18a3
      Linus Torvalds authored
      Pull gpio updates from Bartosz Golaszewski:
       "We have a lot of code refactoring using common helpers and ended up
        removing more lines then we're adding this release cycle.
      
        Nothing really stands out, just small updates all over the place.
      
        Core GPIOLIB updates:
         - wake-up poll() in user-space on device unbind
         - improve fwnode usage
         - interrupt domain handling improvements
         - correctly handle the ngpios property in gpio-mmio
      
        Driver cleanups:
         - remove unneeded calls to platform_set_drvdata() all around the
           place
         - remove unneeded of_match_ptr() expansions whenever a driver depends
           on CONFIG_OF
         - remove redundant calls to dev_err_probe() from gpio-omap and
           gpio-davinci
      
        Driver improvements:
         - use autopointers and guards from cleanup.h in gpio-sim
         - shrink code in gpio-sim using some common helpers
         - convert the idio family of drivers to using gpio-regmap
         - convert gpio-ws16c48 to using gpio-regmap
         - use devres to simplify code in gpio-pisosr and gpio-mxc
         - update gpio-sifive: support IRQ wake, improve interrupt handling,
           allow building as module
         - make gpio-ge and gpio-bcm-kona OF-independent (plus some minor
           tweaks)
         - add support for new models in gpio-pca953x and gpio-ds4520
         - add runtime PM support to gpio-mxc
         - fix a build warning in gpio-mxs
         - add support for adding pin ranges to gpio-mlxbf3
         - add counter/timer support to gpio-104-dio-48e
         - switch to dynamic GPIO base allocation in gpio-vf610
         - minor oneliners here and there
      
        Device-tree bindings updates:
         - enable the gpio-line-names property in snps,dw-apb and STMPE GPIO
         - document new models in fsl-imx-gpio, ds4520 and pca95xx
         - convert the bindings for brcm,kona-gpio to YAML"
      
      * tag 'gpio-updates-for-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: (94 commits)
        gpio: pca953x: add support for TCA9538
        dt-bindings: gpio: pca95xx: document new tca9538 chip
        gpio: pca953x: Use i2c_get_match_data()
        gpio: mlxbf3: use capital "OR" for multiple licenses in SPDX
        gpio: pcf857x: Extend match data support for OF tables
        gpio: vf610: switch to dynamic allocat GPIO base
        gpiolib: provide and use gpiod_line_state_notify()
        gpio: cdev: wake up lineevent poll() on device unbind
        gpio: cdev: wake up linereq poll() on device unbind
        gpio: cdev: wake up chardev poll() on device unbind
        gpiolib: add a second blocking notifier to struct gpio_device
        gpio: cdev: open-code to_gpio_chardev_data()
        gpiolib: rename the gpio_device notifier
        gpio: mlxbf3: Support add_pin_ranges()
        gpio: mxc: Use helper function devm_clk_get_optional_enabled()
        gpio: pca9570: fix kerneldoc
        gpio: sim: simplify code with cleanup helpers
        gpio: sim: replace memmove() + strstrip() with skip_spaces() + strim()
        gpio: sim: simplify gpio_sim_device_config_live_store()
        gpio: mxc: release the parent IRQ in runtime suspend
        ...
      f97e18a3
    • Linus Torvalds's avatar
      Merge tag 'hwmon-for-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging · 41e97d7a
      Linus Torvalds authored
      Pull hwmon updates from Guenter Roeck:
       "New drivers:
      
         - Renesas HS3001
      
        Chip support added to existing drivers:
      
         - pmbus/mp2975 driver now supports MP2971 and MP2973
      
        Functional improvements:
      
         - Additional voltage and temperature sensor support for
           NCT6798/NCT6799 in nt6755 driver
      
         - it87 driver now detects AMDTSI sensor type
      
         - dimmtemp now supports more than 32 DIMMs
      
        Driver removals:
      
         - sm665 driver removed as unsupportable and long since obsolete
      
        .. and minor fixes, cleanups, and simplifications in several drivers"
      
      * tag 'hwmon-for-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: (57 commits)
        hwmon: (tmp513) Simplify probe()
        hwmon: (tmp513) Fix the channel number in tmp51x_is_visible()
        hwmon: (mlxreg-fan) Extend number of supported fans
        hwmon: (sis5595) Do PCI error checks on own line
        hwmon: (vt8231) Do PCI error checks on own line
        hwmon: (via686a) Do PCI error checks on own line
        hwmon: pmbus: Fix -EIO seen on pli1209
        hwmon: pmbus: Drop unnecessary clear fault page
        hwmon: pmbus: Reduce clear fault page invocations
        hwmon: (nsa320-hwmon) Remove redundant of_match_ptr()
        hwmon: (pmbus/ucd9200) fix Wvoid-pointer-to-enum-cast warning
        hwmon: (pmbus/ucd9000) fix Wvoid-pointer-to-enum-cast warning
        hwmon: (pmbus/tps53679) fix Wvoid-pointer-to-enum-cast warning
        hwmon: (pmbus/ibm-cffps) fix Wvoid-pointer-to-enum-cast warning
        hwmon: (tmp513) fix Wvoid-pointer-to-enum-cast warning
        hwmon: (max6697) fix Wvoid-pointer-to-enum-cast warning
        hwmon: (max20730) fix Wvoid-pointer-to-enum-cast warning
        hwmon: (lm90) fix Wvoid-pointer-to-enum-cast warning
        hwmon: (lm85) fix Wvoid-pointer-to-enum-cast warning
        hwmon: (lm75) fix Wvoid-pointer-to-enum-cast warning
        ...
      41e97d7a
    • Linus Torvalds's avatar
      Merge tag 'mmc-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc · 995cda62
      Linus Torvalds authored
      Pull MMC updates from Ulf Hansson:
       "MMC core:
         - Convert drivers to use the ->remove_new() callback
         - Propagate the removable attribute for the card's device
      
        MMC host:
         - Convert drivers to use the ->remove_new() callback
         - atmel-mci: Convert to gpio descriptors and cleanup the code
         - davinci: Make SDIO irq truly optional
         - renesas_sdhi: Register irqs before registering controller
         - sdhci: Simplify the sdhci_pltfm_* interface a bit
         - sdhci-esdhc-imx: Improve support for the 1.8V errata
         - sdhci-of-at91: Add support for the microchip sam9x7 variant
         - sdhci-of-dwcmshc: Add support for runtime PM
         - sdhci-pci-o2micro: Add support for the new Bayhub GG8 variant
         - sdhci-sprd: Add support for SD high-speed mode tuning
         - uniphier-sd: Register irqs before registering controller"
      
      * tag 'mmc-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: (108 commits)
        mmc: atmel-mci: Move card detect gpio polarity quirk to gpiolib
        mmc: atmel-mci: move atmel MCI header file
        mmc: atmel-mci: Convert to gpio descriptors
        mmc: sdhci-sprd: Add SD HS mode online tuning
        mmc: core: Add host specific tuning support for SD HS mode
        mmc: sdhci-of-dwcmshc: Add runtime PM operations
        mmc: sdhci-of-dwcmshc: Add error handling in dwcmshc_resume
        mmc: sdhci-esdhc-imx: improve ESDHC_FLAG_ERR010450
        mmc: sdhci-pltfm: Rename sdhci_pltfm_register()
        mmc: sdhci-pltfm: Remove sdhci_pltfm_unregister()
        mmc: sdhci-st: Use sdhci_pltfm_remove()
        mmc: sdhci-pxav2: Use sdhci_pltfm_remove()
        mmc: sdhci-of-sparx5: Use sdhci_pltfm_remove()
        mmc: sdhci-of-hlwd: Use sdhci_pltfm_remove()
        mmc: sdhci-of-esdhc: Use sdhci_pltfm_remove()
        mmc: sdhci-of-at91: Use sdhci_pltfm_remove()
        mmc: sdhci-of-arasan: Use sdhci_pltfm_remove()
        mmc: sdhci-iproc: Use sdhci_pltfm_remove()
        mmc: sdhci_f_sdh30: Use sdhci_pltfm_remove()
        mmc: sdhci-dove: Use sdhci_pltfm_remove()
        ...
      995cda62
    • Linus Torvalds's avatar
      Merge tag 'spi-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi · 3b6bf5b1
      Linus Torvalds authored
      Pull spi updates from Mark Brown:
       "There's been quite a lot of generic activity here, but more
        administrative than featuers. We also have a bunch of new drivers,
        including one that's part of a MFD so we pulled in the core parts of
        that:
      
         - Lots of work from both Yang Yingliang and Andy Shevchenko on moving
           to host/device/controller based terminology for devices.
      
         - QuadSPI SPI support for Allwinner sun6i.
      
         - New device support Cirrus Logic CS43L43, Longsoon, Qualcomm GENI
           QuPv3 and StarFive JH7110 QSPI"
      
      * tag 'spi-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: (151 commits)
        spi: at91-usart: Use PTR_ERR_OR_ZERO() to simplify code
        spi: spi-sn-f-ospi: switch to use modern name
        spi: sifive: switch to use modern name
        spi: sh: switch to use modern name
        spi: sh-sci: switch to use modern name
        spi: sh-msiof: switch to use modern name
        spi: sh-hspi: switch to use modern name
        spi: sc18is602: switch to use modern name
        spi: s3c64xx: switch to use modern name
        spi: rzv2m-csi: switch to use devm_spi_alloc_host()
        spi: rspi: switch to use spi_alloc_host()
        spi: rockchip: switch to use modern name
        spi: rockchip-sfc: switch to use modern name
        spi: realtek-rtl: switch to use devm_spi_alloc_host()
        spi: rb4xx: switch to use modern name
        spi: qup: switch to use modern name
        spi: spi-qcom-qspi: switch to use modern name
        spi: pxa2xx: switch to use modern name
        spi: ppc4xx: switch to use modern name
        spi: spl022: switch to use modern name
        ...
      3b6bf5b1
    • Linus Torvalds's avatar
      Merge tag 'regulator-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator · 65234f96
      Linus Torvalds authored
      Pull regulator updates from Mark Brown:
       "Other than new device support and some minor fixes this has been a
        really quiet release, the only notable things are the new drivers.
      
        There's a couple of MFDs among the new devices so the generic parts
        are pulled in:
      
         - Support for Analog Devices MAX77831/57/59, Awinc AW37503, Qualcom
           PMX75 and RFGEN, RealTek RT5733, RichTek RTQ2208 and Texas
           Instruments TPS65086"
      
      * tag 'regulator-v6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: (68 commits)
        regulator: userspace-consumer: Drop event support for this cycle
        regulator: aw37503: Switch back to use struct i2c_driver's .probe()
        dt-bindings: regulator: qcom,rpmh-regulator: allow i, j, l, m & n as RPMh resource name suffix
        regulator: dt-bindings: Add Awinic AW37503
        regulator: aw37503: add regulator driver for Awinic AW37503
        regulator: tps65086: Select dedicated regulator config for chip variant
        mfd: tps65086: Read DEVICE ID register 1 from device
        regulator: raa215300: Update help description
        regulator: raa215300: Add missing blank space
        regulator: raa215300: Change rate from 32000->32768
        regulator: db8500-prcmu: Remove unused declaration power_state_active_is_enabled()
        regulator: raa215300: Add const definition
        regulator: raa215300: Fix resource leak in case of error
        regulator: rtq2208: Switch back to use struct i2c_driver's .probe()
        regulator: lp872x: Fix Wvoid-pointer-to-enum-cast warning
        regulator: max77857: Fix Wvoid-pointer-to-enum-cast warning
        regulator: ltc3589: Fix Wvoid-pointer-to-enum-cast warning
        regulator: qcom_rpm-regulator: Use devm_kmemdup to replace devm_kmalloc + memcpy
        regulator: tps6286x-regulator: Remove redundant of_match_ptr() macros
        regulator: pfuze100-regulator: Remove redundant of_match_ptr() macro
        ...
      65234f96