Skip to content
  1. Oct 12, 2023
    • Kuniyuki Iwashima's avatar
      af_packet: Fix fortified memcpy() without flex array. · e2bca487
      Kuniyuki Iwashima authored
      Sergei Trofimovich reported a regression [0] caused by commit a0ade840
      
      
      ("af_packet: Fix warning of fortified memcpy() in packet_getname().").
      
      It introduced a flex array sll_addr_flex in struct sockaddr_ll as a
      union-ed member with sll_addr to work around the fortified memcpy() check.
      
      However, a userspace program uses a struct that has struct sockaddr_ll in
      the middle, where a flex array is illegal to exist.
      
        include/linux/if_packet.h:24:17: error: flexible array member 'sockaddr_ll::<unnamed union>::<unnamed struct>::sll_addr_flex' not at end of 'struct packet_info_t'
           24 |                 __DECLARE_FLEX_ARRAY(unsigned char, sll_addr_flex);
              |                 ^~~~~~~~~~~~~~~~~~~~
      
      To fix the regression, let's go back to the first attempt [1] telling
      memcpy() the actual size of the array.
      
      Reported-by: default avatarSergei Trofimovich <slyich@gmail.com>
      Closes: https://github.com/NixOS/nixpkgs/pull/252587#issuecomment-1741733002 [0]
      Link: https://lore.kernel.org/netdev/20230720004410.87588-3-kuniyu@amazon.com/ [1]
      Fixes: a0ade840
      
       ("af_packet: Fix warning of fortified memcpy() in packet_getname().")
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Link: https://lore.kernel.org/r/20231009153151.75688-1-kuniyu@amazon.com
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      e2bca487
    • Jakub Kicinski's avatar
      net: tcp: fix crashes trying to free half-baked MTU probes · 71c299c7
      Jakub Kicinski authored
      tcp_stream_alloc_skb() initializes the skb to use tcp_tsorted_anchor
      which is a union with the destructor. We need to clean that
      TCP-iness up before freeing.
      
      Fixes: 73601329
      
       ("tcp: let tcp_mtu_probe() build headless packets")
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/r/20231010173651.3990234-1-kuba@kernel.org
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      71c299c7
    • Jakub Kicinski's avatar
      Merge tag 'ieee802154-for-net-2023-10-10' of... · 8bcfc9de
      Jakub Kicinski authored
      
      Merge tag 'ieee802154-for-net-2023-10-10' of git://git.kernel.org/pub/scm/linux/kernel/git/wpan/wpan
      
      Stefan Schmidt says:
      
      ====================
      pull-request: ieee802154 for net 2023-10-10
      
      Just one small fix this time around.
      
      Dinghao Liu fixed a potential use-after-free in the ca8210 driver probe
      function.
      
      * tag 'ieee802154-for-net-2023-10-10' of git://git.kernel.org/pub/scm/linux/kernel/git/wpan/wpan:
        ieee802154: ca8210: Fix a potential UAF in ca8210_probe
      ====================
      
      Link: https://lore.kernel.org/r/20231010200943.82225-1-stefan@datenfreihafen.org
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      8bcfc9de
  2. Oct 11, 2023
    • Nils Hoppmann's avatar
      net/smc: Fix pos miscalculation in statistics · a950a592
      Nils Hoppmann authored
      SMC_STAT_PAYLOAD_SUB(_smc_stats, _tech, key, _len, _rc) will calculate
      wrong bucket positions for payloads of exactly 4096 bytes and
      (1 << (m + 12)) bytes, with m == SMC_BUF_MAX - 1.
      
      Intended bucket distribution:
      Assume l == size of payload, m == SMC_BUF_MAX - 1.
      
      Bucket 0                : 0 < l <= 2^13
      Bucket n, 1 <= n <= m-1 : 2^(n+12) < l <= 2^(n+13)
      Bucket m                : l > 2^(m+12)
      
      Current solution:
      _pos = fls64((l) >> 13)
      [...]
      _pos = (_pos < m) ? ((l == 1 << (_pos + 12)) ? _pos - 1 : _pos) : m
      
      For l == 4096, _pos == -1, but should be _pos == 0.
      For l == (1 << (m + 12)), _pos == m, but should be _pos == m - 1.
      
      In order to avoid special treatment of these corner cases, the
      calculation is adjusted. The new solution first subtracts the length by
      one, and then calculates the correct bucket by shifting accordingly,
      i.e. _pos = fls64((l - 1) >> 13), l > 0.
      This not only fixes the issues named above, but also makes the whole
      bucket assignment easier to follow.
      
      Same is done for SMC_STAT_RMB_SIZE_SUB(_smc_stats, _tech, k, _len),
      where the calculation of the bucket position is similar to the one
      named above.
      
      Fixes: e0e4b8fa
      
       ("net/smc: Add SMC statistics support")
      Suggested-by: default avatarHalil Pasic <pasic@linux.ibm.com>
      Signed-off-by: default avatarNils Hoppmann <niho@linux.ibm.com>
      Reviewed-by: default avatarHalil Pasic <pasic@linux.ibm.com>
      Reviewed-by: default avatarWenjia Zhang <wenjia@linux.ibm.com>
      Reviewed-by: default avatarDust Li <dust.li@linux.alibaba.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a950a592
    • Yanguo Li's avatar
      nfp: flower: avoid rmmod nfp crash issues · 14690995
      Yanguo Li authored
      When there are CT table entries, and you rmmod nfp, the following
      events can happen:
      
      task1:
          nfp_net_pci_remove
                ↓
          nfp_flower_stop->(asynchronous)tcf_ct_flow_table_cleanup_work(3)
                ↓
          nfp_zone_table_entry_destroy(1)
      
      task2:
          nfp_fl_ct_handle_nft_flow(2)
      
      When the execution order is (1)->(2)->(3), it will crash. Therefore, in
      the function nfp_fl_ct_del_flow, nf_flow_table_offload_del_cb needs to
      be executed synchronously.
      
      At the same time, in order to solve the deadlock problem and the problem
      of rtnl_lock sometimes failing, replace rtnl_lock with the private
      nfp_fl_lock.
      
      Fixes: 7cc93d88
      
       ("nfp: flower-ct: remove callback delete deadlock")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarYanguo Li <yanguo.li@corigine.com>
      Signed-off-by: default avatarLouis Peens <louis.peens@corigine.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      14690995
    • Javier Carrasco's avatar
      net: usb: dm9601: fix uninitialized variable use in dm9601_mdio_read · 8f8abb86
      Javier Carrasco authored
      
      
      syzbot has found an uninit-value bug triggered by the dm9601 driver [1].
      
      This error happens because the variable res is not updated if the call
      to dm_read_shared_word returns an error. In this particular case -EPROTO
      was returned and res stayed uninitialized.
      
      This can be avoided by checking the return value of dm_read_shared_word
      and propagating the error if the read operation failed.
      
      [1] https://syzkaller.appspot.com/bug?extid=1f53a30781af65d2c955
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJavier Carrasco <javier.carrasco.cruz@gmail.com>
      Reported-and-tested-by: default avatar <syzbot+1f53a30781af65d2c955@syzkaller.appspotmail.com>
      Acked-by: default avatarPeter Korsgaard <peter@korsgaard.com>
      Fixes: d0374f4f
      
       ("USB: Davicom DM9601 usbnet driver")
      Link: https://lore.kernel.org/r/20231009-topic-dm9601_uninit_mdio_read-v2-1-f2fe39739b6c@gmail.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      8f8abb86
    • Jakub Kicinski's avatar
      Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · ad98426a
      Jakub Kicinski authored
      
      
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf 2023-10-11
      
      We've added 14 non-merge commits during the last 5 day(s) which contain
      a total of 12 files changed, 398 insertions(+), 104 deletions(-).
      
      The main changes are:
      
      1) Fix s390 JIT backchain issues in the trampoline code generation which
         previously clobbered the caller's backchain, from Ilya Leoshkevich.
      
      2) Fix zero-size allocation warning in xsk sockets when the configured
         ring size was close to SIZE_MAX, from Andrew Kanner.
      
      3) Fixes for bpf_mprog API that were found when implementing support
         in the ebpf-go library along with selftests, from Daniel Borkmann
         and Lorenz Bauer.
      
      4) Fix riscv JIT to properly sign-extend the return register in programs.
         This fixes various test_progs selftests on riscv, from Björn Töpel.
      
      5) Fix verifier log for async callback return values where the allowed
         range was displayed incorrectly, from David Vernet.
      
      * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
        s390/bpf: Fix unwinding past the trampoline
        s390/bpf: Fix clobbering the caller's backchain in the trampoline
        selftests/bpf: Add testcase for async callback return value failure
        bpf: Fix verifier log for async callback return values
        xdp: Fix zero-size allocation warning in xskq_create()
        riscv, bpf: Track both a0 (RISC-V ABI) and a5 (BPF) return values
        riscv, bpf: Sign-extend return values
        selftests/bpf: Make seen_tc* variable tests more robust
        selftests/bpf: Test query on empty mprog and pass revision into attach
        selftests/bpf: Adapt assert_mprog_count to always expect 0 count
        selftests/bpf: Test bpf_mprog query API via libbpf and raw syscall
        bpf: Refuse unused attributes in bpf_prog_{attach,detach}
        bpf: Handle bpf_mprog_query with NULL entry
        bpf: Fix BPF_PROG_QUERY last field check
      ====================
      
      Link: https://lore.kernel.org/r/20231010223610.3984-1-daniel@iogearbox.net
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      ad98426a
    • Kory Maincent's avatar
      ethtool: Fix mod state of verbose no_mask bitset · 108a36d0
      Kory Maincent authored
      A bitset without mask in a _SET request means we want exactly the bits in
      the bitset to be set. This works correctly for compact format but when
      verbose format is parsed, ethnl_update_bitset32_verbose() only sets the
      bits present in the request bitset but does not clear the rest. The commit
      66991703 fixes this issue by clearing the whole target bitmap before we
      start iterating. The solution proposed brought an issue with the behavior
      of the mod variable. As the bitset is always cleared the old val will
      always differ to the new val.
      
      Fix it by adding a new temporary variable which save the state of the old
      bitmap.
      
      Fixes: 66991703
      
       ("ethtool: fix application of verbose no_mask bitset")
      Signed-off-by: default avatarKory Maincent <kory.maincent@bootlin.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/20231009133645.44503-1-kory.maincent@bootlin.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      108a36d0
    • Jakub Kicinski's avatar
      Merge tag 'linux-can-fixes-for-6.6-20231009' of... · b52acd02
      Jakub Kicinski authored
      
      Merge tag 'linux-can-fixes-for-6.6-20231009' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
      
      Marc Kleine-Budde says:
      
      ====================
      pull-request: can 2023-10-09
      
      Lukas Magel's patch for the CAN ISO-TP protocol fixes the TX state
      detection and wait behavior.
      
      John Watts contributes a patch to only show the sun4i_can Kconfig
      option on ARCH_SUNXI.
      
      A patch by Miquel Raynal fixes the soft-reset workaround for Renesas
      SoCs in the sja1000 driver.
      
      Markus Schneider-Pargmann's patch for the tcan4x5x m_can glue driver
      fixes the id2 register for the tcan4553.
      
      2 patches by Haibo Chen fix the flexcan stop mode for the imx93 SoC.
      
      * tag 'linux-can-fixes-for-6.6-20231009' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
        can: tcan4x5x: Fix id2_register for tcan4553
        can: flexcan: remove the auto stop mode for IMX93
        can: sja1000: Always restart the Tx queue after an overrun
        arm64: dts: imx93: add the Flex-CAN stop mode by GPR
        can: sun4i_can: Only show Kconfig if ARCH_SUNXI is set
        can: isotp: isotp_sendmsg(): fix TX state detection and wait behavior
      ====================
      
      Link: https://lore.kernel.org/r/20231009085256.693378-1-mkl@pengutronix.de
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      b52acd02
    • Eric Dumazet's avatar
      net: nfc: fix races in nfc_llcp_sock_get() and nfc_llcp_sock_get_sn() · 31c07dff
      Eric Dumazet authored
      Sili Luo reported a race in nfc_llcp_sock_get(), leading to UAF.
      
      Getting a reference on the socket found in a lookup while
      holding a lock should happen before releasing the lock.
      
      nfc_llcp_sock_get_sn() has a similar problem.
      
      Finally nfc_llcp_recv_snl() needs to make sure the socket
      found by nfc_llcp_sock_from_sn() does not disappear.
      
      Fixes: 8f50020e
      
       ("NFC: LLCP late binding")
      Reported-by: default avatarSili Luo <rootlab@huawei.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Willy Tarreau <w@1wt.eu>
      Reviewed-by: default avatarKrzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
      Link: https://lore.kernel.org/r/20231009123110.3735515-1-edumazet@google.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      31c07dff
    • Jeremy Kerr's avatar
      mctp: perform route lookups under a RCU read-side lock · 5093bbfc
      Jeremy Kerr authored
      
      
      Our current route lookups (mctp_route_lookup and mctp_route_lookup_null)
      traverse the net's route list without the RCU read lock held. This means
      the route lookup is subject to preemption, resulting in an potential
      grace period expiry, and so an eventual kfree() while we still have the
      route pointer.
      
      Add the proper read-side critical section locks around the route
      lookups, preventing premption and a possible parallel kfree.
      
      The remaining net->mctp.routes accesses are already under a
      rcu_read_lock, or protected by the RTNL for updates.
      
      Based on an analysis from Sili Luo <rootlab@huawei.com>, where
      introducing a delay in the route lookup could cause a UAF on
      simultaneous sendmsg() and route deletion.
      
      Reported-by: default avatarSili Luo <rootlab@huawei.com>
      Fixes: 889b7da2
      
       ("mctp: Add initial routing framework")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJeremy Kerr <jk@codeconstruct.com.au>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/r/29c4b0e67dc1bf3571df3982de87df90cae9b631.1696837310.git.jk@codeconstruct.com.au
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      5093bbfc
    • Randy Dunlap's avatar
      net: skbuff: fix kernel-doc typos · 8527ca77
      Randy Dunlap authored
      
      
      Correct punctuation and drop an extraneous word.
      
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/20231008214121.25940-1-rdunlap@infradead.org
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      8527ca77
    • Ilya Leoshkevich's avatar
      s390/bpf: Fix unwinding past the trampoline · 5356ba1f
      Ilya Leoshkevich authored
      When functions called by the trampoline panic, the backtrace that is
      printed stops at the trampoline, because the trampoline does not store
      its caller's frame address (backchain) on stack; it also stores the
      return address at a wrong location.
      
      Store both the same way as is already done for the regular eBPF programs.
      
      Fixes: 528eb2cb
      
       ("s390/bpf: Implement arch_prepare_bpf_trampoline()")
      Signed-off-by: default avatarIlya Leoshkevich <iii@linux.ibm.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20231010203512.385819-3-iii@linux.ibm.com
      5356ba1f
    • Ilya Leoshkevich's avatar
      s390/bpf: Fix clobbering the caller's backchain in the trampoline · ce10fc06
      Ilya Leoshkevich authored
      One of the first things that s390x kernel functions do is storing the
      the caller's frame address (backchain) on stack. This makes unwinding
      possible. The backchain is always stored at frame offset 152, which is
      inside the 160-byte stack area, that the functions allocate for their
      callees. The callees must preserve the backchain; the remaining 152
      bytes they may use as they please.
      
      Currently the trampoline uses all 160 bytes, clobbering the backchain.
      This causes kernel panics when using __builtin_return_address() in
      functions called by the trampoline.
      
      Fix by reducing the usage of the caller-reserved stack area by 8 bytes
      in the trampoline.
      
      Fixes: 528eb2cb
      
       ("s390/bpf: Implement arch_prepare_bpf_trampoline()")
      Reported-by: default avatarSong Liu <song@kernel.org>
      Signed-off-by: default avatarIlya Leoshkevich <iii@linux.ibm.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20231010203512.385819-2-iii@linux.ibm.com
      ce10fc06
  3. Oct 10, 2023
  4. Oct 09, 2023
    • Andrew Kanner's avatar
      xdp: Fix zero-size allocation warning in xskq_create() · a12bbb3c
      Andrew Kanner authored
      Syzkaller reported the following issue:
      
        ------------[ cut here ]------------
        WARNING: CPU: 0 PID: 2807 at mm/vmalloc.c:3247 __vmalloc_node_range (mm/vmalloc.c:3361)
        Modules linked in:
        CPU: 0 PID: 2807 Comm: repro Not tainted 6.6.0-rc2+ #12
        Hardware name: Generic DT based system
        unwind_backtrace from show_stack (arch/arm/kernel/traps.c:258)
        show_stack from dump_stack_lvl (lib/dump_stack.c:107 (discriminator 1))
        dump_stack_lvl from __warn (kernel/panic.c:633 kernel/panic.c:680)
        __warn from warn_slowpath_fmt (./include/linux/context_tracking.h:153 kernel/panic.c:700)
        warn_slowpath_fmt from __vmalloc_node_range (mm/vmalloc.c:3361 (discriminator 3))
        __vmalloc_node_range from vmalloc_user (mm/vmalloc.c:3478)
        vmalloc_user from xskq_create (net/xdp/xsk_queue.c:40)
        xskq_create from xsk_setsockopt (net/xdp/xsk.c:953 net/xdp/xsk.c:1286)
        xsk_setsockopt from __sys_setsockopt (net/socket.c:2308)
        __sys_setsockopt from ret_fast_syscall (arch/arm/kernel/entry-common.S:68)
      
      xskq_get_ring_size() uses struct_size() macro to safely calculate the
      size of struct xsk_queue and q->nentries of desc members. But the
      syzkaller repro was able to set q->nentries with the value initially
      taken from copy_from_sockptr() high enough to return SIZE_MAX by
      struct_size(). The next PAGE_ALIGN(size) is such case will overflow
      the size_t value and set it to 0. This will trigger WARN_ON_ONCE in
      vmalloc_user() -> __vmalloc_node_range().
      
      The issue is reproducible on 32-bit arm kernel.
      
      Fixes: 9f78bf33
      
       ("xsk: support use vaddr as ring")
      Reported-by: default avatar <syzbot+fae676d3cf469331fc89@syzkaller.appspotmail.com>
      Closes: https://lore.kernel.org/all/000000000000c84b4705fb31741e@google.com/T/
      Reported-by: default avatar <syzbot+b132693e925cbbd89e26@syzkaller.appspotmail.com>
      Closes: https://lore.kernel.org/all/000000000000e20df20606ebab4f@google.com/T/
      Signed-off-by: default avatarAndrew Kanner <andrew.kanner@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Tested-by: default avatar <syzbot+fae676d3cf469331fc89@syzkaller.appspotmail.com>
      Acked-by: default avatarMagnus Karlsson <magnus.karlsson@intel.com>
      Link: https://syzkaller.appspot.com/bug?extid=fae676d3cf469331fc89
      Link: https://lore.kernel.org/bpf/20231007075148.1759-1-andrew.kanner@gmail.com
      a12bbb3c
    • Björn Töpel's avatar
      riscv, bpf: Track both a0 (RISC-V ABI) and a5 (BPF) return values · 7112cd26
      Björn Töpel authored
      The RISC-V BPF uses a5 for BPF return values, which are zero-extended,
      whereas the RISC-V ABI uses a0 which is sign-extended. In other words,
      a5 and a0 can differ, and are used in different context.
      
      The BPF trampoline are used for both BPF programs, and regular kernel
      functions.
      
      Make sure that the RISC-V BPF trampoline saves, and restores both a0
      and a5.
      
      Fixes: 49b5e77a
      
       ("riscv, bpf: Add bpf trampoline support for RV64")
      Signed-off-by: default avatarBjörn Töpel <bjorn@rivosinc.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20231004120706.52848-3-bjorn@kernel.org
      7112cd26
    • Björn Töpel's avatar
      riscv, bpf: Sign-extend return values · 2f1b0d3d
      Björn Töpel authored
      The RISC-V architecture does not expose sub-registers, and hold all
      32-bit values in a sign-extended format [1] [2]:
      
        | The compiler and calling convention maintain an invariant that all
        | 32-bit values are held in a sign-extended format in 64-bit
        | registers. Even 32-bit unsigned integers extend bit 31 into bits
        | 63 through 32. Consequently, conversion between unsigned and
        | signed 32-bit integers is a no-op, as is conversion from a signed
        | 32-bit integer to a signed 64-bit integer.
      
      While BPF, on the other hand, exposes sub-registers, and use
      zero-extension (similar to arm64/x86).
      
      This has led to some subtle bugs, where a BPF JITted program has not
      sign-extended the a0 register (return value in RISC-V land), passed
      the return value up the kernel, e.g.:
      
        | int from_bpf(void);
        |
        | long foo(void)
        | {
        |    return from_bpf();
        | }
      
      Here, a0 would be 0xffff_ffff, instead of the expected
      0xffff_ffff_ffff_ffff.
      
      Internally, the RISC-V JIT uses a5 as a dedicated register for BPF
      return values.
      
      Keep a5 zero-extended, but explicitly sign-extend a0 (which is used
      outside BPF land). Now that a0 (RISC-V ABI) and a5 (BPF ABI) differs,
      a0 is only moved to a5 for non-BPF native calls (BPF_PSEUDO_CALL).
      
      Fixes: 2353ecc6
      
       ("bpf, riscv: add BPF JIT for RV64G")
      Signed-off-by: default avatarBjörn Töpel <bjorn@rivosinc.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://github.com/riscv/riscv-isa-manual/releases/download/riscv-isa-release-056b6ff-2023-10-02/unpriv-isa-asciidoc.pdf # [2]
      Link: https://github.com/riscv-non-isa/riscv-elf-psabi-doc/releases/download/draft-20230929-e5c800e661a53efe3c2678d71a306323b60eb13b/riscv-abi.pdf # [2]
      Link: https://lore.kernel.org/bpf/20231004120706.52848-2-bjorn@kernel.org
      2f1b0d3d
    • Michal Swiatkowski's avatar
      ice: block default rule setting on LAG interface · 776fe199
      Michal Swiatkowski authored
      When one of the LAG interfaces is in switchdev mode, setting default rule
      can't be done.
      
      The interface on which switchdev is running has ice_set_rx_mode() blocked
      to avoid default rule adding (and other rules). The other interfaces
      (without switchdev running but connected via bond with interface that
      runs switchdev) can't follow the same scheme, because rx filtering needs
      to be disabled when failover happens. Notification for bridge to set
      promisc mode seems like good place to do that.
      
      Fixes: bb52f42a
      
       ("ice: Add driver support for firmware changes for LAG")
      Signed-off-by: default avatarMichal Swiatkowski <michal.swiatkowski@linux.intel.com>
      Signed-off-by: default avatarMarcin Szycik <marcin.szycik@linux.intel.com>
      Reviewed-by: default avatarPrzemek Kitszel <przemyslaw.kitszel@intel.com>
      Reviewed-by: default avatarWojciech Drewek <wojciech.drewek@intel.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Tested-by: default avatarSujai Buvaneswaran <sujai.buvaneswaran@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      776fe199
  5. Oct 08, 2023
  6. Oct 07, 2023
    • Daniel Borkmann's avatar
      selftests/bpf: Make seen_tc* variable tests more robust · 37345b85
      Daniel Borkmann authored
      
      
      Martin reported that on his local dev machine the test_tc_chain_mixed() fails as
      "test_tc_chain_mixed:FAIL:seen_tc5 unexpected seen_tc5: actual 1 != expected 0"
      and others occasionally, too.
      
      However, when running in a more isolated setup (qemu in particular), it works fine
      for him. The reason is that there is a small race-window where seen_tc* could turn
      into true for various test cases when there is background traffic, e.g. after the
      asserts they often get reset. In such case when subsequent detach takes place,
      unrelated background traffic could have already flipped the bool to true beforehand.
      
      Add a small helper tc_skel_reset_all_seen() to reset all bools before we do the ping
      test. At this point, everything is set up as expected and therefore no race can occur.
      All tc_{opts,links} tests continue to pass after this change.
      
      Reported-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/r/20231006220655.1653-7-daniel@iogearbox.net
      Signed-off-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      37345b85
    • Daniel Borkmann's avatar
      selftests/bpf: Test query on empty mprog and pass revision into attach · 685446b0
      Daniel Borkmann authored
      
      
      Add a new test case to query on an empty bpf_mprog and pass the revision
      directly into expected_revision for attachment to assert that this does
      succeed.
      
        ./test_progs -t tc_opts
        [    1.406778] tsc: Refined TSC clocksource calibration: 3407.990 MHz
        [    1.408863] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x311fcaf6eb0, max_idle_ns: 440795321766 ns
        [    1.412419] clocksource: Switched to clocksource tsc
        [    1.428671] bpf_testmod: loading out-of-tree module taints kernel.
        [    1.430260] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel
        #252     tc_opts_after:OK
        #253     tc_opts_append:OK
        #254     tc_opts_basic:OK
        #255     tc_opts_before:OK
        #256     tc_opts_chain_classic:OK
        #257     tc_opts_chain_mixed:OK
        #258     tc_opts_delete_empty:OK
        #259     tc_opts_demixed:OK
        #260     tc_opts_detach:OK
        #261     tc_opts_detach_after:OK
        #262     tc_opts_detach_before:OK
        #263     tc_opts_dev_cleanup:OK
        #264     tc_opts_invalid:OK
        #265     tc_opts_max:OK
        #266     tc_opts_mixed:OK
        #267     tc_opts_prepend:OK
        #268     tc_opts_query:OK
        #269     tc_opts_query_attach:OK     <--- (new test)
        #270     tc_opts_replace:OK
        #271     tc_opts_revision:OK
        Summary: 20/0 PASSED, 0 SKIPPED, 0 FAILED
      
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/r/20231006220655.1653-6-daniel@iogearbox.net
      Signed-off-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      685446b0
    • Daniel Borkmann's avatar
      selftests/bpf: Adapt assert_mprog_count to always expect 0 count · b7736826
      Daniel Borkmann authored
      
      
      Simplify __assert_mprog_count() to remove the -ENOENT corner case as the
      bpf_prog_query() now returns 0 when no bpf_mprog is attached. This also
      allows to convert a few test cases from using raw __assert_mprog_count()
      over to plain assert_mprog_count() helper.
      
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/r/20231006220655.1653-5-daniel@iogearbox.net
      Signed-off-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      b7736826
    • Daniel Borkmann's avatar
      selftests/bpf: Test bpf_mprog query API via libbpf and raw syscall · f9b08790
      Daniel Borkmann authored
      
      
      Add a new test case which performs double query of the bpf_mprog through
      libbpf API, but also via raw bpf(2) syscall. This is testing to gather
      first the count and then in a subsequent probe the full information with
      the program array without clearing passed structs in between.
      
        # ./vmtest.sh -- ./test_progs -t tc_opts
        [...]
        ./test_progs -t tc_opts
        [    1.398818] tsc: Refined TSC clocksource calibration: 3407.999 MHz
        [    1.400263] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x311fd336761, max_idle_ns: 440795243819 ns
        [    1.402734] clocksource: Switched to clocksource tsc
        [    1.426639] bpf_testmod: loading out-of-tree module taints kernel.
        [    1.428112] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel
        #252     tc_opts_after:OK
        #253     tc_opts_append:OK
        #254     tc_opts_basic:OK
        #255     tc_opts_before:OK
        #256     tc_opts_chain_classic:OK
        #257     tc_opts_chain_mixed:OK
        #258     tc_opts_delete_empty:OK
        #259     tc_opts_demixed:OK
        #260     tc_opts_detach:OK
        #261     tc_opts_detach_after:OK
        #262     tc_opts_detach_before:OK
        #263     tc_opts_dev_cleanup:OK
        #264     tc_opts_invalid:OK
        #265     tc_opts_max:OK
        #266     tc_opts_mixed:OK
        #267     tc_opts_prepend:OK
        #268     tc_opts_query:OK            <--- (new test)
        #269     tc_opts_replace:OK
        #270     tc_opts_revision:OK
        Summary: 19/0 PASSED, 0 SKIPPED, 0 FAILED
      
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/r/20231006220655.1653-4-daniel@iogearbox.net
      Signed-off-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      f9b08790
    • Lorenz Bauer's avatar
      bpf: Refuse unused attributes in bpf_prog_{attach,detach} · ba62d611
      Lorenz Bauer authored
      The recently added tcx attachment extended the BPF UAPI for attaching and
      detaching by a couple of fields. Those fields are currently only supported
      for tcx, other types like cgroups and flow dissector silently ignore the
      new fields except for the new flags.
      
      This is problematic once we extend bpf_mprog to older attachment types, since
      it's hard to figure out whether the syscall really was successful if the
      kernel silently ignores non-zero values.
      
      Explicitly reject non-zero fields relevant to bpf_mprog for attachment types
      which don't use the latter yet.
      
      Fixes: e420bed0
      
       ("bpf: Add fd-based tcx multi-prog infra with link support")
      Signed-off-by: default avatarLorenz Bauer <lmb@isovalent.com>
      Co-developed-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/r/20231006220655.1653-3-daniel@iogearbox.net
      Signed-off-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      ba62d611
    • Daniel Borkmann's avatar
      bpf: Handle bpf_mprog_query with NULL entry · edfa9af0
      Daniel Borkmann authored
      
      
      Improve consistency for bpf_mprog_query() API and let the latter also handle
      a NULL entry as can be the case for tcx. Instead of returning -ENOENT, we
      copy a count of 0 and revision of 1 to user space, so that this can be fed
      into a subsequent bpf_mprog_attach() call as expected_revision. A BPF self-
      test as part of this series has been added to assert this case.
      
      Suggested-by: default avatarLorenz Bauer <lmb@isovalent.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/r/20231006220655.1653-2-daniel@iogearbox.net
      Signed-off-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      edfa9af0
    • Daniel Borkmann's avatar
      bpf: Fix BPF_PROG_QUERY last field check · a4fe7838
      Daniel Borkmann authored
      While working on the ebpf-go [0] library integration for bpf_mprog and tcx,
      Lorenz noticed that two subsequent BPF_PROG_QUERY requests currently fail. A
      typical workflow is to first gather the bpf_mprog count without passing program/
      link arrays, followed by the second request which contains the actual array
      pointers.
      
      The initial call populates count and revision fields. The second call gets
      rejected due to a BPF_PROG_QUERY_LAST_FIELD bug which should point to
      query.revision instead of query.link_attach_flags since the former is really
      the last member.
      
      It was not noticed in libbpf as bpf_prog_query_opts() always calls bpf(2) with
      an on-stack bpf_attr that is memset() each time (and therefore query.revision
      was reset to zero).
      
        [0] https://ebpf-go.dev
      
      Fixes: e420bed0
      
       ("bpf: Add fd-based tcx multi-prog infra with link support")
      Reported-by: default avatarLorenz Bauer <lmb@isovalent.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/r/20231006220655.1653-1-daniel@iogearbox.net
      Signed-off-by: default avatarMartin KaFai Lau <martin.lau@kernel.org>
      a4fe7838
    • Jakub Kicinski's avatar
      Merge branch 'ravb-fix-use-after-free-issues' · a2e52554
      Jakub Kicinski authored
      
      
      Yoshihiro Shimoda says:
      
      ====================
      ravb: Fix use-after-free issues
      
      This patch series fixes use-after-free issues in ravb_remove().
      The original patch is made by Zheng Wang [1]. And, I made the patch
      1/2 which I found other issue in the ravb_remove().
      
      [1]
      https://lore.kernel.org/netdev/20230725030026.1664873-1-zyytlz.wz@163.com/
      
      v1: https://lore.kernel.org/all/20231004091253.4194205-1-yoshihiro.shimoda.uh@renesas.com/
      ====================
      
      Link: https://lore.kernel.org/r/20231005011201.14368-1-yoshihiro.shimoda.uh@renesas.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      a2e52554