Skip to content
  1. Sep 19, 2023
    • Petr Tesarik's avatar
      sh: boards: Fix CEU buffer size passed to dma_declare_coherent_memory() · 2348a375
      Petr Tesarik authored
      [ Upstream commit fb60211f ]
      
      In all these cases, the last argument to dma_declare_coherent_memory() is
      the buffer end address, but the expected value should be the size of the
      reserved region.
      
      Fixes: 39fb9930 ("media: arch: sh: ap325rxa: Use new renesas-ceu camera driver")
      Fixes: c2f9b05f ("media: arch: sh: ecovec: Use new renesas-ceu camera driver")
      Fixes: f3590dc3 ("media: arch: sh: kfr2r09: Use new renesas-ceu camera driver")
      Fixes: 186c446f ("media: arch: sh: migor: Use new renesas-ceu camera driver")
      Fixes: 1a3c230b
      
       ("media: arch: sh: ms7724se: Use new renesas-ceu camera driver")
      Signed-off-by: default avatarPetr Tesarik <petr.tesarik.ext@huawei.com>
      Reviewed-by: default avatarGeert Uytterhoeven <geert+renesas@glider.be>
      Reviewed-by: default avatarJacopo Mondi <jacopo.mondi@ideasonboard.com>
      Reviewed-by: default avatarJohn Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
      Reviewed-by: default avatarLaurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
      Link: https://lore.kernel.org/r/20230724120742.2187-1-petrtesarik@huaweicloud.com
      
      
      Signed-off-by: default avatarJohn Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      2348a375
    • Jie Wang's avatar
      net: hns3: remove GSO partial feature bit · f5160dc1
      Jie Wang authored
      [ Upstream commit 60326634 ]
      
      HNS3 NIC does not support GSO partial packets segmentation. Actually tunnel
      packets for example NvGRE packets segment offload and checksum offload is
      already supported. There is no need to keep gso partial feature bit. So
      this patch removes it.
      
      Fixes: 76ad4f0e
      
       ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC")
      Signed-off-by: default avatarJie Wang <wangjie125@huawei.com>
      Signed-off-by: default avatarJijie Shao <shaojijie@huawei.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f5160dc1
    • Yisen Zhuang's avatar
      net: hns3: fix the port information display when sfp is absent · 6d548b7c
      Yisen Zhuang authored
      [ Upstream commit 674d9591 ]
      
      When sfp is absent or unidentified, the port type should be
      displayed as PORT_OTHERS, rather than PORT_FIBRE.
      
      Fixes: 88d10bd6
      
       ("net: hns3: add support for multiple media type")
      Signed-off-by: default avatarYisen Zhuang <yisen.zhuang@huawei.com>
      Signed-off-by: default avatarJijie Shao <shaojijie@huawei.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      6d548b7c
    • Jijie Shao's avatar
      net: hns3: fix invalid mutex between tc qdisc and dcb ets command issue · cc3c67e0
      Jijie Shao authored
      [ Upstream commit fa556494 ]
      
      We hope that tc qdisc and dcb ets commands can not be used crosswise.
      If we want to use any of the commands to configure tc,
      We must use the other command to clear the existing configuration.
      
      However, when we configure a single tc with tc qdisc,
      we can still configure it with dcb ets.
      Because we use mqprio_active as the tag of tc qdisc configuration,
      but with dcb ets, we do not check mqprio_active.
      
      This patch fix this issue by check mqprio_active before
      executing the dcb ets command. and add dcb_ets_active to
      replace HCLGE_FLAG_DCB_ENABLE and HCLGE_FLAG_MQPRIO_ENABLE
      at the hclge layer,
      
      Fixes: cacde272
      
       ("net: hns3: Add hclge_dcb module for the support of DCB feature")
      Signed-off-by: default avatarJijie Shao <shaojijie@huawei.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      cc3c67e0
    • Hao Chen's avatar
      net: hns3: fix debugfs concurrency issue between kfree buffer and read · 2c9643fa
      Hao Chen authored
      [ Upstream commit c295160b ]
      
      Now in hns3_dbg_uninit(), there may be concurrency between
      kfree buffer and read, it may result in memory error.
      
      Moving debugfs_remove_recursive() in front of kfree buffer to ensure
      they don't happen at the same time.
      
      Fixes: 5e69ea7e
      
       ("net: hns3: refactor the debugfs process")
      Signed-off-by: default avatarHao Chen <chenhao418@huawei.com>
      Signed-off-by: default avatarJijie Shao <shaojijie@huawei.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      2c9643fa
    • Hao Chen's avatar
      net: hns3: fix byte order conversion issue in hclge_dbg_fd_tcam_read() · 8bfa87cf
      Hao Chen authored
      [ Upstream commit efccf655 ]
      
      req1->tcam_data is defined as "u8 tcam_data[8]", and we convert it as
      (u32 *) without considerring byte order conversion,
      it may result in printing wrong data for tcam_data.
      
      Convert tcam_data to (__le32 *) first to fix it.
      
      Fixes: b5a0b70d
      
       ("net: hns3: refactor dump fd tcam of debugfs")
      Signed-off-by: default avatarHao Chen <chenhao418@huawei.com>
      Signed-off-by: default avatarJijie Shao <shaojijie@huawei.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      8bfa87cf
    • Wander Lairson Costa's avatar
      netfilter: nfnetlink_osf: avoid OOB read · 19280e8d
      Wander Lairson Costa authored
      [ Upstream commit f4f8a780 ]
      
      The opt_num field is controlled by user mode and is not currently
      validated inside the kernel. An attacker can take advantage of this to
      trigger an OOB read and potentially leak information.
      
      BUG: KASAN: slab-out-of-bounds in nf_osf_match_one+0xbed/0xd10 net/netfilter/nfnetlink_osf.c:88
      Read of size 2 at addr ffff88804bc64272 by task poc/6431
      
      CPU: 1 PID: 6431 Comm: poc Not tainted 6.0.0-rc4 #1
      Call Trace:
       nf_osf_match_one+0xbed/0xd10 net/netfilter/nfnetlink_osf.c:88
       nf_osf_find+0x186/0x2f0 net/netfilter/nfnetlink_osf.c:281
       nft_osf_eval+0x37f/0x590 net/netfilter/nft_osf.c:47
       expr_call_ops_eval net/netfilter/nf_tables_core.c:214
       nft_do_chain+0x2b0/0x1490 net/netfilter/nf_tables_core.c:264
       nft_do_chain_ipv4+0x17c/0x1f0 net/netfilter/nft_chain_filter.c:23
       [..]
      
      Also add validation to genre, subtype and version fields.
      
      Fixes: 11eeef41
      
       ("netfilter: passive OS fingerprint xtables match")
      Reported-by: default avatarLucas Leong <wmliang@infosec.exchange>
      Signed-off-by: default avatarWander Lairson Costa <wander@redhat.com>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      19280e8d
    • Florian Westphal's avatar
      netfilter: nftables: exthdr: fix 4-byte stack OOB write · 1ad7b189
      Florian Westphal authored
      [ Upstream commit fd94d9da ]
      
      If priv->len is a multiple of 4, then dst[len / 4] can write past
      the destination array which leads to stack corruption.
      
      This construct is necessary to clean the remainder of the register
      in case ->len is NOT a multiple of the register size, so make it
      conditional just like nft_payload.c does.
      
      The bug was added in 4.1 cycle and then copied/inherited when
      tcp/sctp and ip option support was added.
      
      Bug reported by Zero Day Initiative project (ZDI-CAN-21950,
      ZDI-CAN-21951, ZDI-CAN-21961).
      
      Fixes: 49499c3e ("netfilter: nf_tables: switch registers to 32 bit addressing")
      Fixes: 935b7f64 ("netfilter: nft_exthdr: add TCP option matching")
      Fixes: 133dc203 ("netfilter: nft_exthdr: Support SCTP chunks")
      Fixes: dbb5281a
      
       ("netfilter: nf_tables: add support for matching IPv4 options")
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      1ad7b189
    • Vladimir Oltean's avatar
      net: dsa: sja1105: complete tc-cbs offload support on SJA1110 · 347f7651
      Vladimir Oltean authored
      [ Upstream commit 180a7419 ]
      
      The blamed commit left this delta behind:
      
        struct sja1105_cbs_entry {
       -	u64 port;
       -	u64 prio;
       +	u64 port; /* Not used for SJA1110 */
       +	u64 prio; /* Not used for SJA1110 */
        	u64 credit_hi;
        	u64 credit_lo;
        	u64 send_slope;
        	u64 idle_slope;
        };
      
      but did not actually implement tc-cbs offload fully for the new switch.
      The offload is accepted, but it doesn't work.
      
      The difference compared to earlier switch generations is that now, the
      table of CBS shapers is sparse, because there are many more shapers, so
      the mapping between a {port, prio} and a table index is static, rather
      than requiring us to store the port and prio into the sja1105_cbs_entry.
      
      So, the problem is that the code programs the CBS shaper parameters at a
      dynamic table index which is incorrect.
      
      All that needs to be done for SJA1110 CBS shapers to work is to bypass
      the logic which allocates shapers in a dense manner, as for SJA1105, and
      use the fixed mapping instead.
      
      Fixes: 3e77e59b
      
       ("net: dsa: sja1105: add support for the SJA1110 switch family")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      347f7651
    • Vladimir Oltean's avatar
      net: dsa: sja1105: fix -ENOSPC when replacing the same tc-cbs too many times · cb4494cf
      Vladimir Oltean authored
      [ Upstream commit 894cafc5 ]
      
      After running command [2] too many times in a row:
      
      [1] $ tc qdisc add dev sw2p0 root handle 1: mqprio num_tc 8 \
      	map 0 1 2 3 4 5 6 7 queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 hw 0
      [2] $ tc qdisc replace dev sw2p0 parent 1:1 cbs offload 1 \
      	idleslope 120000 sendslope -880000 locredit -1320 hicredit 180
      
      (aka more than priv->info->num_cbs_shapers times)
      
      we start seeing the following error message:
      
      Error: Specified device failed to setup cbs hardware offload.
      
      This comes from the fact that ndo_setup_tc(TC_SETUP_QDISC_CBS) presents
      the same API for the qdisc create and replace cases, and the sja1105
      driver fails to distinguish between the 2. Thus, it always thinks that
      it must allocate the same shaper for a {port, queue} pair, when it may
      instead have to replace an existing one.
      
      Fixes: 4d752508
      
       ("net: dsa: sja1105: offload the Credit-Based Shaper qdisc")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      cb4494cf
    • Vladimir Oltean's avatar
      net: dsa: sja1105: fix bandwidth discrepancy between tc-cbs software and offload · 77b850b8
      Vladimir Oltean authored
      [ Upstream commit 954ad9bf ]
      
      More careful measurement of the tc-cbs bandwidth shows that the stream
      bandwidth (effectively idleslope) increases, there is a larger and
      larger discrepancy between the rate limit obtained by the software
      Qdisc, and the rate limit obtained by its offloaded counterpart.
      
      The discrepancy becomes so large, that e.g. at an idleslope of 40000
      (40Mbps), the offloaded cbs does not actually rate limit anything, and
      traffic will pass at line rate through a 100 Mbps port.
      
      The reason for the discrepancy is that the hardware documentation I've
      been following is incorrect. UM11040.pdf (for SJA1105P/Q/R/S) states
      about IDLE_SLOPE that it is "the rate (in unit of bytes/sec) at which
      the credit counter is increased".
      
      Cross-checking with UM10944.pdf (for SJA1105E/T) and UM11107.pdf
      (for SJA1110), the wording is different: "This field specifies the
      value, in bytes per second times link speed, by which the credit counter
      is increased".
      
      So there's an extra scaling for link speed that the driver is currently
      not accounting for, and apparently (empirically), that link speed is
      expressed in Kbps.
      
      I've pondered whether to pollute the sja1105_mac_link_up()
      implementation with CBS shaper reprogramming, but I don't think it is
      worth it. IMO, the UAPI exposed by tc-cbs requires user space to
      recalculate the sendslope anyway, since the formula for that depends on
      port_transmit_rate (see man tc-cbs), which is not an invariant from tc's
      perspective.
      
      So we use the offload->sendslope and offload->idleslope to deduce the
      original port_transmit_rate from the CBS formula, and use that value to
      scale the offload->sendslope and offload->idleslope to values that the
      hardware understands.
      
      Some numerical data points:
      
       40Mbps stream, max interfering frame size 1500, port speed 100M
       ---------------------------------------------------------------
      
       tc-cbs parameters:
       idleslope 40000 sendslope -60000 locredit -900 hicredit 600
      
       which result in hardware values:
      
       Before (doesn't work)           After (works)
       credit_hi    600                600
       credit_lo    900                900
       send_slope   7500000            75
       idle_slope   5000000            50
      
       40Mbps stream, max interfering frame size 1500, port speed 1G
       -------------------------------------------------------------
      
       tc-cbs parameters:
       idleslope 40000 sendslope -960000 locredit -1440 hicredit 60
      
       which result in hardware values:
      
       Before (doesn't work)           After (works)
       credit_hi    60                 60
       credit_lo    1440               1440
       send_slope   120000000          120
       idle_slope   5000000            5
      
       5.12Mbps stream, max interfering frame size 1522, port speed 100M
       -----------------------------------------------------------------
      
       tc-cbs parameters:
       idleslope 5120 sendslope -94880 locredit -1444 hicredit 77
      
       which result in hardware values:
      
       Before (doesn't work)           After (works)
       credit_hi    77                 77
       credit_lo    1444               1444
       send_slope   11860000           118
       idle_slope   640000             6
      
      Tested on SJA1105T, SJA1105S and SJA1110A, at 1Gbps and 100Mbps.
      
      Fixes: 4d752508
      
       ("net: dsa: sja1105: offload the Credit-Based Shaper qdisc")
      Reported-by: default avatarYanan Yang <yanan.yang@nxp.com>
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      77b850b8
    • Eric Dumazet's avatar
      ip_tunnels: use DEV_STATS_INC() · d11109c0
      Eric Dumazet authored
      [ Upstream commit 9b271eba ]
      
      syzbot/KCSAN reported data-races in iptunnel_xmit_stats() [1]
      
      This can run from multiple cpus without mutual exclusion.
      
      Adopt SMP safe DEV_STATS_INC() to update dev->stats fields.
      
      [1]
      BUG: KCSAN: data-race in iptunnel_xmit / iptunnel_xmit
      
      read-write to 0xffff8881353df170 of 8 bytes by task 30263 on cpu 1:
      iptunnel_xmit_stats include/net/ip_tunnels.h:493 [inline]
      iptunnel_xmit+0x432/0x4a0 net/ipv4/ip_tunnel_core.c:87
      ip_tunnel_xmit+0x1477/0x1750 net/ipv4/ip_tunnel.c:831
      __gre_xmit net/ipv4/ip_gre.c:469 [inline]
      ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:662
      __netdev_start_xmit include/linux/netdevice.h:4889 [inline]
      netdev_start_xmit include/linux/netdevice.h:4903 [inline]
      xmit_one net/core/dev.c:3544 [inline]
      dev_hard_start_xmit+0x11b/0x3f0 net/core/dev.c:3560
      __dev_queue_xmit+0xeee/0x1de0 net/core/dev.c:4340
      dev_queue_xmit include/linux/netdevice.h:3082 [inline]
      __bpf_tx_skb net/core/filter.c:2129 [inline]
      __bpf_redirect_no_mac net/core/filter.c:2159 [inline]
      __bpf_redirect+0x723/0x9c0 net/core/filter.c:2182
      ____bpf_clone_redirect net/core/filter.c:2453 [inline]
      bpf_clone_redirect+0x16c/0x1d0 net/core/filter.c:2425
      ___bpf_prog_run+0xd7d/0x41e0 kernel/bpf/core.c:1954
      __bpf_prog_run512+0x74/0xa0 kernel/bpf/core.c:2195
      bpf_dispatcher_nop_func include/linux/bpf.h:1181 [inline]
      __bpf_prog_run include/linux/filter.h:609 [inline]
      bpf_prog_run include/linux/filter.h:616 [inline]
      bpf_test_run+0x15d/0x3d0 net/bpf/test_run.c:423
      bpf_prog_test_run_skb+0x77b/0xa00 net/bpf/test_run.c:1045
      bpf_prog_test_run+0x265/0x3d0 kernel/bpf/syscall.c:3996
      __sys_bpf+0x3af/0x780 kernel/bpf/syscall.c:5353
      __do_sys_bpf kernel/bpf/syscall.c:5439 [inline]
      __se_sys_bpf kernel/bpf/syscall.c:5437 [inline]
      __x64_sys_bpf+0x43/0x50 kernel/bpf/syscall.c:5437
      do_syscall_x64 arch/x86/entry/common.c:50 [inline]
      do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
      entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      read-write to 0xffff8881353df170 of 8 bytes by task 30249 on cpu 0:
      iptunnel_xmit_stats include/net/ip_tunnels.h:493 [inline]
      iptunnel_xmit+0x432/0x4a0 net/ipv4/ip_tunnel_core.c:87
      ip_tunnel_xmit+0x1477/0x1750 net/ipv4/ip_tunnel.c:831
      __gre_xmit net/ipv4/ip_gre.c:469 [inline]
      ipgre_xmit+0x516/0x570 net/ipv4/ip_gre.c:662
      __netdev_start_xmit include/linux/netdevice.h:4889 [inline]
      netdev_start_xmit include/linux/netdevice.h:4903 [inline]
      xmit_one net/core/dev.c:3544 [inline]
      dev_hard_start_xmit+0x11b/0x3f0 net/core/dev.c:3560
      __dev_queue_xmit+0xeee/0x1de0 net/core/dev.c:4340
      dev_queue_xmit include/linux/netdevice.h:3082 [inline]
      __bpf_tx_skb net/core/filter.c:2129 [inline]
      __bpf_redirect_no_mac net/core/filter.c:2159 [inline]
      __bpf_redirect+0x723/0x9c0 net/core/filter.c:2182
      ____bpf_clone_redirect net/core/filter.c:2453 [inline]
      bpf_clone_redirect+0x16c/0x1d0 net/core/filter.c:2425
      ___bpf_prog_run+0xd7d/0x41e0 kernel/bpf/core.c:1954
      __bpf_prog_run512+0x74/0xa0 kernel/bpf/core.c:2195
      bpf_dispatcher_nop_func include/linux/bpf.h:1181 [inline]
      __bpf_prog_run include/linux/filter.h:609 [inline]
      bpf_prog_run include/linux/filter.h:616 [inline]
      bpf_test_run+0x15d/0x3d0 net/bpf/test_run.c:423
      bpf_prog_test_run_skb+0x77b/0xa00 net/bpf/test_run.c:1045
      bpf_prog_test_run+0x265/0x3d0 kernel/bpf/syscall.c:3996
      __sys_bpf+0x3af/0x780 kernel/bpf/syscall.c:5353
      __do_sys_bpf kernel/bpf/syscall.c:5439 [inline]
      __se_sys_bpf kernel/bpf/syscall.c:5437 [inline]
      __x64_sys_bpf+0x43/0x50 kernel/bpf/syscall.c:5437
      do_syscall_x64 arch/x86/entry/common.c:50 [inline]
      do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
      entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      value changed: 0x0000000000018830 -> 0x0000000000018831
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 0 PID: 30249 Comm: syz-executor.4 Not tainted 6.5.0-syzkaller-11704-g3f86ed6ec0b3 #0
      
      Fixes: 039f5062
      
       ("ip_tunnel: Move stats update to iptunnel_xmit()")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d11109c0
    • Ariel Marcovitch's avatar
      idr: fix param name in idr_alloc_cyclic() doc · fcfb5842
      Ariel Marcovitch authored
      [ Upstream commit 2a15de80 ]
      
      The relevant parameter is 'start' and not 'nextid'
      
      Fixes: 460488c5
      
       ("idr: Remove idr_alloc_ext")
      Signed-off-by: default avatarAriel Marcovitch <arielmarcovitch@gmail.com>
      Signed-off-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      fcfb5842
    • Andy Shevchenko's avatar
      s390/zcrypt: don't leak memory if dev_set_name() fails · 131cd74a
      Andy Shevchenko authored
      [ Upstream commit 6252f47b ]
      
      When dev_set_name() fails, zcdn_create() doesn't free the newly
      allocated resources. Do it.
      
      Fixes: 00fab235
      
       ("s390/zcrypt: multiple zcrypt device nodes support")
      Signed-off-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Link: https://lore.kernel.org/r/20230831110000.24279-1-andriy.shevchenko@linux.intel.com
      
      
      Signed-off-by: default avatarHarald Freudenberger <freude@linux.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <hca@linux.ibm.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      131cd74a
    • Olga Zaborska's avatar
      igb: Change IGB_MIN to allow set rx/tx value between 64 and 80 · 12de76fd
      Olga Zaborska authored
      [ Upstream commit 6319685b ]
      
      Change the minimum value of RX/TX descriptors to 64 to enable setting the rx/tx
      value between 64 and 80. All igb devices can use as low as 64 descriptors.
      This change will unify igb with other drivers.
      Based on commit 7b1be198 ("e1000e: lower ring minimum size to 64")
      
      Fixes: 9d5c8243
      
       ("igb: PCI-Express 82575 Gigabit Ethernet driver")
      Signed-off-by: default avatarOlga Zaborska <olga.zaborska@intel.com>
      Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      12de76fd
    • Olga Zaborska's avatar
      igbvf: Change IGBVF_MIN to allow set rx/tx value between 64 and 80 · 7c2f90b1
      Olga Zaborska authored
      [ Upstream commit 83607175 ]
      
      Change the minimum value of RX/TX descriptors to 64 to enable setting the rx/tx
      value between 64 and 80. All igbvf devices can use as low as 64 descriptors.
      This change will unify igbvf with other drivers.
      Based on commit 7b1be198 ("e1000e: lower ring minimum size to 64")
      
      Fixes: d4e0fe01
      
       ("igbvf: add new driver to support 82576 virtual functions")
      Signed-off-by: default avatarOlga Zaborska <olga.zaborska@intel.com>
      Tested-by: default avatarRafal Romanowski <rafal.romanowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7c2f90b1
    • Olga Zaborska's avatar
      igc: Change IGC_MIN to allow set rx/tx value between 64 and 80 · f4c5640d
      Olga Zaborska authored
      [ Upstream commit 5aa48279 ]
      
      Change the minimum value of RX/TX descriptors to 64 to enable setting the rx/tx
      value between 64 and 80. All igc devices can use as low as 64 descriptors.
      This change will unify igc with other drivers.
      Based on commit 7b1be198 ("e1000e: lower ring minimum size to 64")
      
      Fixes: 0507ef8a
      
       ("igc: Add transmit and receive fastpath and interrupt handlers")
      Signed-off-by: default avatarOlga Zaborska <olga.zaborska@intel.com>
      Tested-by: default avatarNaama Meir <naamax.meir@linux.intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f4c5640d
    • Geetha sowjanya's avatar
      octeontx2-af: Fix truncation of smq in CN10K NIX AQ enqueue mbox handler · 9210b3dd
      Geetha sowjanya authored
      [ Upstream commit 29fe7a1b ]
      
      The smq value used in the CN10K NIX AQ instruction enqueue mailbox
      handler was truncated to 9-bit value from 10-bit value because of
      typecasting the CN10K mbox request structure to the CN9K structure.
      Though this hasn't caused any problems when programming the NIX SQ
      context to the HW because the context structure is the same size.
      However, this causes a problem when accessing the structure parameters.
      This patch reads the right smq value for each platform.
      
      Fixes: 30077d21
      
       ("octeontx2-af: cn10k: Update NIX/NPA context structure")
      Signed-off-by: default avatarGeetha sowjanya <gakula@marvell.com>
      Signed-off-by: default avatarSunil Kovvuri Goutham <sgoutham@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      9210b3dd
    • Shigeru Yoshida's avatar
      kcm: Destroy mutex in kcm_exit_net() · 1840f08c
      Shigeru Yoshida authored
      [ Upstream commit 6ad40b36 ]
      
      kcm_exit_net() should call mutex_destroy() on knet->mutex. This is especially
      needed if CONFIG_DEBUG_MUTEXES is enabled.
      
      Fixes: ab7ac4eb
      
       ("kcm: Kernel Connection Multiplexor module")
      Signed-off-by: default avatarShigeru Yoshida <syoshida@redhat.com>
      Link: https://lore.kernel.org/r/20230902170708.1727999-1-syoshida@redhat.com
      
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      1840f08c
    • valis's avatar
      net: sched: sch_qfq: Fix UAF in qfq_dequeue() · 6ea277b2
      valis authored
      [ Upstream commit 8fc134fe ]
      
      When the plug qdisc is used as a class of the qfq qdisc it could trigger a
      UAF. This issue can be reproduced with following commands:
      
        tc qdisc add dev lo root handle 1: qfq
        tc class add dev lo parent 1: classid 1:1 qfq weight 1 maxpkt 512
        tc qdisc add dev lo parent 1:1 handle 2: plug
        tc filter add dev lo parent 1: basic classid 1:1
        ping -c1 127.0.0.1
      
      and boom:
      
      [  285.353793] BUG: KASAN: slab-use-after-free in qfq_dequeue+0xa7/0x7f0
      [  285.354910] Read of size 4 at addr ffff8880bad312a8 by task ping/144
      [  285.355903]
      [  285.356165] CPU: 1 PID: 144 Comm: ping Not tainted 6.5.0-rc3+ #4
      [  285.357112] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
      [  285.358376] Call Trace:
      [  285.358773]  <IRQ>
      [  285.359109]  dump_stack_lvl+0x44/0x60
      [  285.359708]  print_address_description.constprop.0+0x2c/0x3c0
      [  285.360611]  kasan_report+0x10c/0x120
      [  285.361195]  ? qfq_dequeue+0xa7/0x7f0
      [  285.361780]  qfq_dequeue+0xa7/0x7f0
      [  285.362342]  __qdisc_run+0xf1/0x970
      [  285.362903]  net_tx_action+0x28e/0x460
      [  285.363502]  __do_softirq+0x11b/0x3de
      [  285.364097]  do_softirq.part.0+0x72/0x90
      [  285.364721]  </IRQ>
      [  285.365072]  <TASK>
      [  285.365422]  __local_bh_enable_ip+0x77/0x90
      [  285.366079]  __dev_queue_xmit+0x95f/0x1550
      [  285.366732]  ? __pfx_csum_and_copy_from_iter+0x10/0x10
      [  285.367526]  ? __pfx___dev_queue_xmit+0x10/0x10
      [  285.368259]  ? __build_skb_around+0x129/0x190
      [  285.368960]  ? ip_generic_getfrag+0x12c/0x170
      [  285.369653]  ? __pfx_ip_generic_getfrag+0x10/0x10
      [  285.370390]  ? csum_partial+0x8/0x20
      [  285.370961]  ? raw_getfrag+0xe5/0x140
      [  285.371559]  ip_finish_output2+0x539/0xa40
      [  285.372222]  ? __pfx_ip_finish_output2+0x10/0x10
      [  285.372954]  ip_output+0x113/0x1e0
      [  285.373512]  ? __pfx_ip_output+0x10/0x10
      [  285.374130]  ? icmp_out_count+0x49/0x60
      [  285.374739]  ? __pfx_ip_finish_output+0x10/0x10
      [  285.375457]  ip_push_pending_frames+0xf3/0x100
      [  285.376173]  raw_sendmsg+0xef5/0x12d0
      [  285.376760]  ? do_syscall_64+0x40/0x90
      [  285.377359]  ? __static_call_text_end+0x136578/0x136578
      [  285.378173]  ? do_syscall_64+0x40/0x90
      [  285.378772]  ? kasan_enable_current+0x11/0x20
      [  285.379469]  ? __pfx_raw_sendmsg+0x10/0x10
      [  285.380137]  ? __sock_create+0x13e/0x270
      [  285.380673]  ? __sys_socket+0xf3/0x180
      [  285.381174]  ? __x64_sys_socket+0x3d/0x50
      [  285.381725]  ? entry_SYSCALL_64_after_hwframe+0x6e/0xd8
      [  285.382425]  ? __rcu_read_unlock+0x48/0x70
      [  285.382975]  ? ip4_datagram_release_cb+0xd8/0x380
      [  285.383608]  ? __pfx_ip4_datagram_release_cb+0x10/0x10
      [  285.384295]  ? preempt_count_sub+0x14/0xc0
      [  285.384844]  ? __list_del_entry_valid+0x76/0x140
      [  285.385467]  ? _raw_spin_lock_bh+0x87/0xe0
      [  285.386014]  ? __pfx__raw_spin_lock_bh+0x10/0x10
      [  285.386645]  ? release_sock+0xa0/0xd0
      [  285.387148]  ? preempt_count_sub+0x14/0xc0
      [  285.387712]  ? freeze_secondary_cpus+0x348/0x3c0
      [  285.388341]  ? aa_sk_perm+0x177/0x390
      [  285.388856]  ? __pfx_aa_sk_perm+0x10/0x10
      [  285.389441]  ? check_stack_object+0x22/0x70
      [  285.390032]  ? inet_send_prepare+0x2f/0x120
      [  285.390603]  ? __pfx_inet_sendmsg+0x10/0x10
      [  285.391172]  sock_sendmsg+0xcc/0xe0
      [  285.391667]  __sys_sendto+0x190/0x230
      [  285.392168]  ? __pfx___sys_sendto+0x10/0x10
      [  285.392727]  ? kvm_clock_get_cycles+0x14/0x30
      [  285.393328]  ? set_normalized_timespec64+0x57/0x70
      [  285.393980]  ? _raw_spin_unlock_irq+0x1b/0x40
      [  285.394578]  ? __x64_sys_clock_gettime+0x11c/0x160
      [  285.395225]  ? __pfx___x64_sys_clock_gettime+0x10/0x10
      [  285.395908]  ? _copy_to_user+0x3e/0x60
      [  285.396432]  ? exit_to_user_mode_prepare+0x1a/0x120
      [  285.397086]  ? syscall_exit_to_user_mode+0x22/0x50
      [  285.397734]  ? do_syscall_64+0x71/0x90
      [  285.398258]  __x64_sys_sendto+0x74/0x90
      [  285.398786]  do_syscall_64+0x64/0x90
      [  285.399273]  ? exit_to_user_mode_prepare+0x1a/0x120
      [  285.399949]  ? syscall_exit_to_user_mode+0x22/0x50
      [  285.400605]  ? do_syscall_64+0x71/0x90
      [  285.401124]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
      [  285.401807] RIP: 0033:0x495726
      [  285.402233] Code: ff ff ff f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 11 b8 2c 00 00 00 0f 09
      [  285.404683] RSP: 002b:00007ffcc25fb618 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
      [  285.405677] RAX: ffffffffffffffda RBX: 0000000000000040 RCX: 0000000000495726
      [  285.406628] RDX: 0000000000000040 RSI: 0000000002518750 RDI: 0000000000000000
      [  285.407565] RBP: 00000000005205ef R08: 00000000005f8838 R09: 000000000000001c
      [  285.408523] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000002517634
      [  285.409460] R13: 00007ffcc25fb6f0 R14: 0000000000000003 R15: 0000000000000000
      [  285.410403]  </TASK>
      [  285.410704]
      [  285.410929] Allocated by task 144:
      [  285.411402]  kasan_save_stack+0x1e/0x40
      [  285.411926]  kasan_set_track+0x21/0x30
      [  285.412442]  __kasan_slab_alloc+0x55/0x70
      [  285.412973]  kmem_cache_alloc_node+0x187/0x3d0
      [  285.413567]  __alloc_skb+0x1b4/0x230
      [  285.414060]  __ip_append_data+0x17f7/0x1b60
      [  285.414633]  ip_append_data+0x97/0xf0
      [  285.415144]  raw_sendmsg+0x5a8/0x12d0
      [  285.415640]  sock_sendmsg+0xcc/0xe0
      [  285.416117]  __sys_sendto+0x190/0x230
      [  285.416626]  __x64_sys_sendto+0x74/0x90
      [  285.417145]  do_syscall_64+0x64/0x90
      [  285.417624]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
      [  285.418306]
      [  285.418531] Freed by task 144:
      [  285.418960]  kasan_save_stack+0x1e/0x40
      [  285.419469]  kasan_set_track+0x21/0x30
      [  285.419988]  kasan_save_free_info+0x27/0x40
      [  285.420556]  ____kasan_slab_free+0x109/0x1a0
      [  285.421146]  kmem_cache_free+0x1c2/0x450
      [  285.421680]  __netif_receive_skb_core+0x2ce/0x1870
      [  285.422333]  __netif_receive_skb_one_core+0x97/0x140
      [  285.423003]  process_backlog+0x100/0x2f0
      [  285.423537]  __napi_poll+0x5c/0x2d0
      [  285.424023]  net_rx_action+0x2be/0x560
      [  285.424510]  __do_softirq+0x11b/0x3de
      [  285.425034]
      [  285.425254] The buggy address belongs to the object at ffff8880bad31280
      [  285.425254]  which belongs to the cache skbuff_head_cache of size 224
      [  285.426993] The buggy address is located 40 bytes inside of
      [  285.426993]  freed 224-byte region [ffff8880bad31280, ffff8880bad31360)
      [  285.428572]
      [  285.428798] The buggy address belongs to the physical page:
      [  285.429540] page:00000000f4b77674 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0xbad31
      [  285.430758] flags: 0x100000000000200(slab|node=0|zone=1)
      [  285.431447] page_type: 0xffffffff()
      [  285.431934] raw: 0100000000000200 ffff88810094a8c0 dead000000000122 0000000000000000
      [  285.432757] raw: 0000000000000000 00000000800c000c 00000001ffffffff 0000000000000000
      [  285.433562] page dumped because: kasan: bad access detected
      [  285.434144]
      [  285.434320] Memory state around the buggy address:
      [  285.434828]  ffff8880bad31180: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [  285.435580]  ffff8880bad31200: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [  285.436264] >ffff8880bad31280: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [  285.436777]                                   ^
      [  285.437106]  ffff8880bad31300: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
      [  285.437616]  ffff8880bad31380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [  285.438126] ==================================================================
      [  285.438662] Disabling lock debugging due to kernel taint
      
      Fix this by:
      1. Changing sch_plug's .peek handler to qdisc_peek_dequeued(), a
      function compatible with non-work-conserving qdiscs
      2. Checking the return value of qdisc_dequeue_peeked() in sch_qfq.
      
      Fixes: 462dbc91
      
       ("pkt_sched: QFQ Plus: fair-queueing service at DRR cost")
      Reported-by: default avatarvalis <sec@valis.email>
      Signed-off-by: default avatarvalis <sec@valis.email>
      Signed-off-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Link: https://lore.kernel.org/r/20230901162237.11525-1-jhs@mojatatu.com
      
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      6ea277b2
    • Kuniyuki Iwashima's avatar
      af_unix: Fix data race around sk->sk_err. · 3868de7c
      Kuniyuki Iwashima authored
      [ Upstream commit b1928129 ]
      
      As with sk->sk_shutdown shown in the previous patch, sk->sk_err can be
      read locklessly by unix_dgram_sendmsg().
      
      Let's use READ_ONCE() for sk_err as well.
      
      Note that the writer side is marked by commit cc04410a ("af_unix:
      annotate lockless accesses to sk->sk_err").
      
      Fixes: 1da177e4
      
       ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      3868de7c
    • Kuniyuki Iwashima's avatar
      af_unix: Fix data-races around sk->sk_shutdown. · d9545666
      Kuniyuki Iwashima authored
      [ Upstream commit afe8764f ]
      
      sk->sk_shutdown is changed under unix_state_lock(sk), but
      unix_dgram_sendmsg() calls two functions to read sk_shutdown locklessly.
      
        sock_alloc_send_pskb
        `- sock_wait_for_wmem
      
      Let's use READ_ONCE() there.
      
      Note that the writer side was marked by commit e1d09c2c ("af_unix:
      Fix data races around sk->sk_shutdown.").
      
      BUG: KCSAN: data-race in sock_alloc_send_pskb / unix_release_sock
      
      write (marked) to 0xffff8880069af12c of 1 bytes by task 1 on cpu 1:
       unix_release_sock+0x75c/0x910 net/unix/af_unix.c:631
       unix_release+0x59/0x80 net/unix/af_unix.c:1053
       __sock_release+0x7d/0x170 net/socket.c:654
       sock_close+0x19/0x30 net/socket.c:1386
       __fput+0x2a3/0x680 fs/file_table.c:384
       ____fput+0x15/0x20 fs/file_table.c:412
       task_work_run+0x116/0x1a0 kernel/task_work.c:179
       resume_user_mode_work include/linux/resume_user_mode.h:49 [inline]
       exit_to_user_mode_loop kernel/entry/common.c:171 [inline]
       exit_to_user_mode_prepare+0x174/0x180 kernel/entry/common.c:204
       __syscall_exit_to_user_mode_work kernel/entry/common.c:286 [inline]
       syscall_exit_to_user_mode+0x1a/0x30 kernel/entry/common.c:297
       do_syscall_64+0x4b/0x90 arch/x86/entry/common.c:86
       entry_SYSCALL_64_after_hwframe+0x6e/0xd8
      
      read to 0xffff8880069af12c of 1 bytes by task 28650 on cpu 0:
       sock_alloc_send_pskb+0xd2/0x620 net/core/sock.c:2767
       unix_dgram_sendmsg+0x2f8/0x14f0 net/unix/af_unix.c:1944
       unix_seqpacket_sendmsg net/unix/af_unix.c:2308 [inline]
       unix_seqpacket_sendmsg+0xba/0x130 net/unix/af_unix.c:2292
       sock_sendmsg_nosec net/socket.c:725 [inline]
       sock_sendmsg+0x148/0x160 net/socket.c:748
       ____sys_sendmsg+0x4e4/0x610 net/socket.c:2494
       ___sys_sendmsg+0xc6/0x140 net/socket.c:2548
       __sys_sendmsg+0x94/0x140 net/socket.c:2577
       __do_sys_sendmsg net/socket.c:2586 [inline]
       __se_sys_sendmsg net/socket.c:2584 [inline]
       __x64_sys_sendmsg+0x45/0x50 net/socket.c:2584
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x6e/0xd8
      
      value changed: 0x00 -> 0x03
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 0 PID: 28650 Comm: systemd-coredum Not tainted 6.4.0-11989-g6843306689af #6
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
      
      Fixes: 1da177e4
      
       ("Linux-2.6.12-rc2")
      Reported-by: default avatarsyzkaller <syzkaller@googlegroups.com>
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d9545666
    • Kuniyuki Iwashima's avatar
      af_unix: Fix data-race around unix_tot_inflight. · e5edc6e4
      Kuniyuki Iwashima authored
      [ Upstream commit ade32bd8 ]
      
      unix_tot_inflight is changed under spin_lock(unix_gc_lock), but
      unix_release_sock() reads it locklessly.
      
      Let's use READ_ONCE() for unix_tot_inflight.
      
      Note that the writer side was marked by commit 9d6d7f1c ("af_unix:
      annote lockless accesses to unix_tot_inflight & gc_in_progress")
      
      BUG: KCSAN: data-race in unix_inflight / unix_release_sock
      
      write (marked) to 0xffffffff871852b8 of 4 bytes by task 123 on cpu 1:
       unix_inflight+0x130/0x180 net/unix/scm.c:64
       unix_attach_fds+0x137/0x1b0 net/unix/scm.c:123
       unix_scm_to_skb net/unix/af_unix.c:1832 [inline]
       unix_dgram_sendmsg+0x46a/0x14f0 net/unix/af_unix.c:1955
       sock_sendmsg_nosec net/socket.c:724 [inline]
       sock_sendmsg+0x148/0x160 net/socket.c:747
       ____sys_sendmsg+0x4e4/0x610 net/socket.c:2493
       ___sys_sendmsg+0xc6/0x140 net/socket.c:2547
       __sys_sendmsg+0x94/0x140 net/socket.c:2576
       __do_sys_sendmsg net/socket.c:2585 [inline]
       __se_sys_sendmsg net/socket.c:2583 [inline]
       __x64_sys_sendmsg+0x45/0x50 net/socket.c:2583
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x72/0xdc
      
      read to 0xffffffff871852b8 of 4 bytes by task 4891 on cpu 0:
       unix_release_sock+0x608/0x910 net/unix/af_unix.c:671
       unix_release+0x59/0x80 net/unix/af_unix.c:1058
       __sock_release+0x7d/0x170 net/socket.c:653
       sock_close+0x19/0x30 net/socket.c:1385
       __fput+0x179/0x5e0 fs/file_table.c:321
       ____fput+0x15/0x20 fs/file_table.c:349
       task_work_run+0x116/0x1a0 kernel/task_work.c:179
       resume_user_mode_work include/linux/resume_user_mode.h:49 [inline]
       exit_to_user_mode_loop kernel/entry/common.c:171 [inline]
       exit_to_user_mode_prepare+0x174/0x180 kernel/entry/common.c:204
       __syscall_exit_to_user_mode_work kernel/entry/common.c:286 [inline]
       syscall_exit_to_user_mode+0x1a/0x30 kernel/entry/common.c:297
       do_syscall_64+0x4b/0x90 arch/x86/entry/common.c:86
       entry_SYSCALL_64_after_hwframe+0x72/0xdc
      
      value changed: 0x00000000 -> 0x00000001
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 0 PID: 4891 Comm: systemd-coredum Not tainted 6.4.0-rc5-01219-gfa0e21fa4443 #5
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
      
      Fixes: 9305cfa4
      
       ("[AF_UNIX]: Make unix_tot_inflight counter non-atomic")
      Reported-by: default avatarsyzkaller <syzkaller@googlegroups.com>
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e5edc6e4
    • Kuniyuki Iwashima's avatar
      af_unix: Fix data-races around user->unix_inflight. · 9151ed4b
      Kuniyuki Iwashima authored
      [ Upstream commit 0bc36c06 ]
      
      user->unix_inflight is changed under spin_lock(unix_gc_lock),
      but too_many_unix_fds() reads it locklessly.
      
      Let's annotate the write/read accesses to user->unix_inflight.
      
      BUG: KCSAN: data-race in unix_attach_fds / unix_inflight
      
      write to 0xffffffff8546f2d0 of 8 bytes by task 44798 on cpu 1:
       unix_inflight+0x157/0x180 net/unix/scm.c:66
       unix_attach_fds+0x147/0x1e0 net/unix/scm.c:123
       unix_scm_to_skb net/unix/af_unix.c:1827 [inline]
       unix_dgram_sendmsg+0x46a/0x14f0 net/unix/af_unix.c:1950
       unix_seqpacket_sendmsg net/unix/af_unix.c:2308 [inline]
       unix_seqpacket_sendmsg+0xba/0x130 net/unix/af_unix.c:2292
       sock_sendmsg_nosec net/socket.c:725 [inline]
       sock_sendmsg+0x148/0x160 net/socket.c:748
       ____sys_sendmsg+0x4e4/0x610 net/socket.c:2494
       ___sys_sendmsg+0xc6/0x140 net/socket.c:2548
       __sys_sendmsg+0x94/0x140 net/socket.c:2577
       __do_sys_sendmsg net/socket.c:2586 [inline]
       __se_sys_sendmsg net/socket.c:2584 [inline]
       __x64_sys_sendmsg+0x45/0x50 net/socket.c:2584
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x6e/0xd8
      
      read to 0xffffffff8546f2d0 of 8 bytes by task 44814 on cpu 0:
       too_many_unix_fds net/unix/scm.c:101 [inline]
       unix_attach_fds+0x54/0x1e0 net/unix/scm.c:110
       unix_scm_to_skb net/unix/af_unix.c:1827 [inline]
       unix_dgram_sendmsg+0x46a/0x14f0 net/unix/af_unix.c:1950
       unix_seqpacket_sendmsg net/unix/af_unix.c:2308 [inline]
       unix_seqpacket_sendmsg+0xba/0x130 net/unix/af_unix.c:2292
       sock_sendmsg_nosec net/socket.c:725 [inline]
       sock_sendmsg+0x148/0x160 net/socket.c:748
       ____sys_sendmsg+0x4e4/0x610 net/socket.c:2494
       ___sys_sendmsg+0xc6/0x140 net/socket.c:2548
       __sys_sendmsg+0x94/0x140 net/socket.c:2577
       __do_sys_sendmsg net/socket.c:2586 [inline]
       __se_sys_sendmsg net/socket.c:2584 [inline]
       __x64_sys_sendmsg+0x45/0x50 net/socket.c:2584
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x6e/0xd8
      
      value changed: 0x000000000000000c -> 0x000000000000000d
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 0 PID: 44814 Comm: systemd-coredum Not tainted 6.4.0-11989-g6843306689af #6
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
      
      Fixes: 712f4aad
      
       ("unix: properly account for FDs passed over unix sockets")
      Reported-by: default avatarsyzkaller <syzkaller@googlegroups.com>
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Acked-by: default avatarWilly Tarreau <w@1wt.eu>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      9151ed4b
    • Oleksij Rempel's avatar
      net: phy: micrel: Correct bit assignments for phy_device flags · 907fbed6
      Oleksij Rempel authored
      [ Upstream commit 719c5e37 ]
      
      Previously, the defines for phy_device flags in the Micrel driver were
      ambiguous in their representation. They were intended to be bit masks
      but were mistakenly defined as bit positions. This led to the following
      issues:
      
      - MICREL_KSZ8_P1_ERRATA, designated for KSZ88xx switches, overlapped
        with MICREL_PHY_FXEN and MICREL_PHY_50MHZ_CLK.
      - Due to this overlap, the code path for MICREL_PHY_FXEN, tailored for
        the KSZ8041 PHY, was not executed for KSZ88xx PHYs.
      - Similarly, the code associated with MICREL_PHY_50MHZ_CLK wasn't
        triggered for KSZ88xx.
      
      To rectify this, all three flags have now been explicitly converted to
      use the `BIT()` macro, ensuring they are defined as bit masks and
      preventing potential overlaps in the future.
      
      Fixes: 49011e0c
      
       ("net: phy: micrel: ksz886x/ksz8081: add cabletest support")
      Signed-off-by: default avatarOleksij Rempel <o.rempel@pengutronix.de>
      Reviewed-by: default avatarRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      907fbed6
    • Alex Henrie's avatar
      net: ipv6/addrconf: avoid integer underflow in ipv6_create_tempaddr · 5d2d3f23
      Alex Henrie authored
      [ Upstream commit f31867d0 ]
      
      The existing code incorrectly casted a negative value (the result of a
      subtraction) to an unsigned value without checking. For example, if
      /proc/sys/net/ipv6/conf/*/temp_prefered_lft was set to 1, the preferred
      lifetime would jump to 4 billion seconds. On my machine and network the
      shortest lifetime that avoided underflow was 3 seconds.
      
      Fixes: 76506a98
      
       ("IPv6: fix DESYNC_FACTOR")
      Signed-off-by: default avatarAlex Henrie <alexhenrie24@gmail.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      5d2d3f23
    • Liang Chen's avatar
      veth: Fixing transmit return status for dropped packets · 77dd55f5
      Liang Chen authored
      [ Upstream commit 151e887d ]
      
      The veth_xmit function returns NETDEV_TX_OK even when packets are dropped.
      This behavior leads to incorrect calculations of statistics counts, as
      well as things like txq->trans_start updates.
      
      Fixes: e314dbdc
      
       ("[NET]: Virtual ethernet device driver.")
      Signed-off-by: default avatarLiang Chen <liangchen.linux@gmail.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      77dd55f5
    • Corinna Vinschen's avatar
      igb: disable virtualization features on 82580 · 56603b2c
      Corinna Vinschen authored
      [ Upstream commit fa09bc40 ]
      
      Disable virtualization features on 82580 just as on i210/i211.
      This avoids that virt functions are acidentally called on 82850.
      
      Fixes: 55cac248
      
       ("igb: Add full support for 82580 devices")
      Signed-off-by: default avatarCorinna Vinschen <vinschen@redhat.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      56603b2c
    • Sriram Yagnaraman's avatar
      ipv4: ignore dst hint for multipath routes · 149bc783
      Sriram Yagnaraman authored
      [ Upstream commit 6ac66cb0 ]
      
      Route hints when the nexthop is part of a multipath group causes packets
      in the same receive batch to be sent to the same nexthop irrespective of
      the multipath hash of the packet. So, do not extract route hint for
      packets whose destination is part of a multipath group.
      
      A new SKB flag IPSKB_MULTIPATH is introduced for this purpose, set the
      flag when route is looked up in ip_mkroute_input() and use it in
      ip_extract_route_hint() to check for the existence of the flag.
      
      Fixes: 02b24941
      
       ("ipv4: use dst hint for ipv4 list receive")
      Signed-off-by: default avatarSriram Yagnaraman <sriram.yagnaraman@est.tech>
      Reviewed-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      149bc783
    • Sean Christopherson's avatar
      drm/i915/gvt: Drop unused helper intel_vgpu_reset_gtt() · e18b4949
      Sean Christopherson authored
      [ Upstream commit a90c367e ]
      
      Drop intel_vgpu_reset_gtt() as it no longer has any callers.  In addition
      to eliminating dead code, this eliminates the last possible scenario where
      __kvmgt_protect_table_find() can be reached without holding vgpu_lock.
      Requiring vgpu_lock to be held when calling __kvmgt_protect_table_find()
      will allow a protecting the gfn hash with vgpu_lock without too much fuss.
      
      No functional change intended.
      
      Fixes: ba25d977
      
       ("drm/i915/gvt: Do not destroy ppgtt_mm during vGPU D3->D0.")
      Reviewed-by: default avatarYan Zhao <yan.y.zhao@intel.com>
      Tested-by: default avatarYongwei Ma <yongwei.ma@intel.com>
      Reviewed-by: default avatarZhi Wang <zhi.a.wang@intel.com>
      Link: https://lore.kernel.org/r/20230729013535.1070024-11-seanjc@google.com
      
      
      Signed-off-by: default avatarSean Christopherson <seanjc@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e18b4949
    • Magnus Karlsson's avatar
      xsk: Fix xsk_diag use-after-free error during socket cleanup · 5979985f
      Magnus Karlsson authored
      [ Upstream commit 3e019d8a ]
      
      Fix a use-after-free error that is possible if the xsk_diag interface
      is used after the socket has been unbound from the device. This can
      happen either due to the socket being closed or the device
      disappearing. In the early days of AF_XDP, the way we tested that a
      socket was not bound to a device was to simply check if the netdevice
      pointer in the xsk socket structure was NULL. Later, a better system
      was introduced by having an explicit state variable in the xsk socket
      struct. For example, the state of a socket that is on the way to being
      closed and has been unbound from the device is XSK_UNBOUND.
      
      The commit in the Fixes tag below deleted the old way of signalling
      that a socket is unbound, setting dev to NULL. This in the belief that
      all code using the old way had been exterminated. That was
      unfortunately not true as the xsk diagnostics code was still using the
      old way and thus does not work as intended when a socket is going
      down. Fix this by introducing a test against the state variable. If
      the socket is in the state XSK_UNBOUND, simply abort the diagnostic's
      netlink operation.
      
      Fixes: 18b1ab7a
      
       ("xsk: Fix race at socket teardown")
      Reported-by: default avatar <syzbot+822d1359297e2694f873@syzkaller.appspotmail.com>
      Signed-off-by: default avatarMagnus Karlsson <magnus.karlsson@intel.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Tested-by: default avatar <syzbot+822d1359297e2694f873@syzkaller.appspotmail.com>
      Tested-by: default avatarMaciej Fijalkowski <maciej.fijalkowski@intel.com>
      Reviewed-by: default avatarMaciej Fijalkowski <maciej.fijalkowski@intel.com>
      Link: https://lore.kernel.org/bpf/20230831100119.17408-1-magnus.karlsson@gmail.com
      
      
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      5979985f
    • Florian Westphal's avatar
      net: fib: avoid warn splat in flow dissector · 49acc5c5
      Florian Westphal authored
      [ Upstream commit 8aae7625 ]
      
      New skbs allocated via nf_send_reset() have skb->dev == NULL.
      
      fib*_rules_early_flow_dissect helpers already have a 'struct net'
      argument but its not passed down to the flow dissector core, which
      will then WARN as it can't derive a net namespace to use:
      
       WARNING: CPU: 0 PID: 0 at net/core/flow_dissector.c:1016 __skb_flow_dissect+0xa91/0x1cd0
       [..]
        ip_route_me_harder+0x143/0x330
        nf_send_reset+0x17c/0x2d0 [nf_reject_ipv4]
        nft_reject_inet_eval+0xa9/0xf2 [nft_reject_inet]
        nft_do_chain+0x198/0x5d0 [nf_tables]
        nft_do_chain_inet+0xa4/0x110 [nf_tables]
        nf_hook_slow+0x41/0xc0
        ip_local_deliver+0xce/0x110
        ..
      
      Cc: Stanislav Fomichev <sdf@google.com>
      Cc: David Ahern <dsahern@kernel.org>
      Cc: Ido Schimmel <idosch@nvidia.com>
      Fixes: 812fa71f ("netfilter: Dissect flow after packet mangling")
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=217826
      
      
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Reviewed-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Link: https://lore.kernel.org/r/20230830110043.30497-1-fw@strlen.de
      
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      49acc5c5
    • Eric Dumazet's avatar
      net: read sk->sk_family once in sk_mc_loop() · ed4e0adf
      Eric Dumazet authored
      [ Upstream commit a3e0fdf7 ]
      
      syzbot is playing with IPV6_ADDRFORM quite a lot these days,
      and managed to hit the WARN_ON_ONCE(1) in sk_mc_loop()
      
      We have many more similar issues to fix.
      
      WARNING: CPU: 1 PID: 1593 at net/core/sock.c:782 sk_mc_loop+0x165/0x260
      Modules linked in:
      CPU: 1 PID: 1593 Comm: kworker/1:3 Not tainted 6.1.40-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/26/2023
      Workqueue: events_power_efficient gc_worker
      RIP: 0010:sk_mc_loop+0x165/0x260 net/core/sock.c:782
      Code: 34 1b fd 49 81 c7 18 05 00 00 4c 89 f8 48 c1 e8 03 42 80 3c 20 00 74 08 4c 89 ff e8 25 36 6d fd 4d 8b 37 eb 13 e8 db 33 1b fd <0f> 0b b3 01 eb 34 e8 d0 33 1b fd 45 31 f6 49 83 c6 38 4c 89 f0 48
      RSP: 0018:ffffc90000388530 EFLAGS: 00010246
      RAX: ffffffff846d9b55 RBX: 0000000000000011 RCX: ffff88814f884980
      RDX: 0000000000000102 RSI: ffffffff87ae5160 RDI: 0000000000000011
      RBP: ffffc90000388550 R08: 0000000000000003 R09: ffffffff846d9a65
      R10: 0000000000000002 R11: ffff88814f884980 R12: dffffc0000000000
      R13: ffff88810dbee000 R14: 0000000000000010 R15: ffff888150084000
      FS: 0000000000000000(0000) GS:ffff8881f6b00000(0000) knlGS:0000000000000000
      CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000020000180 CR3: 000000014ee5b000 CR4: 00000000003506e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
      <IRQ>
      [<ffffffff8507734f>] ip6_finish_output2+0x33f/0x1ae0 net/ipv6/ip6_output.c:83
      [<ffffffff85062766>] __ip6_finish_output net/ipv6/ip6_output.c:200 [inline]
      [<ffffffff85062766>] ip6_finish_output+0x6c6/0xb10 net/ipv6/ip6_output.c:211
      [<ffffffff85061f8c>] NF_HOOK_COND include/linux/netfilter.h:298 [inline]
      [<ffffffff85061f8c>] ip6_output+0x2bc/0x3d0 net/ipv6/ip6_output.c:232
      [<ffffffff852071cf>] dst_output include/net/dst.h:444 [inline]
      [<ffffffff852071cf>] ip6_local_out+0x10f/0x140 net/ipv6/output_core.c:161
      [<ffffffff83618fb4>] ipvlan_process_v6_outbound drivers/net/ipvlan/ipvlan_core.c:483 [inline]
      [<ffffffff83618fb4>] ipvlan_process_outbound drivers/net/ipvlan/ipvlan_core.c:529 [inline]
      [<ffffffff83618fb4>] ipvlan_xmit_mode_l3 drivers/net/ipvlan/ipvlan_core.c:602 [inline]
      [<ffffffff83618fb4>] ipvlan_queue_xmit+0x1174/0x1be0 drivers/net/ipvlan/ipvlan_core.c:677
      [<ffffffff8361ddd9>] ipvlan_start_xmit+0x49/0x100 drivers/net/ipvlan/ipvlan_main.c:229
      [<ffffffff84763fc0>] netdev_start_xmit include/linux/netdevice.h:4925 [inline]
      [<ffffffff84763fc0>] xmit_one net/core/dev.c:3644 [inline]
      [<ffffffff84763fc0>] dev_hard_start_xmit+0x320/0x980 net/core/dev.c:3660
      [<ffffffff8494c650>] sch_direct_xmit+0x2a0/0x9c0 net/sched/sch_generic.c:342
      [<ffffffff8494d883>] qdisc_restart net/sched/sch_generic.c:407 [inline]
      [<ffffffff8494d883>] __qdisc_run+0xb13/0x1e70 net/sched/sch_generic.c:415
      [<ffffffff8478c426>] qdisc_run+0xd6/0x260 include/net/pkt_sched.h:125
      [<ffffffff84796eac>] net_tx_action+0x7ac/0x940 net/core/dev.c:5247
      [<ffffffff858002bd>] __do_softirq+0x2bd/0x9bd kernel/softirq.c:599
      [<ffffffff814c3fe8>] invoke_softirq kernel/softirq.c:430 [inline]
      [<ffffffff814c3fe8>] __irq_exit_rcu+0xc8/0x170 kernel/softirq.c:683
      [<ffffffff814c3f09>] irq_exit_rcu+0x9/0x20 kernel/softirq.c:695
      
      Fixes: 7ad6848c
      
       ("ip: fix mc_loop checks for tunnels with multicast outer addresses")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Link: https://lore.kernel.org/r/20230830101244.1146934-1-edumazet@google.com
      
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ed4e0adf
    • Eric Dumazet's avatar
      ipv4: annotate data-races around fi->fib_dead · e0b483a0
      Eric Dumazet authored
      [ Upstream commit fce92af1 ]
      
      syzbot complained about a data-race in fib_table_lookup() [1]
      
      Add appropriate annotations to document it.
      
      [1]
      BUG: KCSAN: data-race in fib_release_info / fib_table_lookup
      
      write to 0xffff888150f31744 of 1 bytes by task 1189 on cpu 0:
      fib_release_info+0x3a0/0x460 net/ipv4/fib_semantics.c:281
      fib_table_delete+0x8d2/0x900 net/ipv4/fib_trie.c:1777
      fib_magic+0x1c1/0x1f0 net/ipv4/fib_frontend.c:1106
      fib_del_ifaddr+0x8cf/0xa60 net/ipv4/fib_frontend.c:1317
      fib_inetaddr_event+0x77/0x200 net/ipv4/fib_frontend.c:1448
      notifier_call_chain kernel/notifier.c:93 [inline]
      blocking_notifier_call_chain+0x90/0x200 kernel/notifier.c:388
      __inet_del_ifa+0x4df/0x800 net/ipv4/devinet.c:432
      inet_del_ifa net/ipv4/devinet.c:469 [inline]
      inetdev_destroy net/ipv4/devinet.c:322 [inline]
      inetdev_event+0x553/0xaf0 net/ipv4/devinet.c:1606
      notifier_call_chain kernel/notifier.c:93 [inline]
      raw_notifier_call_chain+0x6b/0x1c0 kernel/notifier.c:461
      call_netdevice_notifiers_info net/core/dev.c:1962 [inline]
      call_netdevice_notifiers_mtu+0xd2/0x130 net/core/dev.c:2037
      dev_set_mtu_ext+0x30b/0x3e0 net/core/dev.c:8673
      do_setlink+0x5be/0x2430 net/core/rtnetlink.c:2837
      rtnl_setlink+0x255/0x300 net/core/rtnetlink.c:3177
      rtnetlink_rcv_msg+0x807/0x8c0 net/core/rtnetlink.c:6445
      netlink_rcv_skb+0x126/0x220 net/netlink/af_netlink.c:2549
      rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:6463
      netlink_unicast_kernel net/netlink/af_netlink.c:1339 [inline]
      netlink_unicast+0x56f/0x640 net/netlink/af_netlink.c:1365
      netlink_sendmsg+0x665/0x770 net/netlink/af_netlink.c:1914
      sock_sendmsg_nosec net/socket.c:725 [inline]
      sock_sendmsg net/socket.c:748 [inline]
      sock_write_iter+0x1aa/0x230 net/socket.c:1129
      do_iter_write+0x4b4/0x7b0 fs/read_write.c:860
      vfs_writev+0x1a8/0x320 fs/read_write.c:933
      do_writev+0xf8/0x220 fs/read_write.c:976
      __do_sys_writev fs/read_write.c:1049 [inline]
      __se_sys_writev fs/read_write.c:1046 [inline]
      __x64_sys_writev+0x45/0x50 fs/read_write.c:1046
      do_syscall_x64 arch/x86/entry/common.c:50 [inline]
      do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
      entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      read to 0xffff888150f31744 of 1 bytes by task 21839 on cpu 1:
      fib_table_lookup+0x2bf/0xd50 net/ipv4/fib_trie.c:1585
      fib_lookup include/net/ip_fib.h:383 [inline]
      ip_route_output_key_hash_rcu+0x38c/0x12c0 net/ipv4/route.c:2751
      ip_route_output_key_hash net/ipv4/route.c:2641 [inline]
      __ip_route_output_key include/net/route.h:134 [inline]
      ip_route_output_flow+0xa6/0x150 net/ipv4/route.c:2869
      send4+0x1e7/0x500 drivers/net/wireguard/socket.c:61
      wg_socket_send_skb_to_peer+0x94/0x130 drivers/net/wireguard/socket.c:175
      wg_socket_send_buffer_to_peer+0xd6/0x100 drivers/net/wireguard/socket.c:200
      wg_packet_send_handshake_initiation drivers/net/wireguard/send.c:40 [inline]
      wg_packet_handshake_send_worker+0x10c/0x150 drivers/net/wireguard/send.c:51
      process_one_work+0x434/0x860 kernel/workqueue.c:2600
      worker_thread+0x5f2/0xa10 kernel/workqueue.c:2751
      kthread+0x1d7/0x210 kernel/kthread.c:389
      ret_from_fork+0x2e/0x40 arch/x86/kernel/process.c:145
      ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:304
      
      value changed: 0x00 -> 0x01
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 1 PID: 21839 Comm: kworker/u4:18 Tainted: G W 6.5.0-syzkaller #0
      
      Fixes: dccd9ecc
      
       ("ipv4: Do not use dead fib_info entries.")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Link: https://lore.kernel.org/r/20230830095520.1046984-1-edumazet@google.com
      
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e0b483a0
    • Eric Dumazet's avatar
      sctp: annotate data-races around sk->sk_wmem_queued · 74df0319
      Eric Dumazet authored
      [ Upstream commit dc9511dd ]
      
      sk->sk_wmem_queued can be read locklessly from sctp_poll()
      
      Use sk_wmem_queued_add() when the field is changed,
      and add READ_ONCE() annotations in sctp_writeable()
      and sctp_assocs_seq_show()
      
      syzbot reported:
      
      BUG: KCSAN: data-race in sctp_poll / sctp_wfree
      
      read-write to 0xffff888149d77810 of 4 bytes by interrupt on cpu 0:
      sctp_wfree+0x170/0x4a0 net/sctp/socket.c:9147
      skb_release_head_state+0xb7/0x1a0 net/core/skbuff.c:988
      skb_release_all net/core/skbuff.c:1000 [inline]
      __kfree_skb+0x16/0x140 net/core/skbuff.c:1016
      consume_skb+0x57/0x180 net/core/skbuff.c:1232
      sctp_chunk_destroy net/sctp/sm_make_chunk.c:1503 [inline]
      sctp_chunk_put+0xcd/0x130 net/sctp/sm_make_chunk.c:1530
      sctp_datamsg_put+0x29a/0x300 net/sctp/chunk.c:128
      sctp_chunk_free+0x34/0x50 net/sctp/sm_make_chunk.c:1515
      sctp_outq_sack+0xafa/0xd70 net/sctp/outqueue.c:1381
      sctp_cmd_process_sack net/sctp/sm_sideeffect.c:834 [inline]
      sctp_cmd_interpreter net/sctp/sm_sideeffect.c:1366 [inline]
      sctp_side_effects net/sctp/sm_sideeffect.c:1198 [inline]
      sctp_do_sm+0x12c7/0x31b0 net/sctp/sm_sideeffect.c:1169
      sctp_assoc_bh_rcv+0x2b2/0x430 net/sctp/associola.c:1051
      sctp_inq_push+0x108/0x120 net/sctp/inqueue.c:80
      sctp_rcv+0x116e/0x1340 net/sctp/input.c:243
      sctp6_rcv+0x25/0x40 net/sctp/ipv6.c:1120
      ip6_protocol_deliver_rcu+0x92f/0xf30 net/ipv6/ip6_input.c:437
      ip6_input_finish net/ipv6/ip6_input.c:482 [inline]
      NF_HOOK include/linux/netfilter.h:303 [inline]
      ip6_input+0xbd/0x1b0 net/ipv6/ip6_input.c:491
      dst_input include/net/dst.h:468 [inline]
      ip6_rcv_finish+0x1e2/0x2e0 net/ipv6/ip6_input.c:79
      NF_HOOK include/linux/netfilter.h:303 [inline]
      ipv6_rcv+0x74/0x150 net/ipv6/ip6_input.c:309
      __netif_receive_skb_one_core net/core/dev.c:5452 [inline]
      __netif_receive_skb+0x90/0x1b0 net/core/dev.c:5566
      process_backlog+0x21f/0x380 net/core/dev.c:5894
      __napi_poll+0x60/0x3b0 net/core/dev.c:6460
      napi_poll net/core/dev.c:6527 [inline]
      net_rx_action+0x32b/0x750 net/core/dev.c:6660
      __do_softirq+0xc1/0x265 kernel/softirq.c:553
      run_ksoftirqd+0x17/0x20 kernel/softirq.c:921
      smpboot_thread_fn+0x30a/0x4a0 kernel/smpboot.c:164
      kthread+0x1d7/0x210 kernel/kthread.c:389
      ret_from_fork+0x2e/0x40 arch/x86/kernel/process.c:145
      ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:304
      
      read to 0xffff888149d77810 of 4 bytes by task 17828 on cpu 1:
      sctp_writeable net/sctp/socket.c:9304 [inline]
      sctp_poll+0x265/0x410 net/sctp/socket.c:8671
      sock_poll+0x253/0x270 net/socket.c:1374
      vfs_poll include/linux/poll.h:88 [inline]
      do_pollfd fs/select.c:873 [inline]
      do_poll fs/select.c:921 [inline]
      do_sys_poll+0x636/0xc00 fs/select.c:1015
      __do_sys_ppoll fs/select.c:1121 [inline]
      __se_sys_ppoll+0x1af/0x1f0 fs/select.c:1101
      __x64_sys_ppoll+0x67/0x80 fs/select.c:1101
      do_syscall_x64 arch/x86/entry/common.c:50 [inline]
      do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
      entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      value changed: 0x00019e80 -> 0x0000cc80
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 1 PID: 17828 Comm: syz-executor.1 Not tainted 6.5.0-rc7-syzkaller-00185-g28f20a19294d #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/26/2023
      
      Fixes: 1da177e4
      
       ("Linux-2.6.12-rc2")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Acked-by: default avatarXin Long <lucien.xin@gmail.com>
      Link: https://lore.kernel.org/r/20230830094519.950007-1-edumazet@google.com
      
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      74df0319
    • Eric Dumazet's avatar
      net/sched: fq_pie: avoid stalls in fq_pie_timer() · 973a4c30
      Eric Dumazet authored
      [ Upstream commit 8c21ab1b ]
      
      When setting a high number of flows (limit being 65536),
      fq_pie_timer() is currently using too much time as syzbot reported.
      
      Add logic to yield the cpu every 2048 flows (less than 150 usec
      on debug kernels).
      It should also help by not blocking qdisc fast paths for too long.
      Worst case (65536 flows) would need 31 jiffies for a complete scan.
      
      Relevant extract from syzbot report:
      
      rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 0-.... } 2663 jiffies s: 873 root: 0x1/.
      rcu: blocking rcu_node structures (internal RCU debug):
      Sending NMI from CPU 1 to CPUs 0:
      NMI backtrace for cpu 0
      CPU: 0 PID: 5177 Comm: syz-executor273 Not tainted 6.5.0-syzkaller-00453-g727dbda16b83 #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/26/2023
      RIP: 0010:check_kcov_mode kernel/kcov.c:173 [inline]
      RIP: 0010:write_comp_data+0x21/0x90 kernel/kcov.c:236
      Code: 2e 0f 1f 84 00 00 00 00 00 65 8b 05 01 b2 7d 7e 49 89 f1 89 c6 49 89 d2 81 e6 00 01 00 00 49 89 f8 65 48 8b 14 25 80 b9 03 00 <a9> 00 01 ff 00 74 0e 85 f6 74 59 8b 82 04 16 00 00 85 c0 74 4f 8b
      RSP: 0018:ffffc90000007bb8 EFLAGS: 00000206
      RAX: 0000000000000101 RBX: ffffc9000dc0d140 RCX: ffffffff885893b0
      RDX: ffff88807c075940 RSI: 0000000000000100 RDI: 0000000000000001
      RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000000 R12: ffffc9000dc0d178
      R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
      FS:  0000555555d54380(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007f6b442f6130 CR3: 000000006fe1c000 CR4: 00000000003506f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       <NMI>
       </NMI>
       <IRQ>
       pie_calculate_probability+0x480/0x850 net/sched/sch_pie.c:415
       fq_pie_timer+0x1da/0x4f0 net/sched/sch_fq_pie.c:387
       call_timer_fn+0x1a0/0x580 kernel/time/timer.c:1700
      
      Fixes: ec97ecf1 ("net: sched: add Flow Queue PIE packet scheduler")
      Link: https://lore.kernel.org/lkml/00000000000017ad3f06040bf394@google.com/
      
      
      Reported-by: default avatar <syzbot+e46fbd5289363464bc13@syzkaller.appspotmail.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarMichal Kubiak <michal.kubiak@intel.com>
      Reviewed-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Link: https://lore.kernel.org/r/20230829123541.3745013-1-edumazet@google.com
      
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      973a4c30
    • Vladimir Zapolskiy's avatar
      pwm: lpc32xx: Remove handling of PWM channels · 5e22217c
      Vladimir Zapolskiy authored
      [ Upstream commit 4aae44f6
      
       ]
      
      Because LPC32xx PWM controllers have only a single output which is
      registered as the only PWM device/channel per controller, it is known in
      advance that pwm->hwpwm value is always 0. On basis of this fact
      simplify the code by removing operations with pwm->hwpwm, there is no
      controls which require channel number as input.
      
      Even though I wasn't aware at the time when I forward ported that patch,
      this fixes a null pointer dereference as lpc32xx->chip.pwms is NULL
      before devm_pwmchip_add() is called.
      
      Reported-by: default avatarDan Carpenter <dan.carpenter@linaro.org>
      Signed-off-by: default avatarVladimir Zapolskiy <vz@mleia.com>
      Signed-off-by: default avatarUwe Kleine-König <u.kleine-koenig@pengutronix.de>
      Fixes: 3d2813fb
      
       ("pwm: lpc32xx: Don't modify HW state in .probe() after the PWM chip was registered")
      Signed-off-by: default avatarThierry Reding <thierry.reding@gmail.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      5e22217c
    • Raag Jadav's avatar
      watchdog: intel-mid_wdt: add MODULE_ALIAS() to allow auto-load · 67615226
      Raag Jadav authored
      [ Upstream commit cf38e769 ]
      
      When built with CONFIG_INTEL_MID_WATCHDOG=m, currently the driver
      needs to be loaded manually, for the lack of module alias.
      This causes unintended resets in cases where watchdog timer is
      set-up by bootloader and the driver is not explicitly loaded.
      Add MODULE_ALIAS() to load the driver automatically at boot and
      avoid this issue.
      
      Fixes: 87a1ef80
      
       ("watchdog: add Intel MID watchdog driver support")
      Signed-off-by: default avatarRaag Jadav <raag.jadav@intel.com>
      Reviewed-by: default avatarAndy Shevchenko <andriy.shevchenko@linux.intel.com>
      Reviewed-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Link: https://lore.kernel.org/r/20230811120220.31578-1-raag.jadav@intel.com
      
      
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarWim Van Sebroeck <wim@linux-watchdog.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      67615226
    • Arnaldo Carvalho de Melo's avatar
      perf top: Don't pass an ERR_PTR() directly to perf_session__delete() · d6aa2be1
      Arnaldo Carvalho de Melo authored
      [ Upstream commit ef23cb59 ]
      
      While debugging a segfault on 'perf lock contention' without an
      available perf.data file I noticed that it was basically calling:
      
      	perf_session__delete(ERR_PTR(-1))
      
      Resulting in:
      
        (gdb) run lock contention
        Starting program: /root/bin/perf lock contention
        [Thread debugging using libthread_db enabled]
        Using host libthread_db library "/lib64/libthread_db.so.1".
        failed to open perf.data: No such file or directory  (try 'perf record' first)
        Initializing perf session failed
      
        Program received signal SIGSEGV, Segmentation fault.
        0x00000000005e7515 in auxtrace__free (session=0xffffffffffffffff) at util/auxtrace.c:2858
        2858		if (!session->auxtrace)
        (gdb) p session
        $1 = (struct perf_session *) 0xffffffffffffffff
        (gdb) bt
        #0  0x00000000005e7515 in auxtrace__free (session=0xffffffffffffffff) at util/auxtrace.c:2858
        #1  0x000000000057bb4d in perf_session__delete (session=0xffffffffffffffff) at util/session.c:300
        #2  0x000000000047c421 in __cmd_contention (argc=0, argv=0x7fffffffe200) at builtin-lock.c:2161
        #3  0x000000000047dc95 in cmd_lock (argc=0, argv=0x7fffffffe200) at builtin-lock.c:2604
        #4  0x0000000000501466 in run_builtin (p=0xe597a8 <commands+552>, argc=2, argv=0x7fffffffe200) at perf.c:322
        #5  0x00000000005016d5 in handle_internal_command (argc=2, argv=0x7fffffffe200) at perf.c:375
        #6  0x0000000000501824 in run_argv (argcp=0x7fffffffe02c, argv=0x7fffffffe020) at perf.c:419
        #7  0x0000000000501b11 in main (argc=2, argv=0x7fffffffe200) at perf.c:535
        (gdb)
      
      So just set it to NULL after using PTR_ERR(session) to decode the error
      as perf_session__delete(NULL) is supported.
      
      The same problem was found in 'perf top' after an audit of all
      perf_session__new() failure handling.
      
      Fixes: 6ef81c55 ("perf session: Return error code for perf_session__new() function on failure")
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kate Stewart <kstewart@linuxfoundation.org>
      Cc: Mamatha Inamdar <mamatha4@linux.vnet.ibm.com>
      Cc: Mukesh Ojha <mojha@codeaurora.org>
      Cc: Nageswara R Sastry <rnsastry@linux.vnet.ibm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
      Cc: Shawn Landden <shawn@git.icu>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tzvetomir Stoyanov <tstoyanov@vmware.com>
      Link: https://lore.kernel.org/lkml/ZN4Q2rxxsL08A8rd@kernel.org
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d6aa2be1
    • Kajol Jain's avatar
      perf vendor events: Drop some of the JSON/events for power10 platform · 79bd17c9
      Kajol Jain authored
      [ Upstream commit e104df97 ]
      
      Drop some of the JSON/events for power10 platform due to counter
      data mismatch.
      
      Fixes: 32daa5d7
      
       ("perf vendor events: Initial JSON/events list for power10 platform")
      Signed-off-by: default avatarKajol Jain <kjain@linux.ibm.com>
      Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
      Cc: Disha Goel <disgoel@linux.ibm.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: linuxppc-dev@lists.ozlabs.org
      Link: https://lore.kernel.org/r/20230814112803.1508296-2-kjain@linux.ibm.com
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      79bd17c9