Skip to content
  1. Aug 23, 2023
    • Christopher Obbard's avatar
      arm64: dts: rockchip: Disable HS400 for eMMC on ROCK Pi 4 · 52d3607d
      Christopher Obbard authored
      [ Upstream commit cee57275 ]
      
      There is some instablity with some eMMC modules on ROCK Pi 4 SBCs running
      in HS400 mode. This ends up resulting in some block errors after a while
      or after a "heavy" operation utilising the eMMC (e.g. resizing a
      filesystem). An example of these errors is as follows:
      
          [  289.171014] mmc1: running CQE recovery
          [  290.048972] mmc1: running CQE recovery
          [  290.054834] mmc1: running CQE recovery
          [  290.060817] mmc1: running CQE recovery
          [  290.061337] blk_update_request: I/O error, dev mmcblk1, sector 1411072 op 0x1:(WRITE) flags 0x800 phys_seg 36 prio class 0
          [  290.061370] EXT4-fs warning (device mmcblk1p1): ext4_end_bio:348: I/O error 10 writing to inode 29547 starting block 176466)
          [  290.061484] Buffer I/O error on device mmcblk1p1, logical block 172288
          [  290.061531] Buffer I/O error on device mmcblk1p1, logical block 172289
          [  290.061551] Buffer I/O error on device mmcblk1p1, logical block 172290
          [  290.061574] Buffer I/O error on device mmcblk1p1, logical block 172291
          [  290.061592] Buffer I/O error on device mmcblk1p1, logical block 172292
          [  290.061615] Buffer I/O error on device mmcblk1p1, logical block 172293
          [  290.061632] Buffer I/O error on device mmcblk1p1, logical block 172294
          [  290.061654] Buffer I/O error on device mmcblk1p1, logical block 172295
          [  290.061673] Buffer I/O error on device mmcblk1p1, logical block 172296
          [  290.061695] Buffer I/O error on device mmcblk1p1, logical block 172297
      
      Disabling the Command Queue seems to stop the CQE recovery from running,
      but doesn't seem to improve the I/O errors. Until this can be investigated
      further, disable HS400 mode on the ROCK Pi 4 SBCs to at least stop I/O
      errors from occurring.
      
      While we are here, set the eMMC maximum clock frequency to 1.5MHz to
      follow the ROCK 4C+.
      
      Fixes: 1b5715c6
      
       ("arm64: dts: rockchip: add ROCK Pi 4 DTS support")
      Signed-off-by: default avatarChristopher Obbard <chris.obbard@collabora.com>
      Tested-By: default avatarFolker Schwesinger <dev@folker-schwesinger.de>
      Link: https://lore.kernel.org/r/20230705144255.115299-2-chris.obbard@collabora.com
      
      
      Signed-off-by: default avatarHeiko Stuebner <heiko@sntech.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      52d3607d
    • Dmitry Baryshkov's avatar
      arm64: dts: qcom: qrb5165-rb5: fix thermal zone conflict · 9657a754
      Dmitry Baryshkov authored
      [ Upstream commit 798f1df8 ]
      
      The commit 3a786086 ("arm64: dts: qcom: Add missing "-thermal"
      suffix for thermal zones") renamed the thermal zone in the pm8150l.dtsi
      file to comply with the schema. However this resulted in a clash with
      the RB5 board file, which already contained the pm8150l-thermal zone for
      the on-board sensor. This resulted in the board file definition
      overriding the thermal zone defined in the PMIC include file (and thus
      the on-die PMIC temp alarm was not probing at all).
      
      Rename the thermal zone in qcom/qrb5165-rb5.dts to remove this override.
      
      Fixes: 3a786086
      
       ("arm64: dts: qcom: Add missing "-thermal" suffix for thermal zones")
      Signed-off-by: default avatarDmitry Baryshkov <dmitry.baryshkov@linaro.org>
      Reviewed-by: default avatarKonrad Dybcio <konrad.dybcio@linaro.org>
      Link: https://lore.kernel.org/r/20230613131224.666668-1-dmitry.baryshkov@linaro.org
      
      
      Signed-off-by: default avatarBjorn Andersson <andersson@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      9657a754
    • Tony Lindgren's avatar
      bus: ti-sysc: Flush posted write on enable before reset · fae3868b
      Tony Lindgren authored
      [ Upstream commit 34539b44
      
       ]
      
      The am335x devices started producing boot errors for resetting musb module
      in because of subtle timing changes:
      
      Unhandled fault: external abort on non-linefetch (0x1008)
      ...
      sysc_poll_reset_sysconfig from sysc_reset+0x109/0x12
      sysc_reset from sysc_probe+0xa99/0xeb0
      ...
      
      The fix is to flush posted write after enable before reset during
      probe. Note that some devices also need to specify the delay after enable
      with ti,sysc-delay-us, but this is not needed for musb on am335x based on
      my tests.
      
      Reported-by: default avatarkernelci.org bot <bot@kernelci.org>
      Closes: https://storage.kernelci.org/next/master/next-20230614/arm/multi_v7_defconfig+CONFIG_THUMB2_KERNEL=y/gcc-10/lab-cip/baseline-beaglebone-black.html
      Fixes: 596e7955
      
       ("bus: ti-sysc: Add support for software reset")
      Signed-off-by: default avatarTony Lindgren <tony@atomide.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      fae3868b
    • Marcin Szycik's avatar
      ice: Block switchdev mode when ADQ is active and vice versa · 1c82d1b7
      Marcin Szycik authored
      [ Upstream commit 43d00e10 ]
      
      ADQ and switchdev are not supported simultaneously. Enabling both at the
      same time can result in nullptr dereference.
      
      To prevent this, check if ADQ is active when changing devlink mode to
      switchdev mode, and check if switchdev is active when enabling ADQ.
      
      Fixes: fbc7b27a
      
       ("ice: enable ndo_setup_tc support for mqprio_qdisc")
      Signed-off-by: default avatarMarcin Szycik <marcin.szycik@linux.intel.com>
      Reviewed-by: default avatarPrzemek Kitszel <przemyslaw.kitszel@intel.com>
      Tested-by: default avatarSujai Buvaneswaran <sujai.buvaneswaran@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/20230816193405.1307580-1-anthony.l.nguyen@intel.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      1c82d1b7
    • Manish Chopra's avatar
      qede: fix firmware halt over suspend and resume · fbc7b1da
      Manish Chopra authored
      [ Upstream commit 2eb9625a ]
      
      While performing certain power-off sequences, PCI drivers are
      called to suspend and resume their underlying devices through
      PCI PM (power management) interface. However this NIC hardware
      does not support PCI PM suspend/resume operations so system wide
      suspend/resume leads to bad MFW (management firmware) state which
      causes various follow-up errors in driver when communicating with
      the device/firmware afterwards.
      
      To fix this driver implements PCI PM suspend handler to indicate
      unsupported operation to the PCI subsystem explicitly, thus avoiding
      system to go into suspended/standby mode.
      
      Without this fix device/firmware does not recover unless system
      is power cycled.
      
      Fixes: 2950219d
      
       ("qede: Add basic network device support")
      Signed-off-by: default avatarManish Chopra <manishc@marvell.com>
      Signed-off-by: default avatarAlok Prasad <palok@marvell.com>
      Reviewed-by: default avatarJohn Meneghini <jmeneghi@redhat.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/20230816150711.59035-1-manishc@marvell.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      fbc7b1da
    • Eric Dumazet's avatar
      net: do not allow gso_size to be set to GSO_BY_FRAGS · 2e03a92b
      Eric Dumazet authored
      [ Upstream commit b616be6b ]
      
      One missing check in virtio_net_hdr_to_skb() allowed
      syzbot to crash kernels again [1]
      
      Do not allow gso_size to be set to GSO_BY_FRAGS (0xffff),
      because this magic value is used by the kernel.
      
      [1]
      general protection fault, probably for non-canonical address 0xdffffc000000000e: 0000 [#1] PREEMPT SMP KASAN
      KASAN: null-ptr-deref in range [0x0000000000000070-0x0000000000000077]
      CPU: 0 PID: 5039 Comm: syz-executor401 Not tainted 6.5.0-rc5-next-20230809-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/26/2023
      RIP: 0010:skb_segment+0x1a52/0x3ef0 net/core/skbuff.c:4500
      Code: 00 00 00 e9 ab eb ff ff e8 6b 96 5d f9 48 8b 84 24 00 01 00 00 48 8d 78 70 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 08 3c 03 0f 8e ea 21 00 00 48 8b 84 24 00 01
      RSP: 0018:ffffc90003d3f1c8 EFLAGS: 00010202
      RAX: dffffc0000000000 RBX: 000000000001fffe RCX: 0000000000000000
      RDX: 000000000000000e RSI: ffffffff882a3115 RDI: 0000000000000070
      RBP: ffffc90003d3f378 R08: 0000000000000005 R09: 000000000000ffff
      R10: 000000000000ffff R11: 5ee4a93e456187d6 R12: 000000000001ffc6
      R13: dffffc0000000000 R14: 0000000000000008 R15: 000000000000ffff
      FS: 00005555563f2380(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
      CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000020020000 CR3: 000000001626d000 CR4: 00000000003506f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
      <TASK>
      udp6_ufo_fragment+0x9d2/0xd50 net/ipv6/udp_offload.c:109
      ipv6_gso_segment+0x5c4/0x17b0 net/ipv6/ip6_offload.c:120
      skb_mac_gso_segment+0x292/0x610 net/core/gso.c:53
      __skb_gso_segment+0x339/0x710 net/core/gso.c:124
      skb_gso_segment include/net/gso.h:83 [inline]
      validate_xmit_skb+0x3a5/0xf10 net/core/dev.c:3625
      __dev_queue_xmit+0x8f0/0x3d60 net/core/dev.c:4329
      dev_queue_xmit include/linux/netdevice.h:3082 [inline]
      packet_xmit+0x257/0x380 net/packet/af_packet.c:276
      packet_snd net/packet/af_packet.c:3087 [inline]
      packet_sendmsg+0x24c7/0x5570 net/packet/af_packet.c:3119
      sock_sendmsg_nosec net/socket.c:727 [inline]
      sock_sendmsg+0xd9/0x180 net/socket.c:750
      ____sys_sendmsg+0x6ac/0x940 net/socket.c:2496
      ___sys_sendmsg+0x135/0x1d0 net/socket.c:2550
      __sys_sendmsg+0x117/0x1e0 net/socket.c:2579
      do_syscall_x64 arch/x86/entry/common.c:50 [inline]
      do_syscall_64+0x38/0xb0 arch/x86/entry/common.c:80
      entry_SYSCALL_64_after_hwframe+0x63/0xcd
      RIP: 0033:0x7ff27cdb34d9
      
      Fixes: 3953c46c
      
       ("sk_buff: allow segmenting based on frag sizes")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Xin Long <lucien.xin@gmail.com>
      Cc: "Michael S. Tsirkin" <mst@redhat.com>
      Cc: Jason Wang <jasowang@redhat.com>
      Reviewed-by: default avatarWillem de Bruijn <willemb@google.com>
      Reviewed-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Reviewed-by: default avatarXuan Zhuo <xuanzhuo@linux.alibaba.com>
      Link: https://lore.kernel.org/r/20230816142158.1779798-1-edumazet@google.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      2e03a92b
    • Abel Wu's avatar
      sock: Fix misuse of sk_under_memory_pressure() · 06b8f06f
      Abel Wu authored
      [ Upstream commit 2d0c88e8 ]
      
      The status of global socket memory pressure is updated when:
      
        a) __sk_mem_raise_allocated():
      
      	enter: sk_memory_allocated(sk) >  sysctl_mem[1]
      	leave: sk_memory_allocated(sk) <= sysctl_mem[0]
      
        b) __sk_mem_reduce_allocated():
      
      	leave: sk_under_memory_pressure(sk) &&
      		sk_memory_allocated(sk) < sysctl_mem[0]
      
      So the conditions of leaving global pressure are inconstant, which
      may lead to the situation that one pressured net-memcg prevents the
      global pressure from being cleared when there is indeed no global
      pressure, thus the global constrains are still in effect unexpectedly
      on the other sockets.
      
      This patch fixes this by ignoring the net-memcg's pressure when
      deciding whether should leave global memory pressure.
      
      Fixes: e1aab161
      
       ("socket: initial cgroup code.")
      Signed-off-by: default avatarAbel Wu <wuyun.abel@bytedance.com>
      Acked-by: default avatarShakeel Butt <shakeelb@google.com>
      Link: https://lore.kernel.org/r/20230816091226.1542-1-wuyun.abel@bytedance.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      06b8f06f
    • Edward Cree's avatar
      sfc: don't unregister flow_indr if it was never registered · 3d820924
      Edward Cree authored
      [ Upstream commit fa165e19 ]
      
      In efx_init_tc(), move the setting of efx->tc->up after the
       flow_indr_dev_register() call, so that if it fails, efx_fini_tc()
       won't call flow_indr_dev_unregister().
      
      Fixes: 5b2e12d5
      
       ("sfc: bind indirect blocks for TC offload on EF100")
      Suggested-by: default avatarPieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
      Reviewed-by: default avatarMartin Habets <habetsm.xilinx@gmail.com>
      Signed-off-by: default avatarEdward Cree <ecree.xilinx@gmail.com>
      Link: https://lore.kernel.org/r/a81284d7013aba74005277bd81104e4cfbea3f6f.1692114888.git.ecree.xilinx@gmail.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      3d820924
    • Alfred Lee's avatar
      net: dsa: mv88e6xxx: Wait for EEPROM done before HW reset · df83af3b
      Alfred Lee authored
      [ Upstream commit 23d775f1 ]
      
      If the switch is reset during active EEPROM transactions, as in
      just after an SoC reset after power up, the I2C bus transaction
      may be cut short leaving the EEPROM internal I2C state machine
      in the wrong state.  When the switch is reset again, the bad
      state machine state may result in data being read from the wrong
      memory location causing the switch to enter unexpected mode
      rendering it inoperational.
      
      Fixes: a3dcb3e7
      
       ("net: dsa: mv88e6xxx: Wait for EEPROM done after HW reset")
      Signed-off-by: default avatarAlfred Lee <l00g33k@gmail.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Link: https://lore.kernel.org/r/20230815001323.24739-1-l00g33k@gmail.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      df83af3b
    • Andrii Staikov's avatar
      i40e: fix misleading debug logs · 74092431
      Andrii Staikov authored
      [ Upstream commit 2f2beb88 ]
      
      Change "write" into the actual "read" word.
      Change parameters description.
      
      Fixes: 7073f46e
      
       ("i40e: Add AQ commands for NVM Update for X722")
      Signed-off-by: default avatarAleksandr Loktionov <aleksandr.loktionov@intel.com>
      Signed-off-by: default avatarAndrii Staikov <andrii.staikov@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      74092431
    • Piotr Gardocki's avatar
      iavf: fix FDIR rule fields masks validation · ea749b5e
      Piotr Gardocki authored
      [ Upstream commit 751969e5 ]
      
      Return an error if a field's mask is neither full nor empty. When a mask
      is only partial the field is not being used for rule programming but it
      gives a wrong impression it is used. Fix by returning an error on any
      partial mask to make it clear they are not supported.
      The ip_ver assignment is moved earlier in code to allow using it in
      iavf_validate_fdir_fltr_masks.
      
      Fixes: 527691bf ("iavf: Support IPv4 Flow Director filters")
      Fixes: e90cbc25
      
       ("iavf: Support IPv6 Flow Director filters")
      Signed-off-by: default avatarPiotr Gardocki <piotrx.gardocki@intel.com>
      Tested-by: default avatarRafal Romanowski <rafal.romanowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ea749b5e
    • Jakub Kicinski's avatar
      net: openvswitch: reject negative ifindex · c965a583
      Jakub Kicinski authored
      [ Upstream commit a552bfa1 ]
      
      Recent changes in net-next (commit 759ab1ed ("net: store netdevs
      in an xarray")) refactored the handling of pre-assigned ifindexes
      and let syzbot surface a latent problem in ovs. ovs does not validate
      ifindex, making it possible to create netdev ports with negative
      ifindex values. It's easy to repro with YNL:
      
      $ ./cli.py --spec netlink/specs/ovs_datapath.yaml \
               --do new \
      	 --json '{"upcall-pid": 1, "name":"my-dp"}'
      $ ./cli.py --spec netlink/specs/ovs_vport.yaml \
      	 --do new \
      	 --json '{"upcall-pid": "00000001", "name": "some-port0", "dp-ifindex":3,"ifindex":4294901760,"type":2}'
      
      $ ip link show
      -65536: some-port0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
          link/ether 7a:48:21:ad:0b:fb brd ff:ff:ff:ff:ff:ff
      ...
      
      Validate the inputs. Now the second command correctly returns:
      
      $ ./cli.py --spec netlink/specs/ovs_vport.yaml \
      	 --do new \
      	 --json '{"upcall-pid": "00000001", "name": "some-port0", "dp-ifindex":3,"ifindex":4294901760,"type":2}'
      
      lib.ynl.NlError: Netlink error: Numerical result out of range
      nl_len = 108 (92) nl_flags = 0x300 nl_type = 2
      	error: -34	extack: {'msg': 'integer out of range', 'unknown': [[type:4 len:36] b'\x0c\x00\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x0c\x00\x03\x00\xff\xff\xff\x7f\x00\x00\x00\x00\x08\x00\x01\x00\x08\x00\x00\x00'], 'bad-attr': '.ifindex'}
      
      Accept 0 since it used to be silently ignored.
      
      Fixes: 54c4ef34
      
       ("openvswitch: allow specifying ifindex of new interfaces")
      Reported-by: default avatar <syzbot+7456b5dcf65111553320@syzkaller.appspotmail.com>
      Reviewed-by: default avatarLeon Romanovsky <leonro@nvidia.com>
      Reviewed-by: default avatarAaron Conole <aconole@redhat.com>
      Link: https://lore.kernel.org/r/20230814203840.2908710-1-kuba@kernel.org
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      c965a583
    • Ziyang Xuan's avatar
      team: Fix incorrect deletion of ETH_P_8021AD protocol vid from slaves · d5e4c0e7
      Ziyang Xuan authored
      [ Upstream commit dafcbce0 ]
      
      Similar to commit 01f4fd27 ("bonding: Fix incorrect deletion of
      ETH_P_8021AD protocol vid from slaves"), we can trigger BUG_ON(!vlan_info)
      in unregister_vlan_dev() with the following testcase:
      
        # ip netns add ns1
        # ip netns exec ns1 ip link add team1 type team
        # ip netns exec ns1 ip link add team_slave type veth peer veth2
        # ip netns exec ns1 ip link set team_slave master team1
        # ip netns exec ns1 ip link add link team_slave name team_slave.10 type vlan id 10 protocol 802.1ad
        # ip netns exec ns1 ip link add link team1 name team1.10 type vlan id 10 protocol 802.1ad
        # ip netns exec ns1 ip link set team_slave nomaster
        # ip netns del ns1
      
      Add S-VLAN tag related features support to team driver. So the team driver
      will always propagate the VLAN info to its slaves.
      
      Fixes: 8ad227ff
      
       ("net: vlan: add 802.1ad support")
      Suggested-by: default avatarIdo Schimmel <idosch@idosch.org>
      Signed-off-by: default avatarZiyang Xuan <william.xuanziyang@huawei.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Link: https://lore.kernel.org/r/20230814032301.2804971-1-william.xuanziyang@huawei.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d5e4c0e7
    • Justin Chen's avatar
      net: phy: broadcom: stub c45 read/write for 54810 · 85bd0af9
      Justin Chen authored
      [ Upstream commit 096516d0 ]
      
      The 54810 does not support c45. The mmd_phy_indirect accesses return
      arbirtary values leading to odd behavior like saying it supports EEE
      when it doesn't. We also see that reading/writing these non-existent
      MMD registers leads to phy instability in some cases.
      
      Fixes: b14995ac
      
       ("net: phy: broadcom: Add BCM54810 PHY entry")
      Signed-off-by: default avatarJustin Chen <justin.chen@broadcom.com>
      Reviewed-by: default avatarFlorian Fainelli <florian.fainelli@broadcom.com>
      Link: https://lore.kernel.org/r/1691901708-28650-1-git-send-email-justin.chen@broadcom.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      85bd0af9
    • Pablo Neira Ayuso's avatar
      netfilter: nft_dynset: disallow object maps · 7148bca6
      Pablo Neira Ayuso authored
      [ Upstream commit 23185c6a ]
      
      Do not allow to insert elements from datapath to objects maps.
      
      Fixes: 8aeff920
      
       ("netfilter: nf_tables: add stateful object reference to set elements")
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7148bca6
    • Sishuai Gong's avatar
      ipvs: fix racy memcpy in proc_do_sync_threshold · 7f8a160d
      Sishuai Gong authored
      [ Upstream commit 5310760a ]
      
      When two threads run proc_do_sync_threshold() in parallel,
      data races could happen between the two memcpy():
      
      Thread-1			Thread-2
      memcpy(val, valp, sizeof(val));
      				memcpy(valp, val, sizeof(val));
      
      This race might mess up the (struct ctl_table *) table->data,
      so we add a mutex lock to serialize them.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Link: https://lore.kernel.org/netdev/B6988E90-0A1E-4B85-BF26-2DAF6D482433@gmail.com/
      
      
      Signed-off-by: default avatarSishuai Gong <sishuai.system@gmail.com>
      Acked-by: default avatarSimon Horman <horms@kernel.org>
      Acked-by: default avatarJulian Anastasov <ja@ssi.bg>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7f8a160d
    • Florian Westphal's avatar
      netfilter: nf_tables: deactivate catchall elements in next generation · 00ea7eb1
      Florian Westphal authored
      [ Upstream commit 90e5b346 ]
      
      When flushing, individual set elements are disabled in the next
      generation via the ->flush callback.
      
      Catchall elements are not disabled.  This is incorrect and may lead to
      double-deactivations of catchall elements which then results in memory
      leaks:
      
      WARNING: CPU: 1 PID: 3300 at include/net/netfilter/nf_tables.h:1172 nft_map_deactivate+0x549/0x730
      CPU: 1 PID: 3300 Comm: nft Not tainted 6.5.0-rc5+ #60
      RIP: 0010:nft_map_deactivate+0x549/0x730
       [..]
       ? nft_map_deactivate+0x549/0x730
       nf_tables_delset+0xb66/0xeb0
      
      (the warn is due to nft_use_dec() detecting underflow).
      
      Fixes: aaa31047
      
       ("netfilter: nftables: add catch-all set element support")
      Reported-by: default avatarlonial con <kongln9170@gmail.com>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      00ea7eb1
    • Florian Westphal's avatar
      netfilter: nf_tables: fix false-positive lockdep splat · a800fcd8
      Florian Westphal authored
      [ Upstream commit b9f052dc ]
      
      ->abort invocation may cause splat on debug kernels:
      
      WARNING: suspicious RCU usage
      net/netfilter/nft_set_pipapo.c:1697 suspicious rcu_dereference_check() usage!
      [..]
      rcu_scheduler_active = 2, debug_locks = 1
      1 lock held by nft/133554: [..] (nft_net->commit_mutex){+.+.}-{3:3}, at: nf_tables_valid_genid
      [..]
       lockdep_rcu_suspicious+0x1ad/0x260
       nft_pipapo_abort+0x145/0x180
       __nf_tables_abort+0x5359/0x63d0
       nf_tables_abort+0x24/0x40
       nfnetlink_rcv+0x1a0a/0x22c0
       netlink_unicast+0x73c/0x900
       netlink_sendmsg+0x7f0/0xc20
       ____sys_sendmsg+0x48d/0x760
      
      Transaction mutex is held, so parallel updates are not possible.
      Switch to _protected and check mutex is held for lockdep enabled builds.
      
      Fixes: 212ed75d
      
       ("netfilter: nf_tables: integrate pipapo into commit protocol")
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a800fcd8
    • Michal Schmidt's avatar
      octeon_ep: cancel tx_timeout_task later in remove sequence · 75c724e2
      Michal Schmidt authored
      [ Upstream commit 28458c80 ]
      
      tx_timeout_task is canceled too early when removing the driver. Nothing
      prevents .ndo_tx_timeout from triggering and queuing the work again.
      
      Better cancel it after the netdev is unregistered.
      It's harmless for octep_tx_timeout_task to run in the window between the
      unregistration and cancelation, because it checks netif_running.
      
      Fixes: 862cd659
      
       ("octeon_ep: Add driver framework and device initialization")
      Signed-off-by: default avatarMichal Schmidt <mschmidt@redhat.com>
      Link: https://lore.kernel.org/r/20230810150114.107765-3-mschmidt@redhat.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      75c724e2
    • Radhey Shyam Pandey's avatar
      net: macb: In ZynqMP resume always configure PS GTR for non-wakeup source · 58a54bad
      Radhey Shyam Pandey authored
      [ Upstream commit 6c461e39 ]
      
      On Zynq UltraScale+ MPSoC ubuntu platform when systemctl issues suspend,
      network manager bring down the interface and goes into suspend. When it
      wakes up it again enables the interface.
      
      This leads to xilinx-psgtr "PLL lock timeout" on interface bringup, as
      the power management controller power down the entire FPD (including
      SERDES) if none of the FPD devices are in use and serdes is not
      initialized on resume.
      
      $ sudo rtcwake -m no -s 120 -v
      $ sudo systemctl suspend  <this does ifconfig eth1 down>
      $ ifconfig eth1 up
      xilinx-psgtr fd400000.phy: lane 0 (type 10, protocol 5): PLL lock timeout
      phy phy-fd400000.phy.0: phy poweron failed --> -110
      
      macb driver is called in this way:
      1. macb_close: Stop network interface. In this function, it
         reset MACB IP and disables PHY and network interface.
      
      2. macb_suspend: It is called in kernel suspend flow. But because
         network interface has been disabled(netif_running(ndev) is
         false), it does nothing and returns directly;
      
      3. System goes into suspend state. Some time later, system is
         waken up by RTC wakeup device;
      
      4. macb_resume: It does nothing because network interface has
         been disabled;
      
      5. macb_open: It is called to enable network interface again. ethernet
         interface is initialized in this API but serdes which is power-off
         by PMUFW during FPD-off suspend is not initialized again and so
         we hit GT PLL lock issue on open.
      
      To resolve this PLL timeout issue always do PS GTR initialization
      when ethernet device is configured as non-wakeup source.
      
      Fixes: f22bd29b ("net: macb: Fix ZynqMP SGMII non-wakeup source resume failure")
      Fixes: 8b73fa3a
      
       ("net: macb: Added ZynqMP-specific initialization")
      Signed-off-by: default avatarRadhey Shyam Pandey <radhey.shyam.pandey@amd.com>
      Link: https://lore.kernel.org/r/1691414091-2260697-1-git-send-email-radhey.shyam.pandey@amd.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      58a54bad
    • Luca Ceresoli's avatar
      drm/panel: simple: Fix AUO G121EAN01 panel timings according to the docs · 06af678c
      Luca Ceresoli authored
      [ Upstream commit e8470c0a ]
      
      Commit 03e909ac ("drm/panel: simple: Add support for AUO G121EAN01.4
      panel") added support for this panel model, but the timings it implements
      are very different from what the datasheet describes. I checked both the
      G121EAN01.0 datasheet from [0] and the G121EAN01.4 one from [1] and they
      all have the same timings: for example the LVDS clock typical value is 74.4
      MHz, not 66.7 MHz as implemented.
      
      Replace the timings with the ones from the documentation. These timings
      have been tested and the clock frequencies verified with an oscilloscope to
      ensure they are correct.
      
      Also use struct display_timing instead of struct drm_display_mode in order
      to also specify the minimum and maximum values.
      
      [0] https://embedded.avnet.com/product/g121ean01-0/
      [1] https://embedded.avnet.com/product/g121ean01-4/
      
      Fixes: 03e909ac
      
       ("drm/panel: simple: Add support for AUO G121EAN01.4 panel")
      Signed-off-by: default avatarLuca Ceresoli <luca.ceresoli@bootlin.com>
      Reviewed-by: default avatarNeil Armstrong <neil.armstrong@linaro.org>
      Signed-off-by: default avatarNeil Armstrong <neil.armstrong@linaro.org>
      Link: https://patchwork.freedesktop.org/patch/msgid/20230804151239.835216-1-luca.ceresoli@bootlin.com
      
      
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      06af678c
    • Petr Machata's avatar
      selftests: mirror_gre_changes: Tighten up the TTL test match · 2f07f130
      Petr Machata authored
      [ Upstream commit 855067de ]
      
      This test verifies whether the encapsulated packets have the correct
      configured TTL. It does so by sending ICMP packets through the test
      topology and mirroring them to a gretap netdevice. On a busy host
      however, more than just the test ICMP packets may end up flowing
      through the topology, get mirrored, and counted. This leads to
      potential spurious failures as the test observes much more mirrored
      packets than the sent test packets, and assumes a bug.
      
      Fix this by tightening up the mirror action match. Change it from
      matchall to a flower classifier matching on ICMP packets specifically.
      
      Fixes: 45315673
      
       ("selftests: forwarding: Test changes in mirror-to-gretap")
      Signed-off-by: default avatarPetr Machata <petrm@nvidia.com>
      Tested-by: default avatarMirsad Todorovac <mirsad.todorovac@alu.unizg.hr>
      Reviewed-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      2f07f130
    • Russell King (Oracle)'s avatar
      net: phy: fix IRQ-based wake-on-lan over hibernate / power off · cd4460b2
      Russell King (Oracle) authored
      [ Upstream commit cc941e54 ]
      
      Uwe reports:
      "Most PHYs signal WoL using an interrupt. So disabling interrupts [at
      shutdown] breaks WoL at least on PHYs covered by the marvell driver."
      
      Discussing with Ioana, the problem which was trying to be solved was:
      "The board in question is a LS1021ATSN which has two AR8031 PHYs that
      share an interrupt line. In case only one of the PHYs is probed and
      there are pending interrupts on the PHY#2 an IRQ storm will happen
      since there is no entity to clear the interrupt from PHY#2's registers.
      PHY#1's driver will get stuck in .handle_interrupt() indefinitely."
      
      Further confirmation that "the two AR8031 PHYs are on the same MDIO
      bus."
      
      With WoL using interrupts to wake the system, in such a case, the
      system will begin booting with an asserted interrupt. Thus, we need to
      cope with an interrupt asserted during boot.
      
      Solve this instead by disabling interrupts during PHY probe. This will
      ensure in Ioana's situation that both PHYs of the same type sharing an
      interrupt line on a common MDIO bus will have their interrupt outputs
      disabled when the driver probes the device, but before we hook in any
      interrupt handlers - thus avoiding the interrupt storm.
      
      A better fix would be for platform firmware to disable the interrupting
      devices at source during boot, before control is handed to the kernel.
      
      Fixes: e2f016cf
      
       ("net: phy: add a shutdown procedure")
      Link: 20230804071757.383971-1-u.kleine-koenig@pengutronix.de
      Reported-by: default avatarUwe Kleine-König <u.kleine-koenig@pengutronix.de>
      Signed-off-by: default avatarRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: default avatarFlorian Fainelli <florian.fainelli@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      cd4460b2
    • Xiang Yang's avatar
      net: pcs: Add missing put_device call in miic_create · a41e5a79
      Xiang Yang authored
      [ Upstream commit 829c6524 ]
      
      The reference of pdev->dev is taken by of_find_device_by_node, so
      it should be released when not need anymore.
      
      Fixes: 7dc54d3b
      
       ("net: pcs: add Renesas MII converter driver")
      Signed-off-by: default avatarXiang Yang <xiangyang3@huawei.com>
      Reviewed-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a41e5a79
    • Jason Wang's avatar
      virtio-net: set queues after driver_ok · 120a89c3
      Jason Wang authored
      [ Upstream commit 51b81317 ]
      
      Commit 25266128 ("virtio-net: fix race between set queues and
      probe") tries to fix the race between set queues and probe by calling
      _virtnet_set_queues() before DRIVER_OK is set. This violates virtio
      spec. Fixing this by setting queues after virtio_device_ready().
      
      Note that rtnl needs to be held for userspace requests to change the
      number of queues. So we are serialized in this way.
      
      Fixes: 25266128
      
       ("virtio-net: fix race between set queues and probe")
      Reported-by: default avatarDragos Tatulea <dtatulea@nvidia.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      120a89c3
    • Laurent Vivier's avatar
      virtio_net: notify MAC address change on device initialization · 45085ba9
      Laurent Vivier authored
      [ Upstream commit 9f62d221
      
       ]
      
      In virtnet_probe(), if the device doesn't provide a MAC address the
      driver assigns a random one.
      As we modify the MAC address we need to notify the device to allow it
      to update all the related information.
      
      The problem can be seen with vDPA and mlx5_vdpa driver as it doesn't
      assign a MAC address by default. The virtio_net device uses a random
      MAC address (we can see it with "ip link"), but we can't ping a net
      namespace from another one using the virtio-vdpa device because the
      new MAC address has not been provided to the hardware:
      RX packets are dropped since they don't go through the receive filters,
      TX packets go through unaffected.
      
      Signed-off-by: default avatarLaurent Vivier <lvivier@redhat.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Stable-dep-of: 51b81317
      
       ("virtio-net: set queues after driver_ok")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      45085ba9
    • Lin Ma's avatar
      xfrm: add forgotten nla_policy for XFRMA_MTIMER_THRESH · a442cd17
      Lin Ma authored
      [ Upstream commit 5e242470 ]
      
      The previous commit 4e484b3e ("xfrm: rate limit SA mapping change
      message to user space") added one additional attribute named
      XFRMA_MTIMER_THRESH and described its type at compat_policy
      (net/xfrm/xfrm_compat.c).
      
      However, the author forgot to also describe the nla_policy at
      xfrma_policy (net/xfrm/xfrm_user.c). Hence, this suppose NLA_U32 (4
      bytes) value can be faked as empty (0 bytes) by a malicious user, which
      leads to 4 bytes overflow read and heap information leak when parsing
      nlattrs.
      
      To exploit this, one malicious user can spray the SLUB objects and then
      leverage this 4 bytes OOB read to leak the heap data into
      x->mapping_maxage (see xfrm_update_ae_params(...)), and leak it to
      userspace via copy_to_user_state_extra(...).
      
      The above bug is assigned CVE-2023-3773. To fix it, this commit just
      completes the nla_policy description for XFRMA_MTIMER_THRESH, which
      enforces the length check and avoids such OOB read.
      
      Fixes: 4e484b3e
      
       ("xfrm: rate limit SA mapping change message to user space")
      Signed-off-by: default avatarLin Ma <linma@zju.edu.cn>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Reviewed-by: default avatarLeon Romanovsky <leonro@nvidia.com>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a442cd17
    • Lin Ma's avatar
      xfrm: add NULL check in xfrm_update_ae_params · 87b655f4
      Lin Ma authored
      [ Upstream commit 00374d9b ]
      
      Normally, x->replay_esn and x->preplay_esn should be allocated at
      xfrm_alloc_replay_state_esn(...) in xfrm_state_construct(...), hence the
      xfrm_update_ae_params(...) is okay to update them. However, the current
      implementation of xfrm_new_ae(...) allows a malicious user to directly
      dereference a NULL pointer and crash the kernel like below.
      
      BUG: kernel NULL pointer dereference, address: 0000000000000000
      PGD 8253067 P4D 8253067 PUD 8e0e067 PMD 0
      Oops: 0002 [#1] PREEMPT SMP KASAN NOPTI
      CPU: 0 PID: 98 Comm: poc.npd Not tainted 6.4.0-rc7-00072-gdad9774deaf1 #8
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.o4
      RIP: 0010:memcpy_orig+0xad/0x140
      Code: e8 4c 89 5f e0 48 8d 7f e0 73 d2 83 c2 20 48 29 d6 48 29 d7 83 fa 10 72 34 4c 8b 06 4c 8b 4e 08 c
      RSP: 0018:ffff888008f57658 EFLAGS: 00000202
      RAX: 0000000000000000 RBX: ffff888008bd0000 RCX: ffffffff8238e571
      RDX: 0000000000000018 RSI: ffff888007f64844 RDI: 0000000000000000
      RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000000 R12: ffff888008f57818
      R13: ffff888007f64aa4 R14: 0000000000000000 R15: 0000000000000000
      FS:  00000000014013c0(0000) GS:ffff88806d600000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000000000000 CR3: 00000000054d8000 CR4: 00000000000006f0
      Call Trace:
       <TASK>
       ? __die+0x1f/0x70
       ? page_fault_oops+0x1e8/0x500
       ? __pfx_is_prefetch.constprop.0+0x10/0x10
       ? __pfx_page_fault_oops+0x10/0x10
       ? _raw_spin_unlock_irqrestore+0x11/0x40
       ? fixup_exception+0x36/0x460
       ? _raw_spin_unlock_irqrestore+0x11/0x40
       ? exc_page_fault+0x5e/0xc0
       ? asm_exc_page_fault+0x26/0x30
       ? xfrm_update_ae_params+0xd1/0x260
       ? memcpy_orig+0xad/0x140
       ? __pfx__raw_spin_lock_bh+0x10/0x10
       xfrm_update_ae_params+0xe7/0x260
       xfrm_new_ae+0x298/0x4e0
       ? __pfx_xfrm_new_ae+0x10/0x10
       ? __pfx_xfrm_new_ae+0x10/0x10
       xfrm_user_rcv_msg+0x25a/0x410
       ? __pfx_xfrm_user_rcv_msg+0x10/0x10
       ? __alloc_skb+0xcf/0x210
       ? stack_trace_save+0x90/0xd0
       ? filter_irq_stacks+0x1c/0x70
       ? __stack_depot_save+0x39/0x4e0
       ? __kasan_slab_free+0x10a/0x190
       ? kmem_cache_free+0x9c/0x340
       ? netlink_recvmsg+0x23c/0x660
       ? sock_recvmsg+0xeb/0xf0
       ? __sys_recvfrom+0x13c/0x1f0
       ? __x64_sys_recvfrom+0x71/0x90
       ? do_syscall_64+0x3f/0x90
       ? entry_SYSCALL_64_after_hwframe+0x72/0xdc
       ? copyout+0x3e/0x50
       netlink_rcv_skb+0xd6/0x210
       ? __pfx_xfrm_user_rcv_msg+0x10/0x10
       ? __pfx_netlink_rcv_skb+0x10/0x10
       ? __pfx_sock_has_perm+0x10/0x10
       ? mutex_lock+0x8d/0xe0
       ? __pfx_mutex_lock+0x10/0x10
       xfrm_netlink_rcv+0x44/0x50
       netlink_unicast+0x36f/0x4c0
       ? __pfx_netlink_unicast+0x10/0x10
       ? netlink_recvmsg+0x500/0x660
       netlink_sendmsg+0x3b7/0x700
      
      This Null-ptr-deref bug is assigned CVE-2023-3772. And this commit
      adds additional NULL check in xfrm_update_ae_params to fix the NPD.
      
      Fixes: d8647b79
      
       ("xfrm: Add user interface for esn and big anti-replay windows")
      Signed-off-by: default avatarLin Ma <linma@zju.edu.cn>
      Reviewed-by: default avatarLeon Romanovsky <leonro@nvidia.com>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      87b655f4
    • Zhengchao Shao's avatar
      ip_vti: fix potential slab-use-after-free in decode_session6 · 2b05bf5d
      Zhengchao Shao authored
      [ Upstream commit 6018a266 ]
      
      When ip_vti device is set to the qdisc of the sfb type, the cb field
      of the sent skb may be modified during enqueuing. Then,
      slab-use-after-free may occur when ip_vti device sends IPv6 packets.
      As commit f8556919 ("xfrm6: Fix the nexthdr offset in
      _decode_session6.") showed, xfrm_decode_session was originally intended
      only for the receive path. IP6CB(skb)->nhoff is not set during
      transmission. Therefore, set the cb field in the skb to 0 before
      sending packets.
      
      Fixes: f8556919
      
       ("xfrm6: Fix the nexthdr offset in _decode_session6.")
      Signed-off-by: default avatarZhengchao Shao <shaozhengchao@huawei.com>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      2b05bf5d
    • Zhengchao Shao's avatar
      ip6_vti: fix slab-use-after-free in decode_session6 · 55ad2309
      Zhengchao Shao authored
      [ Upstream commit 9fd41f1b ]
      
      When ipv6_vti device is set to the qdisc of the sfb type, the cb field
      of the sent skb may be modified during enqueuing. Then,
      slab-use-after-free may occur when ipv6_vti device sends IPv6 packets.
      
      The stack information is as follows:
      BUG: KASAN: slab-use-after-free in decode_session6+0x103f/0x1890
      Read of size 1 at addr ffff88802e08edc2 by task swapper/0/0
      CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.4.0-next-20230707-00001-g84e2cad7f979 #410
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-1.fc33 04/01/2014
      Call Trace:
      <IRQ>
      dump_stack_lvl+0xd9/0x150
      print_address_description.constprop.0+0x2c/0x3c0
      kasan_report+0x11d/0x130
      decode_session6+0x103f/0x1890
      __xfrm_decode_session+0x54/0xb0
      vti6_tnl_xmit+0x3e6/0x1ee0
      dev_hard_start_xmit+0x187/0x700
      sch_direct_xmit+0x1a3/0xc30
      __qdisc_run+0x510/0x17a0
      __dev_queue_xmit+0x2215/0x3b10
      neigh_connected_output+0x3c2/0x550
      ip6_finish_output2+0x55a/0x1550
      ip6_finish_output+0x6b9/0x1270
      ip6_output+0x1f1/0x540
      ndisc_send_skb+0xa63/0x1890
      ndisc_send_rs+0x132/0x6f0
      addrconf_rs_timer+0x3f1/0x870
      call_timer_fn+0x1a0/0x580
      expire_timers+0x29b/0x4b0
      run_timer_softirq+0x326/0x910
      __do_softirq+0x1d4/0x905
      irq_exit_rcu+0xb7/0x120
      sysvec_apic_timer_interrupt+0x97/0xc0
      </IRQ>
      Allocated by task 9176:
      kasan_save_stack+0x22/0x40
      kasan_set_track+0x25/0x30
      __kasan_slab_alloc+0x7f/0x90
      kmem_cache_alloc_node+0x1cd/0x410
      kmalloc_reserve+0x165/0x270
      __alloc_skb+0x129/0x330
      netlink_sendmsg+0x9b1/0xe30
      sock_sendmsg+0xde/0x190
      ____sys_sendmsg+0x739/0x920
      ___sys_sendmsg+0x110/0x1b0
      __sys_sendmsg+0xf7/0x1c0
      do_syscall_64+0x39/0xb0
      entry_SYSCALL_64_after_hwframe+0x63/0xcd
      Freed by task 9176:
      kasan_save_stack+0x22/0x40
      kasan_set_track+0x25/0x30
      kasan_save_free_info+0x2b/0x40
      ____kasan_slab_free+0x160/0x1c0
      slab_free_freelist_hook+0x11b/0x220
      kmem_cache_free+0xf0/0x490
      skb_free_head+0x17f/0x1b0
      skb_release_data+0x59c/0x850
      consume_skb+0xd2/0x170
      netlink_unicast+0x54f/0x7f0
      netlink_sendmsg+0x926/0xe30
      sock_sendmsg+0xde/0x190
      ____sys_sendmsg+0x739/0x920
      ___sys_sendmsg+0x110/0x1b0
      __sys_sendmsg+0xf7/0x1c0
      do_syscall_64+0x39/0xb0
      entry_SYSCALL_64_after_hwframe+0x63/0xcd
      The buggy address belongs to the object at ffff88802e08ed00
      which belongs to the cache skbuff_small_head of size 640
      The buggy address is located 194 bytes inside of
      freed 640-byte region [ffff88802e08ed00, ffff88802e08ef80)
      
      As commit f8556919 ("xfrm6: Fix the nexthdr offset in
      _decode_session6.") showed, xfrm_decode_session was originally intended
      only for the receive path. IP6CB(skb)->nhoff is not set during
      transmission. Therefore, set the cb field in the skb to 0 before
      sending packets.
      
      Fixes: f8556919
      
       ("xfrm6: Fix the nexthdr offset in _decode_session6.")
      Signed-off-by: default avatarZhengchao Shao <shaozhengchao@huawei.com>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      55ad2309
    • Zhengchao Shao's avatar
      xfrm: fix slab-use-after-free in decode_session6 · 0d27567f
      Zhengchao Shao authored
      [ Upstream commit 53223f2e ]
      
      When the xfrm device is set to the qdisc of the sfb type, the cb field
      of the sent skb may be modified during enqueuing. Then,
      slab-use-after-free may occur when the xfrm device sends IPv6 packets.
      
      The stack information is as follows:
      BUG: KASAN: slab-use-after-free in decode_session6+0x103f/0x1890
      Read of size 1 at addr ffff8881111458ef by task swapper/3/0
      CPU: 3 PID: 0 Comm: swapper/3 Not tainted 6.4.0-next-20230707 #409
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-1.fc33 04/01/2014
      Call Trace:
      <IRQ>
      dump_stack_lvl+0xd9/0x150
      print_address_description.constprop.0+0x2c/0x3c0
      kasan_report+0x11d/0x130
      decode_session6+0x103f/0x1890
      __xfrm_decode_session+0x54/0xb0
      xfrmi_xmit+0x173/0x1ca0
      dev_hard_start_xmit+0x187/0x700
      sch_direct_xmit+0x1a3/0xc30
      __qdisc_run+0x510/0x17a0
      __dev_queue_xmit+0x2215/0x3b10
      neigh_connected_output+0x3c2/0x550
      ip6_finish_output2+0x55a/0x1550
      ip6_finish_output+0x6b9/0x1270
      ip6_output+0x1f1/0x540
      ndisc_send_skb+0xa63/0x1890
      ndisc_send_rs+0x132/0x6f0
      addrconf_rs_timer+0x3f1/0x870
      call_timer_fn+0x1a0/0x580
      expire_timers+0x29b/0x4b0
      run_timer_softirq+0x326/0x910
      __do_softirq+0x1d4/0x905
      irq_exit_rcu+0xb7/0x120
      sysvec_apic_timer_interrupt+0x97/0xc0
      </IRQ>
      <TASK>
      asm_sysvec_apic_timer_interrupt+0x1a/0x20
      RIP: 0010:intel_idle_hlt+0x23/0x30
      Code: 1f 84 00 00 00 00 00 f3 0f 1e fa 41 54 41 89 d4 0f 1f 44 00 00 66 90 0f 1f 44 00 00 0f 00 2d c4 9f ab 00 0f 1f 44 00 00 fb f4 <fa> 44 89 e0 41 5c c3 66 0f 1f 44 00 00 f3 0f 1e fa 41 54 41 89 d4
      RSP: 0018:ffffc90000197d78 EFLAGS: 00000246
      RAX: 00000000000a83c3 RBX: ffffe8ffffd09c50 RCX: ffffffff8a22d8e5
      RDX: 0000000000000001 RSI: ffffffff8d3f8080 RDI: ffffe8ffffd09c50
      RBP: ffffffff8d3f8080 R08: 0000000000000001 R09: ffffed1026ba6d9d
      R10: ffff888135d36ceb R11: 0000000000000001 R12: 0000000000000001
      R13: ffffffff8d3f8100 R14: 0000000000000001 R15: 0000000000000000
      cpuidle_enter_state+0xd3/0x6f0
      cpuidle_enter+0x4e/0xa0
      do_idle+0x2fe/0x3c0
      cpu_startup_entry+0x18/0x20
      start_secondary+0x200/0x290
      secondary_startup_64_no_verify+0x167/0x16b
      </TASK>
      Allocated by task 939:
      kasan_save_stack+0x22/0x40
      kasan_set_track+0x25/0x30
      __kasan_slab_alloc+0x7f/0x90
      kmem_cache_alloc_node+0x1cd/0x410
      kmalloc_reserve+0x165/0x270
      __alloc_skb+0x129/0x330
      inet6_ifa_notify+0x118/0x230
      __ipv6_ifa_notify+0x177/0xbe0
      addrconf_dad_completed+0x133/0xe00
      addrconf_dad_work+0x764/0x1390
      process_one_work+0xa32/0x16f0
      worker_thread+0x67d/0x10c0
      kthread+0x344/0x440
      ret_from_fork+0x1f/0x30
      The buggy address belongs to the object at ffff888111145800
      which belongs to the cache skbuff_small_head of size 640
      The buggy address is located 239 bytes inside of
      freed 640-byte region [ffff888111145800, ffff888111145a80)
      
      As commit f8556919 ("xfrm6: Fix the nexthdr offset in
      _decode_session6.") showed, xfrm_decode_session was originally intended
      only for the receive path. IP6CB(skb)->nhoff is not set during
      transmission. Therefore, set the cb field in the skb to 0 before
      sending packets.
      
      Fixes: f8556919
      
       ("xfrm6: Fix the nexthdr offset in _decode_session6.")
      Signed-off-by: default avatarZhengchao Shao <shaozhengchao@huawei.com>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      0d27567f
    • Lin Ma's avatar
      net: xfrm: Amend XFRMA_SEC_CTX nla_policy structure · 71dfe71d
      Lin Ma authored
      [ Upstream commit d1e0e61d ]
      
      According to all consumers code of attrs[XFRMA_SEC_CTX], like
      
      * verify_sec_ctx_len(), convert to xfrm_user_sec_ctx*
      * xfrm_state_construct(), call security_xfrm_state_alloc whose prototype
      is int security_xfrm_state_alloc(.., struct xfrm_user_sec_ctx *sec_ctx);
      * copy_from_user_sec_ctx(), convert to xfrm_user_sec_ctx *
      ...
      
      It seems that the expected parsing result for XFRMA_SEC_CTX should be
      structure xfrm_user_sec_ctx, and the current xfrm_sec_ctx is confusing
      and misleading (Luckily, they happen to have same size 8 bytes).
      
      This commit amend the policy structure to xfrm_user_sec_ctx to avoid
      ambiguity.
      
      Fixes: cf5cb79f
      
       ("[XFRM] netlink: Establish an attribute policy")
      Signed-off-by: default avatarLin Ma <linma@zju.edu.cn>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      71dfe71d
    • Lin Ma's avatar
      net: af_key: fix sadb_x_filter validation · 479884b4
      Lin Ma authored
      [ Upstream commit 75065a89 ]
      
      When running xfrm_state_walk_init(), the xfrm_address_filter being used
      is okay to have a splen/dplen that equals to sizeof(xfrm_address_t)<<3.
      This commit replaces >= to > to make sure the boundary checking is
      correct.
      
      Fixes: 37bd2242
      
       ("af_key: pfkey_dump needs parameter validation")
      Signed-off-by: default avatarLin Ma <linma@zju.edu.cn>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      479884b4
    • Lin Ma's avatar
      net: xfrm: Fix xfrm_address_filter OOB read · 9a005627
      Lin Ma authored
      [ Upstream commit dfa73c17 ]
      
      We found below OOB crash:
      
      [   44.211730] ==================================================================
      [   44.212045] BUG: KASAN: slab-out-of-bounds in memcmp+0x8b/0xb0
      [   44.212045] Read of size 8 at addr ffff88800870f320 by task poc.xfrm/97
      [   44.212045]
      [   44.212045] CPU: 0 PID: 97 Comm: poc.xfrm Not tainted 6.4.0-rc7-00072-gdad9774deaf1-dirty #4
      [   44.212045] Call Trace:
      [   44.212045]  <TASK>
      [   44.212045]  dump_stack_lvl+0x37/0x50
      [   44.212045]  print_report+0xcc/0x620
      [   44.212045]  ? __virt_addr_valid+0xf3/0x170
      [   44.212045]  ? memcmp+0x8b/0xb0
      [   44.212045]  kasan_report+0xb2/0xe0
      [   44.212045]  ? memcmp+0x8b/0xb0
      [   44.212045]  kasan_check_range+0x39/0x1c0
      [   44.212045]  memcmp+0x8b/0xb0
      [   44.212045]  xfrm_state_walk+0x21c/0x420
      [   44.212045]  ? __pfx_dump_one_state+0x10/0x10
      [   44.212045]  xfrm_dump_sa+0x1e2/0x290
      [   44.212045]  ? __pfx_xfrm_dump_sa+0x10/0x10
      [   44.212045]  ? __kernel_text_address+0xd/0x40
      [   44.212045]  ? kasan_unpoison+0x27/0x60
      [   44.212045]  ? mutex_lock+0x60/0xe0
      [   44.212045]  ? __pfx_mutex_lock+0x10/0x10
      [   44.212045]  ? kasan_save_stack+0x22/0x50
      [   44.212045]  netlink_dump+0x322/0x6c0
      [   44.212045]  ? __pfx_netlink_dump+0x10/0x10
      [   44.212045]  ? mutex_unlock+0x7f/0xd0
      [   44.212045]  ? __pfx_mutex_unlock+0x10/0x10
      [   44.212045]  __netlink_dump_start+0x353/0x430
      [   44.212045]  xfrm_user_rcv_msg+0x3a4/0x410
      [   44.212045]  ? __pfx__raw_spin_lock_irqsave+0x10/0x10
      [   44.212045]  ? __pfx_xfrm_user_rcv_msg+0x10/0x10
      [   44.212045]  ? __pfx_xfrm_dump_sa+0x10/0x10
      [   44.212045]  ? __pfx_xfrm_dump_sa_done+0x10/0x10
      [   44.212045]  ? __stack_depot_save+0x382/0x4e0
      [   44.212045]  ? filter_irq_stacks+0x1c/0x70
      [   44.212045]  ? kasan_save_stack+0x32/0x50
      [   44.212045]  ? kasan_save_stack+0x22/0x50
      [   44.212045]  ? kasan_set_track+0x25/0x30
      [   44.212045]  ? __kasan_slab_alloc+0x59/0x70
      [   44.212045]  ? kmem_cache_alloc_node+0xf7/0x260
      [   44.212045]  ? kmalloc_reserve+0xab/0x120
      [   44.212045]  ? __alloc_skb+0xcf/0x210
      [   44.212045]  ? netlink_sendmsg+0x509/0x700
      [   44.212045]  ? sock_sendmsg+0xde/0xe0
      [   44.212045]  ? __sys_sendto+0x18d/0x230
      [   44.212045]  ? __x64_sys_sendto+0x71/0x90
      [   44.212045]  ? do_syscall_64+0x3f/0x90
      [   44.212045]  ? entry_SYSCALL_64_after_hwframe+0x72/0xdc
      [   44.212045]  ? netlink_sendmsg+0x509/0x700
      [   44.212045]  ? sock_sendmsg+0xde/0xe0
      [   44.212045]  ? __sys_sendto+0x18d/0x230
      [   44.212045]  ? __x64_sys_sendto+0x71/0x90
      [   44.212045]  ? do_syscall_64+0x3f/0x90
      [   44.212045]  ? entry_SYSCALL_64_after_hwframe+0x72/0xdc
      [   44.212045]  ? kasan_save_stack+0x22/0x50
      [   44.212045]  ? kasan_set_track+0x25/0x30
      [   44.212045]  ? kasan_save_free_info+0x2e/0x50
      [   44.212045]  ? __kasan_slab_free+0x10a/0x190
      [   44.212045]  ? kmem_cache_free+0x9c/0x340
      [   44.212045]  ? netlink_recvmsg+0x23c/0x660
      [   44.212045]  ? sock_recvmsg+0xeb/0xf0
      [   44.212045]  ? __sys_recvfrom+0x13c/0x1f0
      [   44.212045]  ? __x64_sys_recvfrom+0x71/0x90
      [   44.212045]  ? do_syscall_64+0x3f/0x90
      [   44.212045]  ? entry_SYSCALL_64_after_hwframe+0x72/0xdc
      [   44.212045]  ? copyout+0x3e/0x50
      [   44.212045]  netlink_rcv_skb+0xd6/0x210
      [   44.212045]  ? __pfx_xfrm_user_rcv_msg+0x10/0x10
      [   44.212045]  ? __pfx_netlink_rcv_skb+0x10/0x10
      [   44.212045]  ? __pfx_sock_has_perm+0x10/0x10
      [   44.212045]  ? mutex_lock+0x8d/0xe0
      [   44.212045]  ? __pfx_mutex_lock+0x10/0x10
      [   44.212045]  xfrm_netlink_rcv+0x44/0x50
      [   44.212045]  netlink_unicast+0x36f/0x4c0
      [   44.212045]  ? __pfx_netlink_unicast+0x10/0x10
      [   44.212045]  ? netlink_recvmsg+0x500/0x660
      [   44.212045]  netlink_sendmsg+0x3b7/0x700
      [   44.212045]  ? __pfx_netlink_sendmsg+0x10/0x10
      [   44.212045]  ? __pfx_netlink_sendmsg+0x10/0x10
      [   44.212045]  sock_sendmsg+0xde/0xe0
      [   44.212045]  __sys_sendto+0x18d/0x230
      [   44.212045]  ? __pfx___sys_sendto+0x10/0x10
      [   44.212045]  ? rcu_core+0x44a/0xe10
      [   44.212045]  ? __rseq_handle_notify_resume+0x45b/0x740
      [   44.212045]  ? _raw_spin_lock_irq+0x81/0xe0
      [   44.212045]  ? __pfx___rseq_handle_notify_resume+0x10/0x10
      [   44.212045]  ? __pfx_restore_fpregs_from_fpstate+0x10/0x10
      [   44.212045]  ? __pfx_blkcg_maybe_throttle_current+0x10/0x10
      [   44.212045]  ? __pfx_task_work_run+0x10/0x10
      [   44.212045]  __x64_sys_sendto+0x71/0x90
      [   44.212045]  do_syscall_64+0x3f/0x90
      [   44.212045]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
      [   44.212045] RIP: 0033:0x44b7da
      [   44.212045] RSP: 002b:00007ffdc8838548 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
      [   44.212045] RAX: ffffffffffffffda RBX: 00007ffdc8839978 RCX: 000000000044b7da
      [   44.212045] RDX: 0000000000000038 RSI: 00007ffdc8838770 RDI: 0000000000000003
      [   44.212045] RBP: 00007ffdc88385b0 R08: 00007ffdc883858c R09: 000000000000000c
      [   44.212045] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
      [   44.212045] R13: 00007ffdc8839968 R14: 00000000004c37d0 R15: 0000000000000001
      [   44.212045]  </TASK>
      [   44.212045]
      [   44.212045] Allocated by task 97:
      [   44.212045]  kasan_save_stack+0x22/0x50
      [   44.212045]  kasan_set_track+0x25/0x30
      [   44.212045]  __kasan_kmalloc+0x7f/0x90
      [   44.212045]  __kmalloc_node_track_caller+0x5b/0x140
      [   44.212045]  kmemdup+0x21/0x50
      [   44.212045]  xfrm_dump_sa+0x17d/0x290
      [   44.212045]  netlink_dump+0x322/0x6c0
      [   44.212045]  __netlink_dump_start+0x353/0x430
      [   44.212045]  xfrm_user_rcv_msg+0x3a4/0x410
      [   44.212045]  netlink_rcv_skb+0xd6/0x210
      [   44.212045]  xfrm_netlink_rcv+0x44/0x50
      [   44.212045]  netlink_unicast+0x36f/0x4c0
      [   44.212045]  netlink_sendmsg+0x3b7/0x700
      [   44.212045]  sock_sendmsg+0xde/0xe0
      [   44.212045]  __sys_sendto+0x18d/0x230
      [   44.212045]  __x64_sys_sendto+0x71/0x90
      [   44.212045]  do_syscall_64+0x3f/0x90
      [   44.212045]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
      [   44.212045]
      [   44.212045] The buggy address belongs to the object at ffff88800870f300
      [   44.212045]  which belongs to the cache kmalloc-64 of size 64
      [   44.212045] The buggy address is located 32 bytes inside of
      [   44.212045]  allocated 36-byte region [ffff88800870f300, ffff88800870f324)
      [   44.212045]
      [   44.212045] The buggy address belongs to the physical page:
      [   44.212045] page:00000000e4de16ee refcount:1 mapcount:0 mapping:000000000 ...
      [   44.212045] flags: 0x100000000000200(slab|node=0|zone=1)
      [   44.212045] page_type: 0xffffffff()
      [   44.212045] raw: 0100000000000200 ffff888004c41640 dead000000000122 0000000000000000
      [   44.212045] raw: 0000000000000000 0000000080200020 00000001ffffffff 0000000000000000
      [   44.212045] page dumped because: kasan: bad access detected
      [   44.212045]
      [   44.212045] Memory state around the buggy address:
      [   44.212045]  ffff88800870f200: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
      [   44.212045]  ffff88800870f280: 00 00 00 00 00 fc fc fc fc fc fc fc fc fc fc fc
      [   44.212045] >ffff88800870f300: 00 00 00 00 04 fc fc fc fc fc fc fc fc fc fc fc
      [   44.212045]                                ^
      [   44.212045]  ffff88800870f380: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [   44.212045]  ffff88800870f400: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      [   44.212045] ==================================================================
      
      By investigating the code, we find the root cause of this OOB is the lack
      of checks in xfrm_dump_sa(). The buggy code allows a malicious user to pass
      arbitrary value of filter->splen/dplen. Hence, with crafted xfrm states,
      the attacker can achieve 8 bytes heap OOB read, which causes info leak.
      
        if (attrs[XFRMA_ADDRESS_FILTER]) {
          filter = kmemdup(nla_data(attrs[XFRMA_ADDRESS_FILTER]),
              sizeof(*filter), GFP_KERNEL);
          if (filter == NULL)
            return -ENOMEM;
          // NO MORE CHECKS HERE !!!
        }
      
      This patch fixes the OOB by adding necessary boundary checks, just like
      the code in pfkey_dump() function.
      
      Fixes: d3623099
      
       ("ipsec: add support of limited SA dump")
      Signed-off-by: default avatarLin Ma <linma@zju.edu.cn>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      9a005627
    • Tam Nguyen's avatar
      i2c: designware: Handle invalid SMBus block data response length value · 5a47c2fa
      Tam Nguyen authored
      commit 69f035c4
      
       upstream.
      
      In the I2C_FUNC_SMBUS_BLOCK_DATA case, the invalid length byte value
      (outside of 1-32) of the SMBus block data response from the Slave device
      is not correctly handled by the I2C Designware driver.
      
      In case IC_EMPTYFIFO_HOLD_MASTER_EN==1, which cannot be detected
      from the registers, the Master can be disabled only if the STOP bit
      is set. Without STOP bit set, the Master remains active, holding the bus
      until receiving a block data response length. This hangs the bus and
      is unrecoverable.
      
      Avoid this by issuing another dump read to reach the stop condition when
      an invalid length byte is received.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarTam Nguyen <tamnguyenchi@os.amperecomputing.com>
      Acked-by: default avatarJarkko Nikula <jarkko.nikula@linux.intel.com>
      Link: https://lore.kernel.org/r/20230726080001.337353-3-tamnguyenchi@os.amperecomputing.com
      
      
      Reviewed-by: default avatarAndi Shyti <andi.shyti@kernel.org>
      Signed-off-by: default avatarWolfram Sang <wsa@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5a47c2fa
    • Quan Nguyen's avatar
      i2c: designware: Correct length byte validation logic · 52114963
      Quan Nguyen authored
      commit 49d4db39 upstream.
      
      Commit 0daede80 ("i2c: designware: Convert driver to using regmap API")
      changes the logic to validate the whole 32-bit return value of
      DW_IC_DATA_CMD register instead of 8-bit LSB without reason.
      
      Later, commit f53f15ba ("i2c: designware: Get right data length"),
      introduced partial fix but not enough because the "tmp > 0" still test
      tmp as 32-bit value and is wrong in case the IC_DATA_CMD[11] is set.
      
      Revert the logic to just before commit 0daede80
      ("i2c: designware: Convert driver to using regmap API").
      
      Fixes: f53f15ba ("i2c: designware: Get right data length")
      Fixes: 0daede80
      
       ("i2c: designware: Convert driver to using regmap API")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarTam Nguyen <tamnguyenchi@os.amperecomputing.com>
      Signed-off-by: default avatarQuan Nguyen <quan@os.amperecomputing.com>
      Acked-by: default avatarJarkko Nikula <jarkko.nikula@linux.intel.com>
      Link: https://lore.kernel.org/r/20230726080001.337353-2-tamnguyenchi@os.amperecomputing.com
      
      
      Reviewed-by: default avatarAndi Shyti <andi.shyti@kernel.org>
      Signed-off-by: default avatarWolfram Sang <wsa@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      52114963
    • xiaoshoukui's avatar
      btrfs: fix BUG_ON condition in btrfs_cancel_balance · ceb9ba8e
      xiaoshoukui authored
      commit 29eefa6d
      
       upstream.
      
      Pausing and canceling balance can race to interrupt balance lead to BUG_ON
      panic in btrfs_cancel_balance. The BUG_ON condition in btrfs_cancel_balance
      does not take this race scenario into account.
      
      However, the race condition has no other side effects. We can fix that.
      
      Reproducing it with panic trace like this:
      
        kernel BUG at fs/btrfs/volumes.c:4618!
        RIP: 0010:btrfs_cancel_balance+0x5cf/0x6a0
        Call Trace:
         <TASK>
         ? do_nanosleep+0x60/0x120
         ? hrtimer_nanosleep+0xb7/0x1a0
         ? sched_core_clone_cookie+0x70/0x70
         btrfs_ioctl_balance_ctl+0x55/0x70
         btrfs_ioctl+0xa46/0xd20
         __x64_sys_ioctl+0x7d/0xa0
         do_syscall_64+0x38/0x80
         entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
        Race scenario as follows:
        > mutex_unlock(&fs_info->balance_mutex);
        > --------------------
        > .......issue pause and cancel req in another thread
        > --------------------
        > ret = __btrfs_balance(fs_info);
        >
        > mutex_lock(&fs_info->balance_mutex);
        > if (ret == -ECANCELED && atomic_read(&fs_info->balance_pause_req)) {
        >         btrfs_info(fs_info, "balance: paused");
        >         btrfs_exclop_balance(fs_info, BTRFS_EXCLOP_BALANCE_PAUSED);
        > }
      
      CC: stable@vger.kernel.org # 4.19+
      Signed-off-by: default avatarxiaoshoukui <xiaoshoukui@ruijie.com.cn>
      Reviewed-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ceb9ba8e
    • Josef Bacik's avatar
      btrfs: fix incorrect splitting in btrfs_drop_extent_map_range · 9f68e210
      Josef Bacik authored
      commit c962098c upstream.
      
      In production we were seeing a variety of WARN_ON()'s in the extent_map
      code, specifically in btrfs_drop_extent_map_range() when we have to call
      add_extent_mapping() for our second split.
      
      Consider the following extent map layout
      
      	PINNED
      	[0 16K)  [32K, 48K)
      
      and then we call btrfs_drop_extent_map_range for [0, 36K), with
      skip_pinned == true.  The initial loop will have
      
      	start = 0
      	end = 36K
      	len = 36K
      
      we will find the [0, 16k) extent, but since we are pinned we will skip
      it, which has this code
      
      	start = em_end;
      	if (end != (u64)-1)
      		len = start + len - em_end;
      
      em_end here is 16K, so now the values are
      
      	start = 16K
      	len = 16K + 36K - 16K = 36K
      
      len should instead be 20K.  This is a problem when we find the next
      extent at [32K, 48K), we need to split this extent to leave [36K, 48k),
      however the code for the split looks like this
      
      	split->start = start + len;
      	split->len = em_end - (start + len);
      
      In this case we have
      
      	em_end = 48K
      	split->start = 16K + 36K       // this should be 16K + 20K
      	split->len = 48K - (16K + 36K) // this overflows as 16K + 36K is 52K
      
      and now we have an invalid extent_map in the tree that potentially
      overlaps other entries in the extent map.  Even in the non-overlapping
      case we will have split->start set improperly, which will cause problems
      with any block related calculations.
      
      We don't actually need len in this loop, we can simply use end as our
      end point, and only adjust start up when we find a pinned extent we need
      to skip.
      
      Adjust the logic to do this, which keeps us from inserting an invalid
      extent map.
      
      We only skip_pinned in the relocation case, so this is relatively rare,
      except in the case where you are running relocation a lot, which can
      happen with auto relocation on.
      
      Fixes: 55ef6899
      
       ("Btrfs: Fix btrfs_drop_extent_cache for skip pinned case")
      CC: stable@vger.kernel.org # 4.14+
      Reviewed-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9f68e210
    • Sherry Sun's avatar
      tty: serial: fsl_lpuart: Clear the error flags by writing 1 for lpuart32 platforms · 0693c8f1
      Sherry Sun authored
      commit 28206984 upstream.
      
      Do not read the data register to clear the error flags for lpuart32
      platforms, the additional read may cause the receive FIFO underflow
      since the DMA has already read the data register.
      Actually all lpuart32 platforms support write 1 to clear those error
      bits, let's use this method to better clear the error flags.
      
      Fixes: 42b68768
      
       ("serial: fsl_lpuart: DMA support for 32-bit variant")
      Cc: stable <stable@kernel.org>
      Signed-off-by: default avatarSherry Sun <sherry.sun@nxp.com>
      Link: https://lore.kernel.org/r/20230801022304.24251-1-sherry.sun@nxp.com
      
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0693c8f1
    • Yi Yang's avatar
      tty: n_gsm: fix the UAF caused by race condition in gsm_cleanup_mux · 31311a9a
      Yi Yang authored
      commit 3c4f8333 upstream.
      
      In commit 9b9c8195 ("tty: n_gsm: fix UAF in gsm_cleanup_mux"), the UAF
      problem is not completely fixed. There is a race condition in
      gsm_cleanup_mux(), which caused this UAF.
      
      The UAF problem is triggered by the following race:
      task[5046]                     task[5054]
      -----------------------        -----------------------
      gsm_cleanup_mux();
      dlci = gsm->dlci[0];
      mutex_lock(&gsm->mutex);
                                     gsm_cleanup_mux();
      			       dlci = gsm->dlci[0]; //Didn't take the lock
      gsm_dlci_release(gsm->dlci[i]);
      gsm->dlci[i] = NULL;
      mutex_unlock(&gsm->mutex);
                                     mutex_lock(&gsm->mutex);
      			       dlci->dead = true; //UAF
      
      Fix it by assigning values after mutex_lock().
      
      Link: https://syzkaller.appspot.com/text?tag=CrashReport&x=176188b5a80000
      Cc: stable <stable@kernel.org>
      Fixes: 9b9c8195 ("tty: n_gsm: fix UAF in gsm_cleanup_mux")
      Fixes: aa371e96
      
       ("tty: n_gsm: fix restart handling via CLD command")
      Signed-off-by: default avatarYi Yang <yiyang13@huawei.com>
      Co-developed-by: default avatarQiumiao Zhang <zhangqiumiao1@huawei.com>
      Signed-off-by: default avatarQiumiao Zhang <zhangqiumiao1@huawei.com>
      Link: https://lore.kernel.org/r/20230811031121.153237-1-yiyang13@huawei.com
      
      
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      31311a9a