Skip to content
  1. Jul 27, 2023
    • Florian Westphal's avatar
      netfilter: nft_set_pipapo: fix improper element removal · 48dbb5d2
      Florian Westphal authored
      [ Upstream commit 87b5a5c2 ]
      
      end key should be equal to start unless NFT_SET_EXT_KEY_END is present.
      
      Its possible to add elements that only have a start key
      ("{ 1.0.0.0 . 2.0.0.0 }") without an internval end.
      
      Insertion treats this via:
      
      if (nft_set_ext_exists(ext, NFT_SET_EXT_KEY_END))
         end = (const u8 *)nft_set_ext_key_end(ext)->data;
      else
         end = start;
      
      but removal side always uses nft_set_ext_key_end().
      This is wrong and leads to garbage remaining in the set after removal
      next lookup/insert attempt will give:
      
      BUG: KASAN: slab-use-after-free in pipapo_get+0x8eb/0xb90
      Read of size 1 at addr ffff888100d50586 by task nft-pipapo_uaf_/1399
      Call Trace:
       kasan_report+0x105/0x140
       pipapo_get+0x8eb/0xb90
       nft_pipapo_insert+0x1dc/0x1710
       nf_tables_newsetelem+0x31f5/0x4e00
       ..
      
      Fixes: 3c4287f6
      
       ("nf_tables: Add set type for arbitrary concatenation of ranges")
      Reported-by: default avatarlonial con <kongln9170@gmail.com>
      Reviewed-by: default avatarStefano Brivio <sbrivio@redhat.com>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      48dbb5d2
    • Florian Westphal's avatar
      netfilter: nf_tables: can't schedule in nft_chain_validate · d78a3755
      Florian Westphal authored
      [ Upstream commit 314c8284 ]
      
      Can be called via nft set element list iteration, which may acquire
      rcu and/or bh read lock (depends on set type).
      
      BUG: sleeping function called from invalid context at net/netfilter/nf_tables_api.c:3353
      in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 1232, name: nft
      preempt_count: 0, expected: 0
      RCU nest depth: 1, expected: 0
      2 locks held by nft/1232:
       #0: ffff8881180e3ea8 (&nft_net->commit_mutex){+.+.}-{3:3}, at: nf_tables_valid_genid
       #1: ffffffff83f5f540 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire
      Call Trace:
       nft_chain_validate
       nft_lookup_validate_setelem
       nft_pipapo_walk
       nft_lookup_validate
       nft_chain_validate
       nft_immediate_validate
       nft_chain_validate
       nf_tables_validate
       nf_tables_abort
      
      No choice but to move it to nf_tables_validate().
      
      Fixes: 81ea0106
      
       ("netfilter: nf_tables: add rescheduling points during loop detection walks")
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d78a3755
    • Florian Westphal's avatar
      netfilter: nf_tables: fix spurious set element insertion failure · 08ca12fa
      Florian Westphal authored
      [ Upstream commit ddbd8be6 ]
      
      On some platforms there is a padding hole in the nft_verdict
      structure, between the verdict code and the chain pointer.
      
      On element insertion, if the new element clashes with an existing one and
      NLM_F_EXCL flag isn't set, we want to ignore the -EEXIST error as long as
      the data associated with duplicated element is the same as the existing
      one.  The data equality check uses memcmp.
      
      For normal data (NFT_DATA_VALUE) this works fine, but for NFT_DATA_VERDICT
      padding area leads to spurious failure even if the verdict data is the
      same.
      
      This then makes the insertion fail with 'already exists' error, even
      though the new "key : data" matches an existing entry and userspace
      told the kernel that it doesn't want to receive an error indication.
      
      Fixes: c016c7e4
      
       ("netfilter: nf_tables: honor NLM_F_EXCL flag in set element insertion")
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      08ca12fa
    • Vitaly Rodionov's avatar
      ALSA: hda/realtek: Fix generic fixup definition for cs35l41 amp · 540075ce
      Vitaly Rodionov authored
      [ Upstream commit f7b069cf ]
      
      Generic fixup for CS35L41 amplifies should not have vendor specific
      chained fixup. For ThinkPad laptops with led issue, we can just add
      specific fixup.
      
      Fixes: a6ac60b3
      
       (ALSA: hda/realtek: Fix mute led issue on thinkpad with cs35l41 s-codec)
      Signed-off-by: default avatarVitaly Rodionov <vitalyr@opensource.cirrus.com>
      Link: https://lore.kernel.org/r/20230720082022.13033-1-vitalyr@opensource.cirrus.com
      
      
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      540075ce
    • Kuniyuki Iwashima's avatar
      llc: Don't drop packet from non-root netns. · b7117b72
      Kuniyuki Iwashima authored
      [ Upstream commit 6631463b ]
      
      Now these upper layer protocol handlers can be called from llc_rcv()
      as sap->rcv_func(), which is registered by llc_sap_open().
      
        * function which is passed to register_8022_client()
          -> no in-kernel user calls register_8022_client().
      
        * snap_rcv()
          `- proto->rcvfunc() : registered by register_snap_client()
             -> aarp_rcv() and atalk_rcv() drop packets from non-root netns
      
        * stp_pdu_rcv()
          `- garp_protos[]->rcv() : registered by stp_proto_register()
             -> garp_pdu_rcv() and br_stp_rcv() are netns-aware
      
      So, we can safely remove the netns restriction in llc_rcv().
      
      Fixes: e730c155
      
       ("[NET]: Make packet reception network namespace safe")
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      b7117b72
    • Zhang Shurong's avatar
      fbdev: au1200fb: Fix missing IRQ check in au1200fb_drv_probe · b492d371
      Zhang Shurong authored
      [ Upstream commit 4e88761f ]
      
      This func misses checking for platform_get_irq()'s call and may passes the
      negative error codes to request_irq(), which takes unsigned IRQ #,
      causing it to fail with -EINVAL, overriding an original error code.
      
      Fix this by stop calling request_irq() with invalid IRQ #s.
      
      Fixes: 1630d85a
      
       ("au1200fb: fix hardcoded IRQ")
      Signed-off-by: default avatarZhang Shurong <zhang_shurong@foxmail.com>
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      b492d371
    • Daniel Golle's avatar
      net: ethernet: mtk_eth_soc: always mtk_get_ib1_pkt_type · 41d5ae32
      Daniel Golle authored
      [ Upstream commit 9f9d4c1a ]
      
      entries and bind debugfs files would display wrong data on NETSYS_V2 and
      later because instead of using mtk_get_ib1_pkt_type the driver would use
      MTK_FOE_IB1_PACKET_TYPE which corresponds to NETSYS_V1(.x) SoCs.
      Use mtk_get_ib1_pkt_type so entries and bind records display correctly.
      
      Fixes: 03a3180e
      
       ("net: ethernet: mtk_eth_soc: introduce flow offloading support for mt7986")
      Signed-off-by: default avatarDaniel Golle <daniel@makrotopia.org>
      Acked-by: default avatarLorenzo Bianconi <lorenzo@kernel.org>
      Link: https://lore.kernel.org/r/c0ae03d0182f4d27b874cbdf0059bc972c317f3c.1689727134.git.daniel@makrotopia.org
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      41d5ae32
    • Kuniyuki Iwashima's avatar
      Revert "tcp: avoid the lookup process failing to get sk in ehash table" · 70a2d37c
      Kuniyuki Iwashima authored
      [ Upstream commit 81b3ade5 ]
      
      This reverts commit 3f4ca5fa.
      
      Commit 3f4ca5fa ("tcp: avoid the lookup process failing to get sk in
      ehash table") reversed the order in how a socket is inserted into ehash
      to fix an issue that ehash-lookup could fail when reqsk/full sk/twsk are
      swapped.  However, it introduced another lookup failure.
      
      The full socket in ehash is allocated from a slab with SLAB_TYPESAFE_BY_RCU
      and does not have SOCK_RCU_FREE, so the socket could be reused even while
      it is being referenced on another CPU doing RCU lookup.
      
      Let's say a socket is reused and inserted into the same hash bucket during
      lookup.  After the blamed commit, a new socket is inserted at the end of
      the list.  If that happens, we will skip sockets placed after the previous
      position of the reused socket, resulting in ehash lookup failure.
      
      As described in Documentation/RCU/rculist_nulls.rst, we should insert a
      new socket at the head of the list to avoid such an issue.
      
      This issue, the swap-lookup-failure, and another variant reported in [0]
      can all be handled properly by adding a locked ehash lookup suggested by
      Eric Dumazet [1].
      
      However, this issue could occur for every packet, thus more likely than
      the other two races, so let's revert the change for now.
      
      Link: https://lore.kernel.org/netdev/20230606064306.9192-1-duanmuquan@baidu.com/ [0]
      Link: https://lore.kernel.org/netdev/CANn89iK8snOz8TYOhhwfimC7ykYA78GA3Nyv8x06SZYa1nKdyA@mail.gmail.com/ [1]
      Fixes: 3f4ca5fa
      
       ("tcp: avoid the lookup process failing to get sk in ehash table")
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Link: https://lore.kernel.org/r/20230717215918.15723-1-kuniyu@amazon.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      70a2d37c
    • Yuanjun Gong's avatar
      net:ipv6: check return value of pskb_trim() · cba5b13f
      Yuanjun Gong authored
      [ Upstream commit 4258faa1 ]
      
      goto tx_err if an unexpected result is returned by pskb_tirm()
      in ip6erspan_tunnel_xmit().
      
      Fixes: 5a963eb6
      
       ("ip6_gre: Add ERSPAN native tunnel support")
      Signed-off-by: default avatarYuanjun Gong <ruc_gongyuanjun@163.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Reviewed-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      cba5b13f
    • Wang Ming's avatar
      net: ipv4: Use kfree_sensitive instead of kfree · ef0fe1af
      Wang Ming authored
      [ Upstream commit daa75144 ]
      
      key might contain private part of the key, so better use
      kfree_sensitive to free it.
      
      Fixes: 38320c70
      
       ("[IPSEC]: Use crypto_aead and authenc in ESP")
      Signed-off-by: default avatarWang Ming <machel@vivo.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Reviewed-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ef0fe1af
    • Eric Dumazet's avatar
      tcp: annotate data-races around tcp_rsk(req)->ts_recent · 7552f8e7
      Eric Dumazet authored
      [ Upstream commit eba20811 ]
      
      TCP request sockets are lockless, tcp_rsk(req)->ts_recent
      can change while being read by another cpu as syzbot noticed.
      
      This is harmless, but we should annotate the known races.
      
      Note that tcp_check_req() changes req->ts_recent a bit early,
      we might change this in the future.
      
      BUG: KCSAN: data-race in tcp_check_req / tcp_check_req
      
      write to 0xffff88813c8afb84 of 4 bytes by interrupt on cpu 1:
      tcp_check_req+0x694/0xc70 net/ipv4/tcp_minisocks.c:762
      tcp_v4_rcv+0x12db/0x1b70 net/ipv4/tcp_ipv4.c:2071
      ip_protocol_deliver_rcu+0x356/0x6d0 net/ipv4/ip_input.c:205
      ip_local_deliver_finish+0x13c/0x1a0 net/ipv4/ip_input.c:233
      NF_HOOK include/linux/netfilter.h:303 [inline]
      ip_local_deliver+0xec/0x1c0 net/ipv4/ip_input.c:254
      dst_input include/net/dst.h:468 [inline]
      ip_rcv_finish net/ipv4/ip_input.c:449 [inline]
      NF_HOOK include/linux/netfilter.h:303 [inline]
      ip_rcv+0x197/0x270 net/ipv4/ip_input.c:569
      __netif_receive_skb_one_core net/core/dev.c:5493 [inline]
      __netif_receive_skb+0x90/0x1b0 net/core/dev.c:5607
      process_backlog+0x21f/0x380 net/core/dev.c:5935
      __napi_poll+0x60/0x3b0 net/core/dev.c:6498
      napi_poll net/core/dev.c:6565 [inline]
      net_rx_action+0x32b/0x750 net/core/dev.c:6698
      __do_softirq+0xc1/0x265 kernel/softirq.c:571
      do_softirq+0x7e/0xb0 kernel/softirq.c:472
      __local_bh_enable_ip+0x64/0x70 kernel/softirq.c:396
      local_bh_enable+0x1f/0x20 include/linux/bottom_half.h:33
      rcu_read_unlock_bh include/linux/rcupdate.h:843 [inline]
      __dev_queue_xmit+0xabb/0x1d10 net/core/dev.c:4271
      dev_queue_xmit include/linux/netdevice.h:3088 [inline]
      neigh_hh_output include/net/neighbour.h:528 [inline]
      neigh_output include/net/neighbour.h:542 [inline]
      ip_finish_output2+0x700/0x840 net/ipv4/ip_output.c:229
      ip_finish_output+0xf4/0x240 net/ipv4/ip_output.c:317
      NF_HOOK_COND include/linux/netfilter.h:292 [inline]
      ip_output+0xe5/0x1b0 net/ipv4/ip_output.c:431
      dst_output include/net/dst.h:458 [inline]
      ip_local_out net/ipv4/ip_output.c:126 [inline]
      __ip_queue_xmit+0xa4d/0xa70 net/ipv4/ip_output.c:533
      ip_queue_xmit+0x38/0x40 net/ipv4/ip_output.c:547
      __tcp_transmit_skb+0x1194/0x16e0 net/ipv4/tcp_output.c:1399
      tcp_transmit_skb net/ipv4/tcp_output.c:1417 [inline]
      tcp_write_xmit+0x13ff/0x2fd0 net/ipv4/tcp_output.c:2693
      __tcp_push_pending_frames+0x6a/0x1a0 net/ipv4/tcp_output.c:2877
      tcp_push_pending_frames include/net/tcp.h:1952 [inline]
      __tcp_sock_set_cork net/ipv4/tcp.c:3336 [inline]
      tcp_sock_set_cork+0xe8/0x100 net/ipv4/tcp.c:3343
      rds_tcp_xmit_path_complete+0x3b/0x40 net/rds/tcp_send.c:52
      rds_send_xmit+0xf8d/0x1420 net/rds/send.c:422
      rds_send_worker+0x42/0x1d0 net/rds/threads.c:200
      process_one_work+0x3e6/0x750 kernel/workqueue.c:2408
      worker_thread+0x5f2/0xa10 kernel/workqueue.c:2555
      kthread+0x1d7/0x210 kernel/kthread.c:379
      ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308
      
      read to 0xffff88813c8afb84 of 4 bytes by interrupt on cpu 0:
      tcp_check_req+0x32a/0xc70 net/ipv4/tcp_minisocks.c:622
      tcp_v4_rcv+0x12db/0x1b70 net/ipv4/tcp_ipv4.c:2071
      ip_protocol_deliver_rcu+0x356/0x6d0 net/ipv4/ip_input.c:205
      ip_local_deliver_finish+0x13c/0x1a0 net/ipv4/ip_input.c:233
      NF_HOOK include/linux/netfilter.h:303 [inline]
      ip_local_deliver+0xec/0x1c0 net/ipv4/ip_input.c:254
      dst_input include/net/dst.h:468 [inline]
      ip_rcv_finish net/ipv4/ip_input.c:449 [inline]
      NF_HOOK include/linux/netfilter.h:303 [inline]
      ip_rcv+0x197/0x270 net/ipv4/ip_input.c:569
      __netif_receive_skb_one_core net/core/dev.c:5493 [inline]
      __netif_receive_skb+0x90/0x1b0 net/core/dev.c:5607
      process_backlog+0x21f/0x380 net/core/dev.c:5935
      __napi_poll+0x60/0x3b0 net/core/dev.c:6498
      napi_poll net/core/dev.c:6565 [inline]
      net_rx_action+0x32b/0x750 net/core/dev.c:6698
      __do_softirq+0xc1/0x265 kernel/softirq.c:571
      run_ksoftirqd+0x17/0x20 kernel/softirq.c:939
      smpboot_thread_fn+0x30a/0x4a0 kernel/smpboot.c:164
      kthread+0x1d7/0x210 kernel/kthread.c:379
      ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308
      
      value changed: 0x1cd237f1 -> 0x1cd237f2
      
      Fixes: 079096f1
      
       ("tcp/dccp: install syn_recv requests into ehash table")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Reviewed-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Link: https://lore.kernel.org/r/20230717144445.653164-3-edumazet@google.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7552f8e7
    • Eric Dumazet's avatar
      tcp: annotate data-races around tcp_rsk(req)->txhash · d546247e
      Eric Dumazet authored
      [ Upstream commit 5e526552 ]
      
      TCP request sockets are lockless, some of their fields
      can change while being read by another cpu as syzbot noticed.
      
      This is usually harmless, but we should annotate the known
      races.
      
      This patch takes care of tcp_rsk(req)->txhash,
      a separate one is needed for tcp_rsk(req)->ts_recent.
      
      BUG: KCSAN: data-race in tcp_make_synack / tcp_rtx_synack
      
      write to 0xffff8881362304bc of 4 bytes by task 32083 on cpu 1:
      tcp_rtx_synack+0x9d/0x2a0 net/ipv4/tcp_output.c:4213
      inet_rtx_syn_ack+0x38/0x80 net/ipv4/inet_connection_sock.c:880
      tcp_check_req+0x379/0xc70 net/ipv4/tcp_minisocks.c:665
      tcp_v6_rcv+0x125b/0x1b20 net/ipv6/tcp_ipv6.c:1673
      ip6_protocol_deliver_rcu+0x92f/0xf30 net/ipv6/ip6_input.c:437
      ip6_input_finish net/ipv6/ip6_input.c:482 [inline]
      NF_HOOK include/linux/netfilter.h:303 [inline]
      ip6_input+0xbd/0x1b0 net/ipv6/ip6_input.c:491
      dst_input include/net/dst.h:468 [inline]
      ip6_rcv_finish+0x1e2/0x2e0 net/ipv6/ip6_input.c:79
      NF_HOOK include/linux/netfilter.h:303 [inline]
      ipv6_rcv+0x74/0x150 net/ipv6/ip6_input.c:309
      __netif_receive_skb_one_core net/core/dev.c:5452 [inline]
      __netif_receive_skb+0x90/0x1b0 net/core/dev.c:5566
      netif_receive_skb_internal net/core/dev.c:5652 [inline]
      netif_receive_skb+0x4a/0x310 net/core/dev.c:5711
      tun_rx_batched+0x3bf/0x400
      tun_get_user+0x1d24/0x22b0 drivers/net/tun.c:1997
      tun_chr_write_iter+0x18e/0x240 drivers/net/tun.c:2043
      call_write_iter include/linux/fs.h:1871 [inline]
      new_sync_write fs/read_write.c:491 [inline]
      vfs_write+0x4ab/0x7d0 fs/read_write.c:584
      ksys_write+0xeb/0x1a0 fs/read_write.c:637
      __do_sys_write fs/read_write.c:649 [inline]
      __se_sys_write fs/read_write.c:646 [inline]
      __x64_sys_write+0x42/0x50 fs/read_write.c:646
      do_syscall_x64 arch/x86/entry/common.c:50 [inline]
      do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
      entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      read to 0xffff8881362304bc of 4 bytes by task 32078 on cpu 0:
      tcp_make_synack+0x367/0xb40 net/ipv4/tcp_output.c:3663
      tcp_v6_send_synack+0x72/0x420 net/ipv6/tcp_ipv6.c:544
      tcp_conn_request+0x11a8/0x1560 net/ipv4/tcp_input.c:7059
      tcp_v6_conn_request+0x13f/0x180 net/ipv6/tcp_ipv6.c:1175
      tcp_rcv_state_process+0x156/0x1de0 net/ipv4/tcp_input.c:6494
      tcp_v6_do_rcv+0x98a/0xb70 net/ipv6/tcp_ipv6.c:1509
      tcp_v6_rcv+0x17b8/0x1b20 net/ipv6/tcp_ipv6.c:1735
      ip6_protocol_deliver_rcu+0x92f/0xf30 net/ipv6/ip6_input.c:437
      ip6_input_finish net/ipv6/ip6_input.c:482 [inline]
      NF_HOOK include/linux/netfilter.h:303 [inline]
      ip6_input+0xbd/0x1b0 net/ipv6/ip6_input.c:491
      dst_input include/net/dst.h:468 [inline]
      ip6_rcv_finish+0x1e2/0x2e0 net/ipv6/ip6_input.c:79
      NF_HOOK include/linux/netfilter.h:303 [inline]
      ipv6_rcv+0x74/0x150 net/ipv6/ip6_input.c:309
      __netif_receive_skb_one_core net/core/dev.c:5452 [inline]
      __netif_receive_skb+0x90/0x1b0 net/core/dev.c:5566
      netif_receive_skb_internal net/core/dev.c:5652 [inline]
      netif_receive_skb+0x4a/0x310 net/core/dev.c:5711
      tun_rx_batched+0x3bf/0x400
      tun_get_user+0x1d24/0x22b0 drivers/net/tun.c:1997
      tun_chr_write_iter+0x18e/0x240 drivers/net/tun.c:2043
      call_write_iter include/linux/fs.h:1871 [inline]
      new_sync_write fs/read_write.c:491 [inline]
      vfs_write+0x4ab/0x7d0 fs/read_write.c:584
      ksys_write+0xeb/0x1a0 fs/read_write.c:637
      __do_sys_write fs/read_write.c:649 [inline]
      __se_sys_write fs/read_write.c:646 [inline]
      __x64_sys_write+0x42/0x50 fs/read_write.c:646
      do_syscall_x64 arch/x86/entry/common.c:50 [inline]
      do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
      entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      value changed: 0x91d25731 -> 0xe79325cd
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 0 PID: 32078 Comm: syz-executor.4 Not tainted 6.5.0-rc1-syzkaller-00033-geb26cbb1a754 #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/03/2023
      
      Fixes: 58d607d3
      
       ("tcp: provide skb->hash to synack packets")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Reviewed-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Link: https://lore.kernel.org/r/20230717144445.653164-2-edumazet@google.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d546247e
    • Antoine Tenart's avatar
      net: ipv4: use consistent txhash in TIME_WAIT and SYN_RECV · 67f6f186
      Antoine Tenart authored
      [ Upstream commit c0a8966e
      
       ]
      
      When using IPv4/TCP, skb->hash comes from sk->sk_txhash except in
      TIME_WAIT and SYN_RECV where it's not set in the reply skb from
      ip_send_unicast_reply. Those packets will have a mismatched hash with
      others from the same flow as their hashes will be 0. IPv6 does not have
      the same issue as the hash is set from the socket txhash in those cases.
      
      This commits sets the hash in the reply skb from ip_send_unicast_reply,
      which makes the IPv4 code behaving like IPv6.
      
      Signed-off-by: default avatarAntoine Tenart <atenart@kernel.org>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Stable-dep-of: 5e526552
      
       ("tcp: annotate data-races around tcp_rsk(req)->txhash")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      67f6f186
    • Florian Kauer's avatar
      igc: Prevent garbled TX queue with XDP ZEROCOPY · d5c3b02f
      Florian Kauer authored
      [ Upstream commit 78adb4bc ]
      
      In normal operation, each populated queue item has
      next_to_watch pointing to the last TX desc of the packet,
      while each cleaned item has it set to 0. In particular,
      next_to_use that points to the next (necessarily clean)
      item to use has next_to_watch set to 0.
      
      When the TX queue is used both by an application using
      AF_XDP with ZEROCOPY as well as a second non-XDP application
      generating high traffic, the queue pointers can get in
      an invalid state where next_to_use points to an item
      where next_to_watch is NOT set to 0.
      
      However, the implementation assumes at several places
      that this is never the case, so if it does hold,
      bad things happen. In particular, within the loop inside
      of igc_clean_tx_irq(), next_to_clean can overtake next_to_use.
      Finally, this prevents any further transmission via
      this queue and it never gets unblocked or signaled.
      Secondly, if the queue is in this garbled state,
      the inner loop of igc_clean_tx_ring() will never terminate,
      completely hogging a CPU core.
      
      The reason is that igc_xdp_xmit_zc() reads next_to_use
      before acquiring the lock, and writing it back
      (potentially unmodified) later. If it got modified
      before locking, the outdated next_to_use is written
      pointing to an item that was already used elsewhere
      (and thus next_to_watch got written).
      
      Fixes: 9acf59a7
      
       ("igc: Enable TX via AF_XDP zero-copy")
      Signed-off-by: default avatarFlorian Kauer <florian.kauer@linutronix.de>
      Reviewed-by: default avatarKurt Kanzenbach <kurt@linutronix.de>
      Tested-by: default avatarKurt Kanzenbach <kurt@linutronix.de>
      Acked-by: default avatarVinicius Costa Gomes <vinicius.gomes@intel.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Tested-by: default avatarNaama Meir <naamax.meir@linux.intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Link: https://lore.kernel.org/r/20230717175444.3217831-1-anthony.l.nguyen@intel.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d5c3b02f
    • Kurt Kanzenbach's avatar
      igc: Avoid transmit queue timeout for XDP · c6255814
      Kurt Kanzenbach authored
      [ Upstream commit 95b68148
      
       ]
      
      High XDP load triggers the netdev watchdog:
      
      |NETDEV WATCHDOG: enp3s0 (igc): transmit queue 2 timed out
      
      The reason is the Tx queue transmission start (txq->trans_start) is not updated
      in XDP code path. Therefore, add it for all XDP transmission functions.
      
      Signed-off-by: default avatarKurt Kanzenbach <kurt@linutronix.de>
      Tested-by: default avatarNaama Meir <naamax.meir@linux.intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Stable-dep-of: 78adb4bc
      
       ("igc: Prevent garbled TX queue with XDP ZEROCOPY")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      c6255814
    • Alexander Duyck's avatar
      bpf, arm64: Fix BTI type used for freplace attached functions · 7a4c7932
      Alexander Duyck authored
      [ Upstream commit a3f25d61 ]
      
      When running an freplace attached bpf program on an arm64 system w were
      seeing the following issue:
        Unhandled 64-bit el1h sync exception on CPU47, ESR 0x0000000036000003 -- BTI
      
      After a bit of work to track it down I determined that what appeared to be
      happening is that the 'bti c' at the start of the program was somehow being
      reached after a 'br' instruction. Further digging pointed me toward the
      fact that the function was attached via freplace. This in turn led me to
      build_plt which I believe is invoking the long jump which is triggering
      this error.
      
      To resolve it we can replace the 'bti c' with 'bti jc' and add a comment
      explaining why this has to be modified as such.
      
      Fixes: b2ad54e1
      
       ("bpf, arm64: Implement bpf_arch_text_poke() for arm64")
      Signed-off-by: default avatarAlexander Duyck <alexanderduyck@fb.com>
      Acked-by: default avatarXu Kuohai <xukuohai@huawei.com>
      Link: https://lore.kernel.org/r/168926677665.316237.9953845318337455525.stgit@ahduyck-xeon-server.home.arpa
      
      
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7a4c7932
    • Kumar Kartikeya Dwivedi's avatar
      bpf: Repeat check_max_stack_depth for async callbacks · 275d5743
      Kumar Kartikeya Dwivedi authored
      [ Upstream commit b5e9ad52 ]
      
      While the check_max_stack_depth function explores call chains emanating
      from the main prog, which is typically enough to cover all possible call
      chains, it doesn't explore those rooted at async callbacks unless the
      async callback will have been directly called, since unlike non-async
      callbacks it skips their instruction exploration as they don't
      contribute to stack depth.
      
      It could be the case that the async callback leads to a callchain which
      exceeds the stack depth, but this is never reachable while only
      exploring the entry point from main subprog. Hence, repeat the check for
      the main subprog *and* all async callbacks marked by the symbolic
      execution pass of the verifier, as execution of the program may begin at
      any of them.
      
      Consider functions with following stack depths:
      main: 256
      async: 256
      foo: 256
      
      main:
          rX = async
          bpf_timer_set_callback(...)
      
      async:
          foo()
      
      Here, async is not descended as it does not contribute to stack depth of
      main (since it is referenced using bpf_pseudo_func and not
      bpf_pseudo_call). However, when async is invoked asynchronously, it will
      end up breaching the MAX_BPF_STACK limit by calling foo.
      
      Hence, in addition to main, we also need to explore call chains
      beginning at all async callback subprogs in a program.
      
      Fixes: 7ddc80a4
      
       ("bpf: Teach stack depth check about async callbacks.")
      Signed-off-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Link: https://lore.kernel.org/r/20230717161530.1238-3-memxor@gmail.com
      
      
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      275d5743
    • Kumar Kartikeya Dwivedi's avatar
      bpf: Fix subprog idx logic in check_max_stack_depth · 5e13be20
      Kumar Kartikeya Dwivedi authored
      [ Upstream commit ba7b3e7d ]
      
      The assignment to idx in check_max_stack_depth happens once we see a
      bpf_pseudo_call or bpf_pseudo_func. This is not an issue as the rest of
      the code performs a few checks and then pushes the frame to the frame
      stack, except the case of async callbacks. If the async callback case
      causes the loop iteration to be skipped, the idx assignment will be
      incorrect on the next iteration of the loop. The value stored in the
      frame stack (as the subprogno of the current subprog) will be incorrect.
      
      This leads to incorrect checks and incorrect tail_call_reachable
      marking. Save the target subprog in a new variable and only assign to
      idx once we are done with the is_async_cb check which may skip pushing
      of frame to the frame stack and subsequent stack depth checks and tail
      call markings.
      
      Fixes: 7ddc80a4
      
       ("bpf: Teach stack depth check about async callbacks.")
      Signed-off-by: default avatarKumar Kartikeya Dwivedi <memxor@gmail.com>
      Link: https://lore.kernel.org/r/20230717161530.1238-2-memxor@gmail.com
      
      
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      5e13be20
    • Geetha sowjanya's avatar
      octeontx2-pf: Dont allocate BPIDs for LBK interfaces · 521c84b9
      Geetha sowjanya authored
      [ Upstream commit 8fcd7c7b ]
      
      Current driver enables backpressure for LBK interfaces.
      But these interfaces do not support this feature.
      Hence, this patch fixes the issue by skipping the
      backpressure configuration for these interfaces.
      
      Fixes: 75f36270
      
       ("octeontx2-pf: Support to enable/disable pause frames via ethtool").
      Signed-off-by: default avatarGeetha sowjanya <gakula@marvell.com>
      Signed-off-by: default avatarSunil Goutham <sgoutham@marvell.com>
      Link: https://lore.kernel.org/r/20230716093741.28063-1-gakula@marvell.com
      
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      521c84b9
    • Ido Schimmel's avatar
      vrf: Fix lockdep splat in output path · 6a183c72
      Ido Schimmel authored
      [ Upstream commit 2033ab90 ]
      
      Cited commit converted the neighbour code to use the standard RCU
      variant instead of the RCU-bh variant, but the VRF code still uses
      rcu_read_lock_bh() / rcu_read_unlock_bh() around the neighbour lookup
      code in its IPv4 and IPv6 output paths, resulting in lockdep splats
      [1][2]. Can be reproduced using [3].
      
      Fix by switching to rcu_read_lock() / rcu_read_unlock().
      
      [1]
      =============================
      WARNING: suspicious RCU usage
      6.5.0-rc1-custom-g9c099e6dbf98 #403 Not tainted
      -----------------------------
      include/net/neighbour.h:302 suspicious rcu_dereference_check() usage!
      
      other info that might help us debug this:
      
      rcu_scheduler_active = 2, debug_locks = 1
      2 locks held by ping/183:
       #0: ffff888105ea1d80 (sk_lock-AF_INET){+.+.}-{0:0}, at: raw_sendmsg+0xc6c/0x33c0
       #1: ffffffff85b46820 (rcu_read_lock_bh){....}-{1:2}, at: vrf_output+0x2e3/0x2030
      
      stack backtrace:
      CPU: 0 PID: 183 Comm: ping Not tainted 6.5.0-rc1-custom-g9c099e6dbf98 #403
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc37 04/01/2014
      Call Trace:
       <TASK>
       dump_stack_lvl+0xc1/0xf0
       lockdep_rcu_suspicious+0x211/0x3b0
       vrf_output+0x1380/0x2030
       ip_push_pending_frames+0x125/0x2a0
       raw_sendmsg+0x200d/0x33c0
       inet_sendmsg+0xa2/0xe0
       __sys_sendto+0x2aa/0x420
       __x64_sys_sendto+0xe5/0x1c0
       do_syscall_64+0x38/0x80
       entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      [2]
      =============================
      WARNING: suspicious RCU usage
      6.5.0-rc1-custom-g9c099e6dbf98 #403 Not tainted
      -----------------------------
      include/net/neighbour.h:302 suspicious rcu_dereference_check() usage!
      
      other info that might help us debug this:
      
      rcu_scheduler_active = 2, debug_locks = 1
      2 locks held by ping6/182:
       #0: ffff888114b63000 (sk_lock-AF_INET6){+.+.}-{0:0}, at: rawv6_sendmsg+0x1602/0x3e50
       #1: ffffffff85b46820 (rcu_read_lock_bh){....}-{1:2}, at: vrf_output6+0xe9/0x1310
      
      stack backtrace:
      CPU: 0 PID: 182 Comm: ping6 Not tainted 6.5.0-rc1-custom-g9c099e6dbf98 #403
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc37 04/01/2014
      Call Trace:
       <TASK>
       dump_stack_lvl+0xc1/0xf0
       lockdep_rcu_suspicious+0x211/0x3b0
       vrf_output6+0xd32/0x1310
       ip6_local_out+0xb4/0x1a0
       ip6_send_skb+0xbc/0x340
       ip6_push_pending_frames+0xe5/0x110
       rawv6_sendmsg+0x2e6e/0x3e50
       inet_sendmsg+0xa2/0xe0
       __sys_sendto+0x2aa/0x420
       __x64_sys_sendto+0xe5/0x1c0
       do_syscall_64+0x38/0x80
       entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      [3]
      #!/bin/bash
      
      ip link add name vrf-red up numtxqueues 2 type vrf table 10
      ip link add name swp1 up master vrf-red type dummy
      ip address add 192.0.2.1/24 dev swp1
      ip address add 2001:db8:1::1/64 dev swp1
      ip neigh add 192.0.2.2 lladdr 00:11:22:33:44:55 nud perm dev swp1
      ip neigh add 2001:db8:1::2 lladdr 00:11:22:33:44:55 nud perm dev swp1
      ip vrf exec vrf-red ping 192.0.2.2 -c 1 &> /dev/null
      ip vrf exec vrf-red ping6 2001:db8:1::2 -c 1 &> /dev/null
      
      Fixes: 09eed119
      
       ("neighbour: switch to standard rcu, instead of rcu_bh")
      Reported-by: default avatarNaresh Kamboju <naresh.kamboju@linaro.org>
      Link: https://lore.kernel.org/netdev/CA+G9fYtEr-=GbcXNDYo3XOkwR+uYgehVoDjsP0pFLUpZ_AZcyg@mail.gmail.com/
      
      
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/r/20230715153605.4068066-1-idosch@nvidia.com
      
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      6a183c72
    • Jiapeng Chong's avatar
      security: keys: Modify mismatched function name · cecb533d
      Jiapeng Chong authored
      [ Upstream commit 2a415274 ]
      
      No functional modification involved.
      
      security/keys/trusted-keys/trusted_tpm2.c:203: warning: expecting prototype for tpm_buf_append_auth(). Prototype was for tpm2_buf_append_auth() instead.
      
      Fixes: 2e19e101
      
       ("KEYS: trusted: Move TPM2 trusted keys code")
      Reported-by: default avatarAbaci Robot <abaci@linux.alibaba.com>
      Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=5524
      
      
      Signed-off-by: default avatarJiapeng Chong <jiapeng.chong@linux.alibaba.com>
      Reviewed-by: default avatarPaul Moore <paul@paul-moore.com>
      Signed-off-by: default avatarJarkko Sakkinen <jarkko@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      cecb533d
    • Ahmed Zaki's avatar
      iavf: fix reset task race with iavf_remove() · 532fbfc9
      Ahmed Zaki authored
      [ Upstream commit c34743da ]
      
      The reset task is currently scheduled from the watchdog or adminq tasks.
      First, all direct calls to schedule the reset task are replaced with the
      iavf_schedule_reset(), which is modified to accept the flag showing the
      type of reset.
      
      To prevent the reset task from starting once iavf_remove() starts, we need
      to check the __IAVF_IN_REMOVE_TASK bit before we schedule it. This is now
      easily added to iavf_schedule_reset().
      
      Finally, remove the check for IAVF_FLAG_RESET_NEEDED in the watchdog task.
      It is redundant since all callers who set the flag immediately schedules
      the reset task.
      
      Fixes: 3ccd54ef ("iavf: Fix init state closure on remove")
      Fixes: 14756b2a
      
       ("iavf: Fix __IAVF_RESETTING state usage")
      Signed-off-by: default avatarAhmed Zaki <ahmed.zaki@intel.com>
      Signed-off-by: default avatarMateusz Palczewski <mateusz.palczewski@intel.com>
      Tested-by: default avatarRafal Romanowski <rafal.romanowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      532fbfc9
    • Ahmed Zaki's avatar
      iavf: fix a deadlock caused by rtnl and driver's lock circular dependencies · 63d14a43
      Ahmed Zaki authored
      [ Upstream commit d1639a17 ]
      
      A driver's lock (crit_lock) is used to serialize all the driver's tasks.
      Lockdep, however, shows a circular dependency between rtnl and
      crit_lock. This happens when an ndo that already holds the rtnl requests
      the driver to reset, since the reset task (in some paths) tries to grab
      rtnl to either change real number of queues of update netdev features.
      
        [566.241851] ======================================================
        [566.241893] WARNING: possible circular locking dependency detected
        [566.241936] 6.2.14-100.fc36.x86_64+debug #1 Tainted: G           OE
        [566.241984] ------------------------------------------------------
        [566.242025] repro.sh/2604 is trying to acquire lock:
        [566.242061] ffff9280fc5ceee8 (&adapter->crit_lock){+.+.}-{3:3}, at: iavf_close+0x3c/0x240 [iavf]
        [566.242167]
                     but task is already holding lock:
        [566.242209] ffffffff9976d350 (rtnl_mutex){+.+.}-{3:3}, at: iavf_remove+0x6b5/0x730 [iavf]
        [566.242300]
                     which lock already depends on the new lock.
      
        [566.242353]
                     the existing dependency chain (in reverse order) is:
        [566.242401]
                     -> #1 (rtnl_mutex){+.+.}-{3:3}:
        [566.242451]        __mutex_lock+0xc1/0xbb0
        [566.242489]        iavf_init_interrupt_scheme+0x179/0x440 [iavf]
        [566.242560]        iavf_watchdog_task+0x80b/0x1400 [iavf]
        [566.242627]        process_one_work+0x2b3/0x560
        [566.242663]        worker_thread+0x4f/0x3a0
        [566.242696]        kthread+0xf2/0x120
        [566.242730]        ret_from_fork+0x29/0x50
        [566.242763]
                     -> #0 (&adapter->crit_lock){+.+.}-{3:3}:
        [566.242815]        __lock_acquire+0x15ff/0x22b0
        [566.242869]        lock_acquire+0xd2/0x2c0
        [566.242901]        __mutex_lock+0xc1/0xbb0
        [566.242934]        iavf_close+0x3c/0x240 [iavf]
        [566.242997]        __dev_close_many+0xac/0x120
        [566.243036]        dev_close_many+0x8b/0x140
        [566.243071]        unregister_netdevice_many_notify+0x165/0x7c0
        [566.243116]        unregister_netdevice_queue+0xd3/0x110
        [566.243157]        iavf_remove+0x6c1/0x730 [iavf]
        [566.243217]        pci_device_remove+0x33/0xa0
        [566.243257]        device_release_driver_internal+0x1bc/0x240
        [566.243299]        pci_stop_bus_device+0x6c/0x90
        [566.243338]        pci_stop_and_remove_bus_device+0xe/0x20
        [566.243380]        pci_iov_remove_virtfn+0xd1/0x130
        [566.243417]        sriov_disable+0x34/0xe0
        [566.243448]        ice_free_vfs+0x2da/0x330 [ice]
        [566.244383]        ice_sriov_configure+0x88/0xad0 [ice]
        [566.245353]        sriov_numvfs_store+0xde/0x1d0
        [566.246156]        kernfs_fop_write_iter+0x15e/0x210
        [566.246921]        vfs_write+0x288/0x530
        [566.247671]        ksys_write+0x74/0xf0
        [566.248408]        do_syscall_64+0x58/0x80
        [566.249145]        entry_SYSCALL_64_after_hwframe+0x72/0xdc
        [566.249886]
                       other info that might help us debug this:
      
        [566.252014]  Possible unsafe locking scenario:
      
        [566.253432]        CPU0                    CPU1
        [566.254118]        ----                    ----
        [566.254800]   lock(rtnl_mutex);
        [566.255514]                                lock(&adapter->crit_lock);
        [566.256233]                                lock(rtnl_mutex);
        [566.256897]   lock(&adapter->crit_lock);
        [566.257388]
                        *** DEADLOCK ***
      
      The deadlock can be triggered by a script that is continuously resetting
      the VF adapter while doing other operations requiring RTNL, e.g:
      
      	while :; do
      		ip link set $VF up
      		ethtool --set-channels $VF combined 2
      		ip link set $VF down
      		ip link set $VF up
      		ethtool --set-channels $VF combined 4
      		ip link set $VF down
      	done
      
      Any operation that triggers a reset can substitute "ethtool --set-channles"
      
      As a fix, add a new task "finish_config" that do all the work which
      needs rtnl lock. With the exception of iavf_remove(), all work that
      require rtnl should be called from this task.
      
      As for iavf_remove(), at the point where we need to call
      unregister_netdevice() (and grab rtnl_lock), we make sure the finish_config
      task is not running (cancel_work_sync()) to safely grab rtnl. Subsequent
      finish_config work cannot restart after that since the task is guarded
      by the __IAVF_IN_REMOVE_TASK bit in iavf_schedule_finish_config().
      
      Fixes: 5ac49f3c
      
       ("iavf: use mutexes for locking of critical sections")
      Signed-off-by: default avatarAhmed Zaki <ahmed.zaki@intel.com>
      Signed-off-by: default avatarMateusz Palczewski <mateusz.palczewski@intel.com>
      Tested-by: default avatarRafal Romanowski <rafal.romanowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      63d14a43
    • Marcin Szycik's avatar
      iavf: Wait for reset in callbacks which trigger it · d5549575
      Marcin Szycik authored
      [ Upstream commit c2ed2403 ]
      
      There was a fail when trying to add the interface to bonding
      right after changing the MTU on the interface. It was caused
      by bonding interface unable to open the interface due to
      interface being in __RESETTING state because of MTU change.
      
      Add new reset_waitqueue to indicate that reset has finished.
      
      Add waiting for reset to finish in callbacks which trigger hw reset:
      iavf_set_priv_flags(), iavf_change_mtu() and iavf_set_ringparam().
      We use a 5000ms timeout period because on Hyper-V based systems,
      this operation takes around 3000-4000ms. In normal circumstances,
      it doesn't take more than 500ms to complete.
      
      Add a function iavf_wait_for_reset() to reuse waiting for reset code and
      use it also in iavf_set_channels(), which already waits for reset.
      We don't use error handling in iavf_set_channels() as this could
      cause the device to be in incorrect state if the reset was scheduled
      but hit timeout or the waitng function was interrupted by a signal.
      
      Fixes: 4e5e6b5d
      
       ("iavf: Fix return of set the new channel count")
      Signed-off-by: default avatarMarcin Szycik <marcin.szycik@linux.intel.com>
      Co-developed-by: default avatarDawid Wesierski <dawidx.wesierski@intel.com>
      Signed-off-by: default avatarDawid Wesierski <dawidx.wesierski@intel.com>
      Signed-off-by: default avatarSylwester Dziedziuch <sylwesterx.dziedziuch@intel.com>
      Signed-off-by: default avatarKamil Maziarz <kamil.maziarz@intel.com>
      Signed-off-by: default avatarMateusz Palczewski <mateusz.palczewski@intel.com>
      Tested-by: default avatarRafal Romanowski <rafal.romanowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d5549575
    • Przemek Kitszel's avatar
      iavf: make functions static where possible · e631b18c
      Przemek Kitszel authored
      [ Upstream commit a4aadf0f
      
       ]
      
      Make all possible functions static.
      
      Move iavf_force_wb() up to avoid forward declaration.
      
      Suggested-by: default avatarMaciej Fijalkowski <maciej.fijalkowski@intel.com>
      Reviewed-by: default avatarMaciej Fijalkowski <maciej.fijalkowski@intel.com>
      Signed-off-by: default avatarPrzemek Kitszel <przemyslaw.kitszel@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Stable-dep-of: c2ed2403
      
       ("iavf: Wait for reset in callbacks which trigger it")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e631b18c
    • Ahmed Zaki's avatar
      iavf: use internal state to free traffic IRQs · 5e9db32e
      Ahmed Zaki authored
      [ Upstream commit a77ed5c5 ]
      
      If the system tries to close the netdev while iavf_reset_task() is
      running, __LINK_STATE_START will be cleared and netif_running() will
      return false in iavf_reinit_interrupt_scheme(). This will result in
      iavf_free_traffic_irqs() not being called and a leak as follows:
      
          [7632.489326] remove_proc_entry: removing non-empty directory 'irq/999', leaking at least 'iavf-enp24s0f0v0-TxRx-0'
          [7632.490214] WARNING: CPU: 0 PID: 10 at fs/proc/generic.c:718 remove_proc_entry+0x19b/0x1b0
      
      is shown when pci_disable_msix() is later called. Fix by using the
      internal adapter state. The traffic IRQs will always exist if
      state == __IAVF_RUNNING.
      
      Fixes: 5b36e8d0
      
       ("i40evf: Enable VF to request an alternate queue allocation")
      Signed-off-by: default avatarAhmed Zaki <ahmed.zaki@intel.com>
      Tested-by: default avatarRafal Romanowski <rafal.romanowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      5e9db32e
    • Ding Hui's avatar
      iavf: Fix out-of-bounds when setting channels on remove · 65ecebc9
      Ding Hui authored
      [ Upstream commit 7c4bced3 ]
      
      If we set channels greater during iavf_remove(), and waiting reset done
      would be timeout, then returned with error but changed num_active_queues
      directly, that will lead to OOB like the following logs. Because the
      num_active_queues is greater than tx/rx_rings[] allocated actually.
      
      Reproducer:
      
        [root@host ~]# cat repro.sh
        #!/bin/bash
      
        pf_dbsf="0000:41:00.0"
        vf0_dbsf="0000:41:02.0"
        g_pids=()
      
        function do_set_numvf()
        {
            echo 2 >/sys/bus/pci/devices/${pf_dbsf}/sriov_numvfs
            sleep $((RANDOM%3+1))
            echo 0 >/sys/bus/pci/devices/${pf_dbsf}/sriov_numvfs
            sleep $((RANDOM%3+1))
        }
      
        function do_set_channel()
        {
            local nic=$(ls -1 --indicator-style=none /sys/bus/pci/devices/${vf0_dbsf}/net/)
            [ -z "$nic" ] && { sleep $((RANDOM%3)) ; return 1; }
            ifconfig $nic 192.168.18.5 netmask 255.255.255.0
            ifconfig $nic up
            ethtool -L $nic combined 1
            ethtool -L $nic combined 4
            sleep $((RANDOM%3))
        }
      
        function on_exit()
        {
            local pid
            for pid in "${g_pids[@]}"; do
                kill -0 "$pid" &>/dev/null && kill "$pid" &>/dev/null
            done
            g_pids=()
        }
      
        trap "on_exit; exit" EXIT
      
        while :; do do_set_numvf ; done &
        g_pids+=($!)
        while :; do do_set_channel ; done &
        g_pids+=($!)
      
        wait
      
      Result:
      
      [ 3506.152887] iavf 0000:41:02.0: Removing device
      [ 3510.400799] ==================================================================
      [ 3510.400820] BUG: KASAN: slab-out-of-bounds in iavf_free_all_tx_resources+0x156/0x160 [iavf]
      [ 3510.400823] Read of size 8 at addr ffff88b6f9311008 by task repro.sh/55536
      [ 3510.400823]
      [ 3510.400830] CPU: 101 PID: 55536 Comm: repro.sh Kdump: loaded Tainted: G           O     --------- -t - 4.18.0 #1
      [ 3510.400832] Hardware name: Powerleader PR2008AL/H12DSi-N6, BIOS 2.0 04/09/2021
      [ 3510.400835] Call Trace:
      [ 3510.400851]  dump_stack+0x71/0xab
      [ 3510.400860]  print_address_description+0x6b/0x290
      [ 3510.400865]  ? iavf_free_all_tx_resources+0x156/0x160 [iavf]
      [ 3510.400868]  kasan_report+0x14a/0x2b0
      [ 3510.400873]  iavf_free_all_tx_resources+0x156/0x160 [iavf]
      [ 3510.400880]  iavf_remove+0x2b6/0xc70 [iavf]
      [ 3510.400884]  ? iavf_free_all_rx_resources+0x160/0x160 [iavf]
      [ 3510.400891]  ? wait_woken+0x1d0/0x1d0
      [ 3510.400895]  ? notifier_call_chain+0xc1/0x130
      [ 3510.400903]  pci_device_remove+0xa8/0x1f0
      [ 3510.400910]  device_release_driver_internal+0x1c6/0x460
      [ 3510.400916]  pci_stop_bus_device+0x101/0x150
      [ 3510.400919]  pci_stop_and_remove_bus_device+0xe/0x20
      [ 3510.400924]  pci_iov_remove_virtfn+0x187/0x420
      [ 3510.400927]  ? pci_iov_add_virtfn+0xe10/0xe10
      [ 3510.400929]  ? pci_get_subsys+0x90/0x90
      [ 3510.400932]  sriov_disable+0xed/0x3e0
      [ 3510.400936]  ? bus_find_device+0x12d/0x1a0
      [ 3510.400953]  i40e_free_vfs+0x754/0x1210 [i40e]
      [ 3510.400966]  ? i40e_reset_all_vfs+0x880/0x880 [i40e]
      [ 3510.400968]  ? pci_get_device+0x7c/0x90
      [ 3510.400970]  ? pci_get_subsys+0x90/0x90
      [ 3510.400982]  ? pci_vfs_assigned.part.7+0x144/0x210
      [ 3510.400987]  ? __mutex_lock_slowpath+0x10/0x10
      [ 3510.400996]  i40e_pci_sriov_configure+0x1fa/0x2e0 [i40e]
      [ 3510.401001]  sriov_numvfs_store+0x214/0x290
      [ 3510.401005]  ? sriov_totalvfs_show+0x30/0x30
      [ 3510.401007]  ? __mutex_lock_slowpath+0x10/0x10
      [ 3510.401011]  ? __check_object_size+0x15a/0x350
      [ 3510.401018]  kernfs_fop_write+0x280/0x3f0
      [ 3510.401022]  vfs_write+0x145/0x440
      [ 3510.401025]  ksys_write+0xab/0x160
      [ 3510.401028]  ? __ia32_sys_read+0xb0/0xb0
      [ 3510.401031]  ? fput_many+0x1a/0x120
      [ 3510.401032]  ? filp_close+0xf0/0x130
      [ 3510.401038]  do_syscall_64+0xa0/0x370
      [ 3510.401041]  ? page_fault+0x8/0x30
      [ 3510.401043]  entry_SYSCALL_64_after_hwframe+0x65/0xca
      [ 3510.401073] RIP: 0033:0x7f3a9bb842c0
      [ 3510.401079] Code: 73 01 c3 48 8b 0d d8 cb 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d 89 24 2d 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 fe dd 01 00 48 89 04 24
      [ 3510.401080] RSP: 002b:00007ffc05f1fe18 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
      [ 3510.401083] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f3a9bb842c0
      [ 3510.401085] RDX: 0000000000000002 RSI: 0000000002327408 RDI: 0000000000000001
      [ 3510.401086] RBP: 0000000002327408 R08: 00007f3a9be53780 R09: 00007f3a9c8a4700
      [ 3510.401086] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000002
      [ 3510.401087] R13: 0000000000000001 R14: 00007f3a9be52620 R15: 0000000000000001
      [ 3510.401090]
      [ 3510.401093] Allocated by task 76795:
      [ 3510.401098]  kasan_kmalloc+0xa6/0xd0
      [ 3510.401099]  __kmalloc+0xfb/0x200
      [ 3510.401104]  iavf_init_interrupt_scheme+0x26f/0x1310 [iavf]
      [ 3510.401108]  iavf_watchdog_task+0x1d58/0x4050 [iavf]
      [ 3510.401114]  process_one_work+0x56a/0x11f0
      [ 3510.401115]  worker_thread+0x8f/0xf40
      [ 3510.401117]  kthread+0x2a0/0x390
      [ 3510.401119]  ret_from_fork+0x1f/0x40
      [ 3510.401122]  0xffffffffffffffff
      [ 3510.401123]
      
      In timeout handling, we should keep the original num_active_queues
      and reset num_req_queues to 0.
      
      Fixes: 4e5e6b5d
      
       ("iavf: Fix return of set the new channel count")
      Signed-off-by: default avatarDing Hui <dinghui@sangfor.com.cn>
      Cc: Donglin Peng <pengdonglin@sangfor.com.cn>
      Cc: Huang Cun <huangcun@sangfor.com.cn>
      Reviewed-by: default avatarLeon Romanovsky <leonro@nvidia.com>
      Tested-by: default avatarRafal Romanowski <rafal.romanowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      65ecebc9
    • Ding Hui's avatar
      iavf: Fix use-after-free in free_netdev · 8d781a9c
      Ding Hui authored
      [ Upstream commit 5f4fa167 ]
      
      We do netif_napi_add() for all allocated q_vectors[], but potentially
      do netif_napi_del() for part of them, then kfree q_vectors and leave
      invalid pointers at dev->napi_list.
      
      Reproducer:
      
        [root@host ~]# cat repro.sh
        #!/bin/bash
      
        pf_dbsf="0000:41:00.0"
        vf0_dbsf="0000:41:02.0"
        g_pids=()
      
        function do_set_numvf()
        {
            echo 2 >/sys/bus/pci/devices/${pf_dbsf}/sriov_numvfs
            sleep $((RANDOM%3+1))
            echo 0 >/sys/bus/pci/devices/${pf_dbsf}/sriov_numvfs
            sleep $((RANDOM%3+1))
        }
      
        function do_set_channel()
        {
            local nic=$(ls -1 --indicator-style=none /sys/bus/pci/devices/${vf0_dbsf}/net/)
            [ -z "$nic" ] && { sleep $((RANDOM%3)) ; return 1; }
            ifconfig $nic 192.168.18.5 netmask 255.255.255.0
            ifconfig $nic up
            ethtool -L $nic combined 1
            ethtool -L $nic combined 4
            sleep $((RANDOM%3))
        }
      
        function on_exit()
        {
            local pid
            for pid in "${g_pids[@]}"; do
                kill -0 "$pid" &>/dev/null && kill "$pid" &>/dev/null
            done
            g_pids=()
        }
      
        trap "on_exit; exit" EXIT
      
        while :; do do_set_numvf ; done &
        g_pids+=($!)
        while :; do do_set_channel ; done &
        g_pids+=($!)
      
        wait
      
      Result:
      
      [ 4093.900222] ==================================================================
      [ 4093.900230] BUG: KASAN: use-after-free in free_netdev+0x308/0x390
      [ 4093.900232] Read of size 8 at addr ffff88b4dc145640 by task repro.sh/6699
      [ 4093.900233]
      [ 4093.900236] CPU: 10 PID: 6699 Comm: repro.sh Kdump: loaded Tainted: G           O     --------- -t - 4.18.0 #1
      [ 4093.900238] Hardware name: Powerleader PR2008AL/H12DSi-N6, BIOS 2.0 04/09/2021
      [ 4093.900239] Call Trace:
      [ 4093.900244]  dump_stack+0x71/0xab
      [ 4093.900249]  print_address_description+0x6b/0x290
      [ 4093.900251]  ? free_netdev+0x308/0x390
      [ 4093.900252]  kasan_report+0x14a/0x2b0
      [ 4093.900254]  free_netdev+0x308/0x390
      [ 4093.900261]  iavf_remove+0x825/0xd20 [iavf]
      [ 4093.900265]  pci_device_remove+0xa8/0x1f0
      [ 4093.900268]  device_release_driver_internal+0x1c6/0x460
      [ 4093.900271]  pci_stop_bus_device+0x101/0x150
      [ 4093.900273]  pci_stop_and_remove_bus_device+0xe/0x20
      [ 4093.900275]  pci_iov_remove_virtfn+0x187/0x420
      [ 4093.900277]  ? pci_iov_add_virtfn+0xe10/0xe10
      [ 4093.900278]  ? pci_get_subsys+0x90/0x90
      [ 4093.900280]  sriov_disable+0xed/0x3e0
      [ 4093.900282]  ? bus_find_device+0x12d/0x1a0
      [ 4093.900290]  i40e_free_vfs+0x754/0x1210 [i40e]
      [ 4093.900298]  ? i40e_reset_all_vfs+0x880/0x880 [i40e]
      [ 4093.900299]  ? pci_get_device+0x7c/0x90
      [ 4093.900300]  ? pci_get_subsys+0x90/0x90
      [ 4093.900306]  ? pci_vfs_assigned.part.7+0x144/0x210
      [ 4093.900309]  ? __mutex_lock_slowpath+0x10/0x10
      [ 4093.900315]  i40e_pci_sriov_configure+0x1fa/0x2e0 [i40e]
      [ 4093.900318]  sriov_numvfs_store+0x214/0x290
      [ 4093.900320]  ? sriov_totalvfs_show+0x30/0x30
      [ 4093.900321]  ? __mutex_lock_slowpath+0x10/0x10
      [ 4093.900323]  ? __check_object_size+0x15a/0x350
      [ 4093.900326]  kernfs_fop_write+0x280/0x3f0
      [ 4093.900329]  vfs_write+0x145/0x440
      [ 4093.900330]  ksys_write+0xab/0x160
      [ 4093.900332]  ? __ia32_sys_read+0xb0/0xb0
      [ 4093.900334]  ? fput_many+0x1a/0x120
      [ 4093.900335]  ? filp_close+0xf0/0x130
      [ 4093.900338]  do_syscall_64+0xa0/0x370
      [ 4093.900339]  ? page_fault+0x8/0x30
      [ 4093.900341]  entry_SYSCALL_64_after_hwframe+0x65/0xca
      [ 4093.900357] RIP: 0033:0x7f16ad4d22c0
      [ 4093.900359] Code: 73 01 c3 48 8b 0d d8 cb 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d 89 24 2d 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 fe dd 01 00 48 89 04 24
      [ 4093.900360] RSP: 002b:00007ffd6491b7f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
      [ 4093.900362] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f16ad4d22c0
      [ 4093.900363] RDX: 0000000000000002 RSI: 0000000001a41408 RDI: 0000000000000001
      [ 4093.900364] RBP: 0000000001a41408 R08: 00007f16ad7a1780 R09: 00007f16ae1f2700
      [ 4093.900364] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000002
      [ 4093.900365] R13: 0000000000000001 R14: 00007f16ad7a0620 R15: 0000000000000001
      [ 4093.900367]
      [ 4093.900368] Allocated by task 820:
      [ 4093.900371]  kasan_kmalloc+0xa6/0xd0
      [ 4093.900373]  __kmalloc+0xfb/0x200
      [ 4093.900376]  iavf_init_interrupt_scheme+0x63b/0x1320 [iavf]
      [ 4093.900380]  iavf_watchdog_task+0x3d51/0x52c0 [iavf]
      [ 4093.900382]  process_one_work+0x56a/0x11f0
      [ 4093.900383]  worker_thread+0x8f/0xf40
      [ 4093.900384]  kthread+0x2a0/0x390
      [ 4093.900385]  ret_from_fork+0x1f/0x40
      [ 4093.900387]  0xffffffffffffffff
      [ 4093.900387]
      [ 4093.900388] Freed by task 6699:
      [ 4093.900390]  __kasan_slab_free+0x137/0x190
      [ 4093.900391]  kfree+0x8b/0x1b0
      [ 4093.900394]  iavf_free_q_vectors+0x11d/0x1a0 [iavf]
      [ 4093.900397]  iavf_remove+0x35a/0xd20 [iavf]
      [ 4093.900399]  pci_device_remove+0xa8/0x1f0
      [ 4093.900400]  device_release_driver_internal+0x1c6/0x460
      [ 4093.900401]  pci_stop_bus_device+0x101/0x150
      [ 4093.900402]  pci_stop_and_remove_bus_device+0xe/0x20
      [ 4093.900403]  pci_iov_remove_virtfn+0x187/0x420
      [ 4093.900404]  sriov_disable+0xed/0x3e0
      [ 4093.900409]  i40e_free_vfs+0x754/0x1210 [i40e]
      [ 4093.900415]  i40e_pci_sriov_configure+0x1fa/0x2e0 [i40e]
      [ 4093.900416]  sriov_numvfs_store+0x214/0x290
      [ 4093.900417]  kernfs_fop_write+0x280/0x3f0
      [ 4093.900418]  vfs_write+0x145/0x440
      [ 4093.900419]  ksys_write+0xab/0x160
      [ 4093.900420]  do_syscall_64+0xa0/0x370
      [ 4093.900421]  entry_SYSCALL_64_after_hwframe+0x65/0xca
      [ 4093.900422]  0xffffffffffffffff
      [ 4093.900422]
      [ 4093.900424] The buggy address belongs to the object at ffff88b4dc144200
                      which belongs to the cache kmalloc-8k of size 8192
      [ 4093.900425] The buggy address is located 5184 bytes inside of
                      8192-byte region [ffff88b4dc144200, ffff88b4dc146200)
      [ 4093.900425] The buggy address belongs to the page:
      [ 4093.900427] page:ffffea00d3705000 refcount:1 mapcount:0 mapping:ffff88bf04415c80 index:0x0 compound_mapcount: 0
      [ 4093.900430] flags: 0x10000000008100(slab|head)
      [ 4093.900433] raw: 0010000000008100 dead000000000100 dead000000000200 ffff88bf04415c80
      [ 4093.900434] raw: 0000000000000000 0000000000030003 00000001ffffffff 0000000000000000
      [ 4093.900434] page dumped because: kasan: bad access detected
      [ 4093.900435]
      [ 4093.900435] Memory state around the buggy address:
      [ 4093.900436]  ffff88b4dc145500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [ 4093.900437]  ffff88b4dc145580: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [ 4093.900438] >ffff88b4dc145600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [ 4093.900438]                                            ^
      [ 4093.900439]  ffff88b4dc145680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [ 4093.900440]  ffff88b4dc145700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      [ 4093.900440] ==================================================================
      
      Although the patch #2 (of 2) can avoid the issue triggered by this
      repro.sh, there still are other potential risks that if num_active_queues
      is changed to less than allocated q_vectors[] by unexpected, the
      mismatched netif_napi_add/del() can also cause UAF.
      
      Since we actually call netif_napi_add() for all allocated q_vectors
      unconditionally in iavf_alloc_q_vectors(), so we should fix it by
      letting netif_napi_del() match to netif_napi_add().
      
      Fixes: 5eae00c5
      
       ("i40evf: main driver core")
      Signed-off-by: default avatarDing Hui <dinghui@sangfor.com.cn>
      Cc: Donglin Peng <pengdonglin@sangfor.com.cn>
      Cc: Huang Cun <huangcun@sangfor.com.cn>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Reviewed-by: default avatarMadhu Chittim <madhu.chittim@intel.com>
      Reviewed-by: default avatarLeon Romanovsky <leonro@nvidia.com>
      Tested-by: default avatarRafal Romanowski <rafal.romanowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      8d781a9c
    • Andrzej Hajda's avatar
      drm/i915/perf: add sentinel to xehp_oa_b_counters · 21d92025
      Andrzej Hajda authored
      [ Upstream commit 785b3f66 ]
      
      Arrays passed to reg_in_range_table should end with empty record.
      
      The patch solves KASAN detected bug with signature:
      BUG: KASAN: global-out-of-bounds in xehp_is_valid_b_counter_addr+0x2c7/0x350 [i915]
      Read of size 4 at addr ffffffffa1555d90 by task perf/1518
      
      CPU: 4 PID: 1518 Comm: perf Tainted: G U 6.4.0-kasan_438-g3303d06107f3+ #1
      Hardware name: Intel Corporation Meteor Lake Client Platform/MTL-P DDR5 SODIMM SBS RVP, BIOS MTLPFWI1.R00.3223.D80.2305311348 05/31/2023
      Call Trace:
      <TASK>
      ...
      xehp_is_valid_b_counter_addr+0x2c7/0x350 [i915]
      
      Fixes: 0fa9349d
      
       ("drm/i915/perf: complete programming whitelisting for XEHPSDV")
      Signed-off-by: default avatarAndrzej Hajda <andrzej.hajda@intel.com>
      Reviewed-by: default avatarAndi Shyti <andi.shyti@linux.intel.com>
      Reviewed-by: default avatarNirmoy Das <nirmoy.das@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20230711153410.1224997-1-andrzej.hajda@intel.com
      (cherry picked from commit 2f42c5af
      
      )
      Signed-off-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      21d92025
    • Heiner Kallweit's avatar
      r8169: fix ASPM-related problem for chip version 42 and 43 · 899057ac
      Heiner Kallweit authored
      [ Upstream commit 162d626f ]
      
      Referenced commit missed that for chip versions 42 and 43 ASPM
      remained disabled in the respective rtl_hw_start_...() routines.
      This resulted in problems as described in the referenced bug
      ticket. Therefore re-instantiate the previous logic.
      
      Fixes: 5fc3f6c9 ("r8169: consolidate disabling ASPM before EPHY access")
      Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217635
      
      
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      899057ac
    • Tristram Ha's avatar
      net: dsa: microchip: correct KSZ8795 static MAC table access · 024825df
      Tristram Ha authored
      [ Upstream commit 4bdf79d6 ]
      
      The KSZ8795 driver code was modified to use on KSZ8863/73, which has
      different register definitions.  Some of the new KSZ8795 register
      information are wrong compared to previous code.
      
      KSZ8795 also behaves differently in that the STATIC_MAC_TABLE_USE_FID
      and STATIC_MAC_TABLE_FID bits are off by 1 when doing MAC table reading
      than writing.  To compensate that a special code was added to shift the
      register value by 1 before applying those bits.  This is wrong when the
      code is running on KSZ8863, so this special code is only executed when
      KSZ8795 is detected.
      
      Fixes: 4b20a07e
      
       ("net: dsa: microchip: ksz8795: add support for ksz88xx chips")
      Signed-off-by: default avatarTristram Ha <Tristram.Ha@microchip.com>
      Reviewed-by: default avatarHoratiu Vultur <horatiu.vultur@microchip.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      024825df
    • Victor Nogueira's avatar
      net: sched: cls_bpf: Undo tcf_bind_filter in case of an error · 6777dfaf
      Victor Nogueira authored
      [ Upstream commit 26a22194 ]
      
      If cls_bpf_offload errors out, we must also undo tcf_bind_filter that
      was done before the error.
      
      Fix that by calling tcf_unbind_filter in errout_parms.
      
      Fixes: eadb4148
      
       ("net: cls_bpf: add support for marking filters as hardware-only")
      Signed-off-by: default avatarVictor Nogueira <victor@mojatatu.com>
      Acked-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Reviewed-by: default avatarPedro Tammela <pctammela@mojatatu.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      6777dfaf
    • Victor Nogueira's avatar
      net: sched: cls_u32: Undo refcount decrement in case update failed · cec095b3
      Victor Nogueira authored
      [ Upstream commit e8d3d78c ]
      
      In the case of an update, when TCA_U32_LINK is set, u32_set_parms will
      decrement the refcount of the ht_down (struct tc_u_hnode) pointer
      present in the older u32 filter which we are replacing. However, if
      u32_replace_hw_knode errors out, the update command fails and that
      ht_down pointer continues decremented. To fix that, when
      u32_replace_hw_knode fails, check if ht_down's refcount was decremented
      and undo the decrement.
      
      Fixes: d34e3e18
      
       ("net: cls_u32: Add support for skip-sw flag to tc u32 classifier.")
      Signed-off-by: default avatarVictor Nogueira <victor@mojatatu.com>
      Acked-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Reviewed-by: default avatarPedro Tammela <pctammela@mojatatu.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      cec095b3
    • Victor Nogueira's avatar
      net: sched: cls_u32: Undo tcf_bind_filter if u32_replace_hw_knode · 025159ed
      Victor Nogueira authored
      [ Upstream commit 9cb36fae ]
      
      When u32_replace_hw_knode fails, we need to undo the tcf_bind_filter
      operation done at u32_set_parms.
      
      Fixes: d34e3e18
      
       ("net: cls_u32: Add support for skip-sw flag to tc u32 classifier.")
      Signed-off-by: default avatarVictor Nogueira <victor@mojatatu.com>
      Acked-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Reviewed-by: default avatarPedro Tammela <pctammela@mojatatu.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      025159ed
    • Victor Nogueira's avatar
      net: sched: cls_matchall: Undo tcf_bind_filter in case of failure after mall_set_parms · 1134ceab
      Victor Nogueira authored
      [ Upstream commit b3d0e048 ]
      
      In case an error occurred after mall_set_parms executed successfully, we
      must undo the tcf_bind_filter call it issues.
      
      Fix that by calling tcf_unbind_filter in err_replace_hw_filter label.
      
      Fixes: ec2507d2
      
       ("net/sched: cls_matchall: Fix error path")
      Signed-off-by: default avatarVictor Nogueira <victor@mojatatu.com>
      Acked-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Reviewed-by: default avatarPedro Tammela <pctammela@mojatatu.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      1134ceab
    • Martin Fuzzey's avatar
      regulator: da9063: fix null pointer deref with partial DT config · 04a025b1
      Martin Fuzzey authored
      [ Upstream commit 98e2dd5f ]
      
      When some of the da9063 regulators do not have corresponding DT nodes
      a null pointer dereference occurs on boot because such regulators have
      no init_data causing the pointers calculated in
      da9063_check_xvp_constraints() to be invalid.
      
      Do not dereference them in this case.
      
      Fixes: b8717a80
      
       ("regulator: da9063: implement setter for voltage monitoring")
      Signed-off-by: default avatarMartin Fuzzey <martin.fuzzey@flowbird.group>
      Link: https://lore.kernel.org/r/20230616143736.2946173-1-martin.fuzzey@flowbird.group
      
      
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      04a025b1
    • Dan Carpenter's avatar
      ASoC: SOF: ipc3-dtrace: uninitialized data in dfsentry_trace_filter_write() · d9234856
      Dan Carpenter authored
      [ Upstream commit 469e2f28 ]
      
      This doesn't check how many bytes the simple_write_to_buffer() writes to
      the buffer.  The only thing that we know is that the first byte is
      initialized and the last byte of the buffer is set to NUL.  However
      the middle bytes could be uninitialized.
      
      There is no need to use simple_write_to_buffer().  This code does not
      support partial writes but instead passes "pos = 0" as the starting
      offset regardless of what the user passed as "*ppos".  Just use the
      copy_from_user() function and initialize the whole buffer.
      
      Fixes: 671e0b90
      
       ("ASoC: SOF: Clone the trace code to ipc3-dtrace as fw_tracing implementation")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@linaro.org>
      Link: https://lore.kernel.org/r/74148292-ce4d-4e01-a1a7-921e6767da14@moroto.mountain
      
      
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d9234856
    • Michal Swiatkowski's avatar
      ice: prevent NULL pointer deref during reload · ca03b327
      Michal Swiatkowski authored
      [ Upstream commit b3e7b3a6 ]
      
      Calling ethtool during reload can lead to call trace, because VSI isn't
      configured for some time, but netdev is alive.
      
      To fix it add rtnl lock for VSI deconfig and config. Set ::num_q_vectors
      to 0 after freeing and add a check for ::tx/rx_rings in ring related
      ethtool ops.
      
      Add proper unroll of filters in ice_start_eth().
      
      Reproduction:
      $watch -n 0.1 -d 'ethtool -g enp24s0f0np0'
      $devlink dev reload pci/0000:18:00.0 action driver_reinit
      
      Call trace before fix:
      [66303.926205] BUG: kernel NULL pointer dereference, address: 0000000000000000
      [66303.926259] #PF: supervisor read access in kernel mode
      [66303.926286] #PF: error_code(0x0000) - not-present page
      [66303.926311] PGD 0 P4D 0
      [66303.926332] Oops: 0000 [#1] PREEMPT SMP PTI
      [66303.926358] CPU: 4 PID: 933821 Comm: ethtool Kdump: loaded Tainted: G           OE      6.4.0-rc5+ #1
      [66303.926400] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.00.01.0014.070920180847 07/09/2018
      [66303.926446] RIP: 0010:ice_get_ringparam+0x22/0x50 [ice]
      [66303.926649] Code: 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 48 8b 87 c0 09 00 00 c7 46 04 e0 1f 00 00 c7 46 10 e0 1f 00 00 48 8b 50 20 <48> 8b 12 0f b7 52 3a 89 56 14 48 8b 40 28 48 8b 00 0f b7 40 58 48
      [66303.926722] RSP: 0018:ffffad40472f39c8 EFLAGS: 00010246
      [66303.926749] RAX: ffff98a8ada05828 RBX: ffff98a8c46dd060 RCX: ffffad40472f3b48
      [66303.926781] RDX: 0000000000000000 RSI: ffff98a8c46dd068 RDI: ffff98a8b23c4000
      [66303.926811] RBP: ffffad40472f3b48 R08: 00000000000337b0 R09: 0000000000000000
      [66303.926843] R10: 0000000000000001 R11: 0000000000000100 R12: ffff98a8b23c4000
      [66303.926874] R13: ffff98a8c46dd060 R14: 000000000000000f R15: ffffad40472f3a50
      [66303.926906] FS:  00007f6397966740(0000) GS:ffff98b390900000(0000) knlGS:0000000000000000
      [66303.926941] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [66303.926967] CR2: 0000000000000000 CR3: 000000011ac20002 CR4: 00000000007706e0
      [66303.926999] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [66303.927029] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [66303.927060] PKRU: 55555554
      [66303.927075] Call Trace:
      [66303.927094]  <TASK>
      [66303.927111]  ? __die+0x23/0x70
      [66303.927140]  ? page_fault_oops+0x171/0x4e0
      [66303.927176]  ? exc_page_fault+0x7f/0x180
      [66303.927209]  ? asm_exc_page_fault+0x26/0x30
      [66303.927244]  ? ice_get_ringparam+0x22/0x50 [ice]
      [66303.927433]  rings_prepare_data+0x62/0x80
      [66303.927469]  ethnl_default_doit+0xe2/0x350
      [66303.927501]  genl_family_rcv_msg_doit.isra.0+0xe3/0x140
      [66303.927538]  genl_rcv_msg+0x1b1/0x2c0
      [66303.927561]  ? __pfx_ethnl_default_doit+0x10/0x10
      [66303.927590]  ? __pfx_genl_rcv_msg+0x10/0x10
      [66303.927615]  netlink_rcv_skb+0x58/0x110
      [66303.927644]  genl_rcv+0x28/0x40
      [66303.927665]  netlink_unicast+0x19e/0x290
      [66303.927691]  netlink_sendmsg+0x254/0x4d0
      [66303.927717]  sock_sendmsg+0x93/0xa0
      [66303.927743]  __sys_sendto+0x126/0x170
      [66303.927780]  __x64_sys_sendto+0x24/0x30
      [66303.928593]  do_syscall_64+0x5d/0x90
      [66303.929370]  ? __count_memcg_events+0x60/0xa0
      [66303.930146]  ? count_memcg_events.constprop.0+0x1a/0x30
      [66303.930920]  ? handle_mm_fault+0x9e/0x350
      [66303.931688]  ? do_user_addr_fault+0x258/0x740
      [66303.932452]  ? exc_page_fault+0x7f/0x180
      [66303.933193]  entry_SYSCALL_64_after_hwframe+0x72/0xdc
      
      Fixes: 5b246e53
      
       ("ice: split probe into smaller functions")
      Reviewed-by: default avatarPrzemek Kitszel <przemyslaw.kitszel@intel.com>
      Signed-off-by: default avatarMichal Swiatkowski <michal.swiatkowski@linux.intel.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ca03b327
    • Petr Oros's avatar
      ice: Unregister netdev and devlink_port only once · 9751240e
      Petr Oros authored
      [ Upstream commit 24a3298a ]
      
      Since commit 6624e780 ("ice: split ice_vsi_setup into smaller
      functions") ice_vsi_release does things twice. There is unregister
      netdev which is unregistered in ice_deinit_eth also.
      
      It also unregisters the devlink_port twice which is also unregistered
      in ice_deinit_eth(). This double deregistration is hidden because
      devl_port_unregister ignores the return value of xa_erase.
      
      [   68.642167] Call Trace:
      [   68.650385]  ice_devlink_destroy_pf_port+0xe/0x20 [ice]
      [   68.655656]  ice_vsi_release+0x445/0x690 [ice]
      [   68.660147]  ice_deinit+0x99/0x280 [ice]
      [   68.664117]  ice_remove+0x1b6/0x5c0 [ice]
      
      [  171.103841] Call Trace:
      [  171.109607]  ice_devlink_destroy_pf_port+0xf/0x20 [ice]
      [  171.114841]  ice_remove+0x158/0x270 [ice]
      [  171.118854]  pci_device_remove+0x3b/0xc0
      [  171.122779]  device_release_driver_internal+0xc7/0x170
      [  171.127912]  driver_detach+0x54/0x8c
      [  171.131491]  bus_remove_driver+0x77/0xd1
      [  171.135406]  pci_unregister_driver+0x2d/0xb0
      [  171.139670]  ice_module_exit+0xc/0x55f [ice]
      
      Fixes: 6624e780
      
       ("ice: split ice_vsi_setup into smaller functions")
      Signed-off-by: default avatarPetr Oros <poros@redhat.com>
      Reviewed-by: default avatarMaciej Fijalkowski <maciej.fijalkowski@intel.com>
      Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      9751240e
    • Shyam Prasad N's avatar
      cifs: fix mid leak during reconnection after timeout threshold · 57d25e99
      Shyam Prasad N authored
      [ Upstream commit 69cba9d3 ]
      
      When the number of responses with status of STATUS_IO_TIMEOUT
      exceeds a specified threshold (NUM_STATUS_IO_TIMEOUT), we reconnect
      the connection. But we do not return the mid, or the credits
      returned for the mid, or reduce the number of in-flight requests.
      
      This bug could result in the server->in_flight count to go bad,
      and also cause a leak in the mids.
      
      This change moves the check to a few lines below where the
      response is decrypted, even of the response is read from the
      transform header. This way, the code for returning the mids
      can be reused.
      
      Also, the cifs_reconnect was reconnecting just the transport
      connection before. In case of multi-channel, this may not be
      what we want to do after several timeouts. Changed that to
      reconnect the session and the tree too.
      
      Also renamed NUM_STATUS_IO_TIMEOUT to a more appropriate name
      MAX_STATUS_IO_TIMEOUT.
      
      Fixes: 8e670f77
      
       ("Handle STATUS_IO_TIMEOUT gracefully")
      Signed-off-by: default avatarShyam Prasad N <sprasad@microsoft.com>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      57d25e99