Skip to content
  1. Dec 11, 2023
    • Vlad Buslov's avatar
      net/sched: act_ct: Take per-cb reference to tcf_ct_flow_table · 125f1c7f
      Vlad Buslov authored
      The referenced change added custom cleanup code to act_ct to delete any
      callbacks registered on the parent block when deleting the
      tcf_ct_flow_table instance. However, the underlying issue is that the
      drivers don't obtain the reference to the tcf_ct_flow_table instance when
      registering callbacks which means that not only driver callbacks may still
      be on the table when deleting it but also that the driver can still have
      pointers to its internal nf_flowtable and can use it concurrently which
      results either warning in netfilter[0] or use-after-free.
      
      Fix the issue by taking a reference to the underlying struct
      tcf_ct_flow_table instance when registering the callback and release the
      reference when unregistering. Expose new API required for such reference
      counting by adding two new callbacks to nf_flowtable_type and implementing
      them for act_ct flowtable_ct type. This fixes the issue by extending the
      lifetime of nf_flowtable until all users have unregistered.
      
      [0]:
      [106170.938634] ------------[ cut here ]------------
      [106170.939111] WARNING: CPU: 21 PID: 3688 at include/net/netfilter/nf_flow_table.h:262 mlx5_tc_ct_del_ft_cb+0x267/0x2b0 [mlx5_core]
      [106170.940108] Modules linked in: act_ct nf_flow_table act_mirred act_skbedit act_tunnel_key vxlan cls_matchall nfnetlink_cttimeout act_gact cls_flower sch_ingress mlx5_vdpa vringh vhost_iotlb vdpa bonding openvswitch nsh rpcrdma rdma_ucm
      ib_iser libiscsi scsi_transport_iscsi ib_umad rdma_cm ib_ipoib iw_cm ib_cm mlx5_ib ib_uverbs ib_core xt_MASQUERADE nf_conntrack_netlink nfnetlink iptable_nat xt_addrtype xt_conntrack nf_nat br_netfilter rpcsec_gss_krb5 auth_rpcgss oid_regis
      try overlay mlx5_core
      [106170.943496] CPU: 21 PID: 3688 Comm: kworker/u48:0 Not tainted 6.6.0-rc7_for_upstream_min_debug_2023_11_01_13_02 #1
      [106170.944361] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
      [106170.945292] Workqueue: mlx5e mlx5e_rep_neigh_update [mlx5_core]
      [106170.945846] RIP: 0010:mlx5_tc_ct_del_ft_cb+0x267/0x2b0 [mlx5_core]
      [106170.946413] Code: 89 ef 48 83 05 71 a4 14 00 01 e8 f4 06 04 e1 48 83 05 6c a4 14 00 01 48 83 c4 28 5b 5d 41 5c 41 5d c3 48 83 05 d1 8b 14 00 01 <0f> 0b 48 83 05 d7 8b 14 00 01 e9 96 fe ff ff 48 83 05 a2 90 14 00
      [106170.947924] RSP: 0018:ffff88813ff0fcb8 EFLAGS: 00010202
      [106170.948397] RAX: 0000000000000000 RBX: ffff88811eabac40 RCX: ffff88811eabad48
      [106170.949040] RDX: ffff88811eab8000 RSI: ffffffffa02cd560 RDI: 0000000000000000
      [106170.949679] RBP: ffff88811eab8000 R08: 0000000000000001 R09: ffffffffa0229700
      [106170.950317] R10: ffff888103538fc0 R11: 0000000000000001 R12: ffff88811eabad58
      [106170.950969] R13: ffff888110c01c00 R14: ffff888106b40000 R15: 0000000000000000
      [106170.951616] FS:  0000000000000000(0000) GS:ffff88885fd40000(0000) knlGS:0000000000000000
      [106170.952329] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [106170.952834] CR2: 00007f1cefd28cb0 CR3: 000000012181b006 CR4: 0000000000370ea0
      [106170.953482] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [106170.954121] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [106170.954766] Call Trace:
      [106170.955057]  <TASK>
      [106170.955315]  ? __warn+0x79/0x120
      [106170.955648]  ? mlx5_tc_ct_del_ft_cb+0x267/0x2b0 [mlx5_core]
      [106170.956172]  ? report_bug+0x17c/0x190
      [106170.956537]  ? handle_bug+0x3c/0x60
      [106170.956891]  ? exc_invalid_op+0x14/0x70
      [106170.957264]  ? asm_exc_invalid_op+0x16/0x20
      [106170.957666]  ? mlx5_del_flow_rules+0x10/0x310 [mlx5_core]
      [106170.958172]  ? mlx5_tc_ct_block_flow_offload_add+0x1240/0x1240 [mlx5_core]
      [106170.958788]  ? mlx5_tc_ct_del_ft_cb+0x267/0x2b0 [mlx5_core]
      [106170.959339]  ? mlx5_tc_ct_del_ft_cb+0xc6/0x2b0 [mlx5_core]
      [106170.959854]  ? mapping_remove+0x154/0x1d0 [mlx5_core]
      [106170.960342]  ? mlx5e_tc_action_miss_mapping_put+0x4f/0x80 [mlx5_core]
      [106170.960927]  mlx5_tc_ct_delete_flow+0x76/0xc0 [mlx5_core]
      [106170.961441]  mlx5_free_flow_attr_actions+0x13b/0x220 [mlx5_core]
      [106170.962001]  mlx5e_tc_del_fdb_flow+0x22c/0x3b0 [mlx5_core]
      [106170.962524]  mlx5e_tc_del_flow+0x95/0x3c0 [mlx5_core]
      [106170.963034]  mlx5e_flow_put+0x73/0xe0 [mlx5_core]
      [106170.963506]  mlx5e_put_flow_list+0x38/0x70 [mlx5_core]
      [106170.964002]  mlx5e_rep_update_flows+0xec/0x290 [mlx5_core]
      [106170.964525]  mlx5e_rep_neigh_update+0x1da/0x310 [mlx5_core]
      [106170.965056]  process_one_work+0x13a/0x2c0
      [106170.965443]  worker_thread+0x2e5/0x3f0
      [106170.965808]  ? rescuer_thread+0x410/0x410
      [106170.966192]  kthread+0xc6/0xf0
      [106170.966515]  ? kthread_complete_and_exit+0x20/0x20
      [106170.966970]  ret_from_fork+0x2d/0x50
      [106170.967332]  ? kthread_complete_and_exit+0x20/0x20
      [106170.967774]  ret_from_fork_asm+0x11/0x20
      [106170.970466]  </TASK>
      [106170.970726] ---[ end trace 0000000000000000 ]---
      
      Fixes: 77ac5e40
      
       ("net/sched: act_ct: remove and free nf_table callbacks")
      Signed-off-by: default avatarVlad Buslov <vladbu@nvidia.com>
      Reviewed-by: default avatarPaul Blakey <paulb@nvidia.com>
      Acked-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      125f1c7f
    • Zhipeng Lu's avatar
      octeontx2-af: fix a use-after-free in rvu_nix_register_reporters · 28a7cb04
      Zhipeng Lu authored
      The rvu_dl will be freed in rvu_nix_health_reporters_destroy(rvu_dl)
      after the create_workqueue fails, and after that free, the rvu_dl will
      be translate back through the following call chain:
      
      rvu_nix_health_reporters_destroy
        |-> rvu_nix_health_reporters_create
             |-> rvu_health_reporters_create
                   |-> rvu_register_dl (label err_dl_health)
      
      Finally. in the err_dl_health label, rvu_dl being freed again in
      rvu_health_reporters_destroy(rvu) by rvu_nix_health_reporters_destroy.
      In the second calls of rvu_nix_health_reporters_destroy, however,
      it uses rvu_dl->rvu_nix_health_reporter, which is already freed at
      the end of rvu_nix_health_reporters_destroy in the first call.
      
      So this patch prevents the first destroy by instantly returning -ENONMEN
      when create_workqueue fails. In addition, since the failure of
      create_workqueue is the only entrence of label err, it has been
      integrated into the error-handling path of create_workqueue.
      
      Fixes: 5ed66306
      
       ("octeontx2-af: Add devlink health reporters for NIX")
      Signed-off-by: default avatarZhipeng Lu <alexious@zju.edu.cn>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      28a7cb04
    • Radu Bulie's avatar
      net: fec: correct queue selection · 9fc95fe9
      Radu Bulie authored
      The old implementation extracted VLAN TCI info from the payload
      before the VLAN tag has been pushed in the payload.
      
      Another problem was that the VLAN TCI was extracted even if the
      packet did not have VLAN protocol header.
      
      This resulted in invalid VLAN TCI and as a consequence a random
      queue was computed.
      
      This patch fixes the above issues and use the VLAN TCI from the
      skb if it is present or VLAN TCI from payload if present. If no
      VLAN header is present queue 0 is selected.
      
      Fixes: 52c4a1a8
      
       ("net: fec: add ndo_select_queue to fix TX bandwidth fluctuations")
      Signed-off-by: default avatarRadu Bulie <radu-andrei.bulie@nxp.com>
      Signed-off-by: default avatarWei Fang <wei.fang@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9fc95fe9
  2. Dec 10, 2023
    • Pavel Begunkov's avatar
      io_uring/af_unix: disable sending io_uring over sockets · 69db702c
      Pavel Begunkov authored
      File reference cycles have caused lots of problems for io_uring
      in the past, and it still doesn't work exactly right and races with
      unix_stream_read_generic(). The safest fix would be to completely
      disallow sending io_uring files via sockets via SCM_RIGHT, so there
      are no possible cycles invloving registered files and thus rendering
      SCM accounting on the io_uring side unnecessary.
      
      Cc: stable@vger.kernel.org
      Fixes: 0091bfc8
      
       ("io_uring/af_unix: defer registered files gc to io_uring release")
      Reported-and-suggested-by: default avatarJann Horn <jannh@google.com>
      Signed-off-by: default avatarPavel Begunkov <asml.silence@gmail.com>
      Acked-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      69db702c
    • Chengfeng Ye's avatar
      atm: solos-pci: Fix potential deadlock on &tx_queue_lock · 15319a4e
      Chengfeng Ye authored
      As &card->tx_queue_lock is acquired under softirq context along the
      following call chain from solos_bh(), other acquisition of the same
      lock inside process context should disable at least bh to avoid double
      lock.
      
      <deadlock #2>
      pclose()
      --> spin_lock(&card->tx_queue_lock)
      <interrupt>
         --> solos_bh()
         --> fpga_tx()
         --> spin_lock(&card->tx_queue_lock)
      
      This flaw was found by an experimental static analysis tool I am
      developing for irq-related deadlock.
      
      To prevent the potential deadlock, the patch uses spin_lock_bh()
      on &card->tx_queue_lock under process context code consistently to
      prevent the possible deadlock scenario.
      
      Fixes: 213e85d3
      
       ("solos-pci: clean up pclose() function")
      Signed-off-by: default avatarChengfeng Ye <dg573847474@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      15319a4e
    • Chengfeng Ye's avatar
      atm: solos-pci: Fix potential deadlock on &cli_queue_lock · d5dba32b
      Chengfeng Ye authored
      As &card->cli_queue_lock is acquired under softirq context along the
      following call chain from solos_bh(), other acquisition of the same
      lock inside process context should disable at least bh to avoid double
      lock.
      
      <deadlock #1>
      console_show()
      --> spin_lock(&card->cli_queue_lock)
      <interrupt>
         --> solos_bh()
         --> spin_lock(&card->cli_queue_lock)
      
      This flaw was found by an experimental static analysis tool I am
      developing for irq-related deadlock.
      
      To prevent the potential deadlock, the patch uses spin_lock_bh()
      on the card->cli_queue_lock under process context code consistently
      to prevent the possible deadlock scenario.
      
      Fixes: 9c54004e
      
       ("atm: Driver for Solos PCI ADSL2+ card.")
      Signed-off-by: default avatarChengfeng Ye <dg573847474@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d5dba32b
  3. Dec 09, 2023
  4. Dec 08, 2023