Skip to content
  1. Nov 17, 2021
    • Raed Salem's avatar
      net/mlx5: E-Switch, return error if encap isn't supported · c4c31767
      Raed Salem authored
      On regular ConnectX HCAs getting encap mode isn't supported when the
      E-Switch is in NONE mode. Current code would return no error code when
      trying to get encap mode in such case which is wrong.
      
      Fix by returning error value to indicate failure to caller in such case.
      
      Fixes: 8e0aa4bc
      
       ("net/mlx5: E-switch, Protect eswitch mode changes")
      Signed-off-by: default avatarRaed Salem <raeds@nvidia.com>
      Reviewed-by: default avatarMark Bloch <mbloch@nvidia.com>
      Reviewed-by: default avatarMaor Dickman <maord@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      c4c31767
    • Maher Sanalla's avatar
      net/mlx5: Lag, update tracker when state change event received · ae396d85
      Maher Sanalla authored
      Currently, In NETDEV_CHANGELOWERSTATE/NETDEV_CHANGEUPPERSTATE events
      handling, tracking is not fully completed if the LAG device is not ready
      at the time the events occur. But, we must keep track of the upper and
      lower states after receiving the events because RoCE needs this info in
      mlx5_lag_get_roce_netdev() - in order to return the corresponding port
      that its running on. Returning the wrong (not most recent) port will lead
      to gids table being incorrect.
      
      For example: If during the attachment of a slave to the bond, the other
      non-attached port performs pci_reload, then the LAG device is not ready,
      but that should not result in dismissing attached slave tracker update
      automatically (which is performed in mlx5_handle_changelowerstate()), Since
      these events might not come later, which can lead to both bond ports
      having tx_enabled=0 - which is not a valid state of LAG bond.
      
      Fixes: 9b412cc3
      
       ("net/mlx5e: Add LAG warning if bond slave is not lag master")
      Signed-off-by: default avatarMaher Sanalla <msanalla@nvidia.com>
      Reviewed-by: default avatarMark Bloch <mbloch@nvidia.com>
      Reviewed-by: default avatarJianbo Liu <jianbol@nvidia.com>
      Reviewed-by: default avatarRoi Dayan <roid@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      ae396d85
    • Roi Dayan's avatar
      net/mlx5e: CT, Fix multiple allocations and memleak of mod acts · 806401c2
      Roi Dayan authored
      CT clear action offload adds additional mod hdr actions to the
      flow's original mod actions in order to clear the registers which
      hold ct_state.
      When such flow also includes encap action, a neigh update event
      can cause the driver to unoffload the flow and then reoffload it.
      
      Each time this happens, the ct clear handling adds that same set
      of mod hdr actions to reset ct_state until the max of mod hdr
      actions is reached.
      
      Also the driver never releases the allocated mod hdr actions and
      causing a memleak.
      
      Fix above two issues by moving CT clear mod acts allocation
      into the parsing actions phase and only use it when offloading the rule.
      The release of mod acts will be done in the normal flow_put().
      
       backtrace:
          [<000000007316e2f3>] krealloc+0x83/0xd0
          [<00000000ef157de1>] mlx5e_mod_hdr_alloc+0x147/0x300 [mlx5_core]
          [<00000000970ce4ae>] mlx5e_tc_match_to_reg_set_and_get_id+0xd7/0x240 [mlx5_core]
          [<0000000067c5fa17>] mlx5e_tc_match_to_reg_set+0xa/0x20 [mlx5_core]
          [<00000000d032eb98>] mlx5_tc_ct_entry_set_registers.isra.0+0x36/0xc0 [mlx5_core]
          [<00000000fd23b869>] mlx5_tc_ct_flow_offload+0x272/0x1f10 [mlx5_core]
          [<000000004fc24acc>] mlx5e_tc_offload_fdb_rules.part.0+0x150/0x620 [mlx5_core]
          [<00000000dc741c17>] mlx5e_tc_encap_flows_add+0x489/0x690 [mlx5_core]
          [<00000000e92e49d7>] mlx5e_rep_update_flows+0x6e4/0x9b0 [mlx5_core]
          [<00000000f60f5602>] mlx5e_rep_neigh_update+0x39a/0x5d0 [mlx5_core]
      
      Fixes: 1ef3018f
      
       ("net/mlx5e: CT: Support clear action")
      Signed-off-by: default avatarRoi Dayan <roid@nvidia.com>
      Reviewed-by: default avatarPaul Blakey <paulb@nvidia.com>
      Reviewed-by: default avatarMaor Dickman <maord@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      806401c2
    • Avihai Horon's avatar
      net/mlx5: Fix flow counters SF bulk query len · 38a54cae
      Avihai Horon authored
      When doing a flow counters bulk query, the number of counters to query
      must be aligned to 4. Current SF bulk query len is not aligned to 4,
      which leads to an error when trying to query more than 4 counters.
      
      Fix it by aligning SF bulk query len to 4.
      
      Fixes: 2fdeb4f4
      
       ("net/mlx5: Reduce flow counters bulk query buffer size for SFs")
      Signed-off-by: default avatarAvihai Horon <avihaih@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      38a54cae
    • Mark Bloch's avatar
      net/mlx5: E-Switch, rebuild lag only when needed · 2eb0cb31
      Mark Bloch authored
      A user can enable VFs without changing E-Switch mode, this can happen
      when a user moves straight to switchdev mode and only once in switchdev
      VFs are enabled via the sysfs interface.
      
      The cited commit assumed this isn't possible and exposed a single
      API function where the E-switch calls into the lag code, breaks the lag
      and prevents any other lag operations to take place until the
      E-switch update has ended.
      
      Breaking the hardware lag when it isn't needed can make it such that
      hardware lag can't be enabled again.
      
      In the sysfs call path check if the current E-Switch mode is NONE,
      in the context of the function it can only mean the E-Switch is moving
      out of NONE mode and the hardware lag should be disabled and enabled
      once the mode change has ended. If the mode isn't NONE it means
      VFs are about to be enabled and such operation doesn't require
      toggling the hardware lag.
      
      Fixes: cac1eb2c
      
       ("net/mlx5: Lag, properly lock eswitch if needed")
      Signed-off-by: default avatarMark Bloch <mbloch@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      2eb0cb31
    • Neta Ostrovsky's avatar
      net/mlx5: Update error handler for UCTX and UMEM · ba50cd94
      Neta Ostrovsky authored
      In the fast unload flow, the device state is set to internal error,
      which indicates that the driver started the destroy process.
      In this case, when a destroy command is being executed, it should return
      MLX5_CMD_STAT_OK.
      Fix MLX5_CMD_OP_DESTROY_UCTX and MLX5_CMD_OP_DESTROY_UMEM to return OK
      instead of EIO.
      
      This fixes a call trace in the umem release process -
      [ 2633.536695] Call Trace:
      [ 2633.537518]  ib_uverbs_remove_one+0xc3/0x140 [ib_uverbs]
      [ 2633.538596]  remove_client_context+0x8b/0xd0 [ib_core]
      [ 2633.539641]  disable_device+0x8c/0x130 [ib_core]
      [ 2633.540615]  __ib_unregister_device+0x35/0xa0 [ib_core]
      [ 2633.541640]  ib_unregister_device+0x21/0x30 [ib_core]
      [ 2633.542663]  __mlx5_ib_remove+0x38/0x90 [mlx5_ib]
      [ 2633.543640]  auxiliary_bus_remove+0x1e/0x30 [auxiliary]
      [ 2633.544661]  device_release_driver_internal+0x103/0x1f0
      [ 2633.545679]  bus_remove_device+0xf7/0x170
      [ 2633.546640]  device_del+0x181/0x410
      [ 2633.547606]  mlx5_rescan_drivers_locked.part.10+0x63/0x160 [mlx5_core]
      [ 2633.548777]  mlx5_unregister_device+0x27/0x40 [mlx5_core]
      [ 2633.549841]  mlx5_uninit_one+0x21/0xc0 [mlx5_core]
      [ 2633.550864]  remove_one+0x69/0xe0 [mlx5_core]
      [ 2633.551819]  pci_device_remove+0x3b/0xc0
      [ 2633.552731]  device_release_driver_internal+0x103/0x1f0
      [ 2633.553746]  unbind_store+0xf6/0x130
      [ 2633.554657]  kernfs_fop_write+0x116/0x190
      [ 2633.555567]  vfs_write+0xa5/0x1a0
      [ 2633.556407]  ksys_write+0x4f/0xb0
      [ 2633.557233]  do_syscall_64+0x5b/0x1a0
      [ 2633.558071]  entry_SYSCALL_64_after_hwframe+0x65/0xca
      [ 2633.559018] RIP: 0033:0x7f9977132648
      [ 2633.559821] Code: 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 55 6f 2d 00 8b 00 85 c0 75 17 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 41 54 49 89 d4 55
      [ 2633.562332] RSP: 002b:00007fffb1a83888 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
      [ 2633.563472] RAX: ffffffffffffffda RBX: 000000000000000c RCX: 00007f9977132648
      [ 2633.564541] RDX: 000000000000000c RSI: 000055b90546e230 RDI: 0000000000000001
      [ 2633.565596] RBP: 000055b90546e230 R08: 00007f9977406860 R09: 00007f9977a54740
      [ 2633.566653] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f99774056e0
      [ 2633.567692] R13: 000000000000000c R14: 00007f9977400880 R15: 000000000000000c
      [ 2633.568725] ---[ end trace 10b4fe52945e544d ]---
      
      Fixes: 6a6fabbf
      
       ("net/mlx5: Update pci error handler entries and command translation")
      Signed-off-by: default avatarNeta Ostrovsky <netao@nvidia.com>
      Reviewed-by: default avatarLeon Romanovsky <leonro@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      ba50cd94
    • Yevgeny Kliteynik's avatar
      net/mlx5: DR, Fix check for unsupported fields in match param · 455832d4
      Yevgeny Kliteynik authored
      The existing loop doesn't cast the buffer while scanning it, which
      results in out-of-bounds read and failure to create the matcher.
      
      Fixes: 941f1979
      
       ("net/mlx5: DR, Add check for unsupported fields in match param")
      Signed-off-by: default avatarYevgeny Kliteynik <kliteyn@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      455832d4
    • Yevgeny Kliteynik's avatar
      net/mlx5: DR, Handle eswitch manager and uplink vports separately · 9091b821
      Yevgeny Kliteynik authored
      When querying eswitch manager vport capabilities as "other = 1",
      we encounter a FW compatibility issue with older FW versions.
      To maintain backward compatibility, eswitch manager vport should
      be queried as "other = 0" vport both for ECPF and non-ECPF cases.
      
      This patch fixes these queries and improves the code readability
      by handling eswitch manager and uplink vports separately, avoiding
      the excessive 'if' conditions. Also, uplink caps are stored similar
      to esw manager and not as part of xarray.
      
      Fixes: dd4acb2a
      
       ("net/mlx5: DR, Add missing query for vport 0")
      Signed-off-by: default avatarYevgeny Kliteynik <kliteyn@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      9091b821
    • Valentine Fatiev's avatar
      net/mlx5e: nullify cq->dbg pointer in mlx5_debug_cq_remove() · 76ded29d
      Valentine Fatiev authored
      Prior to this patch in case mlx5_core_destroy_cq() failed it proceeds
      to rest of destroy operations. mlx5_core_destroy_cq() could be called again
      by user and cause additional call of mlx5_debug_cq_remove().
      cq->dbg was not nullify in previous call and cause the crash.
      
      Fix it by nullify cq->dbg pointer after removal.
      
      Also proceed to destroy operations only if FW return 0
      for MLX5_CMD_OP_DESTROY_CQ command.
      
      general protection fault, probably for non-canonical address 0x2000300004058: 0000 [#1] SMP PTI
      CPU: 5 PID: 1228 Comm: python Not tainted 5.15.0-rc5_for_upstream_min_debug_2021_10_14_11_06 #1
      Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
      RIP: 0010:lockref_get+0x1/0x60
      Code: 5d e9 53 ff ff ff 48 8d 7f 70 e8 0a 2e 48 00 c7 85 d0 00 00 00 02
      00 00 00 c6 45 70 00 fb 5d c3 c3 cc cc cc cc cc cc cc cc 53 <48> 8b 17
      48 89 fb 85 d2 75 3d 48 89 d0 bf 64 00 00 00 48 89 c1 48
      RSP: 0018:ffff888137dd7a38 EFLAGS: 00010206
      RAX: 0000000000000000 RBX: ffff888107d5f458 RCX: 00000000fffffffe
      RDX: 000000000002c2b0 RSI: ffffffff8155e2e0 RDI: 0002000300004058
      RBP: ffff888137dd7a88 R08: 0002000300004058 R09: ffff8881144a9f88
      R10: 0000000000000000 R11: 0000000000000000 R12: ffff8881141d4000
      R13: ffff888137dd7c68 R14: ffff888137dd7d58 R15: ffff888137dd7cc0
      FS:  00007f4644f2a4c0(0000) GS:ffff8887a2d40000(0000)
      knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 000055b4500f4380 CR3: 0000000114f7a003 CR4: 0000000000170ea0
      Call Trace:
        simple_recursive_removal+0x33/0x2e0
        ? debugfs_remove+0x60/0x60
        debugfs_remove+0x40/0x60
        mlx5_debug_cq_remove+0x32/0x70 [mlx5_core]
        mlx5_core_destroy_cq+0x41/0x1d0 [mlx5_core]
        devx_obj_cleanup+0x151/0x330 [mlx5_ib]
        ? __pollwait+0xd0/0xd0
        ? xas_load+0x5/0x70
        ? xa_load+0x62/0xa0
        destroy_hw_idr_uobject+0x20/0x80 [ib_uverbs]
        uverbs_destroy_uobject+0x3b/0x360 [ib_uverbs]
        uobj_destroy+0x54/0xa0 [ib_uverbs]
        ib_uverbs_cmd_verbs+0xaf2/0x1160 [ib_uverbs]
        ? uverbs_finalize_object+0xd0/0xd0 [ib_uverbs]
        ib_uverbs_ioctl+0xc4/0x1b0 [ib_uverbs]
        __x64_sys_ioctl+0x3e4/0x8e0
      
      Fixes: 94b960b9
      
       ("net/mlx5e: Fix memory leak in mlx5_core_destroy_cq() error path")
      Signed-off-by: default avatarValentine Fatiev <valentinef@nvidia.com>
      Reviewed-by: default avatarLeon Romanovsky <leonro@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      76ded29d
    • Paul Blakey's avatar
      net/mlx5: E-Switch, Fix resetting of encap mode when entering switchdev · d7751d64
      Paul Blakey authored
      E-Switch encap mode is relevant only when in switchdev mode.
      The RDMA driver can query the encap configuration via
      mlx5_eswitch_get_encap_mode(). Make sure it returns the currently
      used mode and not the set one.
      
      This reverts the cited commit which reset the encap mode
      on entering switchdev and fixes the original issue properly.
      
      Fixes: 9a64144d
      
       ("net/mlx5: E-Switch, Fix default encap mode")
      Signed-off-by: default avatarPaul Blakey <paulb@nvidia.com>
      Reviewed-by: default avatarMark Bloch <mbloch@nvidia.com>
      Reviewed-by: default avatarMaor Dickman <maord@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      d7751d64
    • Vlad Buslov's avatar
      net/mlx5e: Wait for concurrent flow deletion during neigh/fib events · 362980ea
      Vlad Buslov authored
      Function mlx5e_take_tmp_flow() skips flows with zero reference count. This
      can cause syndrome 0x179e84 when the called from neigh or route update code
      and the skipped flow is not removed from the hardware by the time
      underlying encap/decap resource is deleted. Add new completion
      'del_hw_done' that is completed when flow is unoffloaded. This is safe to
      do because flow with reference count zero needs to be detached from
      encap/decap entry before its memory is deallocated, which requires taking
      the encap_tbl_lock mutex that is held by the event handlers code.
      
      Fixes: 8914add2
      
       ("net/mlx5e: Handle FIB events to update tunnel endpoint device")
      Signed-off-by: default avatarVlad Buslov <vladbu@nvidia.com>
      Reviewed-by: default avatarRoi Dayan <roid@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      362980ea
    • Tariq Toukan's avatar
      net/mlx5e: kTLS, Fix crash in RX resync flow · cc4a9cc0
      Tariq Toukan authored
      For the TLS RX resync flow, we maintain a list of TLS contexts
      that require some attention, to communicate their resync information
      to the HW.
      Here we fix list corruptions, by protecting the entries against
      movements coming from resync_handle_seq_match(), until their resync
      handling in napi is fully completed.
      
      Fixes: e9ce991b
      
       ("net/mlx5e: kTLS, Add resiliency to RX resync failures")
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Reviewed-by: default avatarMaxim Mikityanskiy <maximmi@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      cc4a9cc0
  2. Nov 16, 2021
  3. Nov 15, 2021