Skip to content
  1. Mar 27, 2020
  2. Mar 26, 2020
    • Parav Pandit's avatar
      net/mlx5: E-switch, Protect eswitch mode changes · 8e0aa4bc
      Parav Pandit authored
      
      
      Currently eswitch mode change is occurring from 2 different execution
      contexts as below.
      1. sriov sysfs enable/disable
      2. devlink eswitch set commands
      
      Both of them need to access eswitch related data structures in
      synchronized manner.
      Without any synchronization below race condition exist.
      
      SR-IOV enable/disable with devlink eswitch mode change:
      
                cpu-0                         cpu-1
                -----                         -----
      mlx5_device_disable_sriov()        mlx5_devlink_eswitch_mode_set()
        mlx5_eswitch_disable()             esw_offloads_stop()
          esw_offloads_disable()             mlx5_eswitch_disable()
                                               esw_offloads_disable()
      
      Hence, they are synchronized using a new mode_lock.
      eswitch's state_lock is not used as it can lead to a deadlock scenario
      below and state_lock is only for vport and fdb exclusive access.
      
      ip link set vf <param>
        netlink rcv_msg() - Lock A
          rtnl_lock
          vfinfo()
            esw->state_lock() - Lock B
      devlink eswitch_set
         devlink_mutex
           esw->state_lock() - Lock B
             attach_netdev()
               register_netdev()
                 rtnl_lock - Lock A
      
      Alternatives considered:
      1. Acquiring rtnl lock before taking esw->state_lock to follow similar
      locking sequence as ip link flow during eswitch mode set.
      rtnl lock is not good idea for two reasons.
      (a) Holding rtnl lock for several hundred device commands is not good
          idea.
      (b) It leads to below and more similar deadlocks.
      
      devlink eswitch_set
         devlink_mutex
           rtnl_lock - Lock A
             esw->state_lock() - Lock B
               eswitch_disable()
                 reload()
                   ib_register_device()
                     ib_cache_setup_one()
                       rtnl_lock()
      
      2. Exporting devlink lock may lead to undesired use of it in vendor
      driver(s) in future.
      
      3. Unloading representors outside of the mode_lock requires
      serialization with other process trying to enable the eswitch.
      
      4. Differing the representors life cycle to a different workqueue
      requires synchronization with func_change_handler workqueue.
      
      Reviewed-by: default avatarRoi Dayan <roid@mellanox.com>
      Reviewed-by: default avatarBodong Wang <bodong@mellanox.com>
      Reviewed-by: default avatarMark Bloch <markb@mellanox.com>
      Signed-off-by: default avatarParav Pandit <parav@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      8e0aa4bc