Skip to content
  1. Jul 27, 2021
    • Vladimir Oltean's avatar
      Revert "net: dsa: Allow drivers to filter packets they can decode source port from" · edac6f63
      Vladimir Oltean authored
      This reverts commit cc1939e4
      
      .
      
      Currently 2 classes of DSA drivers are able to send/receive packets
      directly through the DSA master:
      - drivers with DSA_TAG_PROTO_NONE
      - sja1105
      
      Now that sja1105 has gained the ability to perform traffic termination
      even under the tricky case (VLAN-aware bridge), and that is much more
      functional (we can perform VLAN-aware bridging with foreign interfaces),
      there is no reason to keep this code in the receive path of the network
      core. So delete it.
      
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      edac6f63
    • Vladimir Oltean's avatar
      net: dsa: sja1105: add bridge TX data plane offload based on tag_8021q · b6ad86e6
      Vladimir Oltean authored
      
      
      The main desire for having this feature in sja1105 is to support network
      stack termination for traffic coming from a VLAN-aware bridge.
      
      For sja1105, offloading the bridge data plane means sending packets
      as-is, with the proper VLAN tag, to the chip. The chip will look up its
      FDB and forward them to the correct destination port.
      
      But we support bridge data plane offload even for VLAN-unaware bridges,
      and the implementation there is different. In fact, VLAN-unaware
      bridging is governed by tag_8021q, so it makes sense to have the
      .bridge_fwd_offload_add() implementation fully within tag_8021q.
      The key difference is that we only support 1 VLAN-aware bridge, but we
      support multiple VLAN-unaware bridges. So we need to make sure that the
      forwarding domain is not crossed by packets injected from the stack.
      
      For this, we introduce the concept of a tag_8021q TX VLAN for bridge
      forwarding offload. As opposed to the regular TX VLANs which contain
      only 2 ports (the user port and the CPU port), a bridge data plane TX
      VLAN is "multicast" (or "imprecise"): it contains all the ports that are
      part of a certain bridge, and the hardware will select where the packet
      goes within this "imprecise" forwarding domain.
      
      Each VLAN-unaware bridge has its own "imprecise" TX VLAN, so we make use
      of the unique "bridge_num" provided by DSA for the data plane offload.
      We use the same 3 bits from the tag_8021q VLAN ID format to encode this
      bridge number.
      
      Note that these 3 bit positions have been used before for sub-VLANs in
      best-effort VLAN filtering mode. The difference is that for best-effort,
      the sub-VLANs were only valid on RX (and it was documented that the
      sub-VLAN field needed to be transmitted as zero). Whereas for the bridge
      data plane offload, these 3 bits are only valid on TX.
      
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b6ad86e6
    • Vladimir Oltean's avatar
      net: dsa: sja1105: add support for imprecise RX · 884be12f
      Vladimir Oltean authored
      
      
      This is already common knowledge by now, but the sja1105 does not have
      hardware support for DSA tagging for data plane packets, and tag_8021q
      sets up a unique pvid per port, transmitted as VLAN-tagged towards the
      CPU, for the source port to be decoded nonetheless.
      
      When the port is part of a VLAN-aware bridge, the pvid committed to
      hardware is taken from the bridge and not from tag_8021q, so we need to
      work with that the best we can.
      
      Configure the switches to send all packets to the CPU as VLAN-tagged
      (even ones that were originally untagged on the wire) and make use of
      dsa_untag_bridge_pvid() to get rid of it before we send those packets up
      the network stack.
      
      With the classified VLAN used by hardware known to the tagger, we first
      peek at the VID in an attempt to figure out if the packet was received
      from a VLAN-unaware port (standalone or under a VLAN-unaware bridge),
      case in which we can continue to call dsa_8021q_rcv(). If that is not
      the case, the packet probably came from a VLAN-aware bridge. So we call
      the DSA helper that finds for us a "designated bridge port" - one that
      is a member of the VLAN ID from the packet, and is in the proper STP
      state - basically these are all checks performed by br_handle_frame() in
      the software RX data path.
      
      The bridge will accept the packet as valid even if the source port was
      maybe wrong. So it will maybe learn the MAC SA of the packet on the
      wrong port, and its software FDB will be out of sync with the hardware
      FDB. So replies towards this same MAC DA will not work, because the
      bridge will send towards a different netdev.
      
      This is where the bridge data plane offload ("imprecise TX") added by
      the next patch comes in handy. The software FDB is wrong, true, but the
      hardware FDB isn't, and by offloading the bridge forwarding plane we
      have a chance to right a wrong, and have the hardware look up the FDB
      for us for the reply packet. So it all cancels out.
      
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      884be12f
    • Vladimir Oltean's avatar
      net: dsa: sja1105: deny more than one VLAN-aware bridge · 19fa937a
      Vladimir Oltean authored
      
      
      With tag_sja1105.c's only ability being to perform an imprecise RX
      procedure and identify whether a packet comes from a VLAN-aware bridge
      or not, we have no way to determine whether a packet with VLAN ID 5
      comes from, say, br0 or br1. Actually we could, but it would mean that
      we need to restrict all VLANs from br0 to be different from all VLANs
      from br1, and this includes the default_pvid, which makes a setup with 2
      VLAN-aware bridges highly imprectical.
      
      The fact of the matter is that this isn't even that big of a practical
      limitation, since even with a single VLAN-aware bridge we can pretty
      much enforce forwarding isolation based on the VLAN port membership.
      
      So in the end, tell the user that they need to model their setup using a
      single VLAN-aware bridge.
      
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      19fa937a
    • Vladimir Oltean's avatar
      net: dsa: sja1105: deny 8021q uppers on ports · 4fbc08bd
      Vladimir Oltean authored
      
      
      Now that best-effort VLAN filtering is gone and we are left with the
      imprecise RX and imprecise TX based in VLAN-aware mode, where the tagger
      just guesses the source port based on plausibility of the VLAN ID, 8021q
      uppers installed on top of a standalone port, while other ports of that
      switch are under a VLAN-aware bridge don't quite "just work".
      
      In fact it could be possible to restrict the VLAN IDs used by the 8021q
      uppers to not be shared with VLAN IDs used by that VLAN-aware bridge,
      but then the tagger needs to be patched to search for 8021q uppers too,
      not just for the "designated bridge port" which will be introduced in a
      later patch.
      
      I haven't given a possible implementation full thought, it seems maybe
      possible but not worth the effort right now. The only certain thing is
      that currently the tagger won't be able to figure out the source port
      for these packets because they will come with the VLAN ID of the 8021q
      upper and are no longer retagged to a tag_8021q sub-VLAN like the best
      effort VLAN filtering code used to do. So just deny these for the
      moment.
      
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4fbc08bd
    • Vladimir Oltean's avatar
      net: dsa: sja1105: delete vlan delta save/restore logic · 6dfd23d3
      Vladimir Oltean authored
      With the best_effort_vlan_filtering mode now gone, the driver does not
      have 3 operating modes anymore (VLAN-unaware, VLAN-aware and best effort),
      but only 2.
      
      The idea is that we will gain support for network stack I/O through a
      VLAN-aware bridge, using the data plane offload framework (imprecise RX,
      imprecise TX). So the VLAN-aware use case will be more functional.
      
      But standalone ports that are part of the same switch when some other
      ports are under a VLAN-aware bridge should work too. Termination on
      those should work through the tag_8021q RX VLAN and TX VLAN.
      
      This was not possible using the old logic, because:
      - in VLAN-unaware mode, only the tag_8021q VLANs were committed to hw
      - in VLAN-aware mode, only the bridge VLANs were committed to hw
      - in best-effort VLAN mode, both the tag_8021q and bridge VLANs were
        committed to hw
      
      The strategy for the new VLAN-aware mode is to allow the bridge and the
      tag_8021q VLANs to coexist in the VLAN table at the same time.
      
      [ yes, we need to make sure that the bridge cannot install a tag_8021q
        VLAN, but ]
      
      This means that the save/restore logic introduced by commit ec5ae610
      
      
      ("net: dsa: sja1105: save/restore VLANs using a delta commit method")
      does not serve a purpose any longer. We can delete it and restore the
      old code that simply adds a VLAN to the VLAN table and calls it a day.
      
      Note that we keep the sja1105_commit_pvid() function from those days,
      but adapt it slightly. Ports that are under a VLAN-aware bridge use the
      bridge's pvid, ports that are standalone or under a VLAN-unaware bridge
      use the tag_8021q pvid, for local termination or VLAN-unaware forwarding.
      
      Now, when the vlan_filtering property is toggled for the bridge, the
      pvid of the ports beneath it is the only thing that's changing, we no
      longer delete some VLANs and restore others.
      
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6dfd23d3
    • Colin Ian King's avatar
      net: dsa: sja1105: remove redundant re-assignment of pointer table · d63f8877
      Colin Ian King authored
      
      
      The pointer table is being re-assigned with a value that is never
      read. The assignment is redundant and can be removed.
      
      Addresses-Coverity: ("Unused value")
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d63f8877
    • Vladimir Oltean's avatar
      net: bridge: add a helper for retrieving port VLANs from the data path · ee80dd2e
      Vladimir Oltean authored
      
      
      Introduce a brother of br_vlan_get_info() which is protected by the RCU
      mechanism, as opposed to br_vlan_get_info() which relies on taking the
      write-side rtnl_mutex.
      
      This is needed for drivers which need to find out whether a bridge port
      has a VLAN configured or not. For example, certain DSA switches might
      not offer complete source port identification to the CPU on RX, just the
      VLAN in which the packet was received. Based on this VLAN, we cannot set
      an accurate skb->dev ingress port, but at least we can configure one
      that behaves the same as the correct one would (this is possible because
      DSA sets skb->offload_fwd_mark = 1).
      
      When we look at the bridge RX handler (br_handle_frame), we see that
      what matters regarding skb->dev is the VLAN ID and the port STP state.
      So we need to select an skb->dev that has the same bridge VLAN as the
      packet we're receiving, and is in the LEARNING or FORWARDING STP state.
      The latter is easy, but for the former, we should somehow keep a shadow
      list of the bridge VLANs on each port, and a lookup table between VLAN
      ID and the 'designated port for imprecise RX'. That is rather
      complicated to keep in sync properly (the designated port per VLAN needs
      to be updated on the addition and removal of a VLAN, as well as on the
      join/leave events of the bridge on that port).
      
      So, to avoid all that complexity, let's just iterate through our finite
      number of ports and ask the bridge, for each packet: "do you have this
      VLAN configured on this port?".
      
      Cc: Roopa Prabhu <roopa@nvidia.com>
      Cc: Nikolay Aleksandrov <nikolay@nvidia.com>
      Cc: Ido Schimmel <idosch@nvidia.com>
      Cc: Jiri Pirko <jiri@nvidia.com>
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ee80dd2e
    • Vladimir Oltean's avatar
      net: bridge: update BROPT_VLAN_ENABLED before notifying switchdev in br_vlan_filter_toggle · f7cdb3ec
      Vladimir Oltean authored
      
      
      SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING is notified by the bridge from
      two places:
      - nbp_vlan_init(), during bridge port creation
      - br_vlan_filter_toggle(), during a netlink/sysfs/ioctl change requested
        by user space
      
      If a switchdev driver uses br_vlan_enabled(br_dev) inside its handler
      for the SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING attribute notifier,
      different things will be seen depending on whether the bridge calls from
      the first path or the second:
      - in nbp_vlan_init(), br_vlan_enabled() reflects the current state of
        the bridge
      - in br_vlan_filter_toggle(), br_vlan_enabled() reflects the past state
        of the bridge
      
      This can lead in some cases to complications in driver implementation,
      which can be avoided if these could reliably use br_vlan_enabled().
      
      Nothing seems to depend on this behavior, and it seems overall more
      straightforward for br_vlan_enabled() to return the proper value even
      during the SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING notifier, so
      temporarily enable the bridge option, then revert it if the switchdev
      notifier failed.
      
      Cc: Roopa Prabhu <roopa@nvidia.com>
      Cc: Nikolay Aleksandrov <nikolay@nvidia.com>
      Cc: Ido Schimmel <idosch@nvidia.com>
      Cc: Jiri Pirko <jiri@nvidia.com>
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f7cdb3ec
    • David S. Miller's avatar
      Merge tag 'mlx5-updates-2021-07-24' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · 9bff6684
      David S. Miller authored
      
      
      mlx5-updates-2021-07-24
      
      This series aims to reduce coupling in mlx5e, particularly between RX
      resources (TIRs, RQTs) and numerous code units that use them.
      
      This refactoring is required for upcoming features, ADQ and TX lag
      hashing.
      
      The issue with the current code is that TIRs and RQTs are unmanaged,
      different places all over the driver create, destroy, track and
      configure them, often in an uncoordinated way. The responsibilities of
      different units become vague, leading to a lot of hidden dependencies
      between numerous units and tight coupling between them, which is prone
      to bugs and hard to maintain.
      
      The result of this refactoring is:
      
      1. Creating a manager for RX resources, that controls their lifecycle
      and provides a clear API, which restricts the set of actions that other
      units can do.
      
      2. Using object-oriented approach for TIRs, RQTs and RX resource
      manager (struct mlx5e_rx_res).
      
      3. Fixing a few bugs and misbehaviors found during the refactoring.
      
      4. Reducing the amount of dependencies, removing hidden dependencies,
      making them one-directional and organizing the code in clear abstraction
      layers.
      
      5. Explicitly exposing the remaining weird dependencies.
      
      6. Simplifying and organizing code that creates and modifies TIRs and
      RQTs.
      
      Saeed Mahameed says:
      
      ====================
      mlx5 updates 2021-07-24
      
      This series provides some refactoring to mlx5e RX resource management,
      it is required for upcoming ADQ and TX lag hashing features.
      
      The first two patches in this series :
        net/mlx5e: Prohibit inner indir TIRs in IPoIB
        net/mlx5e: Block LRO if firmware asks for tunneled LRO
      Were supposed to go to net, but due to dependency and timing they were
      included here.
      I would appreciate it if you'd apply them to net and mark for -stable.
      
      For more information please see tag log below.
      
      Please pull and let me know if there is any problem.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9bff6684
    • David S. Miller's avatar
      Merge tag 'linux-can-next-for-5.15-20210725' of... · d20e5880
      David S. Miller authored
      
      Merge tag 'linux-can-next-for-5.15-20210725' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next
      
      linux-can-next-for-5.15-20210725
      
      Marc Kleine-Budde says:
      
      ====================
      pull-request: can-next 2021-07-25
      
      this is a pull request of 46 patches for net-next/master.
      
      The first 6 patches target the CAN J1939 protocol. One is from
      gushengxian, fixing a grammatical error, 5 are by me fixing a checkpatch
      warning, make use of the fallthrough pseudo-keyword, and use
      consistent variable naming.
      
      The next 3 patches target the rx-offload helper, are by me and improve
      the performance and fix the local softirq work pending error, when
      napi_schedule() is called from threaded IRQ context.
      
      The next 3 patches are by Vincent Mailhol and me update the CAN
      bittiming and transmitter delay compensation, the documentation for
      the struct can_tdc is fixed, clear data_bittiming if FD mode is turned
      off and a redundant check is removed.
      
      Followed by 4 patches targeting the m_can driver. Faiz Abbas's patches
      add support for CAN PHY via the generic phy subsystem. Yang Yingliang
      converts the driver to use devm_platform_ioremap_resource_byname().
      And a patch by me which removes the unused support for custom bit
      timing.
      
      Andy Shevchenko contributes 2 patches for the mcp251xfd driver to
      prepare the driver for ACPI support. A patch by me adds support for
      shared IRQ handlers.
      
      Zhen Lei contributes 3 patches to convert the esd_usb2, janz-ican3 and
      the at91_can driver to make use of the DEVICE_ATTR_RO/RW() macros.
      
      The next 8 patches are by Peng Li and provide general cleanups for the
      at91_can driver.
      
      The next 7 patches target the peak driver. Frist 2 cleanup patches by
      me for the peak_pci driver, followed by Stephane Grosjean' patch to
      print the name and firmware version of the detected hardware. The
      peak_usb driver gets a cleanup patch, loopback and one-shot mode and
      an upgrading of the bus state change handling in Stephane Grosjean's
      patches.
      
      Vincent Mailhol provides 6 cleanup patches for the etas_es58x driver.
      
      In the last 3 patches Angelo Dureghello add support for the mcf5441x
      SoC to the flexcan driver.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d20e5880
    • Maxim Mikityanskiy's avatar
      net/mlx5e: Use the new TIR API for kTLS · 09f83569
      Maxim Mikityanskiy authored
      
      
      One of the previous commits introduced a dedicated object for a TIR.
      kTLS code creates a TIR per connection using the low-level mlx5_core
      API. This commit converts it to the new mlx5e_tir API.
      
      Signed-off-by: default avatarMaxim Mikityanskiy <maximmi@nvidia.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      09f83569
    • Maxim Mikityanskiy's avatar
      net/mlx5e: Move management of indir traffic types to rx_res · 65d6b6e5
      Maxim Mikityanskiy authored
      
      
      This commit moves the responsibility of keeping the RSS configuration
      for different traffic types to en/rx_res.{c,h}, hiding the
      implementation details behind the new getters, and abandons all usage of
      struct mlx5e_tirc_config, which is no longer useful and superseded by
      struct mlx5e_rss_params_traffic_type.
      
      Signed-off-by: default avatarMaxim Mikityanskiy <maximmi@nvidia.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      65d6b6e5
    • Maxim Mikityanskiy's avatar
      net/mlx5e: Convert TIR to a dedicated object · a6696735
      Maxim Mikityanskiy authored
      
      
      Code related to TIR is now encapsulated into a dedicated object and put
      into new files en/tir.{c,h}. All usages are converted.
      
      The Builder pattern is used to initialize a TIR. It allows to create a
      multitude of different configurations, turning on and off some specific
      features in different combinations, without having long parameter lists,
      initializers per usage and repeating code in initializers.
      
      Signed-off-by: default avatarMaxim Mikityanskiy <maximmi@nvidia.com>
      Reviewed-by: default avatarMaor Dickman <maord@nvidia.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      a6696735
    • Maxim Mikityanskiy's avatar
      net/mlx5e: Create struct mlx5e_rss_params_hash · 6fe5ff2c
      Maxim Mikityanskiy authored
      
      
      This commit introduces a new struct to store RSS hash parameters: hash
      function and hash key. The existing usages are changed to use the new
      struct.
      
      Signed-off-by: default avatarMaxim Mikityanskiy <maximmi@nvidia.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      6fe5ff2c
    • Maxim Mikityanskiy's avatar
      net/mlx5e: Remove mdev from mlx5e_build_indir_tir_ctx_common() · 4b3e42ee
      Maxim Mikityanskiy authored
      
      
      In order to drop a dependency to mdev and make the function more
      universal, stop passing mdev to mlx5e_build_indir_tir_ctx_common() and
      pass transport domain directly instead. It also prepares this function
      to be used in other contexts that need a custom transport domain, such
      as hairpin.
      
      Signed-off-by: default avatarMaxim Mikityanskiy <maximmi@nvidia.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      4b3e42ee
    • Maxim Mikityanskiy's avatar
      net/mlx5e: Remove lro_param from mlx5e_build_indir_tir_ctx_common() · a402e3a7
      Maxim Mikityanskiy authored
      
      
      In order to reduce the list of parameters and to define clearer
      responsibility for mlx5e_build_indir_tir_ctx_common(), stop passing
      lro_param and instead call mlx5e_build_tir_ctx_lro() directly where
      needed.
      
      Signed-off-by: default avatarMaxim Mikityanskiy <maximmi@nvidia.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      a402e3a7
    • Maxim Mikityanskiy's avatar
      net/mlx5e: Remove mlx5e_priv usage from mlx5e_build_*tir_ctx*() · 983c9da2
      Maxim Mikityanskiy authored
      
      
      The functions that build TIR context for TIR create and modify commands
      used to depend on struct mlx5e_priv and fetch some values directly from
      different places. It increased coupling of code and the chance of weird
      misbehavior due to hidden complex dependencies.
      
      As the first step, this commit removes the priv parameter from these
      functions. Instead, the necessary values are passed directly.
      
      Signed-off-by: default avatarMaxim Mikityanskiy <maximmi@nvidia.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      983c9da2
    • Maxim Mikityanskiy's avatar
      net/mlx5e: Use mlx5e_rqt_get_rqtn to access RQT hardware id · 093d4bc1
      Maxim Mikityanskiy authored
      
      
      In order to abstract from implementation details of mlx5e_rqt, use the
      mlx5e_rqt_get_rqtn getter instead of accessing the field directly.
      
      Signed-off-by: default avatarMaxim Mikityanskiy <maximmi@nvidia.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      093d4bc1
    • Maxim Mikityanskiy's avatar
      net/mlx5e: Take RQT out of TIR and group RX resources · 0570c1c9
      Maxim Mikityanskiy authored
      
      
      RQT is not part of TIR, as multiple TIRs may point to the same RQT, as
      it happens with indir_tir and inner_indir_tir. These instances of a TIR
      don't use the embedded RQT.
      
      This commit takes RQT out of TIR, making them independent. The RQTs are
      placed into struct mlx5e_rx_res, and items in that struct are regrouped
      by functionality: RSS, channels and PTP.
      
      Signed-off-by: default avatarMaxim Mikityanskiy <maximmi@nvidia.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      0570c1c9
    • Maxim Mikityanskiy's avatar
      net/mlx5e: Move RX resources to a separate struct · 3f22d6c7
      Maxim Mikityanskiy authored
      
      
      This commit moves RQTs and TIRs to a separate struct that is allocated
      dynamically in profiles that support these RX resources (all profiles,
      except IPoIB PKey). It also allows to remove rqt_enabled flags, as RQTs
      are always enabled in profiles that support RX resources.
      
      Signed-off-by: default avatarMaxim Mikityanskiy <maximmi@nvidia.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      3f22d6c7
    • Maxim Mikityanskiy's avatar
      net/mlx5e: Move mlx5e_build_rss_params() call to init_rx · 4ad31849
      Maxim Mikityanskiy authored
      
      
      RSS params belong to the RX side initialization. Move them from
      profile->init to profile->init_rx stage to allow the next commit to move
      rss_params out of priv to a dynamically-allocated struct.
      
      Signed-off-by: default avatarMaxim Mikityanskiy <maximmi@nvidia.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      4ad31849
    • Maxim Mikityanskiy's avatar
      net/mlx5e: Convert RQT to a dedicated object · 06e9f13a
      Maxim Mikityanskiy authored
      
      
      Code related to RQT is now encapsulated into a dedicated object and put
      into new files en/rqt.{c,h}. All usages are converted.
      
      Signed-off-by: default avatarMaxim Mikityanskiy <maximmi@nvidia.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      06e9f13a
    • Maxim Mikityanskiy's avatar
      net/mlx5e: Check if inner FT is supported outside of create/destroy functions · bc5506a1
      Maxim Mikityanskiy authored
      
      
      Move the mlx5e_tunnel_inner_ft_supported() check for inner flow tables
      support outside of mlx5e_create_inner_ttc_table() and
      mlx5e_destroy_inner_ttc_table(). It allows to avoid accessing invalid
      TIRNs of inner indirect TIRs. In a later commit these accesses will be
      replaced by getters that will WARN if inner indirect TIRs don't exist.
      
      Signed-off-by: default avatarMaxim Mikityanskiy <maximmi@nvidia.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      bc5506a1
    • Maxim Mikityanskiy's avatar
      net/mlx5: Take TIR destruction out of the TIR list lock · 69994ef3
      Maxim Mikityanskiy authored
      
      
      res->td.list_lock protects the list of TIRs. There is no point to call
      mlx5_core_destroy_tir() and invoke a firmware command under this lock.
      This commit moves this call outside of the lock and puts it after
      deleting the TIR from the list to ensure that TIRs are always alive
      while in the list.
      
      Signed-off-by: default avatarMaxim Mikityanskiy <maximmi@nvidia.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      69994ef3
    • Maxim Mikityanskiy's avatar
      net/mlx5e: Block LRO if firmware asks for tunneled LRO · 26ab7b38
      Maxim Mikityanskiy authored
      This commit does a cleanup in LRO configuration.
      
      LRO is a parameter of an RQ, but its state is changed by modifying a TIR
      related to the RQ.
      
      The current status: LRO for tunneled packets is not supported in the
      driver, inner TIRs may enable LRO on creation, but LRO status of inner
      TIRs isn't changed in mlx5e_modify_tirs_lro(). This is inconsistent, but
      as long as the firmware doesn't declare support for tunneled LRO, it
      works, because the same RQs are shared between the inner and outer TIRs.
      
      This commit does two fixes:
      
      1. If the firmware has the tunneled LRO capability, LRO is blocked
      altogether, because it's not possible to block it for inner TIRs only,
      when the same RQs are shared between inner and outer TIRs, and the
      driver won't be able to handle tunneled LRO traffic.
      
      2. mlx5e_modify_tirs_lro() is patched to modify LRO state for all TIRs,
      including inner ones, because all TIRs related to an RQ should agree on
      their LRO state.
      
      Fixes: 7b3722fa
      
       ("net/mlx5e: Support RSS for GRE tunneled packets")
      Signed-off-by: default avatarMaxim Mikityanskiy <maximmi@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      26ab7b38
    • Maxim Mikityanskiy's avatar
      net/mlx5e: Prohibit inner indir TIRs in IPoIB · 9c43f386
      Maxim Mikityanskiy authored
      TIR's rx_hash_field_selector_inner can be enabled only when
      tunneled_offload_en = 1. tunneled_offload_en is filled according to the
      tunneled_offload_en field in struct mlx5e_params, which is false in the
      IPoIB profile. On the other hand, the IPoIB profile passes inner_ttc =
      true to mlx5e_create_indirect_tirs, which potentially allows the latter
      function to attempt to create inner indirect TIRs without having
      tunneled_offload_en set.
      
      This commit prohibits this behavior by passing inner_ttc = false to
      mlx5e_create_indirect_tirs. The latter function won't attempt to create
      inner indirect TIRs.
      
      As inner indirect TIRs are not created in the IPoIB profile (this commit
      blocks it explicitly, and even before they would have failed to be
      created), the call to mlx5e_create_inner_ttc_table in
      mlx5i_create_flow_steering is a no-op and can be removed.
      
      Fixes: 46dc933c ("net/mlx5e: Provide explicit directive if to create inner indirect tirs")
      Fixes: 458821c7
      
       ("net/mlx5e: IPoIB, Add inner TTC table to IPoIB flow steering")
      Signed-off-by: default avatarMaxim Mikityanskiy <maximmi@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      9c43f386
  2. Jul 26, 2021
  3. Jul 25, 2021
    • Xin Long's avatar
      tipc: fix an use-after-free issue in tipc_recvmsg · cc19862f
      Xin Long authored
      syzbot reported an use-after-free crash:
      
        BUG: KASAN: use-after-free in tipc_recvmsg+0xf77/0xf90 net/tipc/socket.c:1979
        Call Trace:
         tipc_recvmsg+0xf77/0xf90 net/tipc/socket.c:1979
         sock_recvmsg_nosec net/socket.c:943 [inline]
         sock_recvmsg net/socket.c:961 [inline]
         sock_recvmsg+0xca/0x110 net/socket.c:957
         tipc_conn_rcv_from_sock+0x162/0x2f0 net/tipc/topsrv.c:398
         tipc_conn_recv_work+0xeb/0x190 net/tipc/topsrv.c:421
         process_one_work+0x98d/0x1630 kernel/workqueue.c:2276
         worker_thread+0x658/0x11f0 kernel/workqueue.c:2422
      
      As Hoang pointed out, it was caused by skb_cb->bytes_read still accessed
      after calling tsk_advance_rx_queue() to free the skb in tipc_recvmsg().
      
      This patch is to fix it by accessing skb_cb->bytes_read earlier than
      calling tsk_advance_rx_queue().
      
      Fixes: f4919ff5
      
       ("tipc: keep the skb in rcv queue until the whole data is read")
      Reported-by: default avatar <syzbot+e6741b97d5552f97c24d@syzkaller.appspotmail.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Acked-by: default avatarJon Maloy <jmaloy@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cc19862f