Skip to content
  1. Apr 12, 2021
    • Paolo Abeni's avatar
      veth: refine napi usage · 47e550e0
      Paolo Abeni authored
      
      
      After the previous patch, when enabling GRO, locally generated
      TCP traffic experiences some measurable overhead, as it traverses
      the GRO engine without any chance of aggregation.
      
      This change refine the NAPI receive path admission test, to avoid
      unnecessary GRO overhead in most scenarios, when GRO is enabled
      on a veth peer.
      
      Only skbs that are eligible for aggregation enter the GRO layer,
      the others will go through the traditional receive path.
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      47e550e0
    • Paolo Abeni's avatar
      veth: allow enabling NAPI even without XDP · d3256efd
      Paolo Abeni authored
      
      
      Currently the veth device has the GRO feature bit set, even if
      no GRO aggregation is possible with the default configuration,
      as the veth device does not hook into the GRO engine.
      
      Flipping the GRO feature bit from user-space is a no-op, unless
      XDP is enabled. In such scenario GRO could actually take place, but
      TSO is forced to off on the peer device.
      
      This change allow user-space to really control the GRO feature, with
      no need for an XDP program.
      
      The GRO feature bit is now cleared by default - so that there are no
      user-visible behavior changes with the default configuration.
      
      When the GRO bit is set, the per-queue NAPI instances are initialized
      and registered. On xmit, when napi instances are available, we try
      to use them.
      
      Some additional checks are in place to ensure we initialize/delete NAPIs
      only when needed in case of overlapping XDP and GRO configuration
      changes.
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d3256efd
    • Paolo Abeni's avatar
      veth: use skb_orphan_partial instead of skb_orphan · c75fb320
      Paolo Abeni authored
      As described by commit 9c4c3252 ("skbuff: preserve sock
      reference when scrubbing the skb."), orphaning a skb
      in the TX path will cause OoO.
      
      Let's use skb_orphan_partial() instead of skb_orphan(), so
      that we keep the sk around for queue's selection sake and we
      still avoid the problem fixed with commit 4bf9ffa0
      
       ("veth:
      Orphan skb before GRO")
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c75fb320
    • David S. Miller's avatar
      Merge branch 'ethtool-eeprom' · 7dc85b59
      David S. Miller authored
      
      
      Moshe Shemesh says:
      
      ====================
      ethtool: Extend module EEPROM dump API
      
      Ethtool supports module EEPROM dumps via the `ethtool -m <dev>` command.
      But in current state its functionality is limited - offset and length
      parameters, which are used to specify a linear desired region of EEPROM
      data to dump, is not enough, considering emergence of complex module
      EEPROM layouts such as CMIS 4.0.
      Moreover, CMIS 4.0 extends the amount of pages that may be accessible by
      introducing another parameter for page addressing - banks.
      
      Besides, currently module EEPROM is represented as a chunk of
      concatenated pages, where lower 128 bytes of all pages, except page 00h,
      are omitted. Offset and length are used to address parts of this fake
      linear memory. But in practice drivers, which implement
      get_module_info() and get_module_eeprom() ethtool ops still calculate
      page number and set I2C address on their own.
      
      This series tackles these issues by adding ethtool op, which allows to
      pass page number, bank number and I2C address in addition to offset and
      length parameters to the driver, adds corresponding netlink
      infrastructure and implements the new interface in mlx5 driver.
      
      This allows to extend userspace 'ethtool -m' CLI by adding new
      parameters - page, bank and i2c. New command line format:
       ethtool -m <dev> [hex on|off] [raw on|off] [offset N] [length N] [page N] [bank N] [i2c N]
      
      The consequence of this series is a possibility to dump arbitrary EEPROM
      page at a time, in contrast to dumps of concatenated pages. Therefore,
      offset and length change their semantics and may be used only to specify
      a part of data within half page boundary, which size is currently limited
      to 128 bytes.
      
      As for drivers that support legacy get_module_info() and
      get_module_eeprom() pair, the series addresses it by implementing a
      fallback mechanism. As mentioned earlier, such drivers derive a page
      number from 'global' offset, so this can be done vice versa without
      their involvement thanks to standardization. If kernel netlink handler
      of 'ethtool -m' command detects that new ethtool op is not supported by
      the driver, it calculates offset from given page number and page offset
      and calls old ndos, if they are available.
      ====================
      
      \Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7dc85b59
    • Andrew Lunn's avatar
      ethtool: wire in generic SFP module access · c97a31f6
      Andrew Lunn authored
      
      
      If the device has a sfp bus attached, call its
      sfp_get_module_eeprom_by_page() function, otherwise use the ethtool op
      for the device. This follows how the IOCTL works.
      
      Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c97a31f6
    • Andrew Lunn's avatar
      phy: sfp: add netlink SFP support to generic SFP code · d740513f
      Andrew Lunn authored
      
      
      The new netlink API for reading SFP data requires a new op to be
      implemented. The idea of the new netlink SFP code is that userspace is
      responsible to parsing the EEPROM data and requesting pages, rather
      than have the kernel decide what pages are interesting and returning
      them. This allows greater flexibility for newer formats.
      
      Currently the generic SFP code only supports simple SFPs. Allow i2c
      address 0x50 and 0x51 to be accessed with page and bank must always be
      0. This interface will later be extended when for example QSFP support
      is added.
      
      Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarVladyslav Tarasiuk <vladyslavt@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d740513f
    • Vladyslav Tarasiuk's avatar
      ethtool: Add fallback to get_module_eeprom from netlink command · 96d971e3
      Vladyslav Tarasiuk authored
      
      
      In case netlink get_module_eeprom_by_page() callback is not implemented
      by the driver, try to call old get_module_info() and get_module_eeprom()
      pair. Recalculate parameters to get_module_eeprom() offset and len using
      page number and their sizes. Return error if this can't be done.
      
      Signed-off-by: default avatarVladyslav Tarasiuk <vladyslavt@nvidia.com>
      Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      96d971e3
    • Andrew Lunn's avatar
      net: ethtool: Export helpers for getting EEPROM info · 95dfc7ef
      Andrew Lunn authored
      
      
      There are two ways to retrieve information from SFP EEPROMs.  Many
      devices make use of the common code, and assign the sfp_bus pointer in
      the netdev to point to the bus holding the SFP device. Some MAC
      drivers directly implement ops in there ethool structure.
      
      Export within net/ethtool the two helpers used to call these methods,
      so that they can also be used in the new netlink code.
      
      Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      95dfc7ef
    • Vladyslav Tarasiuk's avatar
      net/mlx5: Add support for DSFP module EEPROM dumps · 4c88fa41
      Vladyslav Tarasiuk authored
      
      
      Allow the driver to recognise DSFP transceiver module ID and therefore
      allow its EEPROM dumps using ethtool.
      
      Signed-off-by: default avatarVladyslav Tarasiuk <vladyslavt@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4c88fa41
    • Vladyslav Tarasiuk's avatar
      net/mlx5: Implement get_module_eeprom_by_page() · e109d2b2
      Vladyslav Tarasiuk authored
      
      
      Implement ethtool_ops::get_module_eeprom_by_page() to enable
      support of new SFP standards.
      
      Signed-off-by: default avatarVladyslav Tarasiuk <vladyslavt@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e109d2b2
    • Vladyslav Tarasiuk's avatar
      net/mlx5: Refactor module EEPROM query · e19b0a34
      Vladyslav Tarasiuk authored
      
      
      Prepare for ethtool_ops::get_module_eeprom_data() implementation by
      extracting common part of mlx5_query_module_eeprom() into a separate
      function.
      
      Signed-off-by: default avatarVladyslav Tarasiuk <vladyslavt@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e19b0a34
    • Vladyslav Tarasiuk's avatar
      ethtool: Allow network drivers to dump arbitrary EEPROM data · c781ff12
      Vladyslav Tarasiuk authored
      
      
      Define get_module_eeprom_by_page() ethtool callback and implement
      netlink infrastructure.
      
      get_module_eeprom_by_page() allows network drivers to dump a part of
      module's EEPROM specified by page and bank numbers along with offset and
      length. It is effectively a netlink replacement for get_module_info()
      and get_module_eeprom() pair, which is needed due to emergence of
      complex non-linear EEPROM layouts.
      
      Signed-off-by: default avatarVladyslav Tarasiuk <vladyslavt@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c781ff12
  2. Apr 10, 2021
    • Jakub Kicinski's avatar
      Merge branch 'net-ipa-a-few-small-fixes' · cbd31253
      Jakub Kicinski authored
      Alex Elder says:
      
      ====================
      net: ipa: a few small fixes
      
      This series implements some minor bug fixes or improvements.
      
      The first patch removes an apparently unnecessary restriction, which
      results in an error on a 32-bit ARM build.
      
      The second makes a definition used for SDM845 match what is used in
      the downstream code.
      
      The third just ensures two netdev pointers are only non-null when
      valid.
      
      The fourth simplifies a little code, knowing that a called function
      never returns an error.
      
      The fifth and sixth just remove some empty/place holder functions.
      
      And the last patch fixes a comment, makes a function private, and
      removes an unnecessary double-negation of a Boolean variable.  This
      patch produces a warning from checkpatch, indicating that a pair of
      parentheses is unnecessary.  I agree with that advice, but it
      conflicts with a suggestion from the compiler.  I left the "problem"
      in place to avoid the compiler warning.
      ====================
      
      Link: https://lore.kernel.org/r/20210409180722.1176868-1-elder@linaro.org
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      cbd31253
    • Alex Elder's avatar
      net: ipa: three small fixes · 602a1c76
      Alex Elder authored
      
      
      Some time ago changes were made to stop referring to clearing the
      hardware pipeline as a "tag process."  Fix a comment to use the
      newer terminology.
      
      Get rid of a pointless double-negation of the Boolean toward_ipa
      flag in ipa_endpoint_config().
      
      make ipa_endpoint_exit_one() private; it's only referenced inside
      "ipa_endpoint.c".
      
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      602a1c76
    • Alex Elder's avatar
      net: ipa: get rid of empty GSI functions · 57ab8ca4
      Alex Elder authored
      
      
      There are place holder functions in the GSI code that do nothing.
      Remove these, knowing we can add something back in their place if
      they're really needed someday.
      
      Some of these are inverse functions (such as teardown to match setup).
      Explicitly comment that there is no inverse in these cases.
      
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      57ab8ca4
    • Alex Elder's avatar
      net: ipa: get rid of empty IPA functions · 74858b63
      Alex Elder authored
      
      
      There are place holder functions in the IPA code that do nothing.
      For the most part these are inverse functions, for example, once the
      routing or filter tables are set up there is no need to perform any
      matching teardown activity at shutdown, or in the case of an error.
      
      These can be safely removed, resulting in some code simplification.
      Add comments in these spots making it explicit that there is no
      inverse.
      
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      74858b63
    • Alex Elder's avatar
      net: ipa: ipa_stop() does not return an error · 077e770f
      Alex Elder authored
      
      
      In ipa_modem_stop(), if the modem netdev pointer is non-null we call
      ipa_stop().  We check for an error and if one is returned we handle
      it.  But ipa_stop() never returns an error, so this extra handling
      is unnecessary.  Simplify the code in ipa_modem_stop() based on the
      knowledge no error handling is needed at this spot.
      
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      077e770f
    • Alex Elder's avatar
      net: ipa: only set endpoint netdev pointer when in use · 57f63faf
      Alex Elder authored
      
      
      In ipa_modem_start(), we set endpoint netdev pointers before the
      network device is registered.  If registration fails, we don't undo
      those assignments.  Instead, wait to assign the netdev pointer until
      after registration succeeds.
      
      Set these endpoint netdev pointers to NULL in ipa_modem_stop()
      before unregistering the network device.
      
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      57f63faf
    • Alex Elder's avatar
      net: ipa: update sequence type for modem TX endpoint · 49e76a41
      Alex Elder authored
      
      
      On IPA v3.5.1, the sequencer type for the modem TX endpoint does not
      define the replication portion in the same way the downstream code
      does.  This difference doesn't affect the behavior of the upstream
      code, but I'd prefer the two code bases use the same configuration
      value here.
      
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      49e76a41
    • Alex Elder's avatar
      net: ipa: relax pool entry size requirement · 7ad3bd52
      Alex Elder authored
      
      
      I no longer know why a validation check ensured the size of an entry
      passed to gsi_trans_pool_init() was restricted to be a multiple of 8.
      For 32-bit builds, this condition doesn't always hold, and for DMA
      pools, the size is rounded up to a power of 2 anyway.
      
      Remove this restriction.
      
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      7ad3bd52
    • Jakub Kicinski's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 8859a44e
      Jakub Kicinski authored
      
      
      Conflicts:
      
      MAINTAINERS
       - keep Chandrasekar
      drivers/net/ethernet/mellanox/mlx5/core/en_main.c
       - simple fix + trust the code re-added to param.c in -next is fine
      include/linux/bpf.h
       - trivial
      include/linux/ethtool.h
       - trivial, fix kdoc while at it
      include/linux/skmsg.h
       - move to relevant place in tcp.c, comment re-wrapped
      net/core/skmsg.c
       - add the sk = sk // sk = NULL around calls
      net/tipc/crypto.c
       - trivial
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      8859a44e
    • Claudiu Manoil's avatar
      enetc: Use generic rule to map Tx rings to interrupt vectors · 6c5e6b4c
      Claudiu Manoil authored
      
      
      Even if the current mapping is correct for the 1 CPU and 2 CPU cases
      (currently enetc is included in SoCs with up to 2 CPUs only), better
      use a generic rule for the mapping to cover all possible cases.
      The number of CPUs is the same as the number of interrupt vectors:
      
      Per device Tx rings -
      device_tx_ring[idx], where idx = 0..n_rings_total-1
      
      Per interrupt vector Tx rings -
      int_vector[i].ring[j], where i = 0..n_int_vects-1
      			     j = 0..n_rings_per_v-1
      
      Mapping rule -
      n_rings_per_v = n_rings_total / n_int_vects
      for i = 0..n_int_vects - 1:
      	for j = 0..n_rings_per_v - 1:
      		idx = n_int_vects * j + i
      		int_vector[i].ring[j] <- device_tx_ring[idx]
      
      Signed-off-by: default avatarClaudiu Manoil <claudiu.manoil@nxp.com>
      Tested-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Link: https://lore.kernel.org/r/20210409071613.28912-1-claudiu.manoil@nxp.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      6c5e6b4c
    • Vladimir Oltean's avatar
      net: enetc: fix TX ring interrupt storm · a93580a0
      Vladimir Oltean authored
      The blamed commit introduced a bit in the TX software buffer descriptor
      structure for determining whether a BD is final or not; we rearm the TX
      interrupt vector for every frame (hence final BD) transmitted.
      
      But there is a problem with the patch: it replaced a condition whose
      expression is a bool which was evaluated at the beginning of the "while"
      loop with a bool expression that is evaluated on the spot: tx_swbd->is_eof.
      
      The problem with the latter expression is that the tx_swbd has already
      been incremented at that stage, so the tx_swbd->is_eof check is in fact
      with the _next_ software BD. Which is _not_ final.
      
      The effect is that the CPU is in 100% load with ksoftirqd because it
      does not acknowledge the TX interrupt, so the handler keeps getting
      called again and again.
      
      The fix is to restore the code structure, and keep the local bool is_eof
      variable, just to assign it the tx_swbd->is_eof value instead of
      !!tx_swbd->skb.
      
      Fixes: d504498d
      
       ("net: enetc: add a dedicated is_eof bit in the TX software BD")
      Reported-by: default avatarAlex Marginean <alexandru.marginean@nxp.com>
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarClaudiu Manoil <claudiu.manoil@nxp.com>
      Link: https://lore.kernel.org/r/20210409192759.3895104-1-olteanv@gmail.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      a93580a0
    • Jakub Kicinski's avatar
      Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux · 95b5c291
      Jakub Kicinski authored
      Saeed Mahameed says:
      
      ====================
      mlx5-next 2021-04-09
      
      This pr contains changes from  mlx5-next branch,
      already reviewed on netdev and rdma mailing lists, links below.
      
      1) From Leon, Dynamically assign MSI-X vectors count
      Already Acked by Bjorn Helgaas.
      https://patchwork.kernel.org/project/netdevbpf/cover/20210314124256.70253-1-leon@kernel.org/
      
      2) Cleanup series:
      https://patchwork.kernel.org/project/netdevbpf/cover/20210311070915.321814-1-saeed@kernel.org/
      
      From Mark, E-Switch cleanups and refactoring, and the addition
      of single FDB mode needed HW bits.
      
      From Mikhael, Remove unused struct field
      
      From Saeed, Cleanup W=1 prototype warning
      
      From Zheng, Esw related cleanup
      
      From Tariq, User order-0 page allocation for EQs
      
      * 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux:
        net/mlx5: Implement sriov_get_vf_total_msix/count() callbacks
        net/mlx5: Dynamically assign MSI-X vectors count
        net/mlx5: Add dynamic MSI-X capabilities bits
        PCI/IOV: Add sysfs MSI-X vector assignment interface
        net/mlx5: Use order-0 allocations for EQs
        net/mlx5: Add IFC bits needed for single FDB mode
        net/mlx5: E-Switch, Refactor send to vport to be more generic
        RDMA/mlx5: Use representor E-Switch when getting netdev and metadata
        net/mlx5: E-Switch, Add eswitch pointer to each representor
        net/mlx5: E-Switch, Add match on vhca id to default send rules
        net/mlx5: Remove unused mlx5_core_health member recover_work
        net/mlx5: simplify the return expression of mlx5_esw_offloads_pair()
        net/mlx5: Cleanup prototype warning
      ====================
      
      Link: https://lore.kernel.org/r/20210409200704.10886-1-saeed@kernel.org
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      95b5c291
    • Dan Carpenter's avatar
      net: enetc: fix array underflow in error handling code · 626b598a
      Dan Carpenter authored
      This loop will try to unmap enetc_unmap_tx_buff[-1] and crash.
      
      Fixes: 9d2b68cc
      
       ("net: enetc: add support for XDP_REDIRECT")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Reviewed-by: default avatarClaudiu Manoil <claudiu.manoil@nxp.com>
      Reviewed-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Link: https://lore.kernel.org/r/YHBHfCY/yv3EnM9z@mwanda
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      626b598a
    • Qiheng Lin's avatar
      cxgb4: remove unneeded if-null-free check · 524e001b
      Qiheng Lin authored
      
      
      Eliminate the following coccicheck warning:
      
      drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_u32.c:529:3-9: WARNING:
       NULL check before some freeing functions is not needed.
      drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_u32.c:533:2-8: WARNING:
       NULL check before some freeing functions is not needed.
      drivers/net/ethernet/chelsio/cxgb4/cxgb4_cudbg.c:161:2-7: WARNING:
       NULL check before some freeing functions is not needed.
      drivers/net/ethernet/chelsio/cxgb4/clip_tbl.c:327:3-9: WARNING:
       NULL check before some freeing functions is not needed.
      
      Signed-off-by: default avatarQiheng Lin <linqiheng@huawei.com>
      Link: https://lore.kernel.org/r/20210409115339.4598-1-linqiheng@huawei.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      524e001b
    • Jakub Kicinski's avatar
      Merge branch 'net-make-phy-pm-ops-a-no-op-if-mac-driver-manages-phy-pm' · 6597b5c2
      Jakub Kicinski authored
      Heiner Kallweit says:
      
      ====================
      net: make PHY PM ops a no-op if MAC driver manages PHY PM
      
      Resume callback of the PHY driver is called after the one for the MAC
      driver. The PHY driver resume callback calls phy_init_hw(), and this is
      potentially problematic if the MAC driver calls phy_start() in its resume
      callback. One issue was reported with the fec driver and a KSZ8081 PHY
      which seems to become unstable if a soft reset is triggered during aneg.
      
      The new flag allows MAC drivers to indicate that they take care of
      suspending/resuming the PHY. Then the MAC PM callbacks can handle
      any dependency between MAC and PHY PM.
      ====================
      
      Link: https://lore.kernel.org/r/9e695411-ab1d-34fe-8b90-3e8192ab84f6@gmail.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      6597b5c2
    • Heiner Kallweit's avatar
      r8169: use mac-managed PHY PM · 5c2280fc
      Heiner Kallweit authored
      
      
      Use the new mac_managed_pm flag to indicate that the driver takes care
      of PHY power management.
      
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      5c2280fc
    • Heiner Kallweit's avatar
      net: fec: use mac-managed PHY PM · 557d5dc8
      Heiner Kallweit authored
      
      
      Use the new mac_managed_pm flag to work around an issue with KSZ8081 PHY
      that becomes unstable when a soft reset is triggered during aneg.
      
      Reported-by: default avatarJoakim Zhang <qiangqing.zhang@nxp.com>
      Tested-by: default avatarJoakim Zhang <qiangqing.zhang@nxp.com>
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      557d5dc8
    • Heiner Kallweit's avatar
      net: phy: make PHY PM ops a no-op if MAC driver manages PHY PM · fba863b8
      Heiner Kallweit authored
      
      
      Resume callback of the PHY driver is called after the one for the MAC
      driver. The PHY driver resume callback calls phy_init_hw(), and this is
      potentially problematic if the MAC driver calls phy_start() in its resume
      callback. One issue was reported with the fec driver and a KSZ8081 PHY
      which seems to become unstable if a soft reset is triggered during aneg.
      
      The new flag allows MAC drivers to indicate that they take care of
      suspending/resuming the PHY. Then the MAC PM callbacks can handle
      any dependency between MAC and PHY PM.
      
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      fba863b8
    • Eric Dumazet's avatar
      Revert "tcp: Reset tcp connections in SYN-SENT state" · a7150e38
      Eric Dumazet authored
      This reverts commit e880f8b3.
      
      1) Patch has not been properly tested, and is wrong [1]
      2) Patch submission did not include TCP maintainer (this is me)
      
      [1]
      divide error: 0000 [#1] PREEMPT SMP KASAN
      CPU: 0 PID: 8426 Comm: syz-executor478 Not tainted 5.12.0-rc4-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:__tcp_select_window+0x56d/0xad0 net/ipv4/tcp_output.c:3015
      Code: 44 89 ff e8 d5 cd f0 f9 45 39 e7 0f 8d 20 ff ff ff e8 f7 c7 f0 f9 44 89 e3 e9 13 ff ff ff e8 ea c7 f0 f9 44 89 e0 44 89 e3 99 <f7> 7c 24 04 29 d3 e9 fc fe ff ff e8 d3 c7 f0 f9 41 f7 dc bf 1f 00
      RSP: 0018:ffffc9000184fac0 EFLAGS: 00010293
      RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
      RDX: 0000000000000000 RSI: ffffffff87832e76 RDI: 0000000000000003
      RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
      R10: ffffffff87832e14 R11: 0000000000000000 R12: 0000000000000000
      R13: 1ffff92000309f5c R14: 0000000000000000 R15: 0000000000000000
      FS:  00000000023eb300(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00007fc2b5f426c0 CR3: 000000001c5cf000 CR4: 00000000001506f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       tcp_select_window net/ipv4/tcp_output.c:264 [inline]
       __tcp_transmit_skb+0xa82/0x38f0 net/ipv4/tcp_output.c:1351
       tcp_transmit_skb net/ipv4/tcp_output.c:1423 [inline]
       tcp_send_active_reset+0x475/0x8e0 net/ipv4/tcp_output.c:3449
       tcp_disconnect+0x15a9/0x1e60 net/ipv4/tcp.c:2955
       inet_shutdown+0x260/0x430 net/ipv4/af_inet.c:905
       __sys_shutdown_sock net/socket.c:2189 [inline]
       __sys_shutdown_sock net/socket.c:2183 [inline]
       __sys_shutdown+0xf1/0x1b0 net/socket.c:2201
       __do_sys_shutdown net/socket.c:2209 [inline]
       __se_sys_shutdown net/socket.c:2207 [inline]
       __x64_sys_shutdown+0x50/0x70 net/socket.c:2207
       do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      Fixes: e880f8b3
      
       ("tcp: Reset tcp connections in SYN-SENT state")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Cc: Manoj Basapathi <manojbm@codeaurora.org>
      Cc: Sauvik Saha <ssaha@codeaurora.org>
      Link: https://lore.kernel.org/r/20210409170237.274904-1-eric.dumazet@gmail.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      a7150e38
    • Florian Westphal's avatar
      net: dccp: use net_generic storage · b98b3304
      Florian Westphal authored
      
      
      DCCP is virtually never used, so no need to use space in struct net for it.
      
      Put the pernet ipv4/v6 socket in the dccp ipv4/ipv6 modules instead.
      
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Link: https://lore.kernel.org/r/20210408174502.1625-1-fw@strlen.de
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      b98b3304
    • Linus Torvalds's avatar
      Merge tag 'net-5.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 4e04e751
      Linus Torvalds authored
      Pull networking fixes from Jakub Kicinski:
       "Networking fixes for 5.12-rc7, including fixes from can, ipsec,
        mac80211, wireless, and bpf trees.
      
        No scary regressions here or in the works, but small fixes for 5.12
        changes keep coming.
      
        Current release - regressions:
      
         - virtio: do not pull payload in skb->head
      
         - virtio: ensure mac header is set in virtio_net_hdr_to_skb()
      
         - Revert "net: correct sk_acceptq_is_full()"
      
         - mptcp: revert "mptcp: provide subflow aware release function"
      
         - ethernet: lan743x: fix ethernet frame cutoff issue
      
         - dsa: fix type was not set for devlink port
      
         - ethtool: remove link_mode param and derive link params from driver
      
         - sched: htb: fix null pointer dereference on a null new_q
      
         - wireless: iwlwifi: Fix softirq/hardirq disabling in
           iwl_pcie_enqueue_hcmd()
      
         - wireless: iwlwifi: fw: fix notification wait locking
      
         - wireless: brcmfmac: p2p: Fix deadlock introduced by avoiding the
           rtnl dependency
      
        Current release - new code bugs:
      
         - napi: fix hangup on napi_disable for threaded napi
      
         - bpf: take module reference for trampoline in module
      
         - wireless: mt76: mt7921: fix airtime reporting and related tx hangs
      
         - wireless: iwlwifi: mvm: rfi: don't lock mvm->mutex when sending
           config command
      
        Previous releases - regressions:
      
         - rfkill: revert back to old userspace API by default
      
         - nfc: fix infinite loop, refcount & memory leaks in LLCP sockets
      
         - let skb_orphan_partial wake-up waiters
      
         - xfrm/compat: Cleanup WARN()s that can be user-triggered
      
         - vxlan, geneve: do not modify the shared tunnel info when PMTU
           triggers an ICMP reply
      
         - can: fix msg_namelen values depending on CAN_REQUIRED_SIZE
      
         - can: uapi: mark union inside struct can_frame packed
      
         - sched: cls: fix action overwrite reference counting
      
         - sched: cls: fix err handler in tcf_action_init()
      
         - ethernet: mlxsw: fix ECN marking in tunnel decapsulation
      
         - ethernet: nfp: Fix a use after free in nfp_bpf_ctrl_msg_rx
      
         - ethernet: i40e: fix receiving of single packets in xsk zero-copy
           mode
      
         - ethernet: cxgb4: avoid collecting SGE_QBASE regs during traffic
      
        Previous releases - always broken:
      
         - bpf: Refuse non-O_RDWR flags in BPF_OBJ_GET
      
         - bpf: Refcount task stack in bpf_get_task_stack
      
         - bpf, x86: Validate computation of branch displacements
      
         - ieee802154: fix many similar syzbot-found bugs
             - fix NULL dereferences in netlink attribute handling
             - reject unsupported operations on monitor interfaces
             - fix error handling in llsec_key_alloc()
      
         - xfrm: make ipv4 pmtu check honor ip header df
      
         - xfrm: make hash generation lock per network namespace
      
         - xfrm: esp: delete NETIF_F_SCTP_CRC bit from features for esp
           offload
      
         - ethtool: fix incorrect datatype in set_eee ops
      
         - xdp: fix xdp_return_frame() kernel BUG throw for page_pool memory
           model
      
         - openvswitch: fix send of uninitialized stack memory in ct limit
           reply
      
        Misc:
      
         - udp: add get handling for UDP_GRO sockopt"
      
      * tag 'net-5.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (182 commits)
        net: fix hangup on napi_disable for threaded napi
        net: hns3: Trivial spell fix in hns3 driver
        lan743x: fix ethernet frame cutoff issue
        net: ipv6: check for validity before dereferencing cfg->fc_nlinfo.nlh
        net: dsa: lantiq_gswip: Configure all remaining GSWIP_MII_CFG bits
        net: dsa: lantiq_gswip: Don't use PHY auto polling
        net: sched: sch_teql: fix null-pointer dereference
        ipv6: report errors for iftoken via netlink extack
        net: sched: fix err handler in tcf_action_init()
        net: sched: fix action overwrite reference counting
        Revert "net: sched: bump refcount for new action in ACT replace mode"
        ice: fix memory leak of aRFS after resuming from suspend
        i40e: Fix sparse warning: missing error code 'err'
        i40e: Fix sparse error: 'vsi->netdev' could be null
        i40e: Fix sparse error: uninitialized symbol 'ring'
        i40e: Fix sparse errors in i40e_txrx.c
        i40e: Fix parameters in aq_get_phy_register()
        nl80211: fix beacon head validation
        bpf, x86: Validate computation of branch displacements for x86-32
        bpf, x86: Validate computation of branch displacements for x86-64
        ...
      4e04e751
    • Linus Torvalds's avatar
      Merge tag 'io_uring-5.12-2021-04-09' of git://git.kernel.dk/linux-block · 3b978435
      Linus Torvalds authored
      Pull io_uring fixes from Jens Axboe:
       "Two minor fixups for the reissue logic, and one for making sure that
        unbounded work is canceled on io-wq exit"
      
      * tag 'io_uring-5.12-2021-04-09' of git://git.kernel.dk/linux-block:
        io-wq: cancel unbounded works on io-wq destroy
        io_uring: fix rw req completion
        io_uring: clear F_REISSUE right after getting it
      3b978435
    • Linus Torvalds's avatar
      Merge tag 'devicetree-fixes-for-5.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux · a2521822
      Linus Torvalds authored
      Pull devicetree fixes from Rob Herring:
      
       - Fix fw_devlink failure with ".*,nr-gpios" properties
      
       - Doc link reference fixes from Mauro
      
       - Fixes for unaligned FDT handling found on OpenRisc. First, avoid
         crash with better error handling when unflattening an unaligned FDT.
         Second, fix memory allocations for FDTs to ensure alignment.
      
      * tag 'devicetree-fixes-for-5.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
        of: property: fw_devlink: do not link ".*,nr-gpios"
        dt-bindings:iio:adc: update motorola,cpcap-adc.yaml reference
        dt-bindings: fix references for iio-bindings.txt
        dt-bindings: don't use ../dir for doc references
        of: unittest: overlay: ensure proper alignment of copied FDT
        of: properly check for error returned by fdt_get_name()
      a2521822
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-2021-04-10' of git://anongit.freedesktop.org/drm/drm · a85f165e
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "Was relatively quiet this week, but still a few pulls came in, pretty
        much small fixes across the board, a couple of regression fixes in the
        amdgpu/radeon code, msm has a few minor fixes across the board, a
        panel regression fix also.
      
        amdgpu:
         - DCN3 fix
         - Fix CAC setting regression for TOPAZ
         - Fix ttm regression
      
        radeon:
         - Fix ttm regression
      
        msm:
         - a5xx/a6xx timestamp fix
         - microcode version check
         - fail path fix
         - block programming fix
         - error removal fix
      
        i915:
         - Fix invalid access to ACPI _DSM objects
      
        xen:
         - Fix use-after-free in xen
         - minor duplicate defintion cleanup
      
        vc4:
         - Reduce fifo threshold on hvs4 to fix a fifo full error
         - minor redunantant assignment cleanup
      
        panel:
         - Disable TE support for Droid4 and N950"
      
      * tag 'drm-fixes-2021-04-10' of git://anongit.freedesktop.org/drm/drm:
        drm/vc4: crtc: Reduce PV fifo threshold on hvs4
        drm/vc4: plane: Remove redundant assignment
        drm/amdgpu/smu7: fix CAC setting on TOPAZ
        drm/radeon: Fix size overflow
        drm/amdgpu: Fix size overflow
        drm/i915: Fix invalid access to ACPI _DSM objects
        drm/amd/display: Add missing mask for DCN3
        drm/panel: panel-dsi-cm: disable TE for now
        drm/msm/disp/dpu1: program 3d_merge only if block is attached
        drm/msm: a6xx: fix version check for the A650 SQE microcode
        drm/msm: Fix a5xx/a6xx timestamps
        drm/msm: Fix removal of valid error case when checking speed_bin
        drm/msm: Set drvdata to NULL when msm_drm_init() fails
        drivers: gpu: drm: xen_drm_front_drm_info is declared twice
        gpu/xen: Fix a use after free in xen_drm_drv_init
      a85f165e
    • Paolo Abeni's avatar
      net: fix hangup on napi_disable for threaded napi · 27f0ad71
      Paolo Abeni authored
      
      
      napi_disable() is subject to an hangup, when the threaded
      mode is enabled and the napi is under heavy traffic.
      
      If the relevant napi has been scheduled and the napi_disable()
      kicks in before the next napi_threaded_wait() completes - so
      that the latter quits due to the napi_disable_pending() condition,
      the existing code leaves the NAPI_STATE_SCHED bit set and the
      napi_disable() loop waiting for such bit will hang.
      
      This patch addresses the issue by dropping the NAPI_STATE_DISABLE
      bit test in napi_thread_wait(). The later napi_threaded_poll()
      iteration will take care of clearing the NAPI_STATE_SCHED.
      
      This also addresses a related problem reported by Jakub:
      before this patch a napi_disable()/napi_enable() pair killed
      the napi thread, effectively disabling the threaded mode.
      On the patched kernel napi_disable() simply stops scheduling
      the relevant thread.
      
      v1 -> v2:
        - let the main napi_thread_poll() loop clear the SCHED bit
      
      Reported-by: default avatarJakub Kicinski <kuba@kernel.org>
      Fixes: 29863d41
      
       ("net: implement threaded-able napi poll loop support")
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/r/883923fa22745a9589e8610962b7dc59df09fb1f.1617981844.git.pabeni@redhat.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      27f0ad71
    • Salil Mehta's avatar
      net: hns3: Trivial spell fix in hns3 driver · cd7e963d
      Salil Mehta authored
      
      
      Some trivial spelling mistakes which caught my eye during the
      review of the code.
      
      Signed-off-by: default avatarSalil Mehta <salil.mehta@huawei.com>
      Link: https://lore.kernel.org/r/20210409074223.32480-1-salil.mehta@huawei.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      cd7e963d
    • Sven Van Asbroeck's avatar
      lan743x: fix ethernet frame cutoff issue · 3bc41d6d
      Sven Van Asbroeck authored
      
      
      The ethernet frame length is calculated incorrectly. Depending on
      the value of RX_HEAD_PADDING, this may result in ethernet frames
      that are too short (cut off at the end), or too long (garbage added
      to the end).
      
      Fix by calculating the ethernet frame length correctly. For added
      clarity, use the ETH_FCS_LEN constant in the calculation.
      
      Many thanks to Heiner Kallweit for suggesting this solution.
      
      Suggested-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Fixes: 3e21a10f ("lan743x: trim all 4 bytes of the FCS; not just 2")
      Link: https://lore.kernel.org/lkml/20210408172353.21143-1-TheSven73@gmail.com/
      
      
      Signed-off-by: default avatarSven Van Asbroeck <thesven73@gmail.com>
      Reviewed-by: default avatarGeorge McCollister <george.mccollister@gmail.com>
      Tested-by: default avatarGeorge McCollister <george.mccollister@gmail.com>
      Link: https://lore.kernel.org/r/20210409003904.8957-1-TheSven73@gmail.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      3bc41d6d
    • Ilya Lipnitskiy's avatar
      of: property: fw_devlink: do not link ".*,nr-gpios" · d473d32c
      Ilya Lipnitskiy authored
      [<vendor>,]nr-gpios property is used by some GPIO drivers[0] to indicate
      the number of GPIOs present on a system, not define a GPIO. nr-gpios is
      not configured by #gpio-cells and can't be parsed along with other
      "*-gpios" properties.
      
      nr-gpios without the "<vendor>," prefix is not allowed by the DT
      spec[1], so only add exception for the ",nr-gpios" suffix and let the
      error message continue being printed for non-compliant implementations.
      
      [0] nr-gpios is referenced in Documentation/devicetree/bindings/gpio:
       - gpio-adnp.txt
       - gpio-xgene-sb.txt
       - gpio-xlp.txt
       - snps,dw-apb-gpio.yaml
      
      [1] Link: https://github.com/devicetree-org/dt-schema/blob/cb53a16a1eb3e2169ce170c071e47940845ec26e/schemas/gpio/gpio-consumer.yaml#L20
      
      Fixes errors such as:
        OF: /palmbus@300000/gpio@600: could not find phandle
      
      Fixes: 7f00be96
      
       ("of: property: Add device link support for interrupt-parent, dmas and -gpio(s)")
      Signed-off-by: default avatarIlya Lipnitskiy <ilya.lipnitskiy@gmail.com>
      Cc: Saravana Kannan <saravanak@google.com>
      Cc: stable@vger.kernel.org # v5.5+
      Link: https://lore.kernel.org/r/20210405222540.18145-1-ilya.lipnitskiy@gmail.com
      
      
      Signed-off-by: default avatarRob Herring <robh@kernel.org>
      d473d32c