Skip to content
  1. Sep 16, 2020
  2. Sep 15, 2020
    • Soheil Hassas Yeganeh's avatar
      tcp: schedule EPOLLOUT after a partial sendmsg · afb83012
      Soheil Hassas Yeganeh authored
      
      
      For EPOLLET, applications must call sendmsg until they get EAGAIN.
      Otherwise, there is no guarantee that EPOLLOUT is sent if there was
      a failure upon memory allocation.
      
      As a result on high-speed NICs, userspace observes multiple small
      sendmsgs after a partial sendmsg until EAGAIN, since TCP can send
      1-2 TSOs in between two sendmsg syscalls:
      
      // One large partial send due to memory allocation failure.
      sendmsg(20MB)   = 2MB
      // Many small sends until EAGAIN.
      sendmsg(18MB)   = 64KB
      sendmsg(17.9MB) = 128KB
      sendmsg(17.8MB) = 64KB
      ...
      sendmsg(...)    = EAGAIN
      // At this point, userspace can assume an EPOLLOUT.
      
      To fix this, set the SOCK_NOSPACE on all partial sendmsg scenarios
      to guarantee that we send EPOLLOUT after partial sendmsg.
      
      After this commit userspace can assume that it will receive an EPOLLOUT
      after the first partial sendmsg. This EPOLLOUT will benefit from
      sk_stream_write_space() logic delaying the EPOLLOUT until significant
      space is available in write queue.
      
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      afb83012
    • Soheil Hassas Yeganeh's avatar
      tcp: return EPOLLOUT from tcp_poll only when notsent_bytes is half the limit · 8ba3c9d1
      Soheil Hassas Yeganeh authored
      
      
      If there was any event available on the TCP socket, tcp_poll()
      will be called to retrieve all the events.  In tcp_poll(), we call
      sk_stream_is_writeable() which returns true as long as we are at least
      one byte below notsent_lowat.  This will result in quite a few
      spurious EPLLOUT and frequent tiny sendmsg() calls as a result.
      
      Similar to sk_stream_write_space(), use __sk_stream_is_writeable
      with a wake value of 1, so that we set EPOLLOUT only if half the
      space is available for write.
      
      Signed-off-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8ba3c9d1
    • Shannon Nelson's avatar
      ionic: fix up debugfs after queue swap · ed6d9b02
      Shannon Nelson authored
      Clean and rebuild the debugfs info for the queues being swapped.
      
      Fixes: a34e25ab
      
       ("ionic: change the descriptor ring length without full reset")
      Signed-off-by: default avatarShannon Nelson <snelson@pensando.io>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ed6d9b02
    • Vladimir Oltean's avatar
      __netif_receive_skb_core: don't untag vlan from skb on DSA master · b14a9fc4
      Vladimir Oltean authored
      A DSA master interface has upper network devices, each representing an
      Ethernet switch port attached to it. Demultiplexing the source ports and
      setting skb->dev accordingly is done through the catch-all ETH_P_XDSA
      packet_type handler. Catch-all because DSA vendors have various header
      implementations, which can be placed anywhere in the frame: before the
      DMAC, before the EtherType, before the FCS, etc. So, the ETH_P_XDSA
      handler acts like an rx_handler more than anything.
      
      It is unlikely for the DSA master interface to have any other upper than
      the DSA switch interfaces themselves. Only maybe a bridge upper*, but it
      is very likely that the DSA master will have no 8021q upper. So
      __netif_receive_skb_core() will try to untag the VLAN, despite the fact
      that the DSA switch interface might have an 8021q upper. So the skb will
      never reach that.
      
      So far, this hasn't been a problem because most of the possible
      placements of the DSA switch header mentioned in the first paragraph
      will displace the VLAN header when the DSA master receives the frame, so
      __netif_receive_skb_core() will not actually execute any VLAN-specific
      code for it. This only becomes a problem when the DSA switch header does
      not displace the VLAN header (for example with a tail tag).
      
      What the patch does is it bypasses the untagging of the skb when there
      is a DSA switch attached to this net device. So, DSA is the only
      packet_type handler which requires seeing the VLAN header. Once skb->dev
      will be changed, __netif_receive_skb_core() will be invoked again and
      untagging, or delivery to an 8021q upper, will happen in the RX of the
      DSA switch interface itself.
      
      *see commit 9eb8eff0
      
       ("net: bridge: allow enslaving some DSA master
      network devices". This is actually the reason why I prefer keeping DSA
      as a packet_type handler of ETH_P_XDSA rather than converting to an
      rx_handler. Currently the rx_handler code doesn't support chaining, and
      this is a problem because a DSA master might be bridged.
      
      Signed-off-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b14a9fc4
    • David S. Miller's avatar
      Merge branch 'net-next-dsa-mt7530-add-support-for-MT7531' · 0ca6d8b7
      David S. Miller authored
      
      
      Landen Chao says:
      
      ====================
      net-next: dsa: mt7530: add support for MT7531
      
      This patch series adds support for MT7531.
      
      MT7531 is the next generation of MT7530 which could be found on Mediatek
      router platforms such as MT7622 or MT7629.
      
      It is also a 7-ports switch with 5 giga embedded phys, 2 cpu ports, and
      the same MAC logic of MT7530. Cpu port 6 only supports SGMII interface.
      Cpu port 5 supports either RGMII or SGMII in different HW SKU, but cannot
      be muxed to PHY of port 0/4 like mt7530. Due to support for SGMII
      interface, pll, and pad setting are different from MT7530.
      
      MT7531 SGMII interface can be configured in following mode:
      - 'SGMII AN mode' with in-band negotiation capability
          which is compatible with PHY_INTERFACE_MODE_SGMII.
      - 'SGMII force mode' without in-band negotiation
          which is compatible with 10B/8B encoding of
          PHY_INTERFACE_MODE_1000BASEX with fixed full-duplex and fixed pause.
      - 2.5 times faster clocked 'SGMII force mode' without in-band negotiation
          which is compatible with 10B/8B encoding of
          PHY_INTERFACE_MODE_2500BASEX with fixed full-duplex and fixed pause.
      
      v4 -> v5
      - Add fixed-link node to dsa cpu port in dts file by suggestion of
        Vladimir Oltean.
      
      v3 -> v4
      - Adjust the coding style by suggestion of Jakub Kicinski.
        Remove unnecessary jumping label, merge continuous numeric 'switch
        cases' into one line, and keep the variables longest to shortest
        (reverse xmas tree).
      
      v2 -> v3
      - Keep the same setup logic of mt7530/mt7621 because these series of
        patches is for adding mt7531 hardware.
      - Do not adjust rgmii delay when vendor phy driver presents in order to
        prevent double adjustment by suggestion of Andrew Lunn.
      - Remove redundant 'Example 4' from dt-bindings by suggestion of
        Rob Herring.
      - Fix typo.
      
      v1 -> v2
      - change phylink_validate callback function to support full-duplex
        gigabit only to match hardware capability.
      - add description of SGMII interface.
      - configure mt7531 cpu port in fastest speed by default.
      - parse SGMII control word for in-band negotiation mode.
      - configure RGMII delay based on phy.rst.
      - Rename the definition in the header file to avoid potential conflicts.
      - Add wrapper function for mdio read/write to support both C22 and C45.
      - correct fixed-link speed of 2500base-x in dts.
      - add MT7531 port mirror setting.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0ca6d8b7
    • Landen Chao's avatar
      arm64: dts: mt7622: add mt7531 dsa to bananapi-bpi-r64 board · 79a675e6
      Landen Chao authored
      
      
      Add mt7531 dsa to bananapi-bpi-r64 board for 5 giga Ethernet ports support.
      
      Signed-off-by: default avatarLanden Chao <landen.chao@mediatek.com>
      Tested-By: default avatarFrank Wunderlich <frank-w@public-files.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      79a675e6
    • Landen Chao's avatar
      arm64: dts: mt7622: add mt7531 dsa to mt7622-rfb1 board · 6af06448
      Landen Chao authored
      
      
      Add mt7531 dsa to mt7622-rfb1 board for 5 giga Ethernet ports support.
      mt7622 only supports 1 sgmii interface, so either gmac0 or gmac1 can be
      configured as sgmii interface. In this patch, change to connect mt7622
      gmac0 and mt7531 port6 through sgmii interface.
      
      Signed-off-by: default avatarLanden Chao <landen.chao@mediatek.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6af06448