Skip to content
  1. Aug 16, 2023
  2. Aug 15, 2023
  3. Aug 14, 2023
    • Gabor Juhos's avatar
      net: phy: Introduce PSGMII PHY interface mode · 83b5f025
      Gabor Juhos authored
      
      
      The PSGMII interface is similar to QSGMII. The main difference
      is that the PSGMII interface combines five SGMII lines into a
      single link while in QSGMII only four lines are combined.
      
      Similarly to the QSGMII, this interface mode might also needs
      special handling within the MAC driver.
      
      It is commonly used by Qualcomm with their QCA807x PHY series and
      modern WiSoC-s.
      
      Add definitions for the PHY layer to allow to express this type
      of connection between the MAC and PHY.
      
      Signed-off-by: default avatarGabor Juhos <j4g8y7@gmail.com>
      Signed-off-by: default avatarRobert Marko <robert.marko@sartura.hr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      83b5f025
    • Robert Marko's avatar
      dt-bindings: net: ethernet-controller: add PSGMII mode · de875d35
      Robert Marko authored
      
      
      Add a new PSGMII mode which is similar to QSGMII with the difference being
      that it combines 5 SGMII lines into a single link compared to 4 on QSGMII.
      
      It is commonly used by Qualcomm on their QCA807x PHY series.
      
      Signed-off-by: default avatarRobert Marko <robert.marko@sartura.hr>
      Acked-by: default avatarRob Herring <robh@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      de875d35
    • David S. Miller's avatar
      Merge branch 'mlxsw-redirection' · 2d93c30c
      David S. Miller authored
      Petr Machata says:
      
      ====================
      mlxsw: Support traffic redirection from a locked bridge port
      
      Ido Schimmel writes:
      
      It is possible to add a filter that redirects traffic from the ingress
      of a bridge port that is locked (i.e., performs security / SMAC lookup)
      and has learning enabled. For example:
      
       # ip link add name br0 type bridge
       # ip link set dev swp1 master br0
       # bridge link set dev swp1 learning on locked on mab on
       # tc qdisc add dev swp1 clsact
       # tc filter add dev swp1 ingress pref 1 proto ip flower skip_sw src_ip 192.0.2.1 action mirred egress redirect dev swp2
      
      In the kernel's Rx path, this filter is evaluated before the Rx handler
      of the bridge, which means that redirected traffic should not be
      affected by bridge port configuration such as learning.
      
      However, the hardware data path is a bit different and the redirect
      action (FORWARDING_ACTION in hardware) merely attaches a pointer to the
      packet, which is later used by the L2 lookup stage to understand how to
      forward the packet. Between both stages - ingress ACL and L2 lookup -
      learning and security lookup are performed, which means that redirected
      traffic is affected by bridge port configuration, unlike in the kernel's
      data path.
      
      The learning discrepancy was handled in commit 577fa14d ("mlxsw:
      spectrum: Do not process learned records with a dummy FID") by simply
      ignoring learning notifications generated by the redirected traffic. A
      similar solution is not possible for the security / SMAC lookup since
      - unlike learning - the CPU is not involved and packets that failed the
      lookup are dropped by the device.
      
      Instead, solve this by prepending the ignore action to the redirect
      action and use it to instruct the device to disable both learning and
      the security / SMAC lookup for redirected traffic.
      
      Patch #1 adds the ignore action.
      
      Patch #2 prepends the action to the redirect action in flower offload
      code.
      
      Patch #3 removes the workaround in commit 577fa14d
      
       ("mlxsw:
      spectrum: Do not process learned records with a dummy FID") since it is
      no longer needed.
      
      Patch #4 adds a test case.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2d93c30c
    • Ido Schimmel's avatar
      selftests: forwarding: Add test case for traffic redirection from a locked port · 38c43a1c
      Ido Schimmel authored
      
      
      Check that traffic can be redirected from a locked bridge port and that
      it does not create locked FDB entries.
      
      Cc: Hans J. Schultz <netdev@kapio-technology.com>
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarPetr Machata <petrm@nvidia.com>
      Signed-off-by: default avatarPetr Machata <petrm@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      38c43a1c
    • Ido Schimmel's avatar
      mlxsw: spectrum: Stop ignoring learning notifications from redirected traffic · 9793a5a9
      Ido Schimmel authored
      As explained in the previous patch, with the ignore action prepended to
      the redirect action, it is not longer possible for redirected traffic to
      generate learning notifications.
      
      Therefore, remove the workaround that was added in commit 577fa14d
      
      
      ("mlxsw: spectrum: Do not process learned records with a dummy FID") as
      it is no longer needed.
      
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarPetr Machata <petrm@nvidia.com>
      Signed-off-by: default avatarPetr Machata <petrm@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9793a5a9
    • Ido Schimmel's avatar
      mlxsw: spectrum_flower: Disable learning and security lookup when redirecting · 0433670e
      Ido Schimmel authored
      It is possible to add a filter that redirects traffic from the ingress
      of a bridge port that is locked (i.e., performs security / SMAC lookup)
      and has learning enabled. For example:
      
       # ip link add name br0 type bridge
       # ip link set dev swp1 master br0
       # bridge link set dev swp1 learning on locked on mab on
       # tc qdisc add dev swp1 clsact
       # tc filter add dev swp1 ingress pref 1 proto ip flower skip_sw src_ip 192.0.2.1 action mirred egress redirect dev swp2
      
      In the kernel's Rx path, this filter is evaluated before the Rx handler
      of the bridge, which means that redirected traffic should not be
      affected by bridge port configuration such as learning.
      
      However, the hardware data path is a bit different and the redirect
      action (FORWARDING_ACTION in hardware) merely attaches a pointer to the
      packet, which is later used by the L2 lookup stage to understand how to
      forward the packet. Between both stages - ingress ACL and L2 lookup -
      learning and security lookup are performed, which means that redirected
      traffic is affected by bridge port configuration, unlike in the kernel's
      data path.
      
      The learning discrepancy was handled in commit 577fa14d
      
       ("mlxsw:
      spectrum: Do not process learned records with a dummy FID") by simply
      ignoring learning notifications generated by the redirected traffic. A
      similar solution is not possible for the security / SMAC lookup since
      - unlike learning - the CPU is not involved and packets that failed the
      lookup are dropped by the device.
      
      Instead, solve this by prepending the ignore action to the redirect
      action and use it to instruct the device to disable both learning and
      the security / SMAC lookup for redirected traffic.
      
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarPetr Machata <petrm@nvidia.com>
      Signed-off-by: default avatarPetr Machata <petrm@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0433670e
    • Ido Schimmel's avatar
      mlxsw: core_acl_flex_actions: Add IGNORE_ACTION · d0d449c7
      Ido Schimmel authored
      
      
      Add the IGNORE_ACTION which is used to ignore basic switching functions
      such as learning on a per-packet basis.
      
      The action will be prepended to the FORWARDING_ACTION in subsequent
      patches.
      
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarPetr Machata <petrm@nvidia.com>
      Signed-off-by: default avatarPetr Machata <petrm@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d0d449c7
    • Furong Xu's avatar
      net: stmmac: xgmac: show more MAC HW features in debugfs · 58c1e0ba
      Furong Xu authored
      
      
      1. Show TSSTSSEL(Timestamp System Time Source),
      ADDMACADRSEL(additional MAC addresses), SMASEL(SMA/MDIO Interface),
      HDSEL(Half-duplex Support) in debugfs.
      2. Show exact number of additional MAC address registers for XGMAC2 core.
      3. XGMAC2 core does not have different IP checksum offload types, so just
      show rx_coe instead of rx_coe_type1 or rx_coe_type2.
      4. XGMAC2 core does not have rxfifo_over_2048 definition, skip it.
      
      Signed-off-by: default avatarFurong Xu <0x1207@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      58c1e0ba
    • David S. Miller's avatar
      Merge branch 'net-stats-helpers' · a9142847
      David S. Miller authored
      
      
      Li Zetao says:
      
      ====================
      Use helper functions to update stats
      
      The patch set uses the helper functions dev_sw_netstats_rx_add() and
      dev_sw_netstats_tx_add() to update stats, which is the same as
      implementing the function separately.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a9142847
    • Li Zetao's avatar
      vxlan: Use helper functions to update stats · 3c0930b4
      Li Zetao authored
      
      
      Use the helper functions dev_sw_netstats_rx_add() and
      dev_sw_netstats_tx_add() to update stats, which helps to
      provide code readability.
      
      Signed-off-by: default avatarLi Zetao <lizetao1@huawei.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3c0930b4
    • Li Zetao's avatar
      net: macsec: Use helper functions to update stats · bf98bbe9
      Li Zetao authored
      
      
      Use the helper functions dev_sw_netstats_rx_add() and
      dev_sw_netstats_tx_add() to update stats, which helps to
      provide code readability.
      
      Signed-off-by: default avatarLi Zetao <lizetao1@huawei.com>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bf98bbe9
    • William Tu's avatar
      vmxnet3: Add XDP support. · 54f00cce
      William Tu authored
      The patch adds native-mode XDP support: XDP DROP, PASS, TX, and REDIRECT.
      
      Background:
      The vmxnet3 rx consists of three rings: ring0, ring1, and dataring.
      For r0 and r1, buffers at r0 are allocated using alloc_skb APIs and dma
      mapped to the ring's descriptor. If LRO is enabled and packet size larger
      than 3K, VMXNET3_MAX_SKB_BUF_SIZE, then r1 is used to mapped the rest of
      the buffer larger than VMXNET3_MAX_SKB_BUF_SIZE. Each buffer in r1 is
      allocated using alloc_page. So for LRO packets, the payload will be in one
      buffer from r0 and multiple from r1, for non-LRO packets, only one
      descriptor in r0 is used for packet size less than 3k.
      
      When receiving a packet, the first descriptor will have the sop (start of
      packet) bit set, and the last descriptor will have the eop (end of packet)
      bit set. Non-LRO packets will have only one descriptor with both sop and
      eop set.
      
      Other than r0 and r1, vmxnet3 dataring is specifically designed for
      handling packets with small size, usually 128 bytes, defined in
      VMXNET3_DEF_RXDATA_DESC_SIZE, by simply copying the packet from the backend
      driver in ESXi to the ring's memory region at front-end vmxnet3 driver, in
      order to avoid memory mapping/unmapping overhead. In summary, packet size:
          A. < 128B: use dataring
          B. 128B - 3K: use ring0 (VMXNET3_RX_BUF_SKB)
          C. > 3K: use ring0 and ring1 (VMXNET3_RX_BUF_SKB + VMXNET3_RX_BUF_PAGE)
      As a result, the patch adds XDP support for packets using dataring
      and r0 (case A and B), not the large packet size when LRO is enabled.
      
      XDP Implementation:
      When user loads and XDP prog, vmxnet3 driver checks configurations, such
      as mtu, lro, and re-allocate the rx buffer size for reserving the extra
      headroom, XDP_PACKET_HEADROOM, for XDP frame. The XDP prog will then be
      associated with every rx queue of the device. Note that when using dataring
      for small packet size, vmxnet3 (front-end driver) doesn't control the
      buffer allocation, as a result we allocate a new page and copy packet
      from the dataring to XDP frame.
      
      The receive side of XDP is implemented for case A and B, by invoking the
      bpf program at vmxnet3_rq_rx_complete and handle its returned action.
      The vmxnet3_process_xdp(), vmxnet3_process_xdp_small() function handles
      the ring0 and dataring case separately, and decides the next journey of
      the packet afterward.
      
      For TX, vmxnet3 has split header design. Outgoing packets are parsed
      first and protocol headers (L2/L3/L4) are copied to the backend. The
      rest of the payload are dma mapped. Since XDP_TX does not parse the
      packet protocol, the entire XDP frame is dma mapped for transmission
      and transmitted in a batch. Later on, the frame is freed and recycled
      back to the memory pool.
      
      Performance:
      Tested using two VMs inside one ESXi vSphere 7.0 machine, using single
      core on each vmxnet3 device, sender using DPDK testpmd tx-mode attached
      to vmxnet3 device, sending 64B or 512B UDP packet.
      
      VM1 txgen:
      $ dpdk-testpmd -l 0-3 -n 1 -- -i --nb-cores=3 \
      --forward-mode=txonly --eth-peer=0,<mac addr of vm2>
      option: add "--txonly-multi-flow"
      option: use --txpkts=512 or 64 byte
      
      VM2 running XDP:
      $ ./samples/bpf/xdp_rxq_info -d ens160 -a <options> --skb-mode
      $ ./samples/bpf/xdp_rxq_info -d ens160 -a <options>
      options: XDP_DROP, XDP_PASS, XDP_TX
      
      To test REDIRECT to cpu 0, use
      $ ./samples/bpf/xdp_redirect_cpu -d ens160 -c 0 -e drop
      
      Single core performance comparison with skb-mode.
      64B:      skb-mode -> native-mode
      XDP_DROP: 1.6Mpps -> 2.4Mpps
      XDP_PASS: 338Kpps -> 367Kpps
      XDP_TX:   1.1Mpps -> 2.3Mpps
      REDIRECT-drop: 1.3Mpps -> 2.3Mpps
      
      512B:     skb-mode -> native-mode
      XDP_DROP: 863Kpps -> 1.3Mpps
      XDP_PASS: 275Kpps -> 376Kpps
      XDP_TX:   554Kpps -> 1.2Mpps
      REDIRECT-drop: 659Kpps -> 1.2Mpps
      
      Demo: https://youtu.be/4lm1CSCi78Q
      
      
      
      Future work:
      - XDP frag support
      - use napi_consume_skb() instead of dev_kfree_skb_any at unmap
      - stats using u64_stats_t
      - using bitfield macro BIT()
      - optimization for DMA synchronization using actual frame length,
        instead of always max_len
      
      Signed-off-by: default avatarWilliam Tu <u9012063@gmail.com>
      Reviewed-by: default avatarAlexander Duyck <alexanderduyck@fb.com>
      Reviewed-by: default avatarAlexander Lobakin <alexandr.lobakin@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      54f00cce
    • David S. Miller's avatar
      Merge branch 'ovs-drop-reasons' · 76fa3635
      David S. Miller authored
      Adrian Moreno says:
      
      ====================
      openvswitch: add drop reasons
      
      There is currently a gap in drop visibility in the openvswitch module.
      This series tries to improve this by adding a new drop reason subsystem
      for OVS.
      
      Apart from adding a new drop reasson subsystem and some common drop
      reasons, this series takes Eric's preliminary work [1] on adding an
      explicit drop action and integrates it into the same subsystem.
      
      A limitation of this series is that it does not report upcall errors.
      The reason is that there could be many sources of upcall drops and the
      most common one, which is the netlink buffer overflow, cannot be
      reported via kfree_skb() because the skb is freed in the netlink layer
      (see [2]). Therefore, using a reason for the rare events and not the
      common one would be even more misleading. I'd propose we add (in a
      follow up patch) a tracepoint to better report upcall errors.
      
      [1] https://lore.kernel.org/netdev/202306300609.tdRdZscy-lkp@intel.com/T/
      [2] commit 1100248a
      
       ("openvswitch: Fix double reporting of drops in dropwatch")
      
      ---
      v4 -> v5:
      - Rebased
      - Added a helper function to explicitly convert drop reason enum types
      
      v3 -> v4:
      - Changed names of errors following Ilya's suggestions
      - Moved the ovs-dpctl.py changes from patch 7/7 to 3/7
      - Added a test to ensure actions following a drop are rejected
      
      rfc2 -> v3:
      - Rebased on top of latest net-next
      
      rfc1 -> rfc2:
      - Fail when an explicit drop is not the last
      - Added a drop reason for action errors
      - Added braces around macros
      - Dropped patch that added support for masks in ovs-dpctl.py as it's now
        included in Aaron's series [2].
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      76fa3635
    • Adrian Moreno's avatar
      selftests: openvswitch: add explicit drop testcase · 42420291
      Adrian Moreno authored
      
      
      Test explicit drops generate the right drop reason. Also, verify that
      the kernel rejects flows with actions following an explicit drop.
      
      Acked-by: default avatarAaron Conole <aconole@redhat.com>
      Signed-off-by: default avatarAdrian Moreno <amorenoz@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      42420291
    • Adrian Moreno's avatar
      selftests: openvswitch: add drop reason testcase · aab1272f
      Adrian Moreno authored
      
      
      Test if the correct drop reason is reported when OVS drops a packet due
      to an explicit flow.
      
      Acked-by: default avatarAaron Conole <aconole@redhat.com>
      Signed-off-by: default avatarAdrian Moreno <amorenoz@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      aab1272f
    • Adrian Moreno's avatar
      net: openvswitch: add misc error drop reasons · 43d95b30
      Adrian Moreno authored
      
      
      Use drop reasons from include/net/dropreason-core.h when a reasonable
      candidate exists.
      
      Acked-by: default avatarAaron Conole <aconole@redhat.com>
      Signed-off-by: default avatarAdrian Moreno <amorenoz@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      43d95b30
    • Adrian Moreno's avatar
      net: openvswitch: add meter drop reason · f329d1bc
      Adrian Moreno authored
      
      
      By using an independent drop reason it makes it easy to distinguish
      between QoS-triggered or flow-triggered drop.
      
      Acked-by: default avatarAaron Conole <aconole@redhat.com>
      Signed-off-by: default avatarAdrian Moreno <amorenoz@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f329d1bc
    • Eric Garver's avatar
      net: openvswitch: add explicit drop action · e7bc7db9
      Eric Garver authored
      
      
      From: Eric Garver <eric@garver.life>
      
      This adds an explicit drop action. This is used by OVS to drop packets
      for which it cannot determine what to do. An explicit action in the
      kernel allows passing the reason _why_ the packet is being dropped or
      zero to indicate no particular error happened (i.e: OVS intentionally
      dropped the packet).
      
      Since the error codes coming from userspace mean nothing for the kernel,
      we squash all of them into only two drop reasons:
      - OVS_DROP_EXPLICIT_WITH_ERROR to indicate a non-zero value was passed
      - OVS_DROP_EXPLICIT to indicate a zero value was passed (no error)
      
      e.g. trace all OVS dropped skbs
      
       # perf trace -e skb:kfree_skb --filter="reason >= 0x30000"
       [..]
       106.023 ping/2465 skb:kfree_skb(skbaddr: 0xffffa0e8765f2000, \
        location:0xffffffffc0d9b462, protocol: 2048, reason: 196611)
      
      reason: 196611 --> 0x30003 (OVS_DROP_EXPLICIT)
      
      Also, this patch allows ovs-dpctl.py to add explicit drop actions as:
        "drop"     -> implicit empty-action drop
        "drop(0)"  -> explicit non-error action drop
        "drop(42)" -> explicit error action drop
      
      Signed-off-by: default avatarEric Garver <eric@garver.life>
      Co-developed-by: default avatarAdrian Moreno <amorenoz@redhat.com>
      Signed-off-by: default avatarAdrian Moreno <amorenoz@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e7bc7db9
    • Adrian Moreno's avatar
      net: openvswitch: add action error drop reason · ec7bfb5e
      Adrian Moreno authored
      
      
      Add a drop reason for packets that are dropped because an action
      returns a non-zero error code.
      
      Acked-by: default avatarAaron Conole <aconole@redhat.com>
      Signed-off-by: default avatarAdrian Moreno <amorenoz@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ec7bfb5e
    • Adrian Moreno's avatar
      net: openvswitch: add last-action drop reason · 9d802da4
      Adrian Moreno authored
      
      
      Create a new drop reason subsystem for openvswitch and add the first
      drop reason to represent last-action drops.
      
      Last-action drops happen when a flow has an empty action list or there
      is no action that consumes the packet (output, userspace, recirc, etc).
      It is the most common way in which OVS drops packets.
      
      Implementation-wise, most of these skb-consuming actions already call
      "consume_skb" internally and return directly from within the
      do_execute_actions() loop so with minimal changes we can assume that
      any skb that exits the loop normally is a packet drop.
      
      Signed-off-by: default avatarAdrian Moreno <amorenoz@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9d802da4
    • David S. Miller's avatar
      Merge branch 'mptcp-remove-msk-subflow' · afb0c192
      David S. Miller authored
      Matthieu Baerts says:
      
      ====================
      mptcp: get rid of msk->subflow
      
      The MPTCP protocol maintains an additional struct socket per connection,
      mainly to be able to easily use tcp-level struct socket operations.
      
      This leads to several side effects, beyond the quite unfortunate /
      confusing 'subflow' field name:
      
      - active and passive sockets behaviour is inconsistent: only active ones
        have a not NULL msk->subflow, leading to different error handling and
        different error code returned to the user-space in several places.
      
      - active sockets uses an unneeded, larger amount of memory
      
      - passive sockets can't successfully go through accept(), disconnect(),
        accept() sequence, see [1] for more details.
      
      The 13 first patches of this series are from Paolo and address all the
      above, finally getting rid of the blamed field:
      
      - The first patch is a minor clean-up.
      
      - In the next 11 patches, msk->subflow usage is systematically removed
        from the MPTCP protocol, replacing it with direct msk->first usage,
        eventually introducing new core helpers when needed.
      
      - The 13th patch finally disposes the field, and it's the only patch in
        the series intended to produce functional changes.
      
      The last and 14th patch is from Kuniyuki and it is not linked to the
      previous ones: it is a small clean-up to get rid of an unnecessary check
      in mptcp_init_sock().
      
      [1] https://github.com/multipath-tcp/mptcp_net-next/issues/290
      
      
      ====================
      
      Signed-off-by: default avatarMatthieu Baerts <matthieu.baerts@tessares.net>
      afb0c192
    • Kuniyuki Iwashima's avatar
      mptcp: Remove unnecessary test for __mptcp_init_sock() · e2636917
      Kuniyuki Iwashima authored
      __mptcp_init_sock() always returns 0 because mptcp_init_sock() used
      to return the value directly.
      
      But after commit 18b683bf
      
       ("mptcp: queue data for mptcp level
      retransmission"), __mptcp_init_sock() need not return value anymore.
      
      Let's remove the unnecessary test for __mptcp_init_sock() and make
      it return void.
      
      Signed-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Reviewed-by: default avatarMatthieu Baerts <matthieu.baerts@tessares.net>
      Signed-off-by: default avatarMatthieu Baerts <matthieu.baerts@tessares.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e2636917
    • Paolo Abeni's avatar
      mptcp: get rid of msk->subflow · 39880bd8
      Paolo Abeni authored
      Such field is now unused just as a flag to control the first subflow
      deletion at close() time. Introduce a new bit flag for that and finally
      drop the mentioned field.
      
      As an intended side effect, now the first subflow sock is not freed
      before close() even for passive sockets. The msk has no open/active
      subflows if the first one is closed and the subflow list is singular,
      update accordingly the state check in mptcp_stream_accept().
      
      Among other benefits, the subflow removal, reduces the amount of memory
      used on the client side for each mptcp connection, allows passive sockets
      to go through successful accept()/disconnect()/connect() and makes return
      error code consistent for failing both passive and active sockets.
      
      Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/290
      
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: default avatarMat Martineau <martineau@kernel.org>
      Signed-off-by: default avatarMatthieu Baerts <matthieu.baerts@tessares.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      39880bd8