Skip to content
  1. Sep 17, 2023
  2. Sep 16, 2023
    • David S. Miller's avatar
      Merge branch 'TCP_INFO-RTO' · fff755e7
      David S. Miller authored
      
      
      Aananth V says:
      
      ====================
      tcp: new TCP_INFO stats for RTO events
      
      The 2023 SIGCOMM paper "Improving Network Availability with Protective
      ReRoute" has indicated Linux TCP's RTO-triggered txhash rehashing can
      effectively reduce application disruption during outages. To better
      measure the efficacy of this feature, this patch set adds three more
      detailed stats during RTO recovery and exports via TCP_INFO.
      Applications and monitoring systems can leverage this data to measure
      the network path diversity and end-to-end repair latency during network
      outages to improve their network infrastructure.
      
      Patch 1 fixes a bug in TFO SYNACK that we encountered while testing
      these new metrics.
      
      Patch 2 adds the new metrics to tcp_sock and tcp_info.
      
      v2: Addressed feedback from a check bot in patch 2 by removing the
      inline keyword from the tcp_update_rto_time and tcp_update_rto_stats
      functions. Changed a comment in include/net/tcp.h to fit under 80 words.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fff755e7
    • Aananth V's avatar
      tcp: new TCP_INFO stats for RTO events · 3868ab0f
      Aananth V authored
      
      
      The 2023 SIGCOMM paper "Improving Network Availability with Protective
      ReRoute" has indicated Linux TCP's RTO-triggered txhash rehashing can
      effectively reduce application disruption during outages. To better
      measure the efficacy of this feature, this patch adds three more
      detailed stats during RTO recovery and exports via TCP_INFO.
      Applications and monitoring systems can leverage this data to measure
      the network path diversity and end-to-end repair latency during network
      outages to improve their network infrastructure.
      
      The following counters are added to tcp_sock in order to track RTO
      events over the lifetime of a TCP socket.
      
      1. u16 total_rto - Counts the total number of RTO timeouts.
      2. u16 total_rto_recoveries - Counts the total number of RTO recoveries.
      3. u32 total_rto_time - Counts the total time spent (ms) in RTO
                              recoveries. (time spent in CA_Loss and
                              CA_Recovery states)
      
      To compute total_rto_time, we add a new u32 rto_stamp field to
      tcp_sock. rto_stamp records the start timestamp (ms) of the last RTO
      recovery (CA_Loss).
      
      Corresponding fields are also added to the tcp_info struct.
      
      Signed-off-by: default avatarAananth V <aananthv@google.com>
      Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3868ab0f
    • Aananth V's avatar
      tcp: call tcp_try_undo_recovery when an RTOd TFO SYNACK is ACKed · e326578a
      Aananth V authored
      For passive TCP Fast Open sockets that had SYN/ACK timeout and did not
      send more data in SYN_RECV, upon receiving the final ACK in 3WHS, the
      congestion state may awkwardly stay in CA_Loss mode unless the CA state
      was undone due to TCP timestamp checks. However, if
      tcp_rcv_synrecv_state_fastopen() decides not to undo, then we should
      enter CA_Open, because at that point we have received an ACK covering
      the retransmitted SYNACKs. Currently, the icsk_ca_state is only set to
      CA_Open after we receive an ACK for a data-packet. This is because
      tcp_ack does not call tcp_fastretrans_alert (and tcp_process_loss) if
      !prior_packets
      
      Note that tcp_process_loss() calls tcp_try_undo_recovery(), so having
      tcp_rcv_synrecv_state_fastopen() decide that if we're in CA_Loss we
      should call tcp_try_undo_recovery() is consistent with that, and
      low risk.
      
      Fixes: dad8cea7
      
       ("tcp: fix TFO SYNACK undo to avoid double-timestamp-undo")
      Signed-off-by: default avatarAananth V <aananthv@google.com>
      Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e326578a
    • David S. Miller's avatar
      Merge branch 'dsa-microchip-drive-strength-support' · 50675d84
      David S. Miller authored
      
      
      Oleksij Rempel says:
      
      ====================
      net: dsa: microchip: add drive strength support
      
      changes v5:
      - rename milliamp to microamp
      - do not expect negative error code on snprintf
      - set coma after last struct element
      - rename found to have_any_prop
      
      changes v4:
      - integrate microchip feedback to the ksz9477_drive_strengths comment.
      - add Reviewed-by: default avatarRob Herring <robh@kernel.org>
      
      changes v3:
      - yaml: use enum instead of min/max
      - do not use snprintf() on overlapping buffer.
      - unify ksz_drive_strength_to_reg() and ksz_drive_strength_error(). Make
        it usable for KSZ9477 and KSZ8830 variants.
      - use ksz_rmw8() in ksz9477_drive_strength_write()
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      50675d84
    • Oleksij Rempel's avatar
      net: dsa: microchip: Add drive strength configuration · d67d7247
      Oleksij Rempel authored
      
      
      Add device tree based drive strength configuration support. It is needed to
      pass EMI validation on our hardware.
      
      Configuration values are based on the vendor's reference driver.
      
      Tested on KSZ9563R.
      
      Signed-off-by: default avatarOleksij Rempel <o.rempel@pengutronix.de>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d67d7247
    • Oleksij Rempel's avatar
      dt-bindings: net: dsa: microchip: Update ksz device tree bindings for drive strength · e26f40a6
      Oleksij Rempel authored
      
      
      Extend device tree bindings to support drive strength configuration for the
      ksz* switches. Introduced properties:
      - microchip,hi-drive-strength-microamp: Controls the drive strength for
        high-speed interfaces like GMII/RGMII and more.
      - microchip,lo-drive-strength-microamp: Governs the drive strength for
        low-speed interfaces such as LEDs, PME_N, and others.
      - microchip,io-drive-strength-microamp: Controls the drive strength for
        for undocumented Pads on KSZ88xx variants.
      
      Signed-off-by: default avatarOleksij Rempel <o.rempel@pengutronix.de>
      Reviewed-by: default avatarRob Herring <robh@kernel.org>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e26f40a6
    • David S. Miller's avatar
      Merge branch '200GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue · b6a7eeb4
      David S. Miller authored
      Tony Nguyen says:
      
      ====================
      Introduce Intel IDPF driver
      
      Pavan Kumar Linga says:
      
      This patch series introduces the Intel Infrastructure Data Path Function
      (IDPF) driver. It is used for both physical and virtual functions. Except
      for some of the device operations the rest of the functionality is the
      same for both PF and VF. IDPF uses virtchnl version2 opcodes and
      structures defined in the virtchnl2 header file which helps the driver
      to learn the capabilities and register offsets from the device
      Control Plane (CP) instead of assuming the default values.
      
      The format of the series follows the driver init flow to interface open.
      To start with, probe gets called and kicks off the driver initialization
      by spawning the 'vc_event_task' work queue which in turn calls the
      'hard reset' function. As part of that, the mailbox is initialized which
      is used to send/receive the virtchnl messages to/from the CP. Once that is
      done, 'core init' kicks in which requests all the required global resources
      from the CP and spawns the 'init_task' work queue to create the vports.
      
      Based on the capability information received, the driver creates the said
      number of vports (one or many) where each vport is associated to a netdev.
      Also, each vport has its own resources such as queues, vectors etc.
      From there, rest of the netdev_ops and data path are added.
      
      IDPF implements both single queue which is traditional queueing model
      as well as split queue model. In split queue model, it uses separate queue
      for both completion descriptors and buffers which helps to implement
      out-of-order completions. It also helps to implement asymmetric queues,
      for example multiple RX completion queues can be processed by a single
      RX buffer queue and multiple TX buffer queues can be processed by a
      single TX completion queue. In single queue model, same queue is used
      for both descriptor completions as well as buffer completions. It also
      supports features such as generic checksum offload, generic receive
      offload (hardware GRO) etc.
      ---
      v7:
      Patch 2:
       * removed pci_[disable|enable]_pcie_error_reporting as they are dropped
         from the core
      Patch 4, 9:
       * used 'kasprintf' instead of 'snprintf' to avoid providing explicit
         character string size which also fixes "-Wformat-truncation" warnings
      Patch 14:
       * used 'ethtool_sprintf' instead of 'snprintf' to avoid providing explicit
         character string size which also fixes "-Wformat-truncation" warning
       * add string format argument to the 'ethtool_sprintf' to avoid warning on
         "-Wformat-security"
      
      v6: https://lore.kernel.org/netdev/20230825235954.894050-1-pavan.kumar.linga@intel.com/
      Note: 'Acked-by' was only added to patches 1, 2, 12 and not to the other
         patches because of the changes in v6
      
      Patch 3, 4, 5, 6, 7, 8, 9, 11, 13, 14, 15:
       * renamed 'reset_lock' to 'vport_ctrl_lock' to reflect the lock usage
       * to avoid defensive programming, used 'vport_ctrl_lock' for the user
         callbacks that access the 'vport' to prevent the hardware reset thread
         from releasing the 'vport', when the user callback is in progress
       * added some variables to netdev private structure to avoid vport access
         if possible from ethtool and ndo callbacks
       * moved 'mac_filter_list_lock' and MAC related flags to vport_config
         structure and refactored mac filter flow to handle asynchronous
         ndo mac filter callbacks
       * stop the queues before starting the reset flow to avoid TX hangs
       * removed 'sw_mutex' and 'stop_mutex' as they are not needed anymore
       * added missing clear bit in 'init_task' error path
       * renamed labels appropriately
      Patch 8:
       * replaced page_pool_put_page with page_pool_put_full_page
       * for the page pool max_len, used PAGE_SIZE
      Patch 10, 11, 13:
       * made use of the 'netif_txq_maybe_stop', '__netif_txq_completed_wake'
         helper macros
      Patch 13:
       * removed IDPF_HR_RESET_IN_PROG flag check in idpf_tx_singleq_start
         as it is defensive
      Patch 14:
       * removed max descriptor check as the core does that
       * removed unnecessary error messages
       * removed the stats that are common between the ones reported by ethtool
         and ip link
       * replaced snprintf with ethtool_sprintf
       * added a comment to explain the reason for the max queue check
       * as the netdev queues are set on alloc, there is no need to set
         them again on reset unless there is a queue change, so move the
         'idpf_set_real_num_queues' to 'idpf_initiate_soft_reset'
       Patch 15:
       * reworded the 'configure SRIOV' in the commit message
      
      v5: https://lore.kernel.org/netdev/20230816004305.216136-1-anthony.l.nguyen@intel.com/
      Most Patches:
       * wrapped line limit to 80 chars to those which don't effect readability
      Patch 12:
       * in skb_add_rx_frag, offset 'headlen' w.r.t page_offset when adding a
         frag to avoid adding the header again
      Patch 14:
       * added NULL check for 'rxq' when dereferencing it in page_pool_get_stats
      
      v4: https://lore.kernel.org/netdev/20230808003416.3805142-1-anthony.l.nguyen@intel.com/
      Patch 1:
       * s/virtcnl/virtchnl
       * removed the kernel doc for the error code definitions that don't exist
       * reworded the summary part in the virtchnl2 header
      Patch 3:
       * don't set local variable to NULL on error
       * renamed sq_send_command_out label with err_unlock
       * don't use __GFP_ZERO in dma_alloc_coherent
      Patch 4:
       * introduced mailbox workqueue to process mailbox interrupts
      Patch 3, 4, 5, 6, 7, 8, 9, 11, 15:
       * removed unnecessary variable 0-init
      Patch 3, 5, 7, 8, 9, 15:
       * removed defensive programming checks wherever applicable
       * removed IDPF_CAP_FIELD_LAST as it can be treated as defensive
         programming
      Patch 3, 4, 5, 6, 7:
       * replaced IDPF_DFLT_MBX_BUF_SIZE with IDPF_CTLQ_MAX_BUF_LEN
      Patch 2 to 15:
       * add kernel-doc for idpf.h and idpf_txrx.h enums and structures
      Patch 4, 5, 15:
       * adjusted the destroy sequence of the workqueues as per the alloc
         sequence
      Patch 4, 5, 9, 15:
       * scrub unnecessary flags in 'idpf_flags'
         - IDPF_REMOVE_IN_PROG flag can take care of the cases where
           IDPF_REL_RES_IN_PROG is used, removed the later one
         - IDPF_REQ_[TX|RX]_SPLITQ are replaced with struct variables
         - IDPF_CANCEL_[SERVICE|STATS]_TASK are redundant as the work queue
           doesn't get rescheduled again after 'cancel_delayed_work_sync'
         - IDPF_HR_CORE_RESET is removed as there is no set_bit for this flag
         - IDPF_MB_INTR_TRIGGER is removed as it is not needed anymore with the
           mailbox workqueue implementation
      Patch 7 to 15:
       * replaced the custom buffer recycling code with page pool API
       * switched the header split buffer allocations from using a bunch of
         pages to using one large chunk of DMA memory
       * reordered some of the flows in vport_open to support page pool
      Patch 8, 12:
       * don't suppress the alloc errors by using __GFP_NOWARN
      Patch 9:
       * removed dyn_ctl_clrpba_m as it is not being used
      Patch 14:
       * introduced enum idpf_vport_reset_cause instead of using vport flags
       * introduced page pool stats
      
      v3: https://lore.kernel.org/netdev/20230616231341.2885622-1-anthony.l.nguyen@intel.com/
      Patch 5:
       * instead of void, used 'struct virtchnl2_create_vport' type for
         vport_params_recvd and vport_params_reqd and removed the typecasting
       * used u16/u32 as needed instead of int for variables which cannot be
         negative and updated in all the places whereever applicable
      Patch 6:
       * changed the commit message to "add ptypes and MAC filter support"
       * used the sender Signed-off-by as the last tag on all the patches
       * removed unnecessary variables 0-init
       * instead of fixing the code in this commit, fixed it in the commit
         where the change was introduced first
       * moved get_type_info struct on to the stack instead of memory alloc
       * moved mutex_lock and ptype_info memory alloc outside while loop and
         adjusted the return flow
       * used 'break' instead of 'continue' in ptype id switch case
      
      v2: https://lore.kernel.org/netdev/20230614171428.1504179-1-anthony.l.nguyen@intel.com/
      Patch 2:
       * added "Intel(R)" to the DRV_SUMMARY and Makefile.
      Patch 4, 5, 6, 15:
       * replaced IDPF_VC_MSG_PENDING flag with mutex 'vc_buf_lock' for the
         adapter related virtchnl opcodes.
       * get the mutex lock in the virtchnl send thread itself instead of
         in receive thread.
      Patch 5, 6, 7, 8, 9, 11, 14, 15:
       * replaced IDPF_VPORT_VC_MSG_PENDING flag with mutex 'vc_buf_lock' for
         the vport related virtchnl opcodes.
       * get the mutex lock in the virtchnl send thread itself instead of
         in receive thread.
      Patch 6:
       * converted get_ptype_info logic from 1:N to 1:1 message exchange for
         better handling of mutex lock.
      Patch 15:
       * introduced 'stats_lock' spinlock to avoid concurrent stats update.
      
      v1: https://lore.kernel.org/netdev/20230530234501.2680230-1-anthony.l.nguyen@intel.com/
      
      
      
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b6a7eeb4
    • David S. Miller's avatar
      Merge branch 'loongson1-mac' · 2fa6175d
      David S. Miller authored
      
      
      Keguang Zhang says:
      
      ====================
      Move Loongson1 MAC arch-code to the driver dir
      
      In order to convert Loongson1 MAC platform devices to the devicetree
      nodes, Loongson1 MAC arch-code should be moved to the driver dir.
      Add dt-binding document and update MAINTAINERS file accordingly.
      
      In other words, this patchset is a preparation for converting
      Loongson1 platform devices to devicetree.
      
      Changelog
      V4 -> V5: Replace stmmac_probe_config_dt() with devm_stmmac_probe_config_dt()
                Replace stmmac_pltfr_probe() with devm_stmmac_pltfr_probe()
                Squash patch 4 into patch 2 and 3
      V3 -> V4: Add Acked-by tag from Krzysztof Kozlowski
                Add "|" to description part
                Amend "phy-mode" property
                Drop ls1x_dwmac_syscon definition and its instances
                Drop three redundant fields from the ls1x_dwmac structure
                Drop the ls1x_dwmac_init() method.
                Update the dt-binding document entry of Loongson1 Ethernet
                Some minor improvements
      V2 -> V3: Split the DT-schema file into loongson,ls1b-gmac.yaml
                and loongson,ls1c-emac.yaml (suggested by Serge Semin)
                Change the compatibles to loongson,ls1b-gmac and loongson,ls1c-emac
                Rename loongson,dwmac-syscon to loongson,ls1-syscon
                Amend the title
                Add description
                Add Reviewed-by tag from Krzysztof Kozlowski
                Change compatibles back to loongson,ls1b-syscon
                and loongson,ls1c-syscon
                Determine the device ID by physical
                base address(suggested by Serge Semin)
                Use regmap instead of regmap fields
                Use syscon_regmap_lookup_by_phandle()
                Some minor fixes
                Update the entries of MAINTAINERS
      V1 -> V2: Leave the Ethernet platform data for now
                Make the syscon compatibles more specific
                Fix "clock-names" and "interrupt-names" property
                Rename the syscon property to "loongson,dwmac-syscon"
                Drop "phy-handle" and "phy-mode" requirement
                Revert adding loongson,ls1b-dwmac/loongson,ls1c-dwmac
                to snps,dwmac.yaml
                Fix the build errors due to CONFIG_OF being unset
                Change struct reg_field definitions to const
                Rename the syscon property to "loongson,dwmac-syscon"
                Add MII PHY mode for LS1C
                Improve the commit message
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2fa6175d
    • Keguang Zhang's avatar
      net: stmmac: Add glue layer for Loongson-1 SoC · d301c66b
      Keguang Zhang authored
      
      
      This glue driver is created based on the arch-code
      implemented earlier with the platform-specific settings.
      
      Use syscon for SYSCON register access.
      
      And modify MAINTAINERS to add a new F: entry for this driver.
      
      Partially based on the previous work by Serge Semin.
      
      Signed-off-by: default avatarKeguang Zhang <keguang.zhang@gmail.com>
      Reviewed-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d301c66b
    • Keguang Zhang's avatar
      dt-bindings: net: Add Loongson-1 Ethernet Controller · 2af21077
      Keguang Zhang authored
      
      
      Add devicetree binding document for Loongson-1 Ethernet controller.
      And modify MAINTAINERS to add a new F: entry for
      Loongson1 dt-binding documents.
      
      Signed-off-by: default avatarKeguang Zhang <keguang.zhang@gmail.com>
      Reviewed-by: default avatarKrzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
      Reviewed-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2af21077
    • Keguang Zhang's avatar
      dt-bindings: mfd: syscon: Add compatibles for Loongson-1 syscon · 7e10088b
      Keguang Zhang authored
      
      
      Add Loongson LS1B and LS1C compatibles for system controller.
      
      Signed-off-by: default avatarKeguang Zhang <keguang.zhang@gmail.com>
      Acked-by: default avatarKrzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
      Reviewed-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7e10088b
    • Alex Austin's avatar
      sfc: make coding style of PTP addresses consistent with core · 487e1937
      Alex Austin authored
      
      
      Follow the style used in the core kernel (e.g.
      include/linux/etherdevice.h and include/linux/in6.h) for the PTP IPv6
      and Ethernet addresses. No functional changes.
      
      Signed-off-by: default avatarAlex Austin <alex.austin@amd.com>
      Reviewed-by: default avatarEdward Cree <ecree.xilinx@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      487e1937
    • Lorenzo Bianconi's avatar
      net: ethernet: mtk_wed: do not assume offload callbacks are always set · 01b38de1
      Lorenzo Bianconi authored
      
      
      Check if wlan.offload_enable and wlan.offload_disable callbacks are set
      in mtk_wed_flow_add/mtk_wed_flow_remove since mt7996 will not rely
      on them.
      
      Signed-off-by: default avatarLorenzo Bianconi <lorenzo@kernel.org>
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      01b38de1
    • Eric Dumazet's avatar
      net: add truesize debug checks in skb_{add|coalesce}_rx_frag() · c123e0d3
      Eric Dumazet authored
      
      
      It can be time consuming to track driver bugs, that might be detected
      too late from this confusing warning in skb_try_coalesce()
      
      	WARN_ON_ONCE(delta < len);
      
      Add sanity check in skb_add_rx_frag() and skb_coalesce_rx_frag()
      to better track bug origin for CONFIG_DEBUG_NET=y builds.
      
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c123e0d3
    • Eric Dumazet's avatar
      net: use indirect call helpers for sk->sk_prot->release_cb() · 41862d12
      Eric Dumazet authored
      
      
      When adding sk->sk_prot->release_cb() call from __sk_flush_backlog()
      Paolo suggested using indirect call helpers to take care of
      CONFIG_RETPOLINE=y case.
      
      It turns out Google had such mitigation for years in release_sock(),
      it is time to make this public :)
      
      Suggested-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      41862d12
  3. Sep 15, 2023