Skip to content
  1. Oct 19, 2022
    • Alexandru Tachici's avatar
      net: ethernet: adi: adin1110: Fix SPI transfers · a526a3cc
      Alexandru Tachici authored
      No need to use more than one SPI transfer for reads.
      Use only one from now as ADIN1110/2111 does not tolerate
      CS changes during reads.
      
      The BCM2711/2708 SPI controllers worked fine, but the NXP
      IMX8MM could not keep CS lowered during SPI bursts.
      
      This change aims to make the ADIN1110/2111 driver compatible
      with both SPI controllers, without any loss of bandwidth/other
      capabilities.
      
      Fixes: bc93e19d
      
       ("net: ethernet: adi: Add ADIN1110 support")
      Signed-off-by: default avatarAlexandru Tachici <alexandru.tachici@analog.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a526a3cc
    • David S. Miller's avatar
      Merge branch 'net-bridge-mc-cleanups' · ac3208fb
      David S. Miller authored
      
      
      Ido Schimmel says:
      
      ====================
      bridge: A few multicast cleanups
      
      Clean up a few issues spotted while working on the bridge multicast code
      and running its selftests.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ac3208fb
    • Ido Schimmel's avatar
      bridge: mcast: Simplify MDB entry creation · d1942cd4
      Ido Schimmel authored
      
      
      Before creating a new MDB entry, br_multicast_new_group() will call
      br_mdb_ip_get() to see if one exists and return it if so.
      
      Therefore, simply call br_multicast_new_group() and omit the call to
      br_mdb_ip_get().
      
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Acked-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d1942cd4
    • Ido Schimmel's avatar
      bridge: mcast: Use spin_lock() instead of spin_lock_bh() · 262985fa
      Ido Schimmel authored
      
      
      IGMPv3 / MLDv2 Membership Reports are only processed from the data path
      with softIRQ disabled, so there is no need to call spin_lock_bh(). Use
      spin_lock() instead.
      
      This is consistent with how other IGMP / MLD packets are processed.
      
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Acked-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      262985fa
    • Ido Schimmel's avatar
      selftests: bridge_igmp: Remove unnecessary address deletion · b526b2ea
      Ido Schimmel authored
      
      
      The test group address is added and removed in v2reportleave_test().
      There is no need to delete it again during cleanup as it results in the
      following error message:
      
       # bash -x ./bridge_igmp.sh
       [...]
       + cleanup
       + pre_cleanup
       [...]
       + ip address del dev swp4 239.10.10.10/32
       RTNETLINK answers: Cannot assign requested address
       + h2_destroy
      
      Solve by removing the unnecessary address deletion.
      
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Acked-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b526b2ea
    • Ido Schimmel's avatar
      selftests: bridge_vlan_mcast: Delete qdiscs during cleanup · 6fb1faa1
      Ido Schimmel authored
      
      
      The qdiscs are added during setup, but not deleted during cleanup,
      resulting in the following error messages:
      
       # ./bridge_vlan_mcast.sh
       [...]
       # ./bridge_vlan_mcast.sh
       Error: Exclusivity flag on, cannot modify.
       Error: Exclusivity flag on, cannot modify.
      
      Solve by deleting the qdiscs during cleanup.
      
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Acked-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6fb1faa1
    • David S. Miller's avatar
      Merge branch 'dpaa-phylink' · 5cacb2c7
      David S. Miller authored
      
      
      Sean Anderson says:
      
      ====================
      net: dpaa: Convert to phylink
      
      This series converts the DPAA driver to phylink.
      
      I have tried to maintain backwards compatibility with existing device
      trees whereever possible. However, one area where I was unable to
      achieve this was with QSGMII. Please refer to patch 2 for details.
      
      All mac drivers have now been converted. I would greatly appreciate if
      anyone has T-series or P-series boards they can test/debug this series
      on. I only have an LS1046ARDB. Everything but QSGMII should work without
      breakage; QSGMII needs patches 7 and 8. For this reason, the last 4
      patches in this series should be applied together (and should not go
      through separate trees).
      
      Changes in v7:
      - provide phylink_validate_mask_caps() helper
      - Fix oops if memac_pcs_create returned -EPROBE_DEFER
      - Fix using pcs-names instead of pcs-handle-names
      - Fix not checking for -ENODATA when looking for sgmii pcs
      - Fix 81-character line
      - Simplify memac_validate with phylink_validate_mask_caps
      
      Changes in v6:
      - Remove unnecessary $ref from renesas,rzn1-a5psw
      - Remove unnecessary type from pcs-handle-names
      - Add maxItems to pcs-handle
      - Fix 81-character line
      - Fix uninitialized variable in dtsec_mac_config
      
      Changes in v5:
      - Add Lynx PCS binding
      
      Changes in v4:
      - Use pcs-handle-names instead of pcs-names, as discussed
      - Don't fail if phy support was not compiled in
      - Split off rate adaptation series
      - Split off DPAA "preparation" series
      - Split off Lynx 10G support
      - t208x: Mark MAC1 and MAC2 as 10G
      - Add XFI PCS for t208x MAC1/MAC2
      
      Changes in v3:
      - Expand pcs-handle to an array
      - Add vendor prefix 'fsl,' to rgmii and mii properties.
      - Set maxItems for pcs-names
      - Remove phy-* properties from example because dt-schema complains and I
        can't be bothered to figure out how to make it work.
      - Add pcs-handle as a preferred version of pcsphy-handle
      - Deprecate pcsphy-handle
      - Remove mii/rmii properties
      - Put the PCS mdiodev only after we are done with it (since the PCS
        does not perform a get itself).
      - Remove _return label from memac_initialization in favor of returning
        directly
      - Fix grabbing the default PCS not checking for -ENODATA from
        of_property_match_string
      - Set DTSEC_ECNTRL_R100M in dtsec_link_up instead of dtsec_mac_config
      - Remove rmii/mii properties
      - Replace 1000Base... with 1000BASE... to match IEEE capitalization
      - Add compatibles for QSGMII PCSs
      - Split arm and powerpcs dts updates
      
      Changes in v2:
      - Better document how we select which PCS to use in the default case
      - Move PCS_LYNX dependency to fman Kconfig
      - Remove unused variable slow_10g_if
      - Restrict valid link modes based on the phy interface. This is easier
        to set up, and mostly captures what I intended to do the first time.
        We now have a custom validate which restricts half-duplex for some SoCs
        for RGMII, but generally just uses the default phylink validate.
      - Configure the SerDes in enable/disable
      - Properly implement all ethtool ops and ioctls. These were mostly
        stubbed out just enough to compile last time.
      - Convert 10GEC and dTSEC as well
      - Fix capitalization of mEMAC in commit messages
      - Add nodes for QSGMII PCSs
      - Add nodes for QSGMII PCSs
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5cacb2c7
    • Sean Anderson's avatar
      arm64: dts: layerscape: Add nodes for QSGMII PCSs · 4e748b1b
      Sean Anderson authored
      
      
      Now that we actually read registers from QSGMII PCSs, it's important
      that we have the correct address (instead of hoping that we're the MAC
      with all the QSGMII PCSs on its bus). This adds nodes for the QSGMII
      PCSs.  The exact mapping of QSGMII to MACs depends on the SoC.
      
      Since the first QSGMII PCSs share an address with the SGMII and XFI
      PCSs, we only add new nodes for PCSs 2-4. This avoids address conflicts
      on the bus.
      
      Signed-off-by: default avatarSean Anderson <sean.anderson@seco.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4e748b1b
    • Sean Anderson's avatar
      powerpc: dts: qoriq: Add nodes for QSGMII PCSs · 4e31b808
      Sean Anderson authored
      
      
      Now that we actually read registers from QSGMII PCSs, it's important
      that we have the correct address (instead of hoping that we're the MAC
      with all the QSGMII PCSs on its bus). This adds nodes for the QSGMII
      PCSs. They have the same addresses on all SoCs (e.g. if QSGMIIA is
      present it's used for MACs 1 through 4).
      
      Since the first QSGMII PCSs share an address with the SGMII and XFI
      PCSs, we only add new nodes for PCSs 2-4. This avoids address conflicts
      on the bus.
      
      Signed-off-by: default avatarSean Anderson <sean.anderson@seco.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4e31b808
    • Sean Anderson's avatar
      powerpc: dts: t208x: Mark MAC1 and MAC2 as 10G · 36926a7d
      Sean Anderson authored
      On the T208X SoCs, MAC1 and MAC2 support XGMII. Add some new MAC dtsi
      fragments, and mark the QMAN ports as 10G.
      
      Fixes: da414bb9
      
       ("powerpc/mpc85xx: Add FSL QorIQ DPAA FMan support to the SoC device tree(s)")
      Signed-off-by: default avatarSean Anderson <sean.anderson@seco.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      36926a7d
    • Sean Anderson's avatar
      net: dpaa: Convert to phylink · 5d93cfcf
      Sean Anderson authored
      
      
      This converts DPAA to phylink. All macs are converted. This should work
      with no device tree modifications (including those made in this series),
      except for QSGMII (as noted previously).
      
      The mEMAC configuration is one of the tricker areas. I have tried to
      capture all the restrictions across the various models. Most of the time,
      we assume that if the serdes supports a mode or the phy-interface-mode
      specifies it, then we support it. The only place we can't do this is
      (RG)MII, since there's no serdes. In that case, we rely on a (new)
      devicetree property. There are also several cases where half-duplex is
      broken. Unfortunately, only a single compatible is used for the MAC, so we
      have to use the board compatible instead.
      
      The 10GEC conversion is very straightforward, since it only supports XAUI.
      There is generally nothing to configure.
      
      The dTSEC conversion is broadly similar to mEMAC, but is simpler because we
      don't support configuring the SerDes (though this can be easily added) and
      we don't have multiple PCSs. From what I can tell, there's nothing
      different in the driver or documentation between SGMII and 1000BASE-X
      except for the advertising. Similarly, I couldn't find anything about
      2500BASE-X. In both cases, I treat them like SGMII. These modes aren't used
      by any in-tree boards. Similarly, despite being mentioned in the driver, I
      couldn't find any documented SoCs which supported QSGMII.  I have left it
      unimplemented for now.
      
      Signed-off-by: default avatarSean Anderson <sean.anderson@seco.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5d93cfcf
    • Sean Anderson's avatar
      net: fman: memac: Use lynx pcs driver · a7c2a32e
      Sean Anderson authored
      
      
      Although not stated in the datasheet, as far as I can tell PCS for mEMACs
      is a "Lynx." By reusing the existing driver, we can remove the PCS
      management code from the memac driver. This requires calling some PCS
      functions manually which phylink would usually do for us, but we will let
      it do that soon.
      
      One problem is that we don't actually have a PCS for QSGMII. We pretend
      that each mEMAC's MDIO bus has four QSGMII PCSs, but this is not the case.
      Only the "base" mEMAC's MDIO bus has the four QSGMII PCSs. This is not an
      issue yet, because we never get the PCS state. However, it will be once the
      conversion to phylink is complete, since the links will appear to never
      come up. To get around this, we allow specifying multiple PCSs in pcsphy.
      This breaks backwards compatibility with old device trees, but only for
      QSGMII. IMO this is the only reasonable way to figure out what the actual
      QSGMII PCS is.
      
      Additionally, we now also support a separate XFI PCS. This can allow the
      SerDes driver to set different addresses for the SGMII and XFI PCSs so they
      can be accessed at the same time.
      
      Signed-off-by: default avatarSean Anderson <sean.anderson@seco.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a7c2a32e
    • Sean Anderson's avatar
      net: fman: memac: Add serdes support · 0fc83bd7
      Sean Anderson authored
      
      
      This adds support for using a serdes which has to be configured. This is
      primarly in preparation for phylink conversion, which will then change the
      serdes mode dynamically.
      
      Signed-off-by: default avatarSean Anderson <sean.anderson@seco.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0fc83bd7
    • Russell King (Oracle)'s avatar
      net: phylink: provide phylink_validate_mask_caps() helper · f392a184
      Russell King (Oracle) authored
      
      
      Provide a helper that restricts the link modes according to the
      phylink capabilities.
      
      Signed-off-by: default avatarRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
      [rebased on net-next/master and added documentation]
      Signed-off-by: default avatarSean Anderson <sean.anderson@seco.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f392a184
    • Sean Anderson's avatar
      dt-bindings: net: fman: Add additional interface properties · 045d0501
      Sean Anderson authored
      
      
      At the moment, mEMACs are configured almost completely based on the
      phy-connection-type. That is, if the phy interface is RGMII, it assumed
      that RGMII is supported. For some interfaces, it is assumed that the
      RCW/bootloader has set up the SerDes properly. This is generally OK, but
      restricts runtime reconfiguration. The actual link state is never
      reported.
      
      To address these shortcomings, the driver will need additional
      information. First, it needs to know how to access the PCS/PMAs (in
      order to configure them and get the link status). The SGMII PCS/PMA is
      the only currently-described PCS/PMA. Add the XFI and QSGMII PCS/PMAs as
      well. The XFI (and 10GBASE-KR) PCS/PMA is a c45 "phy" which sits on the
      same MDIO bus as SGMII PCS/PMA. By default they will have conflicting
      addresses, but they are also not enabled at the same time by default.
      Therefore, we can let the XFI PCS/PMA be the default when
      phy-connection-type is xgmii. This will allow for
      backwards-compatibility.
      
      QSGMII, however, cannot work with the current binding. This is because
      the QSGMII PCS/PMAs are only present on one MAC's MDIO bus. At the
      moment this is worked around by having every MAC write to the PCS/PMA
      addresses (without checking if they are present). This only works if
      each MAC has the same configuration, and only if we don't need to know
      the status. Because the QSGMII PCS/PMA will typically be located on a
      different MDIO bus than the MAC's SGMII PCS/PMA, there is no fallback
      for the QSGMII PCS/PMA.
      
      Signed-off-by: default avatarSean Anderson <sean.anderson@seco.com>
      Reviewed-by: default avatarRob Herring <robh@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      045d0501
    • Sean Anderson's avatar
      dt-bindings: net: Add Lynx PCS binding · 00af103d
      Sean Anderson authored
      
      
      This binding is fairly bare-bones for now, since the Lynx driver doesn't
      parse any properties (or match based on the compatible). We just need it
      in order to prevent the PCS nodes from having phy devices attached to
      them. This is not really a problem, but it is a bit inefficient.
      
      This binding is really for three separate PCSs (SGMII, QSGMII, and XFI).
      However, the driver treats all of them the same. This works because the
      SGMII and XFI devices typically use the same address, and the SerDes
      driver (or RCW) muxes between them. The QSGMII PCSs have the same
      register layout as the SGMII PCSs. To do things properly, we'd probably
      do something like
      
      	ethernet-pcs@0 {
      		#pcs-cells = <1>;
      		compatible = "fsl,lynx-pcs";
      		reg = <0>, <1>, <2>, <3>;
      	};
      
      but that would add complexity, and we can describe the hardware just
      fine using separate PCSs for now.
      
      Signed-off-by: default avatarSean Anderson <sean.anderson@seco.com>
      Reviewed-by: default avatarRob Herring <robh@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      00af103d
    • Sean Anderson's avatar
      dt-bindings: net: Expand pcs-handle to an array · 76025ee5
      Sean Anderson authored
      
      
      This allows multiple phandles to be specified for pcs-handle, such as
      when multiple PCSs are present for a single MAC. To differentiate
      between them, also add a pcs-handle-names property.
      
      Signed-off-by: default avatarSean Anderson <sean.anderson@seco.com>
      Reviewed-by: default avatarRob Herring <robh@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      76025ee5
    • David S. Miller's avatar
      Merge branch 'net-marvell-yaml' · 88a2b3cb
      David S. Miller authored
      Michał Grzelak says:
      
      ====================
      net: further improvements to marvell,pp2.yaml
      
      This patchset addresses problems with reg ranges and
      additional $refs. It also limits phy-mode and aligns examples.
      
      Best regards,
      Michał
      
      ---
      Changelog:
      v4->v5
      - drop '+' from all patternProperties
      - restrict range of patternProperties to [0-2] in top level
      - drop the $ref in patternProperties:'^...':properties:reg
      - add patternProperties:'^...':properties:reg:maximum:2
      - drop $ref in patternProperties:'^...':properties:phys
      - add patternProperties:'^...':properties:phys:maxItems:1
      - limit phy-mode to the subset found in dts files
      - reflect the order of subnodes' properties in subnodes' required:
      - restrict range of pattern to [0-2] in marvell,armada-7k-pp22 case
      - restrict range of pattern to [0-1] in marvell,armada-375-pp2 case
      - align to 4 spaces all examples:
      - add specified maximum to allOf:if:then-else:properties:reg
      
      v3->v4
      - change commit message of first patch
      - move allOf:$ref to patternProperties:'^...':$ref
      - deprecate port-id in favour of reg
      - move reg to front of properties list in patternProperties
      - reflect the order of properties in required list in
        patternProperties
      - add unevaluatedProperties: false to patternProperties
      - change unevaluated- to additionalProperties at top level
      - add property phys: to ports subnode
      - extend example binding with additional information about phys and sfp
      - hook phys property to phy-consumer.yaml schema
      
      v2->v3
      - move 'reg:description' to 'allOf:if:then'
      - change '#size-cells: true' and '#address-cells: true'
        to '#size-cells: const: 0' and '#address-cells: const: 1'
      - replace all occurences of pattern "^eth\{hex_num}*"
        with "^(ethernet-)?port@[0-9]+$"
      - add description in 'patternProperties:^...'
      - add 'patternProperties:^...:interrupt-names:minItems: 1'
      - add 'patternProperties:^...:reg:description'
      - update 'patternProperties:^...:port-id:description'
      - add 'patternProperties:^...:required: - reg'
      - update '*:description:' to uppercase
      - add 'allOf:then:required:marvell,system-controller'
      - skip quotation marks from 'allOf:$ref'
      - add 'else' schema to match 'allOf:if:then'
      - restrict 'clocks' in 'allOf:if:then'
      - restrict 'clock-names' in 'allOf:if:then'
      - add #address-cells=<1>; #size-cells=<0>; in 'examples:'
      - change every "ethX" to "ethernet-port@X" in 'examples:'
      - add "reg" and comment in all ports in 'examples:'
      - change /ethernet/eth0/phy-mode in examples://Armada-375
      
      
        to "rgmii-id"
      - replace each cpm_ with cp0_ in 'examples:'
      - replace each _syscon0 with _clk0 in 'examples:'
      - remove each eth0X label in 'examples:'
      - update armada-375.dtsi and armada-cp11x.dtsi to match
        marvell,pp2.yaml
      
      v1->v2
      - move 'properties' to the front of the file
      - remove blank line after 'properties'
      - move 'compatible' to the front of 'properties'
      - move 'clocks', 'clock-names' and 'reg' definitions to 'properties'
      - substitute all occurences of 'marvell,armada-7k-pp2' with
        'marvell,armada-7k-pp22'
      - add properties:#size-cells and properties:#address-cells
      - specify list in 'interrupt-names'
      - remove blank lines after 'patternProperties'
      - remove '^interrupt' and '^#.*-cells$' patterns
      - remove blank line after 'allOf'
      - remove first 'if-then-else' block from 'allOf'
      - negate the condition in allOf:if schema
      - delete 'interrupt-controller' from section 'examples'
      - delete '#interrupt-cells' from section 'examples'
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      88a2b3cb
    • Marcin Wojtas's avatar
      ARM: dts: armada-375: Update network description to match schema · 844e4498
      Marcin Wojtas authored
      
      
      Update the PP2 ethernet ports subnodes' names to match
      schema enforced by the marvell,pp2.yaml contents.
      
      Add new required properties ('reg') which contains information
      about the port ID, keeping 'port-id' ones for backward
      compatibility.
      
      Signed-off-by: default avatarMarcin Wojtas <mw@semihalf.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      844e4498
    • Marcin Wojtas's avatar
      arm64: dts: marvell: Update network description to match schema · 2994bf77
      Marcin Wojtas authored
      
      
      Update the PP2 ethernet ports subnodes' names to match
      schema enforced by the marvell,pp2.yaml contents.
      
      Add new required properties ('reg') which contains information
      about the port ID, keeping 'port-id' ones for backward
      compatibility.
      
      Signed-off-by: default avatarMarcin Wojtas <mw@semihalf.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2994bf77
    • Michał Grzelak's avatar
      dt-bindings: net: marvell,pp2: convert to json-schema · c4d175c3
      Michał Grzelak authored
      
      
      Convert the marvell,pp2 bindings from text to proper schema.
      
      Move 'marvell,system-controller' and 'dma-coherent' properties from
      port up to the controller node, to match what is actually done in DT.
      
      Rename all subnodes to match "^(ethernet-)?port@[0-2]$" and deprecate
      port-id in favour of 'reg'.
      
      Signed-off-by: default avatarMichał Grzelak <mig@semihalf.com>
      Reviewed-by: default avatarRob Herring <robh@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c4d175c3
    • Govindarajulu Varadarajan's avatar
      enic: define constants for legacy interrupts offset · e2ac2a00
      Govindarajulu Varadarajan authored
      
      
      Use macro instead of function calls. These values are constant and will
      not change.
      
      Signed-off-by: default avatarGovindarajulu Varadarajan <govind.varadar@gmail.com>
      Link: https://lore.kernel.org/r/20221018005804.188643-1-govind.varadar@gmail.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e2ac2a00
    • Shenwei Wang's avatar
      net: fec: remove the unused functions · f3d27ae0
      Shenwei Wang authored
      
      
      Removed those unused functions since we simplified the driver
      by using the page pool to manage RX buffers.
      
      Signed-off-by: default avatarShenwei Wang <shenwei.wang@nxp.com>
      Link: https://lore.kernel.org/r/20221017161236.1563975-1-shenwei.wang@nxp.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      f3d27ae0
    • Arnd Bergmann's avatar
      net: remove smc911x driver · a2fd0844
      Arnd Bergmann authored
      This driver was used on Arm and SH machines until 2009, when the
      last platforms moved to the smsc911x driver for the same hardware.
      
      Time to retire this version.
      
      Link: https://lore.kernel.org/netdev/1232010482-3744-1-git-send-email-steve.glendinning@smsc.com/
      
      
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Link: https://lore.kernel.org/r/20221017121900.3520108-1-arnd@kernel.org
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      a2fd0844
    • Jakub Kicinski's avatar
      Merge tag 'for-netdev' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next · 3566a79c
      Jakub Kicinski authored
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf-next 2022-10-18
      
      We've added 33 non-merge commits during the last 14 day(s) which contain
      a total of 31 files changed, 874 insertions(+), 538 deletions(-).
      
      The main changes are:
      
      1) Add RCU grace period chaining to BPF to wait for the completion
         of access from both sleepable and non-sleepable BPF programs,
         from Hou Tao & Paul E. McKenney.
      
      2) Improve helper UAPI by explicitly defining BPF_FUNC_xxx integer
         values. In the wild we have seen OS vendors doing buggy backports
         where helper call numbers mismatched. This is an attempt to make
         backports more foolproof, from Andrii Nakryiko.
      
      3) Add libbpf *_opts API-variants for bpf_*_get_fd_by_id() functions,
         from Roberto Sassu.
      
      4) Fix libbpf's BTF dumper for structs with padding-only fields,
         from Eduard Zingerman.
      
      5) Fix various libbpf bugs which have been found from fuzzing with
         malformed BPF object files, from Shung-Hsi Yu.
      
      6) Clean up an unneeded check on existence of SSE2 in BPF x86-64 JIT,
         from Jie Meng.
      
      7) Fix various ASAN bugs in both libbpf and selftests when running
         the BPF selftest suite on arm64, from Xu Kuohai.
      
      8) Fix missing bpf_iter_vma_offset__destroy() call in BPF iter selftest
         and use in-skeleton link pointer to remove an explicit bpf_link__destroy(),
         from Jiri Olsa.
      
      9) Fix BPF CI breakage by pointing to iptables-legacy instead of relying
         on symlinked iptables which got upgraded to iptables-nft,
         from Martin KaFai Lau.
      
      10) Minor BPF selftest improvements all over the place, from various others.
      
      * tag 'for-netdev' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (33 commits)
        bpf/docs: Update README for most recent vmtest.sh
        bpf: Use rcu_trace_implies_rcu_gp() for program array freeing
        bpf: Use rcu_trace_implies_rcu_gp() in local storage map
        bpf: Use rcu_trace_implies_rcu_gp() in bpf memory allocator
        rcu-tasks: Provide rcu_trace_implies_rcu_gp()
        selftests/bpf: Use sys_pidfd_open() helper when possible
        libbpf: Fix null-pointer dereference in find_prog_by_sec_insn()
        libbpf: Deal with section with no data gracefully
        libbpf: Use elf_getshdrnum() instead of e_shnum
        selftest/bpf: Fix error usage of ASSERT_OK in xdp_adjust_tail.c
        selftests/bpf: Fix error failure of case test_xdp_adjust_tail_grow
        selftest/bpf: Fix memory leak in kprobe_multi_test
        selftests/bpf: Fix memory leak caused by not destroying skeleton
        libbpf: Fix memory leak in parse_usdt_arg()
        libbpf: Fix use-after-free in btf_dump_name_dups
        selftests/bpf: S/iptables/iptables-legacy/ in the bpf_nf and xdp_synproxy test
        selftests/bpf: Alphabetize DENYLISTs
        selftests/bpf: Add tests for _opts variants of bpf_*_get_fd_by_id()
        libbpf: Introduce bpf_link_get_fd_by_id_opts()
        libbpf: Introduce bpf_btf_get_fd_by_id_opts()
        ...
      ====================
      
      Link: https://lore.kernel.org/r/20221018210631.11211-1-daniel@iogearbox.net
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      3566a79c
    • Daniel Müller's avatar
      bpf/docs: Update README for most recent vmtest.sh · 6c4e777f
      Daniel Müller authored
      Since commit 40b09653
      
       ("selftests/bpf: Adjust vmtest.sh to use local
      kernel configuration") the vmtest.sh script no longer downloads a kernel
      configuration but uses the local, in-repository one.
      This change updates the README, which still mentions the old behavior.
      
      Signed-off-by: default avatarDaniel Müller <deso@posteo.net>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Link: https://lore.kernel.org/bpf/20221017232458.1272762-1-deso@posteo.net
      6c4e777f
    • Alexei Starovoitov's avatar
      Merge branch 'Remove unnecessary RCU grace period chaining' · 79d878f7
      Alexei Starovoitov authored
      Hou Tao says:
      
      ====================
      Now bpf uses RCU grace period chaining to wait for the completion of
      access from both sleepable and non-sleepable bpf program: calling
      call_rcu_tasks_trace() firstly to wait for a RCU-tasks-trace grace
      period, then in its callback calls call_rcu() or kfree_rcu() to wait for
      a normal RCU grace period.
      
      According to the implementation of RCU Tasks Trace, it inovkes
      ->postscan_func() to wait for one RCU-tasks-trace grace period and
      rcu_tasks_trace_postscan() inovkes synchronize_rcu() to wait for one
      normal RCU grace period in turn, so one RCU-tasks-trace grace period
      will imply one normal RCU grace period. To codify the implication,
      introduces rcu_trace_implies_rcu_gp() in patch #1. And using it in patch
      Other two uses of call_rcu_tasks_trace() are unchanged: for
      __bpf_prog_put_rcu() there is no gp chain and for
      __bpf_tramp_image_put_rcu_tasks() it chains RCU tasks trace GP and RCU
      tasks GP.
      
      An alternative way to remove these unnecessary RCU grace period
      chainings is using the RCU polling API to check whether or not a normal
      RCU grace period has passed (e.g. get_state_synchronize_rcu()). But it
      needs an unsigned long space for each free element or each call, and
      it is not affordable for local storage element, so as for now always
      rcu_trace_implies_rcu_gp().
      
      Comments are always welcome.
      
      Change Log:
      
      v2:
       * codify the implication of RCU Tasks Trace grace period instead of
         assuming for it
      
      v1: https://lore.kernel.org/bpf/20221011071128.3470622-1-houtao@huaweicloud.com
      
      
      
      Hou Tao (3):
        bpf: Use rcu_trace_implies_rcu_gp() in bpf memory allocator
        bpf: Use rcu_trace_implies_rcu_gp() in local storage map
        bpf: Use rcu_trace_implies_rcu_gp() for program array freeing
      ====================
      
      Reviewed-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      79d878f7
    • Hou Tao's avatar
      bpf: Use rcu_trace_implies_rcu_gp() for program array freeing · 4835f9ee
      Hou Tao authored
      
      
      To support both sleepable and normal uprobe bpf program, the freeing of
      trace program array chains a RCU-tasks-trace grace period and a normal
      RCU grace period one after the other.
      
      With the introduction of rcu_trace_implies_rcu_gp(),
      __bpf_prog_array_free_sleepable_cb() can check whether or not a normal
      RCU grace period has also passed after a RCU-tasks-trace grace period
      has passed. If it is true, it is safe to invoke kfree() directly.
      
      Signed-off-by: default avatarHou Tao <houtao1@huawei.com>
      Link: https://lore.kernel.org/r/20221014113946.965131-5-houtao@huaweicloud.com
      
      
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      4835f9ee
    • Hou Tao's avatar
      bpf: Use rcu_trace_implies_rcu_gp() in local storage map · d39d1445
      Hou Tao authored
      
      
      Local storage map is accessible for both sleepable and non-sleepable bpf
      program, and its memory is freed by using both call_rcu_tasks_trace() and
      kfree_rcu() to wait for both RCU-tasks-trace grace period and RCU grace
      period to pass.
      
      With the introduction of rcu_trace_implies_rcu_gp(), both
      bpf_selem_free_rcu() and bpf_local_storage_free_rcu() can check whether
      or not a normal RCU grace period has also passed after a RCU-tasks-trace
      grace period has passed. If it is true, it is safe to call kfree()
      directly.
      
      Signed-off-by: default avatarHou Tao <houtao1@huawei.com>
      Link: https://lore.kernel.org/r/20221014113946.965131-4-houtao@huaweicloud.com
      
      
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      d39d1445
    • Hou Tao's avatar
      bpf: Use rcu_trace_implies_rcu_gp() in bpf memory allocator · 59be91e5
      Hou Tao authored
      
      
      The memory free logic in bpf memory allocator chains a RCU Tasks Trace
      grace period and a normal RCU grace period one after the other, so it
      can ensure that both sleepable and non-sleepable programs have finished.
      
      With the introduction of rcu_trace_implies_rcu_gp(),
      __free_rcu_tasks_trace() can check whether or not a normal RCU grace
      period has also passed after a RCU Tasks Trace grace period has passed.
      If it is true, freeing these elements directly, else freeing through
      call_rcu().
      
      Signed-off-by: default avatarHou Tao <houtao1@huawei.com>
      Link: https://lore.kernel.org/r/20221014113946.965131-3-houtao@huaweicloud.com
      
      
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      59be91e5
    • Paul E. McKenney's avatar
      rcu-tasks: Provide rcu_trace_implies_rcu_gp() · e6c86c51
      Paul E. McKenney authored
      
      
      As an accident of implementation, an RCU Tasks Trace grace period also
      acts as an RCU grace period.  However, this could change at any time.
      This commit therefore creates an rcu_trace_implies_rcu_gp() that currently
      returns true to codify this accident.  Code relying on this accident
      must call this function to verify that this accident is still happening.
      
      Reported-by: default avatarHou Tao <houtao@huaweicloud.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Martin KaFai Lau <martin.lau@linux.dev>
      Link: https://lore.kernel.org/r/20221014113946.965131-2-houtao@huaweicloud.com
      
      
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      e6c86c51
  2. Oct 18, 2022
  3. Oct 14, 2022