Skip to content
  1. Jul 22, 2021
    • Vladimir Oltean's avatar
      net: dsa: sja1105: make VID 4095 a bridge VLAN too · e40cba94
      Vladimir Oltean authored
      This simple series of commands:
      
      ip link add br0 type bridge vlan_filtering 1
      ip link set swp0 master br0
      
      fails on sja1105 with the following error:
      [   33.439103] sja1105 spi0.1: vlan-lookup-table needs to have at least the default untagged VLAN
      [   33.447710] sja1105 spi0.1: Invalid config, cannot upload
      Warning: sja1105: Failed to change VLAN Ethertype.
      
      For context, sja1105 has 3 operating modes:
      - SJA1105_VLAN_UNAWARE: the dsa_8021q_vlans are committed to hardware
      - SJA1105_VLAN_FILTERING_FULL: the bridge_vlans are committed to hardware
      - SJA1105_VLAN_FILTERING_BEST_EFFORT: both the dsa_8021q_vlans and the
        bridge_vlans are committed to hardware
      
      Swapping out a VLAN list and another in happens in
      sja1105_build_vlan_table(), which performs a delta update procedure.
      That function is called from a few places, notably from
      sja1105_vlan_filtering() which is called from the
      SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING handler.
      
      The above set of 2 commands fails when run on a kernel pre-commit
      8841f6e6 ("net: dsa: sja1105: make devlink property
      best_effort_vlan_filtering true by default"). So the priv->vlan_state
      transition that takes place is between VLAN-unaware and full VLAN
      filtering. So the dsa_8021q_vlans are swapped out and the bridge_vlans
      are swapped in.
      
      So why does it fail?
      
      Well, the bridge driver, through nbp_vlan_init(), first sets up the
      SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING attribute, and only then
      proceeds to call nbp_vlan_add for the default_pvid.
      
      So when we swap out the dsa_8021q_vlans and swap in the bridge_vlans in
      the SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING handler, there are no bridge
      VLANs (yet). So we have wiped the VLAN table clean, and the low-level
      static config checker complains of an invalid configuration. We _will_
      add the bridge VLANs using the dynamic config interface, albeit later,
      when nbp_vlan_add() calls us. So it is natural that it fails.
      
      So why did it ever work?
      
      Surprisingly, it looks like I only tested this configuration with 2
      things set up in a particular way:
      - a network manager that brings all ports up
      - a kernel with CONFIG_VLAN_8021Q=y
      
      It is widely known that commit ad1afb00 ("vlan_dev: VLAN 0 should be
      treated as "no vlan tag" (802.1p packet)") installs VID 0 to every net
      device that comes up. DSA treats these VLANs as bridge VLANs, and
      therefore, in my testing, the list of bridge_vlans was never empty.
      
      However, if CONFIG_VLAN_8021Q is not enabled, or the port is not up when
      it joins a VLAN-aware bridge, the bridge_vlans list will be temporarily
      empty, and the sja1105_static_config_reload() call from
      sja1105_vlan_filtering() will fail.
      
      To fix this, the simplest thing is to keep VID 4095, the one used for
      CPU-injected control packets since commit ed040abc ("net: dsa:
      sja1105: use 4095 as the private VLAN for untagged traffic"), in the
      list of bridge VLANs too, not just the list of tag_8021q VLANs. This
      ensures that the list of bridge VLANs will never be empty.
      
      Fixes: ec5ae610
      
       ("net: dsa: sja1105: save/restore VLANs using a delta commit method")
      Reported-by: default avatarRadu Pirea (NXP OSS) <radu-nicolae.pirea@oss.nxp.com>
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e40cba94
    • Wei Wang's avatar
      tcp: disable TFO blackhole logic by default · 213ad73d
      Wei Wang authored
      Multiple complaints have been raised from the TFO users on the internet
      stating that the TFO blackhole logic is too aggressive and gets falsely
      triggered too often.
      (e.g. https://blog.apnic.net/2021/07/05/tcp-fast-open-not-so-fast/)
      Considering that most middleboxes no longer drop TFO packets, we decide
      to disable the blackhole logic by setting
      /proc/sys/net/ipv4/tcp_fastopen_blackhole_timeout_set to 0 by default.
      
      Fixes: cf1ef3f0
      
       ("net/tcp_fastopen: Disable active side TFO in certain scenarios")
      Signed-off-by: default avatarWei Wang <weiwan@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Acked-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Acked-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      213ad73d
    • Xin Long's avatar
      sctp: do not update transport pathmtu if SPP_PMTUD_ENABLE is not set · 02dc2ee7
      Xin Long authored
      Currently, in sctp_packet_config(), sctp_transport_pmtu_check() is
      called to update transport pathmtu with dst's mtu when dst's mtu
      has been changed by non sctp stack like xfrm.
      
      However, this should only happen when SPP_PMTUD_ENABLE is set, no
      matter where dst's mtu changed. This patch is to fix by checking
      SPP_PMTUD_ENABLE flag before calling sctp_transport_pmtu_check().
      
      Thanks Jacek for reporting and looking into this issue.
      
      v1->v2:
        - add the missing "{" to fix the build error.
      
      Fixes: 69fec325
      
       ('Revert "sctp: remove sctp_transport_pmtu_check"')
      Reported-by: default avatarJacek Szafraniec <jacek.szafraniec@nokia.com>
      Tested-by: default avatarJacek Szafraniec <jacek.szafraniec@nokia.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      02dc2ee7
    • Arnd Bergmann's avatar
      net: ixp46x: fix ptp build failure · 161dcc02
      Arnd Bergmann authored
      The rework of the ixp46x cpu detection left the network driver in
      a half broken state:
      
      drivers/net/ethernet/xscale/ptp_ixp46x.c: In function 'ptp_ixp_init':
      drivers/net/ethernet/xscale/ptp_ixp46x.c:290:51: error: 'IXP4XX_TIMESYNC_BASE_VIRT' undeclared (first use in this function)
        290 |                 (struct ixp46x_ts_regs __iomem *) IXP4XX_TIMESYNC_BASE_VIRT;
            |                                                   ^~~~~~~~~~~~~~~~~~~~~~~~~
      drivers/net/ethernet/xscale/ptp_ixp46x.c:290:51: note: each undeclared identifier is reported only once for each function it appears in
      drivers/net/ethernet/xscale/ptp_ixp46x.c: At top level:
      drivers/net/ethernet/xscale/ptp_ixp46x.c:323:1: error: data definition has no type or storage class [-Werror]
        323 | module_init(ptp_ixp_init);
      
      I have patches to complete the transition for a future release, but
      for the moment, add the missing include statements to get it to build
      again.
      
      Fixes: 09aa9aab
      
       ("soc: ixp4xx: move cpu detection to linux/soc/ixp4xx/cpu.h")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      161dcc02
  2. Jul 21, 2021
    • Sukadev Bhattiprolu's avatar
      ibmvnic: Remove the proper scrq flush · bb55362b
      Sukadev Bhattiprolu authored
      Commit 65d6470d ("ibmvnic: clean pending indirect buffs during reset")
      intended to remove the call to ibmvnic_tx_scrq_flush() when the
      ->resetting flag is true and was tested that way. But during the final
      rebase to net-next, the hunk got applied to a block few lines below
      (which happened to have the same diff context) and the wrong call to
      ibmvnic_tx_scrq_flush() got removed.
      
      Fix that by removing the correct ibmvnic_tx_scrq_flush() and restoring
      the one that was incorrectly removed.
      
      Fixes: 65d6470d
      
       ("ibmvnic: clean pending indirect buffs during reset")
      Reported-by: default avatarDany Madden <drt@linux.ibm.com>
      Signed-off-by: default avatarSukadev Bhattiprolu <sukadev@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bb55362b
    • David S. Miller's avatar
      Merge branch 'pmtu-esp' · 3ddaed6b
      David S. Miller authored
      
      
      Vadim Fedorenko ays:
      
      ====================
      Fix PMTU for ESP-in-UDP encapsulation
      
      Bug 213669 uncovered regression in PMTU discovery for UDP-encapsulated
      routes and some incorrect usage in udp tunnel fields. This series fixes
      problems and also adds such case for selftests
      
      v3:
       - update checking logic to account SCTP use case
      v2:
       - remove refactor code that was in first patch
       - move checking logic to __udp{4,6}_lib_err_encap
       - add more tests, especially routed configuration
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3ddaed6b
    • Vadim Fedorenko's avatar
      selftests: net: add ESP-in-UDP PMTU test · ece1278a
      Vadim Fedorenko authored
      
      
      The case of ESP in UDP encapsulation was not covered before. Add
      cases of local changes of MTU and difference on routed path.
      
      Signed-off-by: default avatarVadim Fedorenko <vfedorenko@novek.ru>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ece1278a
    • Vadim Fedorenko's avatar
      udp: check encap socket in __udp_lib_err · 9bfce73c
      Vadim Fedorenko authored
      Commit d26796ae ("udp: check udp sock encap_type in __udp_lib_err")
      added checks for encapsulated sockets but it broke cases when there is
      no implementation of encap_err_lookup for encapsulation, i.e. ESP in
      UDP encapsulation. Fix it by calling encap_err_lookup only if socket
      implements this method otherwise treat it as legal socket.
      
      Fixes: d26796ae
      
       ("udp: check udp sock encap_type in __udp_lib_err")
      Signed-off-by: default avatarVadim Fedorenko <vfedorenko@novek.ru>
      Reviewed-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9bfce73c
    • Xin Long's avatar
      sctp: update active_key for asoc when old key is being replaced · 58acd100
      Xin Long authored
      syzbot reported a call trace:
      
        BUG: KASAN: use-after-free in sctp_auth_shkey_hold+0x22/0xa0 net/sctp/auth.c:112
        Call Trace:
         sctp_auth_shkey_hold+0x22/0xa0 net/sctp/auth.c:112
         sctp_set_owner_w net/sctp/socket.c:131 [inline]
         sctp_sendmsg_to_asoc+0x152e/0x2180 net/sctp/socket.c:1865
         sctp_sendmsg+0x103b/0x1d30 net/sctp/socket.c:2027
         inet_sendmsg+0x99/0xe0 net/ipv4/af_inet.c:821
         sock_sendmsg_nosec net/socket.c:703 [inline]
         sock_sendmsg+0xcf/0x120 net/socket.c:723
      
      This is an use-after-free issue caused by not updating asoc->shkey after
      it was replaced in the key list asoc->endpoint_shared_keys, and the old
      key was freed.
      
      This patch is to fix by also updating active_key for asoc when old key is
      being replaced with a new one. Note that this issue doesn't exist in
      sctp_auth_del_key_id(), as it's not allowed to delete the active_key
      from the asoc.
      
      Fixes: 1b1e0bc9
      
       ("sctp: add refcnt support for sh_key")
      Reported-by: default avatar <syzbot+b774577370208727d12b@syzkaller.appspotmail.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      58acd100
    • Sayanta Pattanayak's avatar
      r8169: Avoid duplicate sysfs entry creation error · e9a72f87
      Sayanta Pattanayak authored
      When registering the MDIO bus for a r8169 device, we use the PCI
      bus/device specifier as a (seemingly) unique device identifier.
      However the very same BDF number can be used on another PCI segment,
      which makes the driver fail probing:
      
      [ 27.544136] r8169 0002:07:00.0: enabling device (0000 -> 0003)
      [ 27.559734] sysfs: cannot create duplicate filename '/class/mdio_bus/r8169-700'
      ....
      [ 27.684858] libphy: mii_bus r8169-700 failed to register
      [ 27.695602] r8169: probe of 0002:07:00.0 failed with error -22
      
      Add the segment number to the device name to make it more unique.
      
      This fixes operation on ARM N1SDP boards, with two boards connected
      together to form an SMP system, and all on-board devices showing up
      twice, just on different PCI segments. A similar issue would occur on
      large systems with many PCI slots and multiple RTL8169 NICs.
      
      Fixes: f1e911d5
      
       ("r8169: add basic phylib support")
      Signed-off-by: default avatarSayanta Pattanayak <sayanta.pattanayak@arm.com>
      [Andre: expand commit message, use pci_domain_nr()]
      Signed-off-by: default avatarAndre Przywara <andre.przywara@arm.com>
      Acked-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e9a72f87
    • Markus Boehme's avatar
      ixgbe: Fix packet corruption due to missing DMA sync · 09cfae9f
      Markus Boehme authored
      When receiving a packet with multiple fragments, hardware may still
      touch the first fragment until the entire packet has been received. The
      driver therefore keeps the first fragment mapped for DMA until end of
      packet has been asserted, and delays its dma_sync call until then.
      
      The driver tries to fit multiple receive buffers on one page. When using
      3K receive buffers (e.g. using Jumbo frames and legacy-rx is turned
      off/build_skb is being used) on an architecture with 4K pages, the
      driver allocates an order 1 compound page and uses one page per receive
      buffer. To determine the correct offset for a delayed DMA sync of the
      first fragment of a multi-fragment packet, the driver then cannot just
      use PAGE_MASK on the DMA address but has to construct a mask based on
      the actual size of the backing page.
      
      Using PAGE_MASK in the 3K RX buffer/4K page architecture configuration
      will always sync the first page of a compound page. With the SWIOTLB
      enabled this can lead to corrupted packets (zeroed out first fragment,
      re-used garbage from another packet) and various consequences, such as
      slow/stalling data transfers and connection resets. For example, testing
      on a link with MTU exceeding 3058 bytes on a host with SWIOTLB enabled
      (e.g. "iommu=soft swiotlb=262144,force") TCP transfers quickly fizzle
      out without this patch.
      
      Cc: stable@vger.kernel.org
      Fixes: 0c5661ec
      
       ("ixgbe: fix crash in build_skb Rx code path")
      Signed-off-by: default avatarMarkus Boehme <markubo@amazon.com>
      Tested-by: default avatarTony Brelinski <tonyx.brelinski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      09cfae9f
  3. Jul 20, 2021
  4. Jul 19, 2021
    • David S. Miller's avatar
      Merge branch 'bnxt_en-fixes' · 1dd271d9
      David S. Miller authored
      
      
      Michael Chan says:
      
      ====================
      bnxt_en: Bug fixes
      
      Most of the fixes in this series have to do with error recovery.  They
      include error path handling when the error recovery has to abort, and
      the rediscovery of capabilities (PTP and RoCE) after firmware reset
      that may result in capability changes.
      
      Two other fixes are to reject invalid ETS settings and to validate
      VLAN protocol in the RX path.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1dd271d9
    • Michael Chan's avatar
      bnxt_en: Fix PTP capability discovery · de5bf194
      Michael Chan authored
      The current PTP initialization logic does not account for firmware
      reset that may cause PTP capability to change.  The valid pointer
      bp->ptp_cfg is used to indicate that the device is capable of PTP
      and that it has been initialized.  So we must clean up bp->ptp_cfg
      and free it if the firmware after reset does not support PTP.
      
      Fixes: 93cb62d9
      
       ("bnxt_en: Enable hardware PTP support")
      Cc: Richard Cochran <richardcochran@gmail.com>
      Reviewed-by: default avatarPavan Chebbi <pavan.chebbi@broadcom.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      de5bf194
    • Michael Chan's avatar
      bnxt_en: Move bnxt_ptp_init() to bnxt_open() · d7859afb
      Michael Chan authored
      The device needs to be in ifup state for PTP to function, so move
      bnxt_ptp_init() to bnxt_open().  This means that the PHC will be
      registered during bnxt_open().
      
      This also makes firmware reset work correctly.  PTP configurations
      may change after firmware upgrade or downgrade.  bnxt_open() will
      be called after firmware reset, so it will work properly.
      
      bnxt_ptp_start() is now incorporated into bnxt_ptp_init().  We now
      also need to call bnxt_ptp_clear() in bnxt_close().
      
      Fixes: 93cb62d9
      
       ("bnxt_en: Enable hardware PTP support")
      Cc: Richard Cochran <richardcochran@gmail.com>
      Reviewed-by: default avatarPavan Chebbi <pavan.chebbi@broadcom.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d7859afb
    • Somnath Kotur's avatar
      bnxt_en: Check abort error state in bnxt_half_open_nic() · 11a39259
      Somnath Kotur authored
      bnxt_half_open_nic() is called during during ethtool self test and is
      protected by rtnl_lock.  Firmware reset can be happening at the same
      time.  Only critical portions of the entire firmware reset sequence
      are protected by the rtnl_lock.  It is possible that bnxt_half_open_nic()
      can be called when the firmware reset sequence is aborting.  In that
      case, bnxt_half_open_nic() needs to check if the ABORT_ERR flag is set
      and abort if it is.  The ethtool self test will fail but the NIC will be
      brought to a consistent IF_DOWN state.
      
      Without this patch, if bnxt_half_open_nic() were to continue in this
      error state, it may crash like this:
      
        bnxt_en 0000:82:00.1 enp130s0f1np1: FW reset in progress during close, FW reset will be aborted
        Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
        ...
        Process ethtool (pid: 333327, stack limit = 0x0000000046476577)
        Call trace:
        bnxt_alloc_mem+0x444/0xef0 [bnxt_en]
        bnxt_half_open_nic+0x24/0xb8 [bnxt_en]
        bnxt_self_test+0x2dc/0x390 [bnxt_en]
        ethtool_self_test+0xe0/0x1f8
        dev_ethtool+0x1744/0x22d0
        dev_ioctl+0x190/0x3e0
        sock_ioctl+0x238/0x480
        do_vfs_ioctl+0xc4/0x758
        ksys_ioctl+0x84/0xb8
        __arm64_sys_ioctl+0x28/0x38
        el0_svc_handler+0xb0/0x180
        el0_svc+0x8/0xc
      
      Fixes: a1301f08
      
       ("bnxt_en: Check abort error state in bnxt_open_nic().")
      Signed-off-by: default avatarSomnath Kotur <somnath.kotur@broadcom.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      11a39259
    • Michael Chan's avatar
      bnxt_en: Validate vlan protocol ID on RX packets · 96bdd4b9
      Michael Chan authored
      Only pass supported VLAN protocol IDs for stripped VLAN tags to the
      stack.  The stack will hit WARN() if the protocol ID is unsupported.
      
      Existing firmware sets up the chip to strip 0x8100, 0x88a8, 0x9100.
      Only the 1st two protocols are supported by the kernel.
      
      Fixes: a196e96b
      
       ("bnxt_en: clean up VLAN feature bit handling")
      Reviewed-by: default avatarSomnath Kotur <somnath.kotur@broadcom.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      96bdd4b9