Skip to content
  1. Nov 26, 2021
    • Ahmed Zaki's avatar
      mac80211: fix a memory leak where sta_info is not freed · 8f9dcc29
      Ahmed Zaki authored
      
      
      The following is from a system that went OOM due to a memory leak:
      
      wlan0: Allocated STA 74:83:c2:64:0b:87
      wlan0: Allocated STA 74:83:c2:64:0b:87
      wlan0: IBSS finish 74:83:c2:64:0b:87 (---from ieee80211_ibss_add_sta)
      wlan0: Adding new IBSS station 74:83:c2:64:0b:87
      wlan0: moving STA 74:83:c2:64:0b:87 to state 2
      wlan0: moving STA 74:83:c2:64:0b:87 to state 3
      wlan0: Inserted STA 74:83:c2:64:0b:87
      wlan0: IBSS finish 74:83:c2:64:0b:87 (---from ieee80211_ibss_work)
      wlan0: Adding new IBSS station 74:83:c2:64:0b:87
      wlan0: moving STA 74:83:c2:64:0b:87 to state 2
      wlan0: moving STA 74:83:c2:64:0b:87 to state 3
      .
      .
      wlan0: expiring inactive not authorized STA 74:83:c2:64:0b:87
      wlan0: moving STA 74:83:c2:64:0b:87 to state 2
      wlan0: moving STA 74:83:c2:64:0b:87 to state 1
      wlan0: Removed STA 74:83:c2:64:0b:87
      wlan0: Destroyed STA 74:83:c2:64:0b:87
      
      The ieee80211_ibss_finish_sta() is called twice on the same STA from 2
      different locations. On the second attempt, the allocated STA is not
      destroyed creating a kernel memory leak.
      
      This is happening because sta_info_insert_finish() does not call
      sta_info_free() the second time when the STA already exists (returns
      -EEXIST). Note that the caller sta_info_insert_rcu() assumes STA is
      destroyed upon errors.
      
      Same fix is applied to -ENOMEM.
      
      Signed-off-by: default avatarAhmed Zaki <anzaki@gmail.com>
      Link: https://lore.kernel.org/r/20211002145329.3125293-1-anzaki@gmail.com
      [change the error path label to use the existing code]
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      8f9dcc29
    • Xing Song's avatar
      mac80211: set up the fwd_skb->dev for mesh forwarding · 942bd107
      Xing Song authored
      
      
      Mesh forwarding requires that the fwd_skb->dev is set up for TX handling,
      otherwise the following warning will be generated, so set it up for the
      pending frames.
      
      [   72.835674 ] WARNING: CPU: 0 PID: 1193 at __skb_flow_dissect+0x284/0x1298
      [   72.842379 ] Modules linked in: ksmbd pppoe ppp_async l2tp_ppp ...
      [   72.962020 ] CPU: 0 PID: 1193 Comm: kworker/u5:1 Tainted: P S 5.4.137 #0
      [   72.969938 ] Hardware name: MT7622_MT7531 RFB (DT)
      [   72.974659 ] Workqueue: napi_workq napi_workfn
      [   72.979025 ] pstate: 60000005 (nZCv daif -PAN -UAO)
      [   72.983822 ] pc : __skb_flow_dissect+0x284/0x1298
      [   72.988444 ] lr : __skb_flow_dissect+0x54/0x1298
      [   72.992977 ] sp : ffffffc010c738c0
      [   72.996293 ] x29: ffffffc010c738c0 x28: 0000000000000000
      [   73.001615 ] x27: 000000000000ffc2 x26: ffffff800c2eb818
      [   73.006937 ] x25: ffffffc010a987c8 x24: 00000000000000ce
      [   73.012259 ] x23: ffffffc010c73a28 x22: ffffffc010a99c60
      [   73.017581 ] x21: 000000000000ffc2 x20: ffffff80094da800
      [   73.022903 ] x19: 0000000000000000 x18: 0000000000000014
      [   73.028226 ] x17: 00000000084d16af x16: 00000000d1fc0bab
      [   73.033548 ] x15: 00000000715f6034 x14: 000000009dbdd301
      [   73.038870 ] x13: 00000000ea4dcbc3 x12: 0000000000000040
      [   73.044192 ] x11: 000000000eb00ff0 x10: 0000000000000000
      [   73.049513 ] x9 : 000000000eb00073 x8 : 0000000000000088
      [   73.054834 ] x7 : 0000000000000000 x6 : 0000000000000001
      [   73.060155 ] x5 : 0000000000000000 x4 : 0000000000000000
      [   73.065476 ] x3 : ffffffc010a98000 x2 : 0000000000000000
      [   73.070797 ] x1 : 0000000000000000 x0 : 0000000000000000
      [   73.076120 ] Call trace:
      [   73.078572 ]  __skb_flow_dissect+0x284/0x1298
      [   73.082846 ]  __skb_get_hash+0x7c/0x228
      [   73.086629 ]  ieee80211_txq_may_transmit+0x7fc/0x17b8 [mac80211]
      [   73.092564 ]  ieee80211_tx_prepare_skb+0x20c/0x268 [mac80211]
      [   73.098238 ]  ieee80211_tx_pending+0x144/0x330 [mac80211]
      [   73.103560 ]  tasklet_action_common.isra.16+0xb4/0x158
      [   73.108618 ]  tasklet_action+0x2c/0x38
      [   73.112286 ]  __do_softirq+0x168/0x3b0
      [   73.115954 ]  do_softirq.part.15+0x88/0x98
      [   73.119969 ]  __local_bh_enable_ip+0xb0/0xb8
      [   73.124156 ]  napi_workfn+0x58/0x90
      [   73.127565 ]  process_one_work+0x20c/0x478
      [   73.131579 ]  worker_thread+0x50/0x4f0
      [   73.135249 ]  kthread+0x124/0x128
      [   73.138484 ]  ret_from_fork+0x10/0x1c
      
      Signed-off-by: default avatarXing Song <xing.song@mediatek.com>
      Tested-By: default avatarFrank Wunderlich <frank-w@public-files.de>
      Link: https://lore.kernel.org/r/20211123033123.2684-1-xing.song@mediatek.com
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      942bd107
    • Felix Fietkau's avatar
      mac80211: fix regression in SSN handling of addba tx · 73111efa
      Felix Fietkau authored
      Some drivers that do their own sequence number allocation (e.g. ath9k) rely
      on being able to modify params->ssn on starting tx ampdu sessions.
      This was broken by a change that modified it to use sta->tid_seq[tid] instead.
      
      Cc: stable@vger.kernel.org
      Fixes: 31d8bb4e
      
       ("mac80211: agg-tx: refactor sending addba")
      Reported-by: default avatarEneas U de Queiroz <cotequeiroz@gmail.com>
      Signed-off-by: default avatarFelix Fietkau <nbd@nbd.name>
      Link: https://lore.kernel.org/r/20211124094024.43222-1-nbd@nbd.name
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      73111efa
    • Felix Fietkau's avatar
      mac80211: fix rate control for retransmitted frames · 18688c80
      Felix Fietkau authored
      
      
      Since retransmission clears info->control, rate control needs to be called
      again, otherwise the driver might crash due to invalid rates.
      
      Cc: stable@vger.kernel.org # 5.14+
      Reported-by: default avatarAaro Koskinen <aaro.koskinen@iki.fi>
      Reported-by: default avatarRobert W <rwbugreport@lost-in-the-void.net>
      Fixes: 03c3911d
      
       ("mac80211: call ieee80211_tx_h_rate_ctrl() when dequeue")
      Signed-off-by: default avatarFelix Fietkau <nbd@nbd.name>
      Tested-by: default avatarAaro Koskinen <aaro.koskinen@iki.fi>
      Link: https://lore.kernel.org/r/20211122204323.9787-1-nbd@nbd.name
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      18688c80
    • Johannes Berg's avatar
      mac80211: track only QoS data frames for admission control · d5e568c3
      Johannes Berg authored
      
      
      For admission control, obviously all of that only works for
      QoS data frames, otherwise we cannot even access the QoS
      field in the header.
      
      Syzbot reported (see below) an uninitialized value here due
      to a status of a non-QoS nullfunc packet, which isn't even
      long enough to contain the QoS header.
      
      Fix this to only do anything for QoS data packets.
      
      Reported-by: default avatar <syzbot+614e82b88a1a4973e534@syzkaller.appspotmail.com>
      Fixes: 02219b3a
      
       ("mac80211: add WMM admission control support")
      Link: https://lore.kernel.org/r/20211122124737.dad29e65902a.Ieb04587afacb27c14e0de93ec1bfbefb238cc2a0@changeid
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      d5e568c3
    • Maxime Bizon's avatar
      mac80211: fix TCP performance on mesh interface · 48c06708
      Maxime Bizon authored
      
      
      sta is NULL for mesh point (resolved later), so sk pacing parameters
      were not applied.
      
      Signed-off-by: default avatarMaxime Bizon <mbizon@freebox.fr>
      Link: https://lore.kernel.org/r/66f51659416ac35d6b11a313bd3ffe8b8a43dd55.camel@freebox.fr
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      48c06708
    • Jakub Kicinski's avatar
      Merge branch 'tls-splice_read-fixes' · 49573ff7
      Jakub Kicinski authored
      
      
      Jakub Kicinski says:
      
      ====================
      tls: splice_read fixes
      
      As I work my way to unlocked and zero-copy TLS Rx the obvious bugs
      in the splice_read implementation get harder and harder to ignore.
      This is to say the fixes here are discovered by code inspection,
      I'm not aware of anyone actually using splice_read.
      ====================
      
      Link: https://lore.kernel.org/r/20211124232557.2039757-1-kuba@kernel.org
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      49573ff7
    • Jakub Kicinski's avatar
      selftests: tls: test for correct proto_ops · f884a342
      Jakub Kicinski authored
      
      
      Previous patch fixes overriding callbacks incorrectly. Triggering
      the crash in sendpage_locked would be more spectacular but it's
      hard to get to, so take the easier path of proving this is broken
      and call getname. We're currently getting IPv4 socket info on an
      IPv6 socket.
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      f884a342
    • Jakub Kicinski's avatar
      tls: fix replacing proto_ops · f3911f73
      Jakub Kicinski authored
      We replace proto_ops whenever TLS is configured for RX. But our
      replacement also overrides sendpage_locked, which will crash
      unless TX is also configured. Similarly we plug both of those
      in for TLS_HW (NIC crypto offload) even tho TLS_HW has a completely
      different implementation for TX.
      
      Last but not least we always plug in something based on inet_stream_ops
      even though a few of the callbacks differ for IPv6 (getname, release,
      bind).
      
      Use a callback building method similar to what we do for struct proto.
      
      Fixes: c46234eb ("tls: RX path for ktls")
      Fixes: d4ffb02d
      
       ("net/tls: enable sk_msg redirect to tls socket egress")
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      f3911f73
    • Jakub Kicinski's avatar
      selftests: tls: test splicing decrypted records · 274af0f9
      Jakub Kicinski authored
      
      
      Add tests for half-received and peeked records.
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      274af0f9
    • Jakub Kicinski's avatar
      tls: splice_read: fix accessing pre-processed records · e062fe99
      Jakub Kicinski authored
      recvmsg() will put peek()ed and partially read records onto the rx_list.
      splice_read() needs to consult that list otherwise it may miss data.
      Align with recvmsg() and also put partially-read records onto rx_list.
      tls_sw_advance_skb() is pretty pointless now and will be removed in
      net-next.
      
      Fixes: 692d7b5d
      
       ("tls: Fix recvmsg() to be able to peek across multiple records")
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e062fe99
    • Jakub Kicinski's avatar
      selftests: tls: test splicing cmsgs · d87d67fd
      Jakub Kicinski authored
      
      
      Make sure we correctly reject splicing non-data records.
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      d87d67fd
    • Jakub Kicinski's avatar
      tls: splice_read: fix record type check · 520493f6
      Jakub Kicinski authored
      We don't support splicing control records. TLS 1.3 changes moved
      the record type check into the decrypt if(). The skb may already
      be decrypted and still be an alert.
      
      Note that decrypt_skb_update() is idempotent and updates ctx->decrypted
      so the if() is pointless.
      
      Reorder the check for decryption errors with the content type check
      while touching them. This part is not really a bug, because if
      decryption failed in TLS 1.3 content type will be DATA, and for
      TLS 1.2 it will be correct. Nevertheless its strange to touch output
      before checking if the function has failed.
      
      Fixes: fedf201e
      
       ("net: tls: Refactor control message handling on recv")
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      520493f6
    • Jakub Kicinski's avatar
      selftests: tls: add tests for handling of bad records · ef0fc0b3
      Jakub Kicinski authored
      
      
      Test broken records.
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      ef0fc0b3
    • Jakub Kicinski's avatar
      selftests: tls: factor out cmsg send/receive · 31180adb
      Jakub Kicinski authored
      
      
      Add helpers for sending and receiving special record types.
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      31180adb
    • Jakub Kicinski's avatar
      selftests: tls: add helper for creating sock pairs · a125f91f
      Jakub Kicinski authored
      
      
      We have the same code 3 times, about to add a fourth copy.
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      a125f91f
  2. Nov 25, 2021
    • Dylan Hung's avatar
      mdio: aspeed: Fix "Link is Down" issue · 9dbe33cf
      Dylan Hung authored
      The issue happened randomly in runtime.  The message "Link is Down" is
      popped but soon it recovered to "Link is Up".
      
      The "Link is Down" results from the incorrect read data for reading the
      PHY register via MDIO bus.  The correct sequence for reading the data
      shall be:
      1. fire the command
      2. wait for command done (this step was missing)
      3. wait for data idle
      4. read data from data register
      
      Cc: stable@vger.kernel.org
      Fixes: f160e994
      
       ("net: phy: Add mdio-aspeed")
      Reviewed-by: default avatarJoel Stanley <joel@jms.id.au>
      Signed-off-by: default avatarDylan Hung <dylan_hung@aspeedtech.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: default avatarRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
      Link: https://lore.kernel.org/r/20211125024432.15809-1-dylan_hung@aspeedtech.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      9dbe33cf
    • Jesse Brandeburg's avatar
      igb: fix netpoll exit with traffic · eaeace60
      Jesse Brandeburg authored
      Oleksandr brought a bug report where netpoll causes trace
      messages in the log on igb.
      
      Danielle brought this back up as still occurring, so we'll try
      again.
      
      [22038.710800] ------------[ cut here ]------------
      [22038.710801] igb_poll+0x0/0x1440 [igb] exceeded budget in poll
      [22038.710802] WARNING: CPU: 12 PID: 40362 at net/core/netpoll.c:155 netpoll_poll_dev+0x18a/0x1a0
      
      As Alex suggested, change the driver to return work_done at the
      exit of napi_poll, which should be safe to do in this driver
      because it is not polling multiple queues in this single napi
      context (multiple queues attached to one MSI-X vector). Several
      other drivers contain the same simple sequence, so I hope
      this will not create new problems.
      
      Fixes: 16eb8815
      
       ("igb: Refactor clean_rx_irq to reduce overhead and improve performance")
      Reported-by: default avatarOleksandr Natalenko <oleksandr@natalenko.name>
      Reported-by: default avatarDanielle Ratson <danieller@nvidia.com>
      Suggested-by: default avatarAlexander Duyck <alexander.duyck@gmail.com>
      Signed-off-by: default avatarJesse Brandeburg <jesse.brandeburg@intel.com>
      Tested-by: default avatarOleksandr Natalenko <oleksandr@natalenko.name>
      Tested-by: default avatarDanielle Ratson <danieller@nvidia.com>
      Link: https://lore.kernel.org/r/20211123204000.1597971-1-jesse.brandeburg@intel.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      eaeace60
    • Jakub Kicinski's avatar
      Merge branch 'net-smc-fixes-2021-11-24' · fef30d63
      Jakub Kicinski authored
      
      
      Karsten Graul says:
      
      ====================
      net/smc: fixes 2021-11-24
      
      Patch 1 from DaXing fixes a possible loop in smc_listen().
      Patch 2 prevents a NULL pointer dereferencing while iterating
      over the lower network devices.
      ====================
      
      Link: https://lore.kernel.org/r/20211124123238.471429-1-kgraul@linux.ibm.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      fef30d63
    • Guo DaXing's avatar
      net/smc: Fix loop in smc_listen · 9ebb0c4b
      Guo DaXing authored
      The kernel_listen function in smc_listen will fail when all the available
      ports are occupied.  At this point smc->clcsock->sk->sk_data_ready has
      been changed to smc_clcsock_data_ready.  When we call smc_listen again,
      now both smc->clcsock->sk->sk_data_ready and smc->clcsk_data_ready point
      to the smc_clcsock_data_ready function.
      
      The smc_clcsock_data_ready() function calls lsmc->clcsk_data_ready which
      now points to itself resulting in an infinite loop.
      
      This patch restores smc->clcsock->sk->sk_data_ready with the old value.
      
      Fixes: a60a2b1e
      
       ("net/smc: reduce active tcp_listen workers")
      Signed-off-by: default avatarGuo DaXing <guodaxing@huawei.com>
      Acked-by: default avatarTony Lu <tonylu@linux.alibaba.com>
      Signed-off-by: default avatarKarsten Graul <kgraul@linux.ibm.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      9ebb0c4b
    • Karsten Graul's avatar
      net/smc: Fix NULL pointer dereferencing in smc_vlan_by_tcpsk() · 587acad4
      Karsten Graul authored
      Coverity reports a possible NULL dereferencing problem:
      
      in smc_vlan_by_tcpsk():
      6. returned_null: netdev_lower_get_next returns NULL (checked 29 out of 30 times).
      7. var_assigned: Assigning: ndev = NULL return value from netdev_lower_get_next.
      1623                ndev = (struct net_device *)netdev_lower_get_next(ndev, &lower);
      CID 1468509 (#1 of 1): Dereference null return value (NULL_RETURNS)
      8. dereference: Dereferencing a pointer that might be NULL ndev when calling is_vlan_dev.
      1624                if (is_vlan_dev(ndev)) {
      
      Remove the manual implementation and use netdev_walk_all_lower_dev() to
      iterate over the lower devices. While on it remove an obsolete function
      parameter comment.
      
      Fixes: cb9d43f6
      
       ("net/smc: determine vlan_id of stacked net_device")
      Suggested-by: default avatarJulian Wiedmann <jwi@linux.ibm.com>
      Signed-off-by: default avatarKarsten Graul <kgraul@linux.ibm.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      587acad4
    • Jakub Kicinski's avatar
      Merge branch 'phylink-resolve-fixes' · 06e5ba71
      Jakub Kicinski authored
      
      
      Marek Behún says:
      
      ====================
      phylink resolve fixes
      
      With information from me and my nagging, Russell has produced two fixes
      for phylink, which add code that triggers another phylink_resolve() from
      phylink_resolve(), if certain conditions are met:
        interface is being changed
      or
        link is down and previous link was up
      These are needed because sometimes the PCS callbacks may provide stale
      values if link / speed / ...
      ====================
      
      Link: https://lore.kernel.org/r/20211123154403.32051-1-kabel@kernel.org
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      06e5ba71
    • Russell King (Oracle)'s avatar
      net: phylink: Force retrigger in case of latched link-fail indicator · dbae3388
      Russell King (Oracle) authored
      On mv88e6xxx 1G/2.5G PCS, the SerDes register 4.2001.2 has the following
      description:
        This register bit indicates when link was lost since the last
        read. For the current link status, read this register
        back-to-back.
      
      Thus to get current link state, we need to read the register twice.
      
      But doing that in the link change interrupt handler would lead to
      potentially ignoring link down events, which we really want to avoid.
      
      Thus this needs to be solved in phylink's resolve, by retriggering
      another resolve in the event when PCS reports link down and previous
      link was up, and by re-reading PCS state if the previous link was down.
      
      The wrong value is read when phylink requests change from sgmii to
      2500base-x mode, and link won't come up. This fixes the bug.
      
      Fixes: 9525ae83
      
       ("phylink: add phylink infrastructure")
      Signed-off-by: default avatarRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
      Signed-off-by: default avatarMarek Behún <kabel@kernel.org>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      dbae3388
    • Russell King (Oracle)'s avatar
      net: phylink: Force link down and retrigger resolve on interface change · 80662f4f
      Russell King (Oracle) authored
      On PHY state change the phylink_resolve() function can read stale
      information from the MAC and report incorrect link speed and duplex to
      the kernel message log.
      
      Example with a Marvell 88X3310 PHY connected to a SerDes port on Marvell
      88E6393X switch:
      - PHY driver triggers state change due to PHY interface mode being
        changed from 10gbase-r to 2500base-x due to copper change in speed
        from 10Gbps to 2.5Gbps, but the PHY itself either hasn't yet changed
        its interface to the host, or the interrupt about loss of SerDes link
        hadn't arrived yet (there can be a delay of several milliseconds for
        this), so we still think that the 10gbase-r mode is up
      - phylink_resolve()
        - phylink_mac_pcs_get_state()
          - this fills in speed=10g link=up
        - interface mode is updated to 2500base-x but speed is left at 10Gbps
        - phylink_major_config()
          - interface is changed to 2500base-x
        - phylink_link_up()
          - mv88e6xxx_mac_link_up()
            - .port_set_speed_duplex()
              - speed is set to 10Gbps
          - reports "Link is Up - 10Gbps/Full" to dmesg
      
      Afterwards when the interrupt finally arrives for mv88e6xxx, another
      resolve is forced in which we get the correct speed from
      phylink_mac_pcs_get_state(), but since the interface is not being
      changed anymore, we don't call phylink_major_config() but only
      phylink_mac_config(), which does not set speed/duplex anymore.
      
      To fix this, we need to force the link down and trigger another resolve
      on PHY interface change event.
      
      Fixes: 9525ae83
      
       ("phylink: add phylink infrastructure")
      Signed-off-by: default avatarRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
      Signed-off-by: default avatarMarek Behún <kabel@kernel.org>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      80662f4f
    • Heiner Kallweit's avatar
      lan743x: fix deadlock in lan743x_phy_link_status_change() · ddb826c2
      Heiner Kallweit authored
      Usage of phy_ethtool_get_link_ksettings() in the link status change
      handler isn't needed, and in combination with the referenced change
      it results in a deadlock. Simply remove the call and replace it with
      direct access to phydev->speed. The duplex argument of
      lan743x_phy_update_flowcontrol() isn't used and can be removed.
      
      Fixes: c10a485c
      
       ("phy: phy_ethtool_ksettings_get: Lock the phy for consistency")
      Reported-by: default avatarAlessandro B Maurici <abmaurici@gmail.com>
      Tested-by: default avatarAlessandro B Maurici <abmaurici@gmail.com>
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Link: https://lore.kernel.org/r/40e27f76-0ba3-dcef-ee32-a78b9df38b0f@gmail.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      ddb826c2
    • Eric Dumazet's avatar
      tcp_cubic: fix spurious Hystart ACK train detections for not-cwnd-limited flows · 4e1fddc9
      Eric Dumazet authored
      While testing BIG TCP patch series, I was expecting that TCP_RR workloads
      with 80KB requests/answers would send one 80KB TSO packet,
      then being received as a single GRO packet.
      
      It turns out this was not happening, and the root cause was that
      cubic Hystart ACK train was triggering after a few (2 or 3) rounds of RPC.
      
      Hystart was wrongly setting CWND/SSTHRESH to 30, while my RPC
      needed a budget of ~20 segments.
      
      Ideally these TCP_RR flows should not exit slow start.
      
      Cubic Hystart should reset itself at each round, instead of assuming
      every TCP flow is a bulk one.
      
      Note that even after this patch, Hystart can still trigger, depending
      on scheduling artifacts, but at a higher CWND/SSTHRESH threshold,
      keeping optimal TSO packet sizes.
      
      Tested:
      
      ip link set dev eth0 gro_ipv6_max_size 131072 gso_ipv6_max_size 131072
      nstat -n; netperf -H ... -t TCP_RR  -l 5  -- -r 80000,80000 -K cubic; nstat|egrep "Ip6InReceives|Hystart|Ip6OutRequests"
      
      Before:
      
         8605
      Ip6InReceives                   87541              0.0
      Ip6OutRequests                  129496             0.0
      TcpExtTCPHystartTrainDetect     1                  0.0
      TcpExtTCPHystartTrainCwnd       30                 0.0
      
      After:
      
        8760
      Ip6InReceives                   88514              0.0
      Ip6OutRequests                  87975              0.0
      
      Fixes: ae27e98a
      
       ("[TCP] CUBIC v2.3")
      Co-developed-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Stephen Hemminger <stephen@networkplumber.org>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Soheil Hassas Yeganeh <soheil@google.com>
      Link: https://lore.kernel.org/r/20211123202535.1843771-1-eric.dumazet@gmail.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      4e1fddc9
    • Florian Fainelli's avatar
      MAINTAINERS: Update B53 section to cover SF2 switch driver · 550b8e1d
      Florian Fainelli authored
      
      
      Update the B53 Ethernet switch section to contain
      drivers/net/dsa/bcm_sf2*.
      
      Reported-by: default avatarRussell King (Oracle) <linux@armlinux.org.uk>
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Acked-by: default avatarRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
      Link: https://lore.kernel.org/r/20211123222422.3745485-1-f.fainelli@gmail.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      550b8e1d
    • Jakub Kicinski's avatar
      Merge tag 'ieee802154-for-net-2021-11-24' of... · 48a78f50
      Jakub Kicinski authored
      
      Merge tag 'ieee802154-for-net-2021-11-24' of git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan
      
      Stefan Schmidt says:
      
      ====================
      pull-request: ieee802154 for net 2021-11-24
      
      A fix from Alexander which has been brought up various times found by
      automated checkers. Make sure values are in u32 range.
      
      * tag 'ieee802154-for-net-2021-11-24' of git://git.kernel.org/pub/scm/linux/kernel/git/sschmidt/wpan:
        net: ieee802154: handle iftypes as u32
      ====================
      
      Link: https://lore.kernel.org/r/20211124150934.3670248-1-stefan@datenfreihafen.org
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      48a78f50
  3. Nov 24, 2021
  4. Nov 23, 2021
    • Marek Behún's avatar
      net: marvell: mvpp2: increase MTU limit when XDP enabled · 7b1b62bc
      Marek Behún authored
      Currently mvpp2_xdp_setup won't allow attaching XDP program if
        mtu > ETH_DATA_LEN (1500).
      
      The mvpp2_change_mtu on the other hand checks whether
        MVPP2_RX_PKT_SIZE(mtu) > MVPP2_BM_LONG_PKT_SIZE.
      
      These two checks are semantically different.
      
      Moreover this limit can be increased to MVPP2_MAX_RX_BUF_SIZE, since in
      mvpp2_rx we have
        xdp.data = data + MVPP2_MH_SIZE + MVPP2_SKB_HEADROOM;
        xdp.frame_sz = PAGE_SIZE;
      
      Change the checks to check whether
        mtu > MVPP2_MAX_RX_BUF_SIZE
      
      Fixes: 07dd0a7a
      
       ("mvpp2: add basic XDP support")
      Signed-off-by: default avatarMarek Behún <kabel@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7b1b62bc
    • Alex Elder's avatar
      net: ipa: kill ipa_cmd_pipeline_clear() · e4e9bfb7
      Alex Elder authored
      Calling ipa_cmd_pipeline_clear() after stopping the channel
      underlying the AP<-modem RX endpoint can lead to a deadlock.
      
      This occurs in the ->runtime_suspend device power operation for the
      IPA driver.  While this callback is in progress, any other requests
      for power will block until the callback returns.
      
      Stopping the AP<-modem RX channel does not prevent the modem from
      sending another packet to this endpoint.  If a packet arrives for an
      RX channel when the channel is stopped, an SUSPEND IPA interrupt
      condition will be pending.  Handling an IPA interrupt requires
      power, so ipa_isr_thread() calls pm_runtime_get_sync() first thing.
      
      The problem occurs because a "pipeline clear" command will not
      complete while such a SUSPEND interrupt condition exists.  So the
      SUSPEND IPA interrupt handler won't proceed until it gets power;
      that won't happen until the ->runtime_suspend callback (and its
      "pipeline clear" command) completes; and that can't happen while
      the SUSPEND interrupt condition exists.
      
      It turns out that in this case there is no need to use the "pipeline
      clear" command.  There are scenarios in which clearing the pipeline
      is required while suspending, but those are not (yet) supported
      upstream.  So a simple fix, avoiding the potential deadlock, is to
      stop calling ipa_cmd_pipeline_clear() in ipa_endpoint_suspend().
      This removes the only user of ipa_cmd_pipeline_clear(), so get rid
      of that function.  It can be restored again whenever it's needed.
      
      This is basically a manual revert along with an explanation for
      commit 6cb63ea6 ("net: ipa: introduce ipa_cmd_tag_process()").
      
      Fixes: 6cb63ea6
      
       ("net: ipa: introduce ipa_cmd_tag_process()")
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e4e9bfb7
    • Martyn Welch's avatar
      net: usb: Correct PHY handling of smsc95xx · a049a30f
      Martyn Welch authored
      
      
      The smsc95xx driver is dropping phy speed settings and causing a stack
      trace at device unbind:
      
      [  536.379147] smsc95xx 2-1:1.0 eth1: unregister 'smsc95xx' usb-ci_hdrc.2-1, smsc95xx USB 2.0 Ethernet
      [  536.425029] ------------[ cut here ]------------
      [  536.429650] WARNING: CPU: 0 PID: 439 at fs/kernfs/dir.c:1535 kernfs_remove_by_name_ns+0xb8/0xc0
      [  536.438416] kernfs: can not remove 'attached_dev', no directory
      [  536.444363] Modules linked in: xts dm_crypt dm_mod atmel_mxt_ts smsc95xx usbnet
      [  536.451748] CPU: 0 PID: 439 Comm: sh Tainted: G        W         5.15.0 #1
      [  536.458636] Hardware name: Freescale i.MX53 (Device Tree Support)
      [  536.464735] Backtrace: 
      [  536.467190] [<80b1c904>] (dump_backtrace) from [<80b1cb48>] (show_stack+0x20/0x24)
      [  536.474787]  r7:000005ff r6:8035b294 r5:600f0013 r4:80d8af78
      [  536.480449] [<80b1cb28>] (show_stack) from [<80b1f764>] (dump_stack_lvl+0x48/0x54)
      [  536.488035] [<80b1f71c>] (dump_stack_lvl) from [<80b1f788>] (dump_stack+0x18/0x1c)
      [  536.495620]  r5:00000009 r4:80d9b820
      [  536.499198] [<80b1f770>] (dump_stack) from [<80124fac>] (__warn+0xfc/0x114)
      [  536.506187] [<80124eb0>] (__warn) from [<80b1d21c>] (warn_slowpath_fmt+0xa8/0xdc)
      [  536.513688]  r7:000005ff r6:80d9b820 r5:80d9b8e0 r4:83744000
      [  536.519349] [<80b1d178>] (warn_slowpath_fmt) from [<8035b294>] (kernfs_remove_by_name_ns+0xb8/0xc0)
      [  536.528416]  r9:00000001 r8:00000000 r7:824926dc r6:00000000 r5:80df6c2c r4:00000000
      [  536.536162] [<8035b1dc>] (kernfs_remove_by_name_ns) from [<80b1f56c>] (sysfs_remove_link+0x4c/0x50)
      [  536.545225]  r6:7f00f02c r5:80df6c2c r4:83306400
      [  536.549845] [<80b1f520>] (sysfs_remove_link) from [<806f9c8c>] (phy_detach+0xfc/0x11c)
      [  536.557780]  r5:82492000 r4:83306400
      [  536.561359] [<806f9b90>] (phy_detach) from [<806f9cf8>] (phy_disconnect+0x4c/0x58)
      [  536.568943]  r7:824926dc r6:7f00f02c r5:82492580 r4:83306400
      [  536.574604] [<806f9cac>] (phy_disconnect) from [<7f00a310>] (smsc95xx_disconnect_phy+0x30/0x38 [smsc95xx])
      [  536.584290]  r5:82492580 r4:82492580
      [  536.587868] [<7f00a2e0>] (smsc95xx_disconnect_phy [smsc95xx]) from [<7f001570>] (usbnet_stop+0x70/0x1a0 [usbnet])
      [  536.598161]  r5:82492580 r4:82492000
      [  536.601740] [<7f001500>] (usbnet_stop [usbnet]) from [<808baa70>] (__dev_close_many+0xb4/0x12c)
      [  536.610466]  r8:83744000 r7:00000000 r6:83744000 r5:83745b74 r4:82492000
      [  536.617170] [<808ba9bc>] (__dev_close_many) from [<808bab78>] (dev_close_many+0x90/0x120)
      [  536.625365]  r7:00000001 r6:83745b74 r5:83745b8c r4:82492000
      [  536.631026] [<808baae8>] (dev_close_many) from [<808bf408>] (unregister_netdevice_many+0x15c/0x704)
      [  536.640094]  r9:00000001 r8:81130b98 r7:83745b74 r6:83745bc4 r5:83745b8c r4:82492000
      [  536.647840] [<808bf2ac>] (unregister_netdevice_many) from [<808bfa50>] (unregister_netdevice_queue+0xa0/0xe8)
      [  536.657775]  r10:8112bcc0 r9:83306c00 r8:83306c80 r7:8291e420 r6:83744000 r5:00000000
      [  536.665608]  r4:82492000
      [  536.668143] [<808bf9b0>] (unregister_netdevice_queue) from [<808bfac0>] (unregister_netdev+0x28/0x30)
      [  536.677381]  r6:7f01003c r5:82492000 r4:82492000
      [  536.682000] [<808bfa98>] (unregister_netdev) from [<7f000b40>] (usbnet_disconnect+0x64/0xdc [usbnet])
      [  536.691241]  r5:82492000 r4:82492580
      [  536.694819] [<7f000adc>] (usbnet_disconnect [usbnet]) from [<8076b958>] (usb_unbind_interface+0x80/0x248)
      [  536.704406]  r5:7f01003c r4:83306c80
      [  536.707984] [<8076b8d8>] (usb_unbind_interface) from [<8061765c>] (device_release_driver_internal+0x1c4/0x1cc)
      [  536.718005]  r10:8112bcc0 r9:80dff1dc r8:83306c80 r7:83744000 r6:7f01003c r5:00000000
      [  536.725838]  r4:8291e420
      [  536.728373] [<80617498>] (device_release_driver_internal) from [<80617684>] (device_release_driver+0x20/0x24)
      [  536.738302]  r7:83744000 r6:810d4f4c r5:8291e420 r4:8176ae30
      [  536.743963] [<80617664>] (device_release_driver) from [<806156cc>] (bus_remove_device+0xf0/0x148)
      [  536.752858] [<806155dc>] (bus_remove_device) from [<80610018>] (device_del+0x198/0x41c)
      [  536.760880]  r7:83744000 r6:8116e2e4 r5:8291e464 r4:8291e420
      [  536.766542] [<8060fe80>] (device_del) from [<80768fe8>] (usb_disable_device+0xcc/0x1e0)
      [  536.774576]  r10:8112bcc0 r9:80dff1dc r8:00000001 r7:8112bc48 r6:8291e400 r5:00000001
      [  536.782410]  r4:83306c00
      [  536.784945] [<80768f1c>] (usb_disable_device) from [<80769c30>] (usb_set_configuration+0x514/0x8dc)
      [  536.794011]  r10:00000000 r9:00000000 r8:832c3600 r7:00000004 r6:810d5688 r5:00000000
      [  536.801844]  r4:83306c00
      [  536.804379] [<8076971c>] (usb_set_configuration) from [<80775fac>] (usb_generic_driver_disconnect+0x34/0x38)
      [  536.814236]  r10:832c3610 r9:83745ef8 r8:832c3600 r7:00000004 r6:810d5688 r5:83306c00
      [  536.822069]  r4:83306c00
      [  536.824605] [<80775f78>] (usb_generic_driver_disconnect) from [<8076b850>] (usb_unbind_device+0x30/0x70)
      [  536.834100]  r5:83306c00 r4:810d5688
      [  536.837678] [<8076b820>] (usb_unbind_device) from [<8061765c>] (device_release_driver_internal+0x1c4/0x1cc)
      [  536.847432]  r5:822fb480 r4:83306c80
      [  536.851009] [<80617498>] (device_release_driver_internal) from [<806176a8>] (device_driver_detach+0x20/0x24)
      [  536.860853]  r7:00000004 r6:810d4f4c r5:810d5688 r4:83306c80
      [  536.866515] [<80617688>] (device_driver_detach) from [<80614d98>] (unbind_store+0x70/0xe4)
      [  536.874793] [<80614d28>] (unbind_store) from [<80614118>] (drv_attr_store+0x30/0x3c)
      [  536.882554]  r7:00000000 r6:00000000 r5:83739200 r4:80614d28
      [  536.888217] [<806140e8>] (drv_attr_store) from [<8035cb68>] (sysfs_kf_write+0x48/0x54)
      [  536.896154]  r5:83739200 r4:806140e8
      [  536.899732] [<8035cb20>] (sysfs_kf_write) from [<8035be84>] (kernfs_fop_write_iter+0x11c/0x1d4)
      [  536.908446]  r5:83739200 r4:00000004
      [  536.912024] [<8035bd68>] (kernfs_fop_write_iter) from [<802b87fc>] (vfs_write+0x258/0x3e4)
      [  536.920317]  r10:00000000 r9:83745f58 r8:83744000 r7:00000000 r6:00000004 r5:00000000
      [  536.928151]  r4:82adacc0
      [  536.930687] [<802b85a4>] (vfs_write) from [<802b8b0c>] (ksys_write+0x74/0xf4)
      [  536.937842]  r10:00000004 r9:007767a0 r8:83744000 r7:00000000 r6:00000000 r5:82adacc0
      [  536.945676]  r4:82adacc0
      [  536.948213] [<802b8a98>] (ksys_write) from [<802b8ba4>] (sys_write+0x18/0x1c)
      [  536.955367]  r10:00000004 r9:83744000 r8:80100244 r7:00000004 r6:76f47b58 r5:76fc0350
      [  536.963200]  r4:00000004
      [  536.965735] [<802b8b8c>] (sys_write) from [<80100060>] (ret_fast_syscall+0x0/0x48)
      [  536.973320] Exception stack(0x83745fa8 to 0x83745ff0)
      [  536.978383] 5fa0:                   00000004 76fc0350 00000001 007767a0 00000004 00000000
      [  536.986569] 5fc0: 00000004 76fc0350 76f47b58 00000004 76f47c7c 76f48114 00000000 7e87991c
      [  536.994753] 5fe0: 00000498 7e879908 76e6dce8 76eca2e8
      [  536.999922] ---[ end trace 9b835d809816b435 ]---
      
      The driver should not be connecting and disconnecting the PHY when the
      device is opened and closed, it should be stopping and starting the PHY. The
      phy should be connected as part of binding and disconnected during
      unbinding.
      
      As this results in the PHY not being reset during open, link speed, etc.
      settings set prior to the link coming up are now not being lost.
      
      It is necessary for phy_stop() to only be called when the phydev still
      exists (resolving the above stack trace). When unbinding, ".unbind" will be
      called prior to ".stop", with phy_disconnect() already having called
      phy_stop() before the phydev becomes inaccessible.
      
      Signed-off-by: default avatarMartyn Welch <martyn.welch@collabora.com>
      Cc: Steve Glendinning <steve.glendinning@shawell.net>
      Cc: UNGLinuxDriver@microchip.com
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Cc: stable@kernel.org # v5.15
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a049a30f
    • Zheyu Ma's avatar
      net: chelsio: cxgb4vf: Fix an error code in cxgb4vf_pci_probe() · b82d71c0
      Zheyu Ma authored
      
      
      During the process of driver probing, probe function should return < 0
      for failure, otherwise kernel will treat value == 0 as success.
      
      Therefore, we should set err to -EINVAL when
      adapter->registered_device_map is NULL. Otherwise kernel will assume
      that driver has been successfully probed and will cause unexpected
      errors.
      
      Signed-off-by: default avatarZheyu Ma <zheyuma97@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b82d71c0
    • Heiner Kallweit's avatar
      r8169: fix incorrect mac address assignment · c75a9ad4
      Heiner Kallweit authored
      The original changes brakes MAC address assignment on older chip
      versions (see bug report [0]), and it brakes random MAC assignment.
      
      is_valid_ether_addr() requires that its argument is word-aligned.
      Add the missing alignment to array mac_addr.
      
      [0] https://bugzilla.kernel.org/show_bug.cgi?id=215087
      
      Fixes: 1c5d09d5
      
       ("ethernet: r8169: use eth_hw_addr_set()")
      Reported-by: default avatarRichard Herbert <rherbert@sympatico.ca>
      Tested-by: default avatarRichard Herbert <rherbert@sympatico.ca>
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Acked-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c75a9ad4
    • David S. Miller's avatar
      Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue · 52911bb6
      David S. Miller authored
      
      
      Tony Nguyen says:
      
      ====================
      Intel Wired LAN Driver Updates 2021-11-22
      
      Maciej Fijalkowski says:
      
      Here are the two fixes for issues around ethtool's set_channels()
      callback for ice driver. Both are related to XDP resources. First one
      corrects the size of vsi->txq_map that is used to track the usage of Tx
      resources and the second one prevents the wrong refcounting of bpf_prog.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      52911bb6
    • David S. Miller's avatar
      Merge branch 'ipa-fixes' · 60ebd673
      David S. Miller authored
      
      
      Alex Elder says:
      
      ====================
      net: ipa: prevent shutdown during setup
      
      The setup phase of the IPA driver occurs in one of two ways.
      Normally, it is done directly by the main driver probe function.
      But some systems (those having a "modem-init" DTS property) don't
      start setup until an SMP2P interrupt (sent by the modem) arrives.
      
      Because it isn't performed by the probe function, setup on
      "modem-init" systems could be underway at the time a driver
      remove (or shutdown) request arrives (or vice-versa).  This
      situation can lead to hardware state not being cleaned up
      properly.
      
      This series addresses this problem by having the driver remove
      function disable the setup interrupt.  A consequence of this is
      that setup will complete if it is underway when the remove function
      is called.
      
      So now, when removing the driver, setup:
        - will have already completed;
        - is underway, and will complete before proceeding; or
        - will not have begun (and will not occur).
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      60ebd673
    • Alex Elder's avatar
      net: ipa: separate disabling setup from modem stop · 8afc7e47
      Alex Elder authored
      The IPA setup_complete flag is set at the end of ipa_setup(), when
      the setup phase of initialization has completed successfully.  This
      occurs as part of driver probe processing, or (if "modem-init" is
      specified in the DTS file) it is triggered by the "ipa-setup-ready"
      SMP2P interrupt generated by the modem.
      
      In the latter case, it's possible for driver shutdown (or remove) to
      begin while setup processing is underway, and this can't be allowed.
      The problem is that the setup_complete flag is not adequate to signal
      that setup is underway.
      
      If setup_complete is set, it will never be un-set, so that case is
      not a problem.  But if setup_complete is false, there's a chance
      setup is underway.
      
      Because setup is triggered by an interrupt on a "modem-init" system,
      there is a simple way to ensure the value of setup_complete is safe
      to read.  The threaded handler--if it is executing--will complete as
      part of a request to disable the "ipa-modem-ready" interrupt.  This
      means that ipa_setup() (which is called from the handler) will run
      to completion if it was underway, or will never be called otherwise.
      
      The request to disable the "ipa-setup-ready" interrupt is currently
      made within ipa_modem_stop().  Instead, disable the interrupt
      outside that function in the two places it's called.  In the case of
      ipa_remove(), this ensures the setup_complete flag is safe to read
      before we read it.
      
      Rename ipa_smp2p_disable() to be ipa_smp2p_irq_disable_setup(), to be
      more specific about its effect.
      
      Fixes: 530f9216
      
       ("soc: qcom: ipa: AP/modem communications")
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8afc7e47