Skip to content
  1. Sep 22, 2022
    • Michael Walle's avatar
      net: phy: micrel: fix shared interrupt on LAN8814 · 2002fbac
      Michael Walle authored
      Since commit ece19502 ("net: phy: micrel: 1588 support for LAN8814
      phy") the handler always returns IRQ_HANDLED, except in an error case.
      Before that commit, the interrupt status register was checked and if
      it was empty, IRQ_NONE was returned. Restore that behavior to play nice
      with the interrupt line being shared with others.
      
      Fixes: ece19502
      
       ("net: phy: micrel: 1588 support for LAN8814 phy")
      Signed-off-by: default avatarMichael Walle <michael@walle.cc>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: default avatarHoratiu Vultur <horatiu.vultur@microchip.com>
      Reviewed-by: default avatarDivya Koppera <Divya.Koppera@microchip.com>
      Link: https://lore.kernel.org/r/20220920141619.808117-1-michael@walle.cc
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      2002fbac
    • Wen Gu's avatar
      net/smc: Stop the CLC flow if no link to map buffers on · e738455b
      Wen Gu authored
      There might be a potential race between SMC-R buffer map and
      link group termination.
      
      smc_smcr_terminate_all()     | smc_connect_rdma()
      --------------------------------------------------------------
                                   | smc_conn_create()
      for links in smcibdev        |
              schedule links down  |
                                   | smc_buf_create()
                                   |  \- smcr_buf_map_usable_links()
                                   |      \- no usable links found,
                                   |         (rmb->mr = NULL)
                                   |
                                   | smc_clc_send_confirm()
                                   |  \- access conn->rmb_desc->mr[]->rkey
                                   |     (panic)
      
      During reboot and IB device module remove, all links will be set
      down and no usable links remain in link groups. In such situation
      smcr_buf_map_usable_links() should return an error and stop the
      CLC flow accessing to uninitialized mr.
      
      Fixes: b9247544
      
       ("net/smc: convert static link ID instances to support multiple links")
      Signed-off-by: default avatarWen Gu <guwen@linux.alibaba.com>
      Link: https://lore.kernel.org/r/1663656189-32090-1-git-send-email-guwen@linux.alibaba.com
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      e738455b
    • Jakub Kicinski's avatar
      Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue · 624aea6b
      Jakub Kicinski authored
      
      
      Tony Nguyen says:
      
      ====================
      Intel Wired LAN Driver Updates 2022-09-20 (ice)
      
      Michal re-sets TC configuration when changing number of queues.
      
      Mateusz moves the check and call for link-down-on-close to the specific
      path for downing/closing the interface.
      
      * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
        ice: Fix interface being down after reset with link-down-on-close flag on
        ice: config netdev tc before setting queues number
      ====================
      
      Link: https://lore.kernel.org/r/20220920205344.1860934-1-anthony.l.nguyen@intel.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      624aea6b
    • Larysa Zaremba's avatar
      ice: Fix ice_xdp_xmit() when XDP TX queue number is not sufficient · 114f398d
      Larysa Zaremba authored
      The original patch added the static branch to handle the situation,
      when assigning an XDP TX queue to every CPU is not possible,
      so they have to be shared.
      
      However, in the XDP transmit handler ice_xdp_xmit(), an error was
      returned in such cases even before static condition was checked,
      thus making queue sharing still impossible.
      
      Fixes: 22bf877e
      
       ("ice: introduce XDP_TX fallback path")
      Signed-off-by: default avatarLarysa Zaremba <larysa.zaremba@intel.com>
      Reviewed-by: default avatarAlexander Lobakin <alexandr.lobakin@intel.com>
      Link: https://lore.kernel.org/r/20220919134346.25030-1-larysa.zaremba@intel.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      114f398d
    • Jakub Kicinski's avatar
      Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue · f64780e3
      Jakub Kicinski authored
      
      
      Tony Nguyen says:
      
      ====================
      Intel Wired LAN Driver Updates 2022-09-19 (iavf, i40e)
      
      Norbert adds checking of buffer size for Rx buffer checks in iavf.
      
      Michal corrects setting of max MTU in iavf to account for MTU data provided
      by PF, fixes i40e to set VF max MTU, and resolves lack of rate limiting
      when value was less than divisor for i40e.
      
      * '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
        i40e: Fix set max_tx_rate when it is lower than 1 Mbps
        i40e: Fix VF set max MTU size
        iavf: Fix set max MTU size with port VLAN and jumbo frames
        iavf: Fix bad page state
      ====================
      
      Link: https://lore.kernel.org/r/20220919223428.572091-1-anthony.l.nguyen@intel.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      f64780e3
  2. Sep 21, 2022
    • Jakub Kicinski's avatar
      Merge tag 'linux-can-fixes-for-6.0-20220921' of... · 375a6833
      Jakub Kicinski authored
      
      Merge tag 'linux-can-fixes-for-6.0-20220921' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
      
      Marc Kleine-Budde says:
      
      ====================
      pull-request: can 2022-09-21
      
      The 1st patch is by me, targets the flexcan driver and fixes a
      potential system hang on single core systems under high CAN packet
      rate.
      
      The next 2 patches are also by me and target the gs_usb driver. A
      potential race condition during the ndo_open callback as well as the
      return value if the ethtool identify feature is not supported are
      fixed.
      
      * tag 'linux-can-fixes-for-6.0-20220921' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
        can: gs_usb: gs_usb_set_phys_id(): return with error if identify is not supported
        can: gs_usb: gs_can_open(): fix race dev->can.state condition
        can: flexcan: flexcan_mailbox_read() fix return value for drop = true
      ====================
      
      Link: https://lore.kernel.org/r/20220921083609.419768-1-mkl@pengutronix.de
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      375a6833
    • Jianglei Nie's avatar
      net: atlantic: fix potential memory leak in aq_ndev_close() · 65e5d27d
      Jianglei Nie authored
      
      
      If aq_nic_stop() fails, aq_ndev_close() returns err without calling
      aq_nic_deinit() to release the relevant memory and resource, which
      will lead to a memory leak.
      
      We can fix it by deleting the if condition judgment and goto statement to
      call aq_nic_deinit() directly after aq_nic_stop() to fix the memory leak.
      
      Signed-off-by: default avatarJianglei Nie <niejianglei2021@163.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      65e5d27d
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf · 79a392a3
      David S. Miller authored
      
      
      Florian Westphal says:
      
      ====================
      netfilter: bugfixes for net
      
      The following set contains netfilter fixes for the *net* tree.
      
      Regressions (rc only):
      recent ebtables crash fix was incomplete, it added a memory leak.
      
      The patch to fix possible buffer overrun for BIG TCP in ftp conntrack
      tried to be too clever, we cannot re-use ct->lock: NAT engine might
      grab it again -> deadlock.  Revert back to a global spinlock.
      Both from myself.
      
      Remove the documentation for the recently removed
      'nf_conntrack_helper' sysctl as well, from Pablo Neira.
      
      The static_branch_inc() that guards the 'chain stats enabled' path
      needs to be deferred further, until the entire transaction was created.
      From Tetsuo Handa.
      
      Older bugs:
      Since 5.3:
      nf_tables_addchain may leak pcpu memory in error path when
      offloading fails. Also from Tetsuo Handa.
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      79a392a3
    • Marc Kleine-Budde's avatar
      can: gs_usb: gs_usb_set_phys_id(): return with error if identify is not supported · 0f2211f1
      Marc Kleine-Budde authored
      Until commit 409c188c ("can: tree-wide: advertise software
      timestamping capabilities") the ethtool_ops was only assigned for
      devices which support the GS_CAN_FEATURE_IDENTIFY feature. That commit
      assigns ethtool_ops unconditionally.
      
      This results on controllers without GS_CAN_FEATURE_IDENTIFY support
      for the following ethtool error:
      
      | $ ethtool -p can0 1
      | Cannot identify NIC: Broken pipe
      
      Restore the correct error value by checking for
      GS_CAN_FEATURE_IDENTIFY in the gs_usb_set_phys_id() function.
      
      | $ ethtool -p can0 1
      | Cannot identify NIC: Operation not supported
      
      While there use the variable "netdev" for the "struct net_device"
      pointer and "dev" for the "struct gs_can" pointer as in the rest of
      the driver.
      
      Fixes: 409c188c
      
       ("can: tree-wide: advertise software timestamping capabilities")
      Link: http://lore.kernel.org/all/20220818143853.2671854-1-mkl@pengutronix.de
      Cc: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      0f2211f1
    • Marc Kleine-Budde's avatar
      can: gs_usb: gs_can_open(): fix race dev->can.state condition · 5440428b
      Marc Kleine-Budde authored
      The dev->can.state is set to CAN_STATE_ERROR_ACTIVE, after the device
      has been started. On busy networks the CAN controller might receive
      CAN frame between and go into an error state before the dev->can.state
      is assigned.
      
      Assign dev->can.state before starting the controller to close the race
      window.
      
      Fixes: d08e973a
      
       ("can: gs_usb: Added support for the GS_USB CAN devices")
      Link: https://lore.kernel.org/all/20220920195216.232481-1-mkl@pengutronix.de
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      5440428b
    • Marc Kleine-Budde's avatar
      can: flexcan: flexcan_mailbox_read() fix return value for drop = true · a09721dd
      Marc Kleine-Budde authored
      The following happened on an i.MX25 using flexcan with many packets on
      the bus:
      
      The rx-offload queue reached a length more than skb_queue_len_max. In
      can_rx_offload_offload_one() the drop variable was set to true which
      made the call to .mailbox_read() (here: flexcan_mailbox_read()) to
      _always_ return ERR_PTR(-ENOBUFS) and drop the rx'ed CAN frame. So
      can_rx_offload_offload_one() returned ERR_PTR(-ENOBUFS), too.
      
      can_rx_offload_irq_offload_fifo() looks as follows:
      
      | 	while (1) {
      | 		skb = can_rx_offload_offload_one(offload, 0);
      | 		if (IS_ERR(skb))
      | 			continue;
      | 		if (!skb)
      | 			break;
      | 		...
      | 	}
      
      The flexcan driver wrongly always returns ERR_PTR(-ENOBUFS) if drop is
      requested, even if there is no CAN frame pending. As the i.MX25 is a
      single core CPU, while the rx-offload processing is active, there is
      no thread to process packets from the offload queue. So the queue
      doesn't get any shorter and this results is a tight loop.
      
      Instead of always returning ERR_PTR(-ENOBUFS) if drop is requested,
      return NULL if no CAN frame is pending.
      
      Changes since v1: https://lore.kernel.org/all/20220810144536.389237-1-u.kleine-koenig@pengutronix.de
      - don't break in can_rx_offload_irq_offload_fifo() in case of an error,
        return NULL in flexcan_mailbox_read() in case of no pending CAN frame
        instead
      
      Fixes: 4e9c9484
      
       ("can: rx-offload: Prepare for CAN FD support")
      Link: https://lore.kernel.org/all/20220811094254.1864367-1-mkl@pengutronix.de
      Cc: stable@vger.kernel.org # v5.5
      Suggested-by: default avatarUwe Kleine-König <u.kleine-koenig@pengutronix.de>
      Reviewed-by: default avatarUwe Kleine-König <u.kleine-koenig@pengutronix.de>
      Tested-by: default avatarThorsten Scherer <t.scherer@eckelmann.de>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      a09721dd
    • Geert Uytterhoeven's avatar
      net: sh_eth: Fix PHY state warning splat during system resume · 6a1dbfef
      Geert Uytterhoeven authored
      Since commit 744d23c7 ("net: phy: Warn about incorrect
      mdio_bus_phy_resume() state"), a warning splat is printed during system
      resume with Wake-on-LAN disabled:
      
      	WARNING: CPU: 0 PID: 626 at drivers/net/phy/phy_device.c:323 mdio_bus_phy_resume+0xbc/0xe4
      
      As the Renesas SuperH Ethernet driver already calls phy_{stop,start}()
      in its suspend/resume callbacks, it is sufficient to just mark the MAC
      responsible for managing the power state of the PHY.
      
      Fixes: fba863b8
      
       ("net: phy: make PHY PM ops a no-op if MAC driver manages PHY PM")
      Signed-off-by: default avatarGeert Uytterhoeven <geert+renesas@glider.be>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Reviewed-by: default avatarSergey Shtylyov <s.shtylyov@omp.ru>
      Link: https://lore.kernel.org/r/c6e1331b9bef61225fa4c09db3ba3e2e7214ba2d.1663598886.git.geert+renesas@glider.be
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      6a1dbfef
    • Geert Uytterhoeven's avatar
      net: ravb: Fix PHY state warning splat during system resume · 4924c0cd
      Geert Uytterhoeven authored
      Since commit 744d23c7 ("net: phy: Warn about incorrect
      mdio_bus_phy_resume() state"), a warning splat is printed during system
      resume with Wake-on-LAN disabled:
      
              WARNING: CPU: 0 PID: 1197 at drivers/net/phy/phy_device.c:323 mdio_bus_phy_resume+0xbc/0xc8
      
      As the Renesas Ethernet AVB driver already calls phy_{stop,start}() in
      its suspend/resume callbacks, it is sufficient to just mark the MAC
      responsible for managing the power state of the PHY.
      
      Fixes: fba863b8
      
       ("net: phy: make PHY PM ops a no-op if MAC driver manages PHY PM")
      Signed-off-by: default avatarGeert Uytterhoeven <geert+renesas@glider.be>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Reviewed-by: default avatarSergey Shtylyov <s.shtylyov@omp.ru>
      Link: https://lore.kernel.org/r/8ec796f47620980fdd0403e21bd8b7200b4fa1d4.1663598796.git.geert+renesas@glider.be
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      4924c0cd
    • Florian Westphal's avatar
      netfilter: nf_ct_ftp: fix deadlock when nat rewrite is needed · d2508893
      Florian Westphal authored
      We can't use ct->lock, this is already used by the seqadj internals.
      When using ftp helper + nat, seqadj will attempt to acquire ct->lock
      again.
      
      Revert back to a global lock for now.
      
      Fixes: c783a29c
      
       ("netfilter: nf_ct_ftp: prefer skb_linearize")
      Reported-by: default avatarBruno de Paula Larini <bruno.larini@riosoft.com.br>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      d2508893
    • Florian Westphal's avatar
      netfilter: ebtables: fix memory leak when blob is malformed · 62ce44c4
      Florian Westphal authored
      The bug fix was incomplete, it "replaced" crash with a memory leak.
      The old code had an assignment to "ret" embedded into the conditional,
      restore this.
      
      Fixes: 7997eff8
      
       ("netfilter: ebtables: reject blobs that don't provide all entry points")
      Reported-and-tested-by: default avatar <syzbot+a24c5252f3e3ab733464@syzkaller.appspotmail.com>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      62ce44c4
    • Tetsuo Handa's avatar
      netfilter: nf_tables: fix percpu memory leak at nf_tables_addchain() · 9a4d6dd5
      Tetsuo Handa authored
      It seems to me that percpu memory for chain stats started leaking since
      commit 3bc158f8
      
       ("netfilter: nf_tables: map basechain priority to
      hardware priority") when nft_chain_offload_priority() returned an error.
      
      Signed-off-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Fixes: 3bc158f8
      
       ("netfilter: nf_tables: map basechain priority to hardware priority")
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      9a4d6dd5
    • Tetsuo Handa's avatar
      netfilter: nf_tables: fix nft_counters_enabled underflow at nf_tables_addchain() · 921ebde3
      Tetsuo Handa authored
      syzbot is reporting underflow of nft_counters_enabled counter at
      nf_tables_addchain() [1], for commit 43eb8949
      
       ("netfilter:
      nf_tables: do not leave chain stats enabled on error") missed that
      nf_tables_chain_destroy() after nft_basechain_init() in the error path of
      nf_tables_addchain() decrements the counter because nft_basechain_init()
      makes nft_is_base_chain() return true by setting NFT_CHAIN_BASE flag.
      
      Increment the counter immediately after returning from
      nft_basechain_init().
      
      Link:  https://syzkaller.appspot.com/bug?extid=b5d82a651b71cd8a75ab [1]
      Reported-by: default avatarsyzbot <syzbot+b5d82a651b71cd8a75ab@syzkaller.appspotmail.com>
      Signed-off-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Tested-by: default avatarsyzbot <syzbot+b5d82a651b71cd8a75ab@syzkaller.appspotmail.com>
      Fixes: 43eb8949
      
       ("netfilter: nf_tables: do not leave chain stats enabled on error")
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      921ebde3
    • Pablo Neira Ayuso's avatar
      netfilter: conntrack: remove nf_conntrack_helper documentation · 76b907ee
      Pablo Neira Ayuso authored
      This toggle has been already remove by b1185090
      
       ("netfilter: remove
      nf_conntrack_helper sysctl and modparam toggles").
      
      Remove the documentation entry for this toggle too.
      
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      76b907ee
    • Bhupesh Sharma's avatar
      MAINTAINERS: Add myself as a reviewer for Qualcomm ETHQOS Ethernet driver · 603ccb3a
      Bhupesh Sharma authored
      
      
      As suggested by Vinod, adding myself as the reviewer
      for the Qualcomm ETHQOS Ethernet driver.
      
      Recently I have enabled this driver on a few Qualcomm
      SoCs / boards and hence trying to keep a close eye on
      it.
      
      Signed-off-by: default avatarBhupesh Sharma <bhupesh.sharma@linaro.org>
      Acked-by: default avatarVinod Koul <vkoul@kernel.org>
      Link: https://lore.kernel.org/r/20220915112804.3950680-1-bhupesh.sharma@linaro.org
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      603ccb3a
    • Mateusz Palczewski's avatar
      ice: Fix interface being down after reset with link-down-on-close flag on · 8ac71327
      Mateusz Palczewski authored
      When performing a reset on ice driver with link-down-on-close flag on
      interface would always stay down. Fix this by moving a check of this
      flag to ice_stop() that is called only when user wants to bring
      interface down.
      
      Fixes: ab4ab73f
      
       ("ice: Add ethtool private flag to make forcing link down optional")
      Signed-off-by: default avatarMateusz Palczewski <mateusz.palczewski@intel.com>
      Tested-by: default avatarPetr Oros <poros@redhat.com>
      Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      8ac71327
    • Michal Swiatkowski's avatar
      ice: config netdev tc before setting queues number · 122045ca
      Michal Swiatkowski authored
      After lowering number of tx queues the warning appears:
      "Number of in use tx queues changed invalidating tc mappings. Priority
      traffic classification disabled!"
      Example command to reproduce:
      ethtool -L enp24s0f0 tx 36 rx 36
      
      Fix this by setting correct tc mapping before setting real number of
      queues on netdev.
      
      Fixes: 0754d65b
      
       ("ice: Add infrastructure for mqprio support via ndo_setup_tc")
      Signed-off-by: default avatarMichal Swiatkowski <michal.swiatkowski@linux.intel.com>
      Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      122045ca
    • Jakub Kicinski's avatar
      Merge branch 'fixes-for-tc-taprio-software-mode' · da847246
      Jakub Kicinski authored
      
      
      Vladimir Oltean says:
      
      ====================
      Fixes for tc-taprio software mode
      
      While working on some new features for tc-taprio, I found some strange
      behavior which looked like bugs. I was able to eventually trigger a NULL
      pointer dereference. This patch set fixes 2 issues I saw. Detailed
      explanation in patches.
      ====================
      
      Link: https://lore.kernel.org/r/20220915100802.2308279-1-vladimir.oltean@nxp.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      da847246
    • Vladimir Oltean's avatar
      net/sched: taprio: make qdisc_leaf() see the per-netdev-queue pfifo child qdiscs · 1461d212
      Vladimir Oltean authored
      taprio can only operate as root qdisc, and to that end, there exists the
      following check in taprio_init(), just as in mqprio:
      
      	if (sch->parent != TC_H_ROOT)
      		return -EOPNOTSUPP;
      
      And indeed, when we try to attach taprio to an mqprio child, it fails as
      expected:
      
      $ tc qdisc add dev swp0 root handle 1: mqprio num_tc 8 \
      	map 0 1 2 3 4 5 6 7 \
      	queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 hw 0
      $ tc qdisc replace dev swp0 parent 1:2 taprio num_tc 8 \
      	map 0 1 2 3 4 5 6 7 \
      	queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \
      	base-time 0 sched-entry S 0x7f 990000 sched-entry S 0x80 100000 \
      	flags 0x0 clockid CLOCK_TAI
      Error: sch_taprio: Can only be attached as root qdisc.
      
      (extack message added by me)
      
      But when we try to attach a taprio child to a taprio root qdisc,
      surprisingly it doesn't fail:
      
      $ tc qdisc replace dev swp0 root handle 1: taprio num_tc 8 \
      	map 0 1 2 3 4 5 6 7 queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \
      	base-time 0 sched-entry S 0x7f 990000 sched-entry S 0x80 100000 \
      	flags 0x0 clockid CLOCK_TAI
      $ tc qdisc replace dev swp0 parent 1:2 taprio num_tc 8 \
      	map 0 1 2 3 4 5 6 7 \
      	queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 \
      	base-time 0 sched-entry S 0x7f 990000 sched-entry S 0x80 100000 \
      	flags 0x0 clockid CLOCK_TAI
      
      This is because tc_modify_qdisc() behaves differently when mqprio is
      root, vs when taprio is root.
      
      In the mqprio case, it finds the parent qdisc through
      p = qdisc_lookup(dev, TC_H_MAJ(clid)), and then the child qdisc through
      q = qdisc_leaf(p, clid). This leaf qdisc q has handle 0, so it is
      ignored according to the comment right below ("It may be default qdisc,
      ignore it"). As a result, tc_modify_qdisc() goes through the
      qdisc_create() code path, and this gives taprio_init() a chance to check
      for sch_parent != TC_H_ROOT and error out.
      
      Whereas in the taprio case, the returned q = qdisc_leaf(p, clid) is
      different. It is not the default qdisc created for each netdev queue
      (both taprio and mqprio call qdisc_create_dflt() and keep them in
      a private q->qdiscs[], or priv->qdiscs[], respectively). Instead, taprio
      makes qdisc_leaf() return the _root_ qdisc, aka itself.
      
      When taprio does that, tc_modify_qdisc() goes through the qdisc_change()
      code path, because the qdisc layer never finds out about the child qdisc
      of the root. And through the ->change() ops, taprio has no reason to
      check whether its parent is root or not, just through ->init(), which is
      not called.
      
      The problem is the taprio_leaf() implementation. Even though code wise,
      it does the exact same thing as mqprio_leaf() which it is copied from,
      it works with different input data. This is because mqprio does not
      attach itself (the root) to each device TX queue, but one of the default
      qdiscs from its private array.
      
      In fact, since commit 13511704 ("net: taprio offload: enforce qdisc
      to netdev queue mapping"), taprio does this too, but just for the full
      offload case. So if we tried to attach a taprio child to a fully
      offloaded taprio root qdisc, it would properly fail too; just not to a
      software root taprio.
      
      To fix the problem, stop looking at the Qdisc that's attached to the TX
      queue, and instead, always return the default qdiscs that we've
      allocated (and to which we privately enqueue and dequeue, in software
      scheduling mode).
      
      Since Qdisc_class_ops :: leaf  is only called from tc_modify_qdisc(),
      the risk of unforeseen side effects introduced by this change is
      minimal.
      
      Fixes: 5a781ccb
      
       ("tc: Add support for configuring the taprio scheduler")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarVinicius Costa Gomes <vinicius.gomes@intel.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      1461d212
    • Vladimir Oltean's avatar
      net/sched: taprio: avoid disabling offload when it was never enabled · db46e3a8
      Vladimir Oltean authored
      In an incredibly strange API design decision, qdisc->destroy() gets
      called even if qdisc->init() never succeeded, not exclusively since
      commit 87b60cfa ("net_sched: fix error recovery at qdisc creation"),
      but apparently also earlier (in the case of qdisc_create_dflt()).
      
      The taprio qdisc does not fully acknowledge this when it attempts full
      offload, because it starts off with q->flags = TAPRIO_FLAGS_INVALID in
      taprio_init(), then it replaces q->flags with TCA_TAPRIO_ATTR_FLAGS
      parsed from netlink (in taprio_change(), tail called from taprio_init()).
      
      But in taprio_destroy(), we call taprio_disable_offload(), and this
      determines what to do based on FULL_OFFLOAD_IS_ENABLED(q->flags).
      
      But looking at the implementation of FULL_OFFLOAD_IS_ENABLED()
      (a bitwise check of bit 1 in q->flags), it is invalid to call this macro
      on q->flags when it contains TAPRIO_FLAGS_INVALID, because that is set
      to U32_MAX, and therefore FULL_OFFLOAD_IS_ENABLED() will return true on
      an invalid set of flags.
      
      As a result, it is possible to crash the kernel if user space forces an
      error between setting q->flags = TAPRIO_FLAGS_INVALID, and the calling
      of taprio_enable_offload(). This is because drivers do not expect the
      offload to be disabled when it was never enabled.
      
      The error that we force here is to attach taprio as a non-root qdisc,
      but instead as child of an mqprio root qdisc:
      
      $ tc qdisc add dev swp0 root handle 1: \
      	mqprio num_tc 8 map 0 1 2 3 4 5 6 7 \
      	queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 hw 0
      $ tc qdisc replace dev swp0 parent 1:1 \
      	taprio num_tc 8 map 0 1 2 3 4 5 6 7 \
      	queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 base-time 0 \
      	sched-entry S 0x7f 990000 sched-entry S 0x80 100000 \
      	flags 0x0 clockid CLOCK_TAI
      Unable to handle kernel paging request at virtual address fffffffffffffff8
      [fffffffffffffff8] pgd=0000000000000000, p4d=0000000000000000
      Internal error: Oops: 96000004 [#1] PREEMPT SMP
      Call trace:
       taprio_dump+0x27c/0x310
       vsc9959_port_setup_tc+0x1f4/0x460
       felix_port_setup_tc+0x24/0x3c
       dsa_slave_setup_tc+0x54/0x27c
       taprio_disable_offload.isra.0+0x58/0xe0
       taprio_destroy+0x80/0x104
       qdisc_create+0x240/0x470
       tc_modify_qdisc+0x1fc/0x6b0
       rtnetlink_rcv_msg+0x12c/0x390
       netlink_rcv_skb+0x5c/0x130
       rtnetlink_rcv+0x1c/0x2c
      
      Fix this by keeping track of the operations we made, and undo the
      offload only if we actually did it.
      
      I've added "bool offloaded" inside a 4 byte hole between "int clockid"
      and "atomic64_t picos_per_byte". Now the first cache line looks like
      below:
      
      $ pahole -C taprio_sched net/sched/sch_taprio.o
      struct taprio_sched {
              struct Qdisc * *           qdiscs;               /*     0     8 */
              struct Qdisc *             root;                 /*     8     8 */
              u32                        flags;                /*    16     4 */
              enum tk_offsets            tk_offset;            /*    20     4 */
              int                        clockid;              /*    24     4 */
              bool                       offloaded;            /*    28     1 */
      
              /* XXX 3 bytes hole, try to pack */
      
              atomic64_t                 picos_per_byte;       /*    32     0 */
      
              /* XXX 8 bytes hole, try to pack */
      
              spinlock_t                 current_entry_lock;   /*    40     0 */
      
              /* XXX 8 bytes hole, try to pack */
      
              struct sched_entry *       current_entry;        /*    48     8 */
              struct sched_gate_list *   oper_sched;           /*    56     8 */
              /* --- cacheline 1 boundary (64 bytes) --- */
      
      Fixes: 9c66d156
      
       ("taprio: Add support for hardware offloading")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarVinicius Costa Gomes <vinicius.gomes@intel.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      db46e3a8
    • Ido Schimmel's avatar
      ipv6: Fix crash when IPv6 is administratively disabled · 76dd0728
      Ido Schimmel authored
      The global 'raw_v6_hashinfo' variable can be accessed even when IPv6 is
      administratively disabled via the 'ipv6.disable=1' kernel command line
      option, leading to a crash [1].
      
      Fix by restoring the original behavior and always initializing the
      variable, regardless of IPv6 support being administratively disabled or
      not.
      
      [1]
       BUG: unable to handle page fault for address: ffffffffffffffc8
       #PF: supervisor read access in kernel mode
       #PF: error_code(0x0000) - not-present page
       PGD 173e18067 P4D 173e18067 PUD 173e1a067 PMD 0
       Oops: 0000 [#1] PREEMPT SMP KASAN
       CPU: 3 PID: 271 Comm: ss Not tainted 6.0.0-rc4-custom-00136-g0727a9a5fbc1 #1396
       Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-1.fc36 04/01/2014
       RIP: 0010:raw_diag_dump+0x310/0x7f0
       [...]
       Call Trace:
        <TASK>
        __inet_diag_dump+0x10f/0x2e0
        netlink_dump+0x575/0xfd0
        __netlink_dump_start+0x67b/0x940
        inet_diag_handler_cmd+0x273/0x2d0
        sock_diag_rcv_msg+0x317/0x440
        netlink_rcv_skb+0x15e/0x430
        sock_diag_rcv+0x2b/0x40
        netlink_unicast+0x53b/0x800
        netlink_sendmsg+0x945/0xe60
        ____sys_sendmsg+0x747/0x960
        ___sys_sendmsg+0x13a/0x1e0
        __sys_sendmsg+0x118/0x1e0
        do_syscall_64+0x34/0x80
        entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Fixes: 0daf07e5
      
       ("raw: convert raw sockets to RCU")
      Reported-by: default avatarRoberto Ricci <rroberto2r@gmail.com>
      Tested-by: default avatarRoberto Ricci <rroberto2r@gmail.com>
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Link: https://lore.kernel.org/r/20220916084821.229287-1-idosch@nvidia.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      76dd0728
    • Vladimir Oltean's avatar
      net: enetc: deny offload of tc-based TSN features on VF interfaces · 5641c751
      Vladimir Oltean authored
      TSN features on the ENETC (taprio, cbs, gate, police) are configured
      through a mix of command BD ring messages and port registers:
      enetc_port_rd(), enetc_port_wr().
      
      Port registers are a region of the ENETC memory map which are only
      accessible from the PCIe Physical Function. They are not accessible from
      the Virtual Functions.
      
      Moreover, attempting to access these registers crashes the kernel:
      
      $ echo 1 > /sys/bus/pci/devices/0000\:00\:00.0/sriov_numvfs
      pci 0000:00:01.0: [1957:ef00] type 00 class 0x020001
      fsl_enetc_vf 0000:00:01.0: Adding to iommu group 15
      fsl_enetc_vf 0000:00:01.0: enabling device (0000 -> 0002)
      fsl_enetc_vf 0000:00:01.0 eno0vf0: renamed from eth0
      $ tc qdisc replace dev eno0vf0 root taprio num_tc 8 map 0 1 2 3 4 5 6 7 \
      	queues 1@0 1@1 1@2 1@3 1@4 1@5 1@6 1@7 base-time 0 \
      	sched-entry S 0x7f 900000 sched-entry S 0x80 100000 flags 0x2
      Unable to handle kernel paging request at virtual address ffff800009551a08
      Internal error: Oops: 96000007 [#1] PREEMPT SMP
      pc : enetc_setup_tc_taprio+0x170/0x47c
      lr : enetc_setup_tc_taprio+0x16c/0x47c
      Call trace:
       enetc_setup_tc_taprio+0x170/0x47c
       enetc_setup_tc+0x38/0x2dc
       taprio_change+0x43c/0x970
       taprio_init+0x188/0x1e0
       qdisc_create+0x114/0x470
       tc_modify_qdisc+0x1fc/0x6c0
       rtnetlink_rcv_msg+0x12c/0x390
      
      Split enetc_setup_tc() into separate functions for the PF and for the
      VF drivers. Also remove enetc_qos.o from being included into
      enetc-vf.ko, since it serves absolutely no purpose there.
      
      Fixes: 34c6adf1
      
       ("enetc: Configure the Time-Aware Scheduler via tc-taprio offload")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Link: https://lore.kernel.org/r/20220916133209.3351399-2-vladimir.oltean@nxp.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      5641c751
    • Vladimir Oltean's avatar
      net: enetc: move enetc_set_psfp() out of the common enetc_set_features() · fed38e64
      Vladimir Oltean authored
      The VF netdev driver shouldn't respond to changes in the NETIF_F_HW_TC
      flag; only PFs should. Moreover, TSN-specific code should go to
      enetc_qos.c, which should not be included in the VF driver.
      
      Fixes: 79e49982
      
       ("net: enetc: add hw tc hw offload features for PSPF capability")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Link: https://lore.kernel.org/r/20220916133209.3351399-1-vladimir.oltean@nxp.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      fed38e64
    • Jakub Kicinski's avatar
      Merge branch 'wireguard-patches-for-6-0-rc6' · 0507246d
      Jakub Kicinski authored
      
      
      Jason A. Donenfeld says:
      
      ====================
      wireguard patches for 6.0-rc6
      
      1) The ratelimiter timing test doesn't help outside of development, yet
         it is currently preventing the module from being inserted on some
         kernels when it flakes at insertion time. So we disable it.
      
      2) A fix for a build error on UML, caused by a recent change in a
         different tree.
      
      3) A WARN_ON() is triggered by Kees' new fortified memcpy() patch, due
         to memcpy()ing over a sockaddr pointer with the size of a
         sockaddr_in[6]. The type safe fix is pretty simple. Given how classic
         of a thing sockaddr punning is, I suspect this may be the first in a
         few patches like this throughout the net tree, once Kees' fortify
         series is more widely deployed (current it's just in next).
      ====================
      
      Link: https://lore.kernel.org/r/20220916143740.831881-1-Jason@zx2c4.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      0507246d
    • Jason A. Donenfeld's avatar
      wireguard: netlink: avoid variable-sized memcpy on sockaddr · 26c01310
      Jason A. Donenfeld authored
      Doing a variable-sized memcpy is slower, and the compiler isn't smart
      enough to turn this into a constant-size assignment.
      
      Further, Kees' latest fortified memcpy will actually bark, because the
      destination pointer is type sockaddr, not explicitly sockaddr_in or
      sockaddr_in6, so it thinks there's an overflow:
      
          memcpy: detected field-spanning write (size 28) of single field
          "&endpoint.addr" at drivers/net/wireguard/netlink.c:446 (size 16)
      
      Fix this by just assigning by using explicit casts for each checked
      case.
      
      Fixes: e7096c13
      
       ("net: WireGuard secure network tunnel")
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Reviewed-by: default avatarKees Cook <keescook@chromium.org>
      Reported-by: default avatar <syzbot+a448cda4dba2dac50de5@syzkaller.appspotmail.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      26c01310
    • Jason A. Donenfeld's avatar
      wireguard: selftests: do not install headers on UML · 8e25c02b
      Jason A. Donenfeld authored
      Since 1b620d53 ("kbuild: disable header exports for UML in a
      straightforward way"), installing headers fails on UML, so just disable
      installing them, since they're not needed anyway on the architecture.
      
      Fixes: b438b3b8
      
       ("wireguard: selftests: support UML")
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      8e25c02b
    • Jason A. Donenfeld's avatar
      wireguard: ratelimiter: disable timings test by default · 684dec3c
      Jason A. Donenfeld authored
      A previous commit tried to make the ratelimiter timings test more
      reliable but in the process made it less reliable on other
      configurations. This is an impossible problem to solve without
      increasingly ridiculous heuristics. And it's not even a problem that
      actually needs to be solved in any comprehensive way, since this is only
      ever used during development. So just cordon this off with a DEBUG_
      ifdef, just like we do for the trie's randomized tests, so it can be
      enabled while hacking on the code, and otherwise disabled in CI. In the
      process we also revert 151c8e49.
      
      Fixes: 151c8e49 ("wireguard: ratelimiter: use hrtimer in selftest")
      Fixes: e7096c13
      
       ("net: WireGuard secure network tunnel")
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      684dec3c
    • Íñigo Huguet's avatar
      sfc/siena: fix null pointer dereference in efx_hard_start_xmit · 589c6ede
      Íñigo Huguet authored
      Like in previous patch for sfc, prevent potential (but unlikely) NULL
      pointer dereference.
      
      Fixes: 12804793
      
       ("sfc: decouple TXQ type from label")
      Reported-by: default avatarTianhao Zhao <tizhao@redhat.com>
      Signed-off-by: default avatarÍñigo Huguet <ihuguet@redhat.com>
      Link: https://lore.kernel.org/r/20220915141958.16458-1-ihuguet@redhat.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      589c6ede
    • Íñigo Huguet's avatar
      sfc/siena: fix TX channel offset when using legacy interrupts · 974bb793
      Íñigo Huguet authored
      As in previous commit for sfc, fix TX channels offset when
      efx_siena_separate_tx_channels is false (the default)
      
      Fixes: 25bde571
      
       ("sfc/siena: fix wrong tx channel offset with efx_separate_tx_channels")
      Reported-by: default avatarTianhao Zhao <tizhao@redhat.com>
      Signed-off-by: default avatarÍñigo Huguet <ihuguet@redhat.com>
      Link: https://lore.kernel.org/r/20220915141653.15504-1-ihuguet@redhat.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      974bb793
  3. Sep 20, 2022
    • Tetsuo Handa's avatar
      net: clear msg_get_inq in __get_compat_msghdr() · d547c1b7
      Tetsuo Handa authored
      syzbot is still complaining uninit-value in tcp_recvmsg(), for
      commit 1228b34c
      
       ("net: clear msg_get_inq in __sys_recvfrom() and
      __copy_msghdr_from_user()") missed that __get_compat_msghdr() is called
      instead of copy_msghdr_from_user() when MSG_CMSG_COMPAT is specified.
      
      Signed-off-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Fixes: 1228b34c
      
       ("net: clear msg_get_inq in __sys_recvfrom() and __copy_msghdr_from_user()")
      Reviewed-by: default avatarJens Axboe <axboe@kernel.dk>
      Link: https://lore.kernel.org/r/d06d0f7f-696c-83b4-b2d5-70b5f2730a37@I-love.SAKURA.ne.jp
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      d547c1b7
    • Jakub Kicinski's avatar
      Merge branch 'ipmr-always-call-ip-6-_mr_forward-from-rcu-read-side-critical-section' · 68fe503c
      Jakub Kicinski authored
      
      
      Ido Schimmel says:
      
      ====================
      ipmr: Always call ip{,6}_mr_forward() from RCU read-side critical section
      
      Patch #1 fixes a bug in ipmr code.
      
      Patch #2 adds corresponding test cases.
      ====================
      
      Link: https://lore.kernel.org/r/20220914075339.4074096-1-idosch@nvidia.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      68fe503c
    • Ido Schimmel's avatar
      selftests: forwarding: Add test cases for unresolved multicast routes · 2b5a8c8f
      Ido Schimmel authored
      
      
      Add IPv4 and IPv6 test cases for unresolved multicast routes, testing
      that queued packets are forwarded after installing a matching (S, G)
      route.
      
      The test cases can be used to reproduce the bugs fixed in "ipmr: Always
      call ip{,6}_mr_forward() from RCU read-side critical section".
      
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      2b5a8c8f
    • Ido Schimmel's avatar
      ipmr: Always call ip{,6}_mr_forward() from RCU read-side critical section · b07a9b26
      Ido Schimmel authored
      These functions expect to be called from RCU read-side critical section,
      but this only happens when invoked from the data path via
      ip{,6}_mr_input(). They can also be invoked from process context in
      response to user space adding a multicast route which resolves a cache
      entry with queued packets [1][2].
      
      Fix by adding missing rcu_read_lock() / rcu_read_unlock() in these call
      paths.
      
      [1]
      WARNING: suspicious RCU usage
      6.0.0-rc3-custom-15969-g049d233c8bcc-dirty #1387 Not tainted
      -----------------------------
      net/ipv4/ipmr.c:84 suspicious rcu_dereference_check() usage!
      
      other info that might help us debug this:
      
      rcu_scheduler_active = 2, debug_locks = 1
      1 lock held by smcrouted/246:
       #0: ffffffff862389b0 (rtnl_mutex){+.+.}-{3:3}, at: ip_mroute_setsockopt+0x11c/0x1420
      
      stack backtrace:
      CPU: 0 PID: 246 Comm: smcrouted Not tainted 6.0.0-rc3-custom-15969-g049d233c8bcc-dirty #1387
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-1.fc36 04/01/2014
      Call Trace:
       <TASK>
       dump_stack_lvl+0x91/0xb9
       vif_dev_read+0xbf/0xd0
       ipmr_queue_xmit+0x135/0x1ab0
       ip_mr_forward+0xe7b/0x13d0
       ipmr_mfc_add+0x1a06/0x2ad0
       ip_mroute_setsockopt+0x5c1/0x1420
       do_ip_setsockopt+0x23d/0x37f0
       ip_setsockopt+0x56/0x80
       raw_setsockopt+0x219/0x290
       __sys_setsockopt+0x236/0x4d0
       __x64_sys_setsockopt+0xbe/0x160
       do_syscall_64+0x34/0x80
       entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      [2]
      WARNING: suspicious RCU usage
      6.0.0-rc3-custom-15969-g049d233c8bcc-dirty #1387 Not tainted
      -----------------------------
      net/ipv6/ip6mr.c:69 suspicious rcu_dereference_check() usage!
      
      other info that might help us debug this:
      
      rcu_scheduler_active = 2, debug_locks = 1
      1 lock held by smcrouted/246:
       #0: ffffffff862389b0 (rtnl_mutex){+.+.}-{3:3}, at: ip6_mroute_setsockopt+0x6b9/0x2630
      
      stack backtrace:
      CPU: 1 PID: 246 Comm: smcrouted Not tainted 6.0.0-rc3-custom-15969-g049d233c8bcc-dirty #1387
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-1.fc36 04/01/2014
      Call Trace:
       <TASK>
       dump_stack_lvl+0x91/0xb9
       vif_dev_read+0xbf/0xd0
       ip6mr_forward2.isra.0+0xc9/0x1160
       ip6_mr_forward+0xef0/0x13f0
       ip6mr_mfc_add+0x1ff2/0x31f0
       ip6_mroute_setsockopt+0x1825/0x2630
       do_ipv6_setsockopt+0x462/0x4440
       ipv6_setsockopt+0x105/0x140
       rawv6_setsockopt+0xd8/0x690
       __sys_setsockopt+0x236/0x4d0
       __x64_sys_setsockopt+0xbe/0x160
       do_syscall_64+0x34/0x80
       entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      Fixes: ebc31979
      
       ("ipmr: add rcu protection over (struct vif_device)->dev")
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      b07a9b26
    • Alex Elder's avatar
      net: ipa: properly limit modem routing table use · cf412ec3
      Alex Elder authored
      IPA can route packets between IPA-connected entities.  The AP and
      modem are currently the only such entities supported, and no routing
      is required to transfer packets between them.
      
      The number of entries in each routing table is fixed, and defined at
      initialization time.  Some of these entries are designated for use
      by the modem, and the rest are available for the AP to use.  The AP
      sends a QMI message to the modem which describes (among other
      things) information about routing table memory available for the
      modem to use.
      
      Currently the QMI initialization packet gives wrong information in
      its description of routing tables.  What *should* be supplied is the
      maximum index that the modem can use for the routing table memory
      located at a given location.  The current code instead supplies the
      total *number* of routing table entries.  Furthermore, the modem is
      granted the entire table, not just the subset it's supposed to use.
      
      This patch fixes this.  First, the ipa_mem_bounds structure is
      generalized so its "end" field can be interpreted either as a final
      byte offset, or a final array index.  Second, the IPv4 and IPv6
      (non-hashed and hashed) table information fields in the QMI
      ipa_init_modem_driver_req structure are changed to be ipa_mem_bounds
      rather than ipa_mem_array structures.  Third, we set the "end" value
      for each routing table to be the last index, rather than setting the
      "count" to be the number of indices.  Finally, instead of allowing
      the modem to use all of a routing table's memory, it is limited to
      just the portion meant to be used by the modem.  In all versions of
      IPA currently supported, that is IPA_ROUTE_MODEM_COUNT (8) entries.
      
      Update a few comments for clarity.
      
      Fixes: 530f9216
      
       ("soc: qcom: ipa: AP/modem communications")
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Link: https://lore.kernel.org/r/20220913204602.1803004-1-elder@linaro.org
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      cf412ec3
    • Liang He's avatar
      of: mdio: Add of_node_put() when breaking out of for_each_xx · 1c48709e
      Liang He authored
      In of_mdiobus_register(), we should call of_node_put() for 'child'
      escaped out of for_each_available_child_of_node().
      
      Fixes: 66bdede4
      
       ("of_mdio: Fix broken PHY IRQ in case of probe deferral")
      Co-developed-by: default avatarMiaoqian Lin <linmq006@gmail.com>
      Signed-off-by: default avatarMiaoqian Lin <linmq006@gmail.com>
      Signed-off-by: default avatarLiang He <windhl@126.com>
      Link: https://lore.kernel.org/r/20220913125659.3331969-1-windhl@126.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      1c48709e
    • Cong Wang's avatar
      tcp: read multiple skbs in tcp_read_skb() · db4192a7
      Cong Wang authored
      Before we switched to ->read_skb(), ->read_sock() was passed with
      desc.count=1, which technically indicates we only read one skb per
      ->sk_data_ready() call. However, for TCP, this is not true.
      
      TCP at least has sk_rcvlowat which intentionally holds skb's in
      receive queue until this watermark is reached. This means when
      ->sk_data_ready() is invoked there could be multiple skb's in the
      queue, therefore we have to read multiple skbs in tcp_read_skb()
      instead of one.
      
      Fixes: 965b57b4
      
       ("net: Introduce a new proto_ops ->read_skb()")
      Reported-by: default avatarPeilin Ye <peilin.ye@bytedance.com>
      Cc: John Fastabend <john.fastabend@gmail.com>
      Cc: Jakub Sitnicki <jakub@cloudflare.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarCong Wang <cong.wang@bytedance.com>
      Link: https://lore.kernel.org/r/20220912173553.235838-1-xiyou.wangcong@gmail.com
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      db4192a7