Skip to content
  1. Jan 02, 2023
  2. Dec 22, 2022
  3. Dec 21, 2022
  4. Dec 20, 2022
    • Jakub Kicinski's avatar
      Merge tag 'linux-can-fixes-for-6.2-20221219' of... · 4be84df3
      Jakub Kicinski authored
      
      Merge tag 'linux-can-fixes-for-6.2-20221219' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
      
      Marc Kleine-Budde says:
      
      ====================
      pull-request: can 2022-12-19
      
      The first patch is by Vincent Mailhol and adds the etas_es58x
      devlink documentation to the index.
      
      Haibo Chen's patch for the flexcan driver fixes a unbalanced
      pm_runtime_enable warning.
      
      The last patch is by me, targets the kvaser_usb driver and fixes
      an error occurring with gcc-13.
      
      * tag 'linux-can-fixes-for-6.2-20221219' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
        can: kvaser_usb: hydra: help gcc-13 to figure out cmd_len
        can: flexcan: avoid unbalanced pm_runtime_enable warning
        Documentation: devlink: add missing toc entry for etas_es58x devlink doc
      ====================
      
      Link: https://lore.kernel.org/r/20221219155210.1143439-1-mkl@pengutronix.de
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      4be84df3
    • Jakub Kicinski's avatar
      Merge branch 'stop-corrupting-socket-s-task_frag' · 918fb1aa
      Jakub Kicinski authored
      Benjamin Coddington says:
      
      ====================
      Stop corrupting socket's task_frag
      
      The networking code uses flags in sk_allocation to determine if it can use
      current->task_frag, however in-kernel users of sockets may stop setting
      sk_allocation when they convert to the preferred memalloc_nofs_save/restore,
      as SUNRPC has done in commit a1231fda
      
       ("SUNRPC: Set memalloc_nofs_save()
      on all rpciod/xprtiod jobs").
      
      This will cause corruption in current->task_frag when recursing into the
      network layer for those subsystems during page fault or reclaim.  The
      corruption is difficult to diagnose because stack traces may not contain the
      offending subsystem at all.  The corruption is unlikely to show up in
      testing because it requires memory pressure, and so subsystems that
      convert to memalloc_nofs_save/restore are likely to continue to run into
      this issue.
      
      Previous reports and proposed fixes:
      https://lore.kernel.org/netdev/96a18bd00cbc6cb554603cc0d6ef1c551965b078.1663762494.git.gnault@redhat.com/
      https://lore.kernel.org/netdev/b4d8cb09c913d3e34f853736f3f5628abfd7f4b6.1656699567.git.gnault@redhat.com/
      https://lore.kernel.org/linux-nfs/de6d99321d1dcaa2ad456b92b3680aa77c07a747.1665401788.git.gnault@redhat.com/
      
      Guilluame Nault has done all of the hard work tracking this problem down and
      finding the best fix for this issue.  I'm just taking a turn posting another
      fix.
      ====================
      
      Link: https://lore.kernel.org/r/cover.1671194454.git.bcodding@redhat.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      918fb1aa
    • Benjamin Coddington's avatar
      net: simplify sk_page_frag · 08f65892
      Benjamin Coddington authored
      
      
      Now that in-kernel socket users that may recurse during reclaim have benn
      converted to sk_use_task_frag = false, we can have sk_page_frag() simply
      check that value.
      
      Signed-off-by: default avatarBenjamin Coddington <bcodding@redhat.com>
      Reviewed-by: default avatarGuillaume Nault <gnault@redhat.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      08f65892
    • Benjamin Coddington's avatar
      Treewide: Stop corrupting socket's task_frag · 98123866
      Benjamin Coddington authored
      
      
      Since moving to memalloc_nofs_save/restore, SUNRPC has stopped setting the
      GFP_NOIO flag on sk_allocation which the networking system uses to decide
      when it is safe to use current->task_frag.  The results of this are
      unexpected corruption in task_frag when SUNRPC is involved in memory
      reclaim.
      
      The corruption can be seen in crashes, but the root cause is often
      difficult to ascertain as a crashing machine's stack trace will have no
      evidence of being near NFS or SUNRPC code.  I believe this problem to
      be much more pervasive than reports to the community may indicate.
      
      Fix this by having kernel users of sockets that may corrupt task_frag due
      to reclaim set sk_use_task_frag = false.  Preemptively correcting this
      situation for users that still set sk_allocation allows them to convert to
      memalloc_nofs_save/restore without the same unexpected corruptions that are
      sure to follow, unlikely to show up in testing, and difficult to bisect.
      
      CC: Philipp Reisner <philipp.reisner@linbit.com>
      CC: Lars Ellenberg <lars.ellenberg@linbit.com>
      CC: "Christoph Böhmwalder" <christoph.boehmwalder@linbit.com>
      CC: Jens Axboe <axboe@kernel.dk>
      CC: Josef Bacik <josef@toxicpanda.com>
      CC: Keith Busch <kbusch@kernel.org>
      CC: Christoph Hellwig <hch@lst.de>
      CC: Sagi Grimberg <sagi@grimberg.me>
      CC: Lee Duncan <lduncan@suse.com>
      CC: Chris Leech <cleech@redhat.com>
      CC: Mike Christie <michael.christie@oracle.com>
      CC: "James E.J. Bottomley" <jejb@linux.ibm.com>
      CC: "Martin K. Petersen" <martin.petersen@oracle.com>
      CC: Valentina Manea <valentina.manea.m@gmail.com>
      CC: Shuah Khan <shuah@kernel.org>
      CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      CC: David Howells <dhowells@redhat.com>
      CC: Marc Dionne <marc.dionne@auristor.com>
      CC: Steve French <sfrench@samba.org>
      CC: Christine Caulfield <ccaulfie@redhat.com>
      CC: David Teigland <teigland@redhat.com>
      CC: Mark Fasheh <mark@fasheh.com>
      CC: Joel Becker <jlbec@evilplan.org>
      CC: Joseph Qi <joseph.qi@linux.alibaba.com>
      CC: Eric Van Hensbergen <ericvh@gmail.com>
      CC: Latchesar Ionkov <lucho@ionkov.net>
      CC: Dominique Martinet <asmadeus@codewreck.org>
      CC: Ilya Dryomov <idryomov@gmail.com>
      CC: Xiubo Li <xiubli@redhat.com>
      CC: Chuck Lever <chuck.lever@oracle.com>
      CC: Jeff Layton <jlayton@kernel.org>
      CC: Trond Myklebust <trond.myklebust@hammerspace.com>
      CC: Anna Schumaker <anna@kernel.org>
      CC: Steffen Klassert <steffen.klassert@secunet.com>
      CC: Herbert Xu <herbert@gondor.apana.org.au>
      
      Suggested-by: default avatarGuillaume Nault <gnault@redhat.com>
      Signed-off-by: default avatarBenjamin Coddington <bcodding@redhat.com>
      Reviewed-by: default avatarGuillaume Nault <gnault@redhat.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      98123866
    • Guillaume Nault's avatar
      net: Introduce sk_use_task_frag in struct sock. · fb87bd47
      Guillaume Nault authored
      
      
      Sockets that can be used while recursing into memory reclaim, like
      those used by network block devices and file systems, mustn't use
      current->task_frag: if the current process is already using it, then
      the inner memory reclaim call would corrupt the task_frag structure.
      
      To avoid this, sk_page_frag() uses ->sk_allocation to detect sockets
      that mustn't use current->task_frag, assuming that those used during
      memory reclaim had their allocation constraints reflected in
      ->sk_allocation.
      
      This unfortunately doesn't cover all cases: in an attempt to remove all
      usage of GFP_NOFS and GFP_NOIO, sunrpc stopped setting these flags in
      ->sk_allocation, and used memalloc_nofs critical sections instead.
      This breaks the sk_page_frag() heuristic since the allocation
      constraints are now stored in current->flags, which sk_page_frag()
      can't read without risking triggering a cache miss and slowing down
      TCP's fast path.
      
      This patch creates a new field in struct sock, named sk_use_task_frag,
      which sockets with memory reclaim constraints can set to false if they
      can't safely use current->task_frag. In such cases, sk_page_frag() now
      always returns the socket's page_frag (->sk_frag). The first user is
      sunrpc, which needs to avoid using current->task_frag but can keep
      ->sk_allocation set to GFP_KERNEL otherwise.
      
      Eventually, it might be possible to simplify sk_page_frag() by only
      testing ->sk_use_task_frag and avoid relying on the ->sk_allocation
      heuristic entirely (assuming other sockets will set ->sk_use_task_frag
      according to their constraints in the future).
      
      The new ->sk_use_task_frag field is placed in a hole in struct sock and
      belongs to a cache line shared with ->sk_shutdown. Therefore it should
      be hot and shouldn't have negative performance impacts on TCP's fast
      path (sk_shutdown is tested just before the while() loop in
      tcp_sendmsg_locked()).
      
      Link: https://lore.kernel.org/netdev/b4d8cb09c913d3e34f853736f3f5628abfd7f4b6.1656699567.git.gnault@redhat.com/
      Signed-off-by: default avatarGuillaume Nault <gnault@redhat.com>
      Reviewed-by: default avatarBenjamin Coddington <bcodding@redhat.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      fb87bd47
    • Matt Johnston's avatar
      mctp: Remove device type check at unregister · b389a902
      Matt Johnston authored
      The unregister check could be incorrectly triggered if a netdev
      changes its type after register. That is possible for a tun device
      using TUNSETLINK ioctl, resulting in mctp unregister failing
      and the netdev unregister waiting forever.
      
      This was encountered by https://github.com/openthread/openthread/issues/8523
      
      Neither check at register or unregister is required. They were added in
      an attempt to track down mctp_ptr being set unexpectedly, which should
      not happen in normal operation.
      
      Fixes: 7b1871af
      
       ("mctp: Warn if pointer is set for a wrong dev type")
      Signed-off-by: default avatarMatt Johnston <matt@codeconstruct.com.au>
      Link: https://lore.kernel.org/r/20221215054933.2403401-1-matt@codeconstruct.com.au
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      b389a902
    • Arun Ramadoss's avatar
      net: dsa: microchip: remove IRQF_TRIGGER_FALLING in request_threaded_irq · 62e027fb
      Arun Ramadoss authored
      KSZ swithes used interrupts for detecting the phy link up and down.
      During registering the interrupt handler, it used IRQF_TRIGGER_FALLING
      flag. But this flag has to be retrieved from device tree instead of hard
      coding in the driver, so removing the flag.
      
      Fixes: ff319a64
      
       ("net: dsa: microchip: move interrupt handling logic from lan937x to ksz_common")
      Reported-by: default avatarChristian Eggers <ceggers@arri.de>
      Signed-off-by: default avatarArun Ramadoss <arun.ramadoss@microchip.com>
      Link: https://lore.kernel.org/r/20221213101440.24667-1-arun.ramadoss@microchip.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      62e027fb
  5. Dec 19, 2022
    • Marc Kleine-Budde's avatar
      can: kvaser_usb: hydra: help gcc-13 to figure out cmd_len · f0062291
      Marc Kleine-Budde authored
      Debian's gcc-13 [1] throws the following error in
      kvaser_usb_hydra_cmd_size():
      
      [1] gcc version 13.0.0 20221214 (experimental) [master r13-4693-g512098a3316] (Debian 13-20221214-1)
      
      | drivers/net/can/usb/kvaser_usb/kvaser_usb_hydra.c:502:65: error:
      | array subscript ‘struct kvaser_cmd_ext[0]’ is partly outside array
      | bounds of ‘unsigned char[32]’ [-Werror=array-bounds=]
      |   502 |                 ret = le16_to_cpu(((struct kvaser_cmd_ext *)cmd)->len);
      
      kvaser_usb_hydra_cmd_size() returns the size of given command. It
      depends on the command number (cmd->header.cmd_no). For extended
      commands (cmd->header.cmd_no == CMD_EXTENDED) the above shown code is
      executed.
      
      Help gcc to recognize that this code path is not taken in all cases,
      by calling kvaser_usb_hydra_cmd_size() directly after assigning the
      command number.
      
      Fixes: aec5fb22
      
       ("can: kvaser_usb: Add support for Kvaser USB hydra family")
      Cc: Jimmy Assarsson <extja@kvaser.com>
      Cc: Anssi Hannula <anssi.hannula@bitwise.fi>
      Link: https://lore.kernel.org/all/20221219110104.1073881-1-mkl@pengutronix.de
      Tested-by: default avatarJimmy Assarsson <extja@kvaser.com>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      f0062291
    • Haibo Chen's avatar
      can: flexcan: avoid unbalanced pm_runtime_enable warning · 3bc2afcb
      Haibo Chen authored
      When do suspend/resume, meet the following warning message:
      [   30.028336] flexcan 425b0000.can: Unbalanced pm_runtime_enable!
      
      Balance the pm_runtime_force_suspend() and pm_runtime_force_resume().
      
      Fixes: 8cb53b48
      
       ("can: flexcan: add auto stop mode for IMX93 to support wakeup")
      Signed-off-by: default avatarHaibo Chen <haibo.chen@nxp.com>
      Link: https://lore.kernel.org/all/20221213094351.3023858-1-haibo.chen@nxp.com
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      3bc2afcb
    • Vincent Mailhol's avatar
      Documentation: devlink: add missing toc entry for etas_es58x devlink doc · 115dd546
      Vincent Mailhol authored
      toc entry is missing for etas_es58x devlink doc and triggers this warning:
      
        Documentation/networking/devlink/etas_es58x.rst: WARNING: document isn't included in any toctree
      
      Add the missing toc entry.
      
      Fixes: 9f63f96a
      
       ("Documentation: devlink: add devlink documentation for the etas_es58x driver")
      Signed-off-by: default avatarVincent Mailhol <mailhol.vincent@wanadoo.fr>
      Link: https://lore.kernel.org/all/20221213051136.721887-1-mailhol.vincent@wanadoo.fr
      Reported-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      115dd546
    • Jeremy Kerr's avatar
      mctp: serial: Fix starting value for frame check sequence · 2856a627
      Jeremy Kerr authored
      RFC1662 defines the start state for the crc16 FCS to be 0xffff, but
      we're currently starting at zero.
      
      This change uses the correct start state. We're only early in the
      adoption for the serial binding, so there aren't yet any other users to
      interface to.
      
      Fixes: a0c2ccd9
      
       ("mctp: Add MCTP-over-serial transport binding")
      Reported-by: default avatarHarsh Tyagi <harshtya@google.com>
      Tested-by: default avatarHarsh Tyagi <harshtya@google.com>
      Signed-off-by: default avatarJeremy Kerr <jk@codeconstruct.com.au>
      Reviewed-by: default avatarAlexander Duyck <alexanderduyck@fb.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2856a627
    • Huanhuan Wang's avatar
      nfp: fix unaligned io read of capabilities word · 1b0c84a3
      Huanhuan Wang authored
      The address of 32-bit extend capability is not qword aligned,
      and may cause exception in some arch.
      
      Fixes: 484963ce
      
       ("nfp: extend capability and control words")
      Signed-off-by: default avatarHuanhuan Wang <huanhuan.wang@corigine.com>
      Reviewed-by: default avatarNiklas Söderlund <niklas.soderlund@corigine.com>
      Signed-off-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1b0c84a3
    • Eric Dumazet's avatar
      net: stream: purge sk_error_queue in sk_stream_kill_queues() · e0c8bccd
      Eric Dumazet authored
      Changheon Lee reported TCP socket leaks, with a nice repro.
      
      It seems we leak TCP sockets with the following sequence:
      
      1) SOF_TIMESTAMPING_TX_ACK is enabled on the socket.
      
         Each ACK will cook an skb put in error queue, from __skb_tstamp_tx().
         __skb_tstamp_tx() is using skb_clone(), unless
         SOF_TIMESTAMPING_OPT_TSONLY was also requested.
      
      2) If the application is also using MSG_ZEROCOPY, then we put in the
         error queue cloned skbs that had a struct ubuf_info attached to them.
      
         Whenever an struct ubuf_info is allocated, sock_zerocopy_alloc()
         does a sock_hold().
      
         As long as the cloned skbs are still in sk_error_queue,
         socket refcount is kept elevated.
      
      3) Application closes the socket, while error queue is not empty.
      
      Since tcp_close() no longer purges the socket error queue,
      we might end up with a TCP socket with at least one skb in
      error queue keeping the socket alive forever.
      
      This bug can be (ab)used to consume all kernel memory
      and freeze the host.
      
      We need to purge the error queue, with proper synchronization
      against concurrent writers.
      
      Fixes: 24bcbe1c
      
       ("net: stream: don't purge sk_error_queue in sk_stream_kill_queues()")
      Reported-by: default avatarChangheon Lee <darklight2357@icloud.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e0c8bccd
    • Christophe JAILLET's avatar
      myri10ge: Fix an error handling path in myri10ge_probe() · d83b950d
      Christophe JAILLET authored
      Some memory allocated in myri10ge_probe_slices() is not released in the
      error handling path of myri10ge_probe().
      
      Add the corresponding kfree(), as already done in the remove function.
      
      Fixes: 0dcffac1
      
       ("myri10ge: add multislices support")
      Signed-off-by: default avatarChristophe JAILLET <christophe.jaillet@wanadoo.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d83b950d
    • Horatiu Vultur's avatar
      net: microchip: vcap: Fix initialization of value and mask · 10073399
      Horatiu Vultur authored
      Fix the following smatch warning:
      
      smatch warnings:
      drivers/net/ethernet/microchip/vcap/vcap_api_debugfs.c:103 vcap_debugfs_show_rule_keyfield() error: uninitialized symbol 'value'.
      drivers/net/ethernet/microchip/vcap/vcap_api_debugfs.c:106 vcap_debugfs_show_rule_keyfield() error: uninitialized symbol 'mask'.
      
      In case the vcap field was VCAP_FIELD_U128 and the key was different
      than IP6_S/DIP then the value and mask were not initialized, therefore
      initialize them.
      
      Fixes: 610c32b2
      
       ("net: microchip: vcap: Add vcap_get_rule")
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Reported-by: default avatarDan Carpenter <error27@gmail.com>
      Reviewed-by: default avatarSaeed Mahameed <saeed@kernel.org>
      Signed-off-by: default avatarHoratiu Vultur <horatiu.vultur@microchip.com>
      Reviewed-by: default avatarMichal Swiatkowski <michal.swiatkowski@linux.intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      10073399
    • David S. Miller's avatar
      Merge branch 'rxrpc-fixes' · 98dbec0a
      David S. Miller authored
      
      
      David Howells says:
      
      ====================
      rxrpc: Fixes for I/O thread conversion/SACK table expansion
      
      Here are some fixes for AF_RXRPC:
      
       (1) Fix missing unlock in rxrpc's sendmsg.
      
       (2) Fix (lack of) propagation of security settings to rxrpc_call.
      
       (3) Fix NULL ptr deref in rxrpc_unuse_local().
      
       (4) Fix problem with kthread_run() not invoking the I/O thread function if
           the kthread gets stopped first.  Possibly this should actually be
           fixed in the kthread code.
      
       (5) Fix locking problem as putting a peer (which may be done from RCU) may
           now invoke kthread_stop().
      
       (6) Fix switched parameters in a couple of trace calls.
      
       (7) Fix I/O thread's checking for kthread stop to make sure it completes
           all outstanding work before returning so that calls are cleaned up.
      
       (8) Fix an uninitialised var in the new rxperf test server.
      
       (9) Fix the return value of rxrpc_new_incoming_call() so that the checks
           on it work correctly.
      
      The patches fix at least one syzbot bug[1] and probably some others that
      don't have reproducers[2][3][4].  I think it also fixes another[5], but
      that showed another failure during testing that was different to the
      original.
      
      There's also an outstanding bug in rxrpc_put_peer()[6] that is fixed by a
      combination of several patches in my rxrpc-next branch, but I haven't
      included that here.
      ====================
      
      Tested-by: default avatarMarc Dionne <marc.dionne@auristor.com>
      Tested-by: default avatar <kafs-testing+fedora36_64checkkafs-build-164@auristor.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      98dbec0a
    • David Howells's avatar
      rxrpc: Fix the return value of rxrpc_new_incoming_call() · 31d35a02
      David Howells authored
      Dan Carpenter sayeth[1]:
      
        The patch 5e6ef4f1: "rxrpc: Make the I/O thread take over the
        call and local processor work" from Jan 23, 2020, leads to the
        following Smatch static checker warning:
      
      	net/rxrpc/io_thread.c:283 rxrpc_input_packet()
      	warn: bool is not less than zero.
      
      Fix this (for now) by changing rxrpc_new_incoming_call() to return an int
      with 0 or error code rather than bool.  Note that the actual return value
      of rxrpc_input_packet() is currently ignored.  I have a separate patch to
      clean that up.
      
      Fixes: 5e6ef4f1
      
       ("rxrpc: Make the I/O thread take over the call and local processor work")
      Reported-by: default avatarDan Carpenter <error27@gmail.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      Link: http://lists.infradead.org/pipermail/linux-afs/2022-December/006123.html [1]
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      31d35a02
    • David Howells's avatar
      rxrpc: rxperf: Fix uninitialised variable · 11e1706b
      David Howells authored
      Dan Carpenter sayeth[1]:
      
        The patch 75bfdbf2: "rxrpc: Implement an in-kernel rxperf server
        for testing purposes" from Nov 3, 2022, leads to the following Smatch
        static checker warning:
      
      	net/rxrpc/rxperf.c:337 rxperf_deliver_to_call()
      	error: uninitialized symbol 'ret'.
      
      Fix this by initialising ret to 0.  The value is only used for tracing
      purposes in the rxperf server.
      
      Fixes: 75bfdbf2
      
       ("rxrpc: Implement an in-kernel rxperf server for testing purposes")
      Reported-by: default avatarDan Carpenter <error27@gmail.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      Link: http://lists.infradead.org/pipermail/linux-afs/2022-December/006124.html [1]
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      11e1706b
    • David Howells's avatar
      rxrpc: Fix I/O thread stop · 743d1768
      David Howells authored
      The rxrpc I/O thread checks to see if there's any work it needs to do, and
      if not, checks kthread_should_stop() before scheduling, and if it should
      stop, breaks out of the loop and tries to clean up and exit.
      
      This can, however, race with socket destruction, wherein outstanding calls
      are aborted and released from the socket and then the socket unuses the
      local endpoint, causing kthread_stop() to be issued.  The abort is deferred
      to the I/O thread and the event can by issued between the I/O thread
      checking if there's any work to be done (such as processing call aborts)
      and the stop being seen.
      
      This results in the I/O thread stopping processing of events whilst call
      cleanup events are still outstanding, leading to connections or other
      objects still being around and uncleaned up, which can result in assertions
      being triggered, e.g.:
      
          rxrpc: AF_RXRPC: Leaked client conn 00000000e8009865 {2}
          ------------[ cut here ]------------
          kernel BUG at net/rxrpc/conn_client.c:64!
      
      Fix this by retrieving the kthread_should_stop() indication, then checking
      to see if there's more work to do, and going back round the loop if there
      is, and breaking out of the loop only if there wasn't.
      
      This was triggered by a syzbot test that produced some other symptom[1].
      
      Fixes: a275da62
      
       ("rxrpc: Create a per-local endpoint receive queue and I/O thread")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      Link: https://lore.kernel.org/r/0000000000002b4a9f05ef2b616f@google.com/ [1]
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      743d1768
    • David Howells's avatar
      rxrpc: Fix switched parameters in peer tracing · c838f1a7
      David Howells authored
      Fix the switched parameters on rxrpc_alloc_peer() and rxrpc_get_peer().
      The ref argument and the why argument got mixed.
      
      Fixes: 47c810a7
      
       ("rxrpc: trace: Don't use __builtin_return_address for rxrpc_peer tracing")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c838f1a7
    • David Howells's avatar
      rxrpc: Fix locking issues in rxrpc_put_peer_locked() · 608aecd1
      David Howells authored
      Now that rxrpc_put_local() may call kthread_stop(), it can't be called
      under spinlock as it might sleep.  This can cause a problem in the peer
      keepalive code in rxrpc as it tries to avoid dropping the peer_hash_lock
      from the point it needs to re-add peer->keepalive_link to going round the
      loop again in rxrpc_peer_keepalive_dispatch().
      
      Fix this by just dropping the lock when we don't need it and accepting that
      we'll have to take it again.  This code is only called about every 20s for
      each peer, so not very often.
      
      This allows rxrpc_put_peer_unlocked() to be removed also.
      
      If triggered, this bug produces an oops like the following, as reproduced
      by a syzbot reproducer for a different oops[1]:
      
      BUG: sleeping function called from invalid context at kernel/sched/completion.c:101
      ...
      RCU nest depth: 0, expected: 0
      3 locks held by kworker/u9:0/50:
       #0: ffff88810e74a138 ((wq_completion)krxrpcd){+.+.}-{0:0}, at: process_one_work+0x294/0x636
       #1: ffff8881013a7e20 ((work_completion)(&rxnet->peer_keepalive_work)){+.+.}-{0:0}, at: process_one_work+0x294/0x636
       #2: ffff88817d366390 (&rxnet->peer_hash_lock){+.+.}-{2:2}, at: rxrpc_peer_keepalive_dispatch+0x2bd/0x35f
      ...
      Call Trace:
       <TASK>
       dump_stack_lvl+0x4c/0x5f
       __might_resched+0x2cf/0x2f2
       __wait_for_common+0x87/0x1e8
       kthread_stop+0x14d/0x255
       rxrpc_peer_keepalive_dispatch+0x333/0x35f
       rxrpc_peer_keepalive_worker+0x2e9/0x449
       process_one_work+0x3c1/0x636
       worker_thread+0x25f/0x359
       kthread+0x1a6/0x1b5
       ret_from_fork+0x1f/0x30
      
      Fixes: a275da62
      
       ("rxrpc: Create a per-local endpoint receive queue and I/O thread")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      Link: https://lore.kernel.org/r/0000000000002b4a9f05ef2b616f@google.com/ [1]
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      608aecd1
    • David Howells's avatar
      rxrpc: Fix I/O thread startup getting skipped · 8fbcc833
      David Howells authored
      When starting a kthread, the __kthread_create_on_node() function, as called
      from kthread_run(), waits for a completion to indicate that the task_struct
      (or failure state) of the new kernel thread is available before continuing.
      
      This does not wait, however, for the thread function to be invoked and,
      indeed, will skip it if kthread_stop() gets called before it gets there.
      
      If this happens, though, kthread_run() will have returned successfully,
      indicating that the thread was started and returning the task_struct
      pointer.  The actual error indication is returned by kthread_stop().
      
      Note that this is ambiguous, as the caller cannot tell whether the -EINTR
      error code came from kthread() or from the thread function.
      
      This was encountered in the new rxrpc I/O thread, where if the system is
      being pounded hard by, say, syzbot, the check of KTHREAD_SHOULD_STOP can be
      delayed long enough for kthread_stop() to get called when rxrpc releases a
      socket - and this causes an oops because the I/O thread function doesn't
      get started and thus doesn't remove the rxrpc_local struct from the
      local_endpoints list.
      
      Fix this by using a completion to wait for the thread to actually enter
      rxrpc_io_thread().  This makes sure the thread can't be prematurely
      stopped and makes sure the relied-upon cleanup is done.
      
      Fixes: a275da62
      
       ("rxrpc: Create a per-local endpoint receive queue and I/O thread")
      Reported-by: default avatar <syzbot+3538a6a72efa8b059c38@syzkaller.appspotmail.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: Hillf Danton <hdanton@sina.com>
      Link: https://lore.kernel.org/r/000000000000229f1505ef2b6159@google.com/
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8fbcc833
    • David Howells's avatar
      rxrpc: Fix NULL deref in rxrpc_unuse_local() · eaa02390
      David Howells authored
      Fix rxrpc_unuse_local() to get the debug_id *after* checking to see if
      local is NULL.
      
      Fixes: a2cf3264
      
       ("rxrpc: Fold __rxrpc_unuse_local() into rxrpc_unuse_local()")
      Reported-by: default avatar <syzbot+3538a6a72efa8b059c38@syzkaller.appspotmail.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Tested-by: default avatar <syzbot+3538a6a72efa8b059c38@syzkaller.appspotmail.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      eaa02390
    • David Howells's avatar
      rxrpc: Fix security setting propagation · fdb99487
      David Howells authored
      Fix the propagation of the security settings from sendmsg to the rxrpc_call
      struct.
      
      Fixes: f3441d41
      
       ("rxrpc: Copy client call parameters into rxrpc_call earlier")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fdb99487
    • David Howells's avatar
      rxrpc: Fix missing unlock in rxrpc_do_sendmsg() · 4feb2c44
      David Howells authored
      One of the error paths in rxrpc_do_sendmsg() doesn't unlock the call mutex
      before returning.  Fix it to do this.
      
      Note that this still doesn't get rid of the checker warning:
      
         ../net/rxrpc/sendmsg.c:617:5: warning: context imbalance in 'rxrpc_do_sendmsg' - wrong count at exit
      
      I think the interplay between the socket lock and the call's user_mutex may
      be too complicated for checker to analyse, especially as
      rxrpc_new_client_call_for_sendmsg(), which it calls, returns with the
      call's user_mutex if successful but unconditionally drops the socket lock.
      
      Fixes: e754eba6
      
       ("rxrpc: Provide a cmsg to specify the amount of Tx data for a call")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4feb2c44
    • Cong Wang's avatar
      net_sched: reject TCF_EM_SIMPLE case for complex ematch module · 9cd3fd20
      Cong Wang authored
      When TCF_EM_SIMPLE was introduced, it is supposed to be convenient
      for ematch implementation:
      
      https://lore.kernel.org/all/20050105110048.GO26856@postel.suug.ch/
      
      "You don't have to, providing a 32bit data chunk without TCF_EM_SIMPLE
      set will simply result in allocating & copy. It's an optimization,
      nothing more."
      
      So if an ematch module provides ops->datalen that means it wants a
      complex data structure (saved in its em->data) instead of a simple u32
      value. We should simply reject such a combination, otherwise this u32
      could be misinterpreted as a pointer.
      
      Fixes: 1da177e4
      
       ("Linux-2.6.12-rc2")
      Reported-and-tested-by: default avatar <syzbot+4caeae4c7103813598ae@syzkaller.appspotmail.com>
      Reported-by: default avatarJun Nie <jun.nie@linaro.org>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Paolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarCong Wang <cong.wang@bytedance.com>
      Acked-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9cd3fd20
    • David S. Miller's avatar
      Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue · 89529367
      David S. Miller authored
      
      
      Tony Nguyen says:
      
      ====================
      Intel Wired LAN Driver Updates 2022-12-15 (igc)
      
      Muhammad Husaini Zulkifli says:
      
      This patch series fixes bugs for the Time-Sensitive Networking(TSN)
      Qbv Scheduling features.
      
      An overview of each patch series is given below:
      
      Patch 1: Using a first flag bit to schedule a packet to the next cycle if
      packet cannot fit in current Qbv cycle.
      Patch 2: Enable strict cycle for Qbv scheduling.
      Patch 3: Prevent user to set basetime less than zero during tc config.
      Patch 4: Allow the basetime enrollment with zero value.
      Patch 5: Calculate the new end time value to exclude the time interval that
      exceed the cycle time as user can specify the cycle time in tc config.
      Patch 6: Resolve the HW bugs where the gate is not fully closed.
      ---
      This contains the net patches from this original pull request:
      https://lore.kernel.org/netdev/20221205212414.3197525-1-anthony.l.nguyen@intel.com/
      
      ====================
      
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      89529367
  6. Dec 17, 2022
    • Subash Abhinov Kasiviswanathan's avatar
      skbuff: Account for tail adjustment during pull operations · 2d7afdcb
      Subash Abhinov Kasiviswanathan authored
      Extending the tail can have some unexpected side effects if a program uses
      a helper like BPF_FUNC_skb_pull_data to read partial content beyond the
      head skb headlen when all the skbs in the gso frag_list are linear with no
      head_frag -
      
        kernel BUG at net/core/skbuff.c:4219!
        pc : skb_segment+0xcf4/0xd2c
        lr : skb_segment+0x63c/0xd2c
        Call trace:
         skb_segment+0xcf4/0xd2c
         __udp_gso_segment+0xa4/0x544
         udp4_ufo_fragment+0x184/0x1c0
         inet_gso_segment+0x16c/0x3a4
         skb_mac_gso_segment+0xd4/0x1b0
         __skb_gso_segment+0xcc/0x12c
         udp_rcv_segment+0x54/0x16c
         udp_queue_rcv_skb+0x78/0x144
         udp_unicast_rcv_skb+0x8c/0xa4
         __udp4_lib_rcv+0x490/0x68c
         udp_rcv+0x20/0x30
         ip_protocol_deliver_rcu+0x1b0/0x33c
         ip_local_deliver+0xd8/0x1f0
         ip_rcv+0x98/0x1a4
         deliver_ptype_list_skb+0x98/0x1ec
         __netif_receive_skb_core+0x978/0xc60
      
      Fix this by marking these skbs as GSO_DODGY so segmentation can handle
      the tail updates accordingly.
      
      Fixes: 3dcbdb13
      
       ("net: gso: Fix skb_segment splat when splitting gso_size mangled skb having linear-headed frag_list")
      Signed-off-by: default avatarSean Tranchetti <quic_stranche@quicinc.com>
      Signed-off-by: default avatarSubash Abhinov Kasiviswanathan <quic_subashab@quicinc.com>
      Reviewed-by: default avatarAlexander Duyck <alexanderduyck@fb.com>
      Link: https://lore.kernel.org/r/1671084718-24796-1-git-send-email-quic_subashab@quicinc.com
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      2d7afdcb
    • Jakub Kicinski's avatar
      devlink: protect devlink dump by the instance lock · 214964a1
      Jakub Kicinski authored
      Take the instance lock around devlink_nl_fill() when dumping,
      doit takes it already.
      
      We are only dumping basic info so in the worst case we were risking
      data races around the reload statistics. Until the big devlink mutex
      was removed all relevant code was protected by it, so the missing
      instance lock was not exposed.
      
      Fixes: d3efc2a6
      
       ("net: devlink: remove devlink_mutex")
      Reviewed-by: default avatarJiri Pirko <jiri@nvidia.com>
      Reviewed-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Link: https://lore.kernel.org/r/20221216044122.1863550-1-kuba@kernel.org
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      214964a1
    • Arnd Bergmann's avatar
      net: ethernet: ti: am65-cpsw: fix CONFIG_PM #ifdef · 078838f5
      Arnd Bergmann authored
      The #ifdef check is incorrect and leads to a warning:
      
      drivers/net/ethernet/ti/am65-cpsw-nuss.c:1679:13: error: 'am65_cpsw_nuss_remove_rx_chns' defined but not used [-Werror=unused-function]
       1679 | static void am65_cpsw_nuss_remove_rx_chns(void *data)
      
      It's better to remove the #ifdef here and use the modern
      SYSTEM_SLEEP_PM_OPS() macro instead.
      
      Fixes: 24bc19b0
      
       ("net: ethernet: ti: am65-cpsw: Add suspend/resume support")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Reviewed-by: default avatarAlexander Duyck <alexanderduyck@fb.com>
      Link: https://lore.kernel.org/r/20221215163918.611609-1-arnd@kernel.org
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      078838f5
    • Jakub Kicinski's avatar
      Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · 13e3c779
      Jakub Kicinski authored
      
      
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf 2022-12-16
      
      We've added 7 non-merge commits during the last 2 day(s) which contain
      a total of 9 files changed, 119 insertions(+), 36 deletions(-).
      
      1) Fix for recent syzkaller XDP dispatcher update splat, from Jiri Olsa.
      
      2) Fix BPF program refcount leak in LSM attachment failure path,
         from Milan Landaverde.
      
      3) Fix BPF program type in map compatibility check for fext,
         from Toke Høiland-Jørgensen.
      
      4) Fix a BPF selftest compilation error under !CONFIG_SMP config,
         from Yonghong Song.
      
      5) Fix CI to enable CONFIG_FUNCTION_ERROR_INJECTION after it got changed
         to a prompt, from Song Liu.
      
      6) Various BPF documentation fixes for socket local storage,
         from Donald Hunter.
      
      * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
        selftests/bpf: Add a test for using a cpumap from an freplace-to-XDP program
        bpf: Resolve fext program type when checking map compatibility
        bpf: Synchronize dispatcher update with bpf_dispatcher_xdp_func
        bpf: prevent leak of lsm program after failed attach
        selftests/bpf: Select CONFIG_FUNCTION_ERROR_INJECTION
        selftests/bpf: Fix a selftest compilation error with CONFIG_SMP=n
        docs/bpf: Reword docs for BPF_MAP_TYPE_SK_STORAGE
      ====================
      
      Link: https://lore.kernel.org/r/20221216174540.16598-1-daniel@iogearbox.net
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      13e3c779
  7. Dec 16, 2022
    • Eelco Chaudron's avatar
      openvswitch: Fix flow lookup to use unmasked key · 68bb1010
      Eelco Chaudron authored
      The commit mentioned below causes the ovs_flow_tbl_lookup() function
      to be called with the masked key. However, it's supposed to be called
      with the unmasked key. This due to the fact that the datapath supports
      installing wider flows, and OVS relies on this behavior. For example
      if ipv4(src=1.1.1.1/192.0.0.0, dst=1.1.1.2/192.0.0.0) exists, a wider
      flow (smaller mask) of ipv4(src=192.1.1.1/128.0.0.0,dst=192.1.1.2/
      128.0.0.0) is allowed to be added.
      
      However, if we try to add a wildcard rule, the installation fails:
      
      $ ovs-appctl dpctl/add-flow system@myDP "in_port(1),eth_type(0x0800), \
        ipv4(src=1.1.1.1/192.0.0.0,dst=1.1.1.2/192.0.0.0,frag=no)" 2
      $ ovs-appctl dpctl/add-flow system@myDP "in_port(1),eth_type(0x0800), \
        ipv4(src=192.1.1.1/0.0.0.0,dst=49.1.1.2/0.0.0.0,frag=no)" 2
      ovs-vswitchd: updating flow table (File exists)
      
      The reason is that the key used to determine if the flow is already
      present in the system uses the original key ANDed with the mask.
      This results in the IP address not being part of the (miniflow) key,
      i.e., being substituted with an all-zero value. When doing the actual
      lookup, this results in the key wrongfully matching the first flow,
      and therefore the flow does not get installed.
      
      This change reverses the commit below, but rather than having the key
      on the stack, it's allocated.
      
      Fixes: 190aa3e7
      
       ("openvswitch: Fix Frame-size larger than 1024 bytes warning.")
      
      Signed-off-by: default avatarEelco Chaudron <echaudro@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      68bb1010