Skip to content
  1. Dec 21, 2023
    • Chengfeng Ye's avatar
      atm: solos-pci: Fix potential deadlock on &tx_queue_lock · 7cfbb8be
      Chengfeng Ye authored
      [ Upstream commit 15319a4e ]
      
      As &card->tx_queue_lock is acquired under softirq context along the
      following call chain from solos_bh(), other acquisition of the same
      lock inside process context should disable at least bh to avoid double
      lock.
      
      <deadlock #2>
      pclose()
      --> spin_lock(&card->tx_queue_lock)
      <interrupt>
         --> solos_bh()
         --> fpga_tx()
         --> spin_lock(&card->tx_queue_lock)
      
      This flaw was found by an experimental static analysis tool I am
      developing for irq-related deadlock.
      
      To prevent the potential deadlock, the patch uses spin_lock_bh()
      on &card->tx_queue_lock under process context code consistently to
      prevent the possible deadlock scenario.
      
      Fixes: 213e85d3
      
       ("solos-pci: clean up pclose() function")
      Signed-off-by: default avatarChengfeng Ye <dg573847474@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7cfbb8be
    • Chengfeng Ye's avatar
      atm: solos-pci: Fix potential deadlock on &cli_queue_lock · 35c63d36
      Chengfeng Ye authored
      [ Upstream commit d5dba32b ]
      
      As &card->cli_queue_lock is acquired under softirq context along the
      following call chain from solos_bh(), other acquisition of the same
      lock inside process context should disable at least bh to avoid double
      lock.
      
      <deadlock #1>
      console_show()
      --> spin_lock(&card->cli_queue_lock)
      <interrupt>
         --> solos_bh()
         --> spin_lock(&card->cli_queue_lock)
      
      This flaw was found by an experimental static analysis tool I am
      developing for irq-related deadlock.
      
      To prevent the potential deadlock, the patch uses spin_lock_bh()
      on the card->cli_queue_lock under process context code consistently
      to prevent the possible deadlock scenario.
      
      Fixes: 9c54004e
      
       ("atm: Driver for Solos PCI ADSL2+ card.")
      Signed-off-by: default avatarChengfeng Ye <dg573847474@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      35c63d36
    • Michael Chan's avatar
      bnxt_en: Fix HWTSTAMP_FILTER_ALL packet timestamp logic · 525904a1
      Michael Chan authored
      [ Upstream commit c13e268c ]
      
      When the chip is configured to timestamp all receive packets, the
      timestamp in the RX completion is only valid if the metadata
      present flag is not set for packets received on the wire.  In
      addition, internal loopback packets will never have a valid timestamp
      and the timestamp field will always be zero.  We must exclude
      any 0 value in the timestamp field because there is no way to
      determine if it is a loopback packet or not.
      
      Add a new function bnxt_rx_ts_valid() to check for all timestamp
      valid conditions.
      
      Fixes: 66ed81dc
      
       ("bnxt_en: Enable packet timestamping for all RX packets")
      Reviewed-by: default avatarAndy Gospodarek <andrew.gospodarek@broadcom.com>
      Reviewed-by: default avatarPavan Chebbi <pavan.chebbi@broadcom.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Link: https://lore.kernel.org/r/20231208001658.14230-5-michael.chan@broadcom.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      525904a1
    • Kalesh AP's avatar
      bnxt_en: Fix wrong return value check in bnxt_close_nic() · ac612517
      Kalesh AP authored
      [ Upstream commit bd6781c1 ]
      
      The wait_event_interruptible_timeout() function returns 0
      if the timeout elapsed, -ERESTARTSYS if it was interrupted
      by a signal, and the remaining jiffies otherwise if the
      condition evaluated to true before the timeout elapsed.
      
      Driver should have checked for zero return value instead of
      a positive value.
      
      MChan: Print a warning for -ERESTARTSYS.  The close operation
      will proceed anyway when wait_event_interruptible_timeout()
      returns for any reason.  Since we do the close no matter what,
      we should not return this error code to the caller.  Change
      bnxt_close_nic() to a void function and remove all error
      handling from some of the callers.
      
      Fixes: c0c050c5
      
       ("bnxt_en: New Broadcom ethernet driver.")
      Reviewed-by: default avatarAndy Gospodarek <andrew.gospodarek@broadcom.com>
      Reviewed-by: default avatarVikas Gupta <vikas.gupta@broadcom.com>
      Reviewed-by: default avatarSomnath Kotur <somnath.kotur@broadcom.com>
      Signed-off-by: default avatarKalesh AP <kalesh-anakkur.purayil@broadcom.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Link: https://lore.kernel.org/r/20231208001658.14230-4-michael.chan@broadcom.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ac612517
    • Michael Chan's avatar
      bnxt_en: Save ring error counters across reset · 8217f936
      Michael Chan authored
      [ Upstream commit 4c70dbe3 ]
      
      Currently, the ring counters are stored in the per ring datastructure.
      During reset, all the rings are freed together with the associated
      datastructures.  As a result, all the ring error counters will be reset
      to zero.
      
      Add logic to keep track of the total error counts of all the rings
      and save them before reset (including ifdown).  The next patch will
      display these total ring error counters under ethtool -S.
      
      Link: https://lore.kernel.org/netdev/CACKFLimD-bKmJ1tGZOLYRjWzEwxkri-Mw7iFme1x2Dr0twdCeg@mail.gmail.com/
      
      
      Reviewed-by: default avatarAjit Khaparde <ajit.khaparde@broadcom.com>
      Reviewed-by: default avatarAndy Gospodarek <andrew.gospodarek@broadcom.com>
      Reviewed-by: default avatarSomnath Kotur <somnath.kotur@broadcom.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Link: https://lore.kernel.org/r/20230817231911.165035-5-michael.chan@broadcom.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Stable-dep-of: bd6781c1
      
       ("bnxt_en: Fix wrong return value check in bnxt_close_nic()")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      8217f936
    • Somnath Kotur's avatar
      bnxt_en: Clear resource reservation during resume · 53cacb8c
      Somnath Kotur authored
      [ Upstream commit 9ef7c58f ]
      
      We are issuing HWRM_FUNC_RESET cmd to reset the device including
      all reserved resources, but not clearing the reservations
      within the driver struct. As a result, when the driver re-initializes
      as part of resume, it believes that there is no need to do any
      resource reservation and goes ahead and tries to allocate rings
      which will eventually fail beyond a certain number pre-reserved by
      the firmware.
      
      Fixes: 674f50a5
      
       ("bnxt_en: Implement new method to reserve rings.")
      Reviewed-by: default avatarKalesh AP <kalesh-anakkur.purayil@broadcom.com>
      Reviewed-by: default avatarAjit Khaparde <ajit.khaparde@broadcom.com>
      Reviewed-by: default avatarAndy Gospodarek <andrew.gospodarek@broadcom.com>
      Signed-off-by: default avatarSomnath Kotur <somnath.kotur@broadcom.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Link: https://lore.kernel.org/r/20231208001658.14230-2-michael.chan@broadcom.com
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      53cacb8c
    • Stefan Wahren's avatar
      qca_spi: Fix reset behavior · ab410db6
      Stefan Wahren authored
      [ Upstream commit 1057812d ]
      
      In case of a reset triggered by the QCA7000 itself, the behavior of the
      qca_spi driver was not quite correct:
      - in case of a pending RX frame decoding the drop counter must be
        incremented and decoding state machine reseted
      - also the reset counter must always be incremented regardless of sync
        state
      
      Fixes: 291ab06e
      
       ("net: qualcomm: new Ethernet over SPI driver for QCA7000")
      Signed-off-by: default avatarStefan Wahren <wahrenst@gmx.net>
      Link: https://lore.kernel.org/r/20231206141222.52029-4-wahrenst@gmx.net
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ab410db6
    • Stefan Wahren's avatar
      qca_debug: Fix ethtool -G iface tx behavior · 7e177e5a
      Stefan Wahren authored
      [ Upstream commit 96a7e861 ]
      
      After calling ethtool -g it was not possible to adjust the TX ring
      size again:
      
        # ethtool -g eth1
        Ring parameters for eth1:
        Pre-set maximums:
        RX:		4
        RX Mini:	n/a
        RX Jumbo:	n/a
        TX:		10
        Current hardware settings:
        RX:		4
        RX Mini:	n/a
        RX Jumbo:	n/a
        TX:		10
        # ethtool -G eth1 tx 8
        netlink error: Invalid argument
      
      The reason for this is that the readonly setting rx_pending get
      initialized and after that the range check in qcaspi_set_ringparam()
      fails regardless of the provided parameter. So fix this by accepting
      the exposed RX defaults. Instead of adding another magic number
      better use a new define here.
      
      Fixes: 291ab06e
      
       ("net: qualcomm: new Ethernet over SPI driver for QCA7000")
      Suggested-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarStefan Wahren <wahrenst@gmx.net>
      Link: https://lore.kernel.org/r/20231206141222.52029-3-wahrenst@gmx.net
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7e177e5a
    • Stefan Wahren's avatar
      qca_debug: Prevent crash on TX ring changes · 2127142c
      Stefan Wahren authored
      [ Upstream commit f4e6064c ]
      
      The qca_spi driver stop and restart the SPI kernel thread
      (via ndo_stop & ndo_open) in case of TX ring changes. This is
      a big issue because it allows userspace to prevent restart of
      the SPI kernel thread (via signals). A subsequent change of
      TX ring wrongly assume a valid spi_thread pointer which result
      in a crash.
      
      So prevent this by stopping the network traffic handling and
      temporary park the SPI thread.
      
      Fixes: 291ab06e
      
       ("net: qualcomm: new Ethernet over SPI driver for QCA7000")
      Signed-off-by: default avatarStefan Wahren <wahrenst@gmx.net>
      Link: https://lore.kernel.org/r/20231206141222.52029-2-wahrenst@gmx.net
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      2127142c
    • Maciej Żenczykowski's avatar
      net: ipv6: support reporting otherwise unknown prefix flags in RTM_NEWPREFIX · 0da41ddf
      Maciej Żenczykowski authored
      [ Upstream commit bd4a8167 ]
      
      Lorenzo points out that we effectively clear all unknown
      flags from PIO when copying them to userspace in the netlink
      RTM_NEWPREFIX notification.
      
      We could fix this one at a time as new flags are defined,
      or in one fell swoop - I choose the latter.
      
      We could either define 6 new reserved flags (reserved1..6) and handle
      them individually (and rename them as new flags are defined), or we
      could simply copy the entire unmodified byte over - I choose the latter.
      
      This unfortunately requires some anonymous union/struct magic,
      so we add a static assert on the struct size for a little extra safety.
      
      Cc: David Ahern <dsahern@kernel.org>
      Cc: Lorenzo Colitti <lorenzo@google.com>
      Fixes: 1da177e4
      
       ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarMaciej Żenczykowski <maze@google.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      0da41ddf
    • Moshe Shemesh's avatar
      net/mlx5e: Fix possible deadlock on mlx5e_tx_timeout_work · 51423249
      Moshe Shemesh authored
      [ Upstream commit eab0da38 ]
      
      Due to the cited patch, devlink health commands take devlink lock and
      this may result in deadlock for mlx5e_tx_reporter as it takes local
      state_lock before calling devlink health report and on the other hand
      devlink health commands such as diagnose for same reporter take local
      state_lock after taking devlink lock (see kernel log below).
      
      To fix it, remove local state_lock from mlx5e_tx_timeout_work() before
      calling devlink_health_report() and take care to cancel the work before
      any call to close channels, which may free the SQs that should be
      handled by the work. Before cancel_work_sync(), use current_work() to
      check we are not calling it from within the work, as
      mlx5e_tx_timeout_work() itself may close the channels and reopen as part
      of recovery flow.
      
      While removing state_lock from mlx5e_tx_timeout_work() keep rtnl_lock to
      ensure no change in netdev->real_num_tx_queues, but use rtnl_trylock()
      and a flag to avoid deadlock by calling cancel_work_sync() before
      closing the channels while holding rtnl_lock too.
      
      Kernel log:
      ======================================================
      WARNING: possible circular locking dependency detected
      6.0.0-rc3_for_upstream_debug_2022_08_30_13_10 #1 Not tainted
      ------------------------------------------------------
      kworker/u16:2/65 is trying to acquire lock:
      ffff888122f6c2f8 (&devlink->lock_key#2){+.+.}-{3:3}, at: devlink_health_report+0x2f1/0x7e0
      
      but task is already holding lock:
      ffff888121d20be0 (&priv->state_lock){+.+.}-{3:3}, at: mlx5e_tx_timeout_work+0x70/0x280 [mlx5_core]
      
      which lock already depends on the new lock.
      
      the existing dependency chain (in reverse order) is:
      
      -> #1 (&priv->state_lock){+.+.}-{3:3}:
             __mutex_lock+0x12c/0x14b0
             mlx5e_rx_reporter_diagnose+0x71/0x700 [mlx5_core]
             devlink_nl_cmd_health_reporter_diagnose_doit+0x212/0xa50
             genl_family_rcv_msg_doit+0x1e9/0x2f0
             genl_rcv_msg+0x2e9/0x530
             netlink_rcv_skb+0x11d/0x340
             genl_rcv+0x24/0x40
             netlink_unicast+0x438/0x710
             netlink_sendmsg+0x788/0xc40
             sock_sendmsg+0xb0/0xe0
             __sys_sendto+0x1c1/0x290
             __x64_sys_sendto+0xdd/0x1b0
             do_syscall_64+0x3d/0x90
             entry_SYSCALL_64_after_hwframe+0x46/0xb0
      
      -> #0 (&devlink->lock_key#2){+.+.}-{3:3}:
             __lock_acquire+0x2c8a/0x6200
             lock_acquire+0x1c1/0x550
             __mutex_lock+0x12c/0x14b0
             devlink_health_report+0x2f1/0x7e0
             mlx5e_health_report+0xc9/0xd7 [mlx5_core]
             mlx5e_reporter_tx_timeout+0x2ab/0x3d0 [mlx5_core]
             mlx5e_tx_timeout_work+0x1c1/0x280 [mlx5_core]
             process_one_work+0x7c2/0x1340
             worker_thread+0x59d/0xec0
             kthread+0x28f/0x330
             ret_from_fork+0x1f/0x30
      
      other info that might help us debug this:
      
       Possible unsafe locking scenario:
      
             CPU0                    CPU1
             ----                    ----
        lock(&priv->state_lock);
                                     lock(&devlink->lock_key#2);
                                     lock(&priv->state_lock);
        lock(&devlink->lock_key#2);
      
       *** DEADLOCK ***
      
      4 locks held by kworker/u16:2/65:
       #0: ffff88811a55b138 ((wq_completion)mlx5e#2){+.+.}-{0:0}, at: process_one_work+0x6e2/0x1340
       #1: ffff888101de7db8 ((work_completion)(&priv->tx_timeout_work)){+.+.}-{0:0}, at: process_one_work+0x70f/0x1340
       #2: ffffffff84ce8328 (rtnl_mutex){+.+.}-{3:3}, at: mlx5e_tx_timeout_work+0x53/0x280 [mlx5_core]
       #3: ffff888121d20be0 (&priv->state_lock){+.+.}-{3:3}, at: mlx5e_tx_timeout_work+0x70/0x280 [mlx5_core]
      
      stack backtrace:
      CPU: 1 PID: 65 Comm: kworker/u16:2 Not tainted 6.0.0-rc3_for_upstream_debug_2022_08_30_13_10 #1
      Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
      Workqueue: mlx5e mlx5e_tx_timeout_work [mlx5_core]
      Call Trace:
       <TASK>
       dump_stack_lvl+0x57/0x7d
       check_noncircular+0x278/0x300
       ? print_circular_bug+0x460/0x460
       ? find_held_lock+0x2d/0x110
       ? __stack_depot_save+0x24c/0x520
       ? alloc_chain_hlocks+0x228/0x700
       __lock_acquire+0x2c8a/0x6200
       ? register_lock_class+0x1860/0x1860
       ? kasan_save_stack+0x1e/0x40
       ? kasan_set_free_info+0x20/0x30
       ? ____kasan_slab_free+0x11d/0x1b0
       ? kfree+0x1ba/0x520
       ? devlink_health_do_dump.part.0+0x171/0x3a0
       ? devlink_health_report+0x3d5/0x7e0
       lock_acquire+0x1c1/0x550
       ? devlink_health_report+0x2f1/0x7e0
       ? lockdep_hardirqs_on_prepare+0x400/0x400
       ? find_held_lock+0x2d/0x110
       __mutex_lock+0x12c/0x14b0
       ? devlink_health_report+0x2f1/0x7e0
       ? devlink_health_report+0x2f1/0x7e0
       ? mutex_lock_io_nested+0x1320/0x1320
       ? trace_hardirqs_on+0x2d/0x100
       ? bit_wait_io_timeout+0x170/0x170
       ? devlink_health_do_dump.part.0+0x171/0x3a0
       ? kfree+0x1ba/0x520
       ? devlink_health_do_dump.part.0+0x171/0x3a0
       devlink_health_report+0x2f1/0x7e0
       mlx5e_health_report+0xc9/0xd7 [mlx5_core]
       mlx5e_reporter_tx_timeout+0x2ab/0x3d0 [mlx5_core]
       ? lockdep_hardirqs_on_prepare+0x400/0x400
       ? mlx5e_reporter_tx_err_cqe+0x1b0/0x1b0 [mlx5_core]
       ? mlx5e_tx_reporter_timeout_dump+0x70/0x70 [mlx5_core]
       ? mlx5e_tx_reporter_dump_sq+0x320/0x320 [mlx5_core]
       ? mlx5e_tx_timeout_work+0x70/0x280 [mlx5_core]
       ? mutex_lock_io_nested+0x1320/0x1320
       ? process_one_work+0x70f/0x1340
       ? lockdep_hardirqs_on_prepare+0x400/0x400
       ? lock_downgrade+0x6e0/0x6e0
       mlx5e_tx_timeout_work+0x1c1/0x280 [mlx5_core]
       process_one_work+0x7c2/0x1340
       ? lockdep_hardirqs_on_prepare+0x400/0x400
       ? pwq_dec_nr_in_flight+0x230/0x230
       ? rwlock_bug.part.0+0x90/0x90
       worker_thread+0x59d/0xec0
       ? process_one_work+0x1340/0x1340
       kthread+0x28f/0x330
       ? kthread_complete_and_exit+0x20/0x20
       ret_from_fork+0x1f/0x30
       </TASK>
      
      Fixes: c90005b5
      
       ("devlink: Hold the instance lock in health callbacks")
      Signed-off-by: default avatarMoshe Shemesh <moshe@nvidia.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      51423249
    • Mikhail Khvainitski's avatar
      HID: lenovo: Restrict detection of patched firmware only to USB cptkbd · 1e8396aa
      Mikhail Khvainitski authored
      [ Upstream commit 43527a00 ]
      
      Commit 46a0a2c9 ("HID: lenovo: Detect quirk-free fw on cptkbd and
      stop applying workaround") introduced a regression for ThinkPad
      TrackPoint Keyboard II which has similar quirks to cptkbd (so it uses
      the same workarounds) but slightly different so that there are
      false-positives during detecting well-behaving firmware. This commit
      restricts detecting well-behaving firmware to the only model which
      known to have one and have stable enough quirks to not cause
      false-positives.
      
      Fixes: 46a0a2c9 ("HID: lenovo: Detect quirk-free fw on cptkbd and stop applying workaround")
      Link: https://lore.kernel.org/linux-input/ZXRiiPsBKNasioqH@jekhomev/
      Link: https://bbs.archlinux.org/viewtopic.php?pid=2135468#p2135468
      
      
      Signed-off-by: default avatarMikhail Khvainitski <me@khvoinitsky.org>
      Tested-by: default avatarYauhen Kharuzhy <jekhor@gmail.com>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      1e8396aa
    • David Howells's avatar
      afs: Fix refcount underflow from error handling race · e0cda159
      David Howells authored
      [ Upstream commit 52bf9f6c ]
      
      If an AFS cell that has an unreachable (eg. ENETUNREACH) server listed (VL
      server or fileserver), an asynchronous probe to one of its addresses may
      fail immediately because sendmsg() returns an error.  When this happens, a
      refcount underflow can happen if certain events hit a very small window.
      
      The way this occurs is:
      
       (1) There are two levels of "call" object, the afs_call and the
           rxrpc_call.  Each of them can be transitioned to a "completed" state
           in the event of success or failure.
      
       (2) Asynchronous afs_calls are self-referential whilst they are active to
           prevent them from evaporating when they're not being processed.  This
           reference is disposed of when the afs_call is completed.
      
           Note that an afs_call may only be completed once; once completed
           completing it again will do nothing.
      
       (3) When a call transmission is made, the app-side rxrpc code queues a Tx
           buffer for the rxrpc I/O thread to transmit.  The I/O thread invokes
           sendmsg() to transmit it - and in the case of failure, it transitions
           the rxrpc_call to the completed state.
      
       (4) When an rxrpc_call is completed, the app layer is notified.  In this
           case, the app is kafs and it schedules a work item to process events
           pertaining to an afs_call.
      
       (5) When the afs_call event processor is run, it goes down through the
           RPC-specific handler to afs_extract_data() to retrieve data from rxrpc
           - and, in this case, it picks up the error from the rxrpc_call and
           returns it.
      
           The error is then propagated to the afs_call and that is completed
           too.  At this point the self-reference is released.
      
       (6) If the rxrpc I/O thread manages to complete the rxrpc_call within the
           window between rxrpc_send_data() queuing the request packet and
           checking for call completion on the way out, then
           rxrpc_kernel_send_data() will return the error from sendmsg() to the
           app.
      
       (7) Then afs_make_call() will see an error and will jump to the error
           handling path which will attempt to clean up the afs_call.
      
       (8) The problem comes when the error handling path in afs_make_call()
           tries to unconditionally drop an async afs_call's self-reference.
           This self-reference, however, may already have been dropped by
           afs_extract_data() completing the afs_call
      
       (9) The refcount underflows when we return to afs_do_probe_vlserver() and
           that tries to drop its reference on the afs_call.
      
      Fix this by making afs_make_call() attempt to complete the afs_call rather
      than unconditionally putting it.  That way, if afs_extract_data() manages
      to complete the call first, afs_make_call() won't do anything.
      
      The bug can be forced by making do_udp_sendmsg() return -ENETUNREACH and
      sticking an msleep() in rxrpc_send_data() after the 'success:' label to
      widen the race window.
      
      The error message looks something like:
      
          refcount_t: underflow; use-after-free.
          WARNING: CPU: 3 PID: 720 at lib/refcount.c:28 refcount_warn_saturate+0xba/0x110
          ...
          RIP: 0010:refcount_warn_saturate+0xba/0x110
          ...
          afs_put_call+0x1dc/0x1f0 [kafs]
          afs_fs_get_capabilities+0x8b/0xe0 [kafs]
          afs_fs_probe_fileserver+0x188/0x1e0 [kafs]
          afs_lookup_server+0x3bf/0x3f0 [kafs]
          afs_alloc_server_list+0x130/0x2e0 [kafs]
          afs_create_volume+0x162/0x400 [kafs]
          afs_get_tree+0x266/0x410 [kafs]
          vfs_get_tree+0x25/0xc0
          fc_mount+0xe/0x40
          afs_d_automount+0x1b3/0x390 [kafs]
          __traverse_mounts+0x8f/0x210
          step_into+0x340/0x760
          path_openat+0x13a/0x1260
          do_filp_open+0xaf/0x160
          do_sys_openat2+0xaf/0x170
      
      or something like:
      
          refcount_t: underflow; use-after-free.
          ...
          RIP: 0010:refcount_warn_saturate+0x99/0xda
          ...
          afs_put_call+0x4a/0x175
          afs_send_vl_probes+0x108/0x172
          afs_select_vlserver+0xd6/0x311
          afs_do_cell_detect_alias+0x5e/0x1e9
          afs_cell_detect_alias+0x44/0x92
          afs_validate_fc+0x9d/0x134
          afs_get_tree+0x20/0x2e6
          vfs_get_tree+0x1d/0xc9
          fc_mount+0xe/0x33
          afs_d_automount+0x48/0x9d
          __traverse_mounts+0xe0/0x166
          step_into+0x140/0x274
          open_last_lookups+0x1c1/0x1df
          path_openat+0x138/0x1c3
          do_filp_open+0x55/0xb4
          do_sys_openat2+0x6c/0xb6
      
      Fixes: 34fa4761
      
       ("afs: Fix race in async call refcounting")
      Reported-by: default avatarBill MacAllister <bill@ca-zephyr.org>
      Closes: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1052304
      
      
      Suggested-by: default avatarJeffrey E Altman <jaltman@auristor.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJeffrey Altman <jaltman@auristor.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      Link: https://lore.kernel.org/r/2633992.1702073229@warthog.procyon.org.uk/
      
       # v1
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e0cda159
    • Zizhi Wo's avatar
      ksmbd: fix memory leak in smb2_lock() · a7e6477c
      Zizhi Wo authored
      [ Upstream commit 8f175272 ]
      
      In smb2_lock(), if setup_async_work() executes successfully,
      work->cancel_argv will bind the argv that generated by kmalloc(). And
      release_async_work() is called in ksmbd_conn_try_dequeue_request() or
      smb2_lock() to release argv.
      However, when setup_async_work function fails, work->cancel_argv has not
      been bound to the argv, resulting in the previously allocated argv not
      being released. Call kfree() to fix it.
      
      Fixes: e2f34481
      
       ("cifsd: add server-side procedures for SMB3")
      Signed-off-by: default avatarZizhi Wo <wozizhi@huawei.com>
      Acked-by: default avatarNamjae Jeon <linkinjeon@kernel.org>
      Signed-off-by: default avatarSteve French <stfrench@microsoft.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a7e6477c
    • Jan Kara's avatar
      ext4: fix warning in ext4_dio_write_end_io() · 8925ab33
      Jan Kara authored
      [ Upstream commit 619f75da
      
       ]
      
      The syzbot has reported that it can hit the warning in
      ext4_dio_write_end_io() because i_size < i_disksize. Indeed the
      reproducer creates a race between DIO IO completion and truncate
      expanding the file and thus ext4_dio_write_end_io() sees an inconsistent
      inode state where i_disksize is already updated but i_size is not
      updated yet. Since we are careful when setting up DIO write and consider
      it extending (and thus performing the IO synchronously with i_rwsem held
      exclusively) whenever it goes past either of i_size or i_disksize, we
      can use the same test during IO completion without risking entering
      ext4_handle_inode_extension() without i_rwsem held. This way we make it
      obvious both i_size and i_disksize are large enough when we report DIO
      completion without relying on unreliable WARN_ON.
      
      Reported-by: default avatar <syzbot+47479b71cdfc78f56d30@syzkaller.appspotmail.com>
      Fixes: 91562895
      
       ("ext4: properly sync file size update after O_SYNC direct IO")
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Reviewed-by: default avatarRitesh Harjani (IBM) <ritesh.list@gmail.com>
      Link: https://lore.kernel.org/r/20231130095653.22679-1-jack@suse.cz
      
      
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      8925ab33
    • Naveen N Rao's avatar
      powerpc/ftrace: Fix stack teardown in ftrace_no_trace · 1c077acf
      Naveen N Rao authored
      [ Upstream commit 4b3338aa ]
      
      Commit 41a506ef ("powerpc/ftrace: Create a dummy stackframe to fix
      stack unwind") added use of a new stack frame on ftrace entry to fix
      stack unwind. However, the commit missed updating the offset used while
      tearing down the ftrace stack when ftrace is disabled. Fix the same.
      
      In addition, the commit missed saving the correct stack pointer in
      pt_regs. Update the same.
      
      Fixes: 41a506ef
      
       ("powerpc/ftrace: Create a dummy stackframe to fix stack unwind")
      Cc: stable@vger.kernel.org # v6.5+
      Signed-off-by: default avatarNaveen N Rao <naveen@kernel.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://msgid.link/20231130065947.2188860-1-naveen@kernel.org
      
      
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      1c077acf
    • Kelly Kane's avatar
      r8152: add vendor/device ID pair for ASUS USB-C2500 · 34ae53cc
      Kelly Kane authored
      [ Upstream commit 7037d95a
      
       ]
      
      The ASUS USB-C2500 is an RTL8156 based 2.5G Ethernet controller.
      
      Add the vendor and product ID values to the driver. This makes Ethernet
      work with the adapter.
      
      Signed-off-by: default avatarKelly Kane <kelly@hawknetworks.com>
      Link: https://lore.kernel.org/r/20231203011712.6314-1-kelly@hawknetworks.com
      
      
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      34ae53cc
    • Antonio Napolitano's avatar
      r8152: add vendor/device ID pair for D-Link DUB-E250 · cac1218b
      Antonio Napolitano authored
      [ Upstream commit 72f93a31
      
       ]
      
      The D-Link DUB-E250 is an RTL8156 based 2.5G Ethernet controller.
      
      Add the vendor and product ID values to the driver. This makes Ethernet
      work with the adapter.
      
      Signed-off-by: default avatarAntonio Napolitano <anton@polit.no>
      Link: https://lore.kernel.org/r/CV200KJEEUPC.WPKAHXCQJ05I@mercurius
      
      
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Stable-dep-of: 7037d95a
      
       ("r8152: add vendor/device ID pair for ASUS USB-C2500")
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      cac1218b
    • Bjørn Mork's avatar
      r8152: add USB device driver for config selection · 893597cb
      Bjørn Mork authored
      [ Upstream commit ec51fbd1
      
       ]
      
      Subclassing the generic USB device driver to override the
      default configuration selection regardless of matching interface
      drivers.
      
      The r815x family devices expose a vendor specific function which
      the r8152 interface driver wants to handle.  This is the preferred
      device mode. Additionally one or more USB class functions are
      usually supported for hosts lacking a vendor specific driver. The
      choice is USB configuration based, with one alternate function per
      configuration.
      
      Example device with both NCM and ECM alternate cfgs:
      
      T:  Bus=02 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#=  4 Spd=5000 MxCh= 0
      D:  Ver= 3.20 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 9 #Cfgs=  3
      P:  Vendor=0bda ProdID=8156 Rev=31.00
      S:  Manufacturer=Realtek
      S:  Product=USB 10/100/1G/2.5G LAN
      S:  SerialNumber=001000001
      C:* #Ifs= 1 Cfg#= 1 Atr=a0 MxPwr=256mA
      I:* If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=00 Driver=r8152
      E:  Ad=81(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
      E:  Ad=02(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms
      E:  Ad=83(I) Atr=03(Int.) MxPS=   2 Ivl=128ms
      C:  #Ifs= 2 Cfg#= 2 Atr=a0 MxPwr=256mA
      I:  If#= 0 Alt= 0 #EPs= 1 Cls=02(comm.) Sub=0d Prot=00 Driver=
      E:  Ad=83(I) Atr=03(Int.) MxPS=  16 Ivl=128ms
      I:  If#= 1 Alt= 0 #EPs= 0 Cls=0a(data ) Sub=00 Prot=01 Driver=
      I:  If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=01 Driver=
      E:  Ad=81(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
      E:  Ad=02(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms
      C:  #Ifs= 2 Cfg#= 3 Atr=a0 MxPwr=256mA
      I:  If#= 0 Alt= 0 #EPs= 1 Cls=02(comm.) Sub=06 Prot=00 Driver=
      E:  Ad=83(I) Atr=03(Int.) MxPS=  16 Ivl=128ms
      I:  If#= 1 Alt= 0 #EPs= 0 Cls=0a(data ) Sub=00 Prot=00 Driver=
      I:  If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=00 Driver=
      E:  Ad=81(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
      E:  Ad=02(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms
      
      A problem with this is that Linux will prefer class functions over
      vendor specific functions. Using the above example, Linux defaults
      to cfg #2, running the device in a sub-optimal NCM mode.
      
      Previously we've attempted to work around the problem by
      blacklisting the devices in the ECM class driver "cdc_ether", and
      matching on the ECM class function in the vendor specific interface
      driver. The latter has been used to switch back to the vendor
      specific configuration when the driver is probed for a class
      function.
      
      This workaround has several issues;
      - class driver blacklists is additional maintanence cruft in an
        unrelated driver
      - class driver blacklists prevents users from optionally running
        the devices in class mode
      - each device needs double match entries in the vendor driver
      - the initial probing as a class function slows down device
        discovery
      
      Now these issues have become even worse with the introduction of
      firmware supporting both NCM and ECM, where NCM ends up as the
      default mode in Linux. To use the same workaround, we now have
      to blacklist the devices in to two different class drivers and
      add yet another match entry to the vendor specific driver.
      
      This patch implements an alternative workaround strategy -
      independent of the interface drivers.  It avoids adding a
      blacklist to the cdc_ncm driver and will let us remove the
      existing blacklist from the cdc_ether driver.
      
      As an additional bonus, removing the blacklists allow users to
      select one of the other device modes if wanted.
      
      Signed-off-by: default avatarBjørn Mork <bjorn@mork.no>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      893597cb
    • Kan Liang's avatar
      perf/x86/uncore: Don't WARN_ON_ONCE() for a broken discovery table · b80d0c6e
      Kan Liang authored
      commit 5d515ee4
      
       upstream.
      
      The kernel warning message is triggered, when SPR MCC is used.
      
      [   17.945331] ------------[ cut here ]------------
      [   17.946305] WARNING: CPU: 65 PID: 1 at
      arch/x86/events/intel/uncore_discovery.c:184
      intel_uncore_has_discovery_tables+0x4c0/0x65c
      [   17.946305] Modules linked in:
      [   17.946305] CPU: 65 PID: 1 Comm: swapper/0 Not tainted
      5.4.17-2136.313.1-X10-2c+ #4
      
      It's caused by the broken discovery table of UPI.
      
      The discovery tables are from hardware. Except for dropping the broken
      information, there is nothing Linux can do. Using WARN_ON_ONCE() is
      overkilled.
      
      Use the pr_info() to replace WARN_ON_ONCE(), and specify what uncore unit
      is dropped and the reason.
      
      Signed-off-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Tested-by: default avatarMichael Petlan <mpetlan@redhat.com>
      Link: https://lore.kernel.org/r/20230112200105.733466-6-kan.liang@linux.intel.com
      
      
      Cc: Mahmoud Adam <mngyadam@amazon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b80d0c6e
  2. Dec 14, 2023