Skip to content
  1. Dec 01, 2021
    • Nicholas Kazlauskas's avatar
      drm/amd/display: Fix DPIA outbox timeout after GPU reset · 4da56400
      Nicholas Kazlauskas authored
      [ Upstream commit 6eff272d ]
      
      [Why]
      The HW interrupt gets disabled after GPU reset so we don't receive
      notifications for HPD or AUX from DMUB - leading to timeout and
      black screen with (or without) DPIA links connected.
      
      [How]
      Re-enable the interrupt after GPU reset like we do for the other
      DC interrupts.
      
      Fixes: 81927e28
      
       ("drm/amd/display: Support for DMUB AUX")
      
      Reviewed-by: default avatarJude Shih <Jude.Shih@amd.com>
      Acked-by: default avatarQingqing Zhuo <qingqing.zhuo@amd.com>
      Signed-off-by: default avatarNicholas Kazlauskas <nicholas.kazlauskas@amd.com>
      Tested-by: default avatarDaniel Wheeler <daniel.wheeler@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      4da56400
    • Thomas Zeitlhofer's avatar
      PM: hibernate: use correct mode for swsusp_close() · c83f2757
      Thomas Zeitlhofer authored
      [ Upstream commit cefcf24b ]
      
      Commit 39fbef4b ("PM: hibernate: Get block device exclusively in
      swsusp_check()") changed the opening mode of the block device to
      (FMODE_READ | FMODE_EXCL).
      
      In the corresponding calls to swsusp_close(), the mode is still just
      FMODE_READ which triggers the warning in blkdev_flush_mapping() on
      resume from hibernate.
      
      So, use the mode (FMODE_READ | FMODE_EXCL) also when closing the
      device.
      
      Fixes: 39fbef4b
      
       ("PM: hibernate: Get block device exclusively in swsusp_check()")
      Signed-off-by: default avatarThomas Zeitlhofer <thomas.zeitlhofer+lkml@ze-it.at>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      c83f2757
    • Kumar Thangavel's avatar
      net/ncsi : Add payload to be 32-bit aligned to fix dropped packets · fd49f1f5
      Kumar Thangavel authored
      [ Upstream commit ac132852 ]
      
      Update NC-SI command handler (both standard and OEM) to take into
      account of payload paddings in allocating skb (in case of payload
      size is not 32-bit aligned).
      
      The checksum field follows payload field, without taking payload
      padding into account can cause checksum being truncated, leading to
      dropped packets.
      
      Fixes: fb4ee675
      
       ("net/ncsi: Add NCSI OEM command support")
      Signed-off-by: default avatarKumar Thangavel <thangavel.k@hcl.com>
      Acked-by: default avatarSamuel Mendoza-Jonas <sam@mendozajonas.com>
      Reviewed-by: default avatarPaul Menzel <pmenzel@molgen.mpg.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      fd49f1f5
    • Mark Rutland's avatar
      arm64: uaccess: avoid blocking within critical sections · ff1a3074
      Mark Rutland authored
      [ Upstream commit 94902d84 ]
      
      As Vincent reports in:
      
        https://lore.kernel.org/r/20211118163417.21617-1-vincent.whitchurch@axis.com
      
      The put_user() in schedule_tail() can get stuck in a livelock, similar
      to a problem recently fixed on riscv in commit:
      
        285a76bb
      
       ("riscv: evaluate put_user() arg before enabling user access")
      
      In __raw_put_user() we have a critical section between
      uaccess_ttbr0_enable() and uaccess_ttbr0_disable() where we cannot
      safely call into the scheduler without having taken an exception, as
      schedule() and other scheduling functions will not save/restore the
      TTBR0 state. If either of the `x` or `ptr` arguments to __raw_put_user()
      contain a blocking call, we may call into the scheduler within the
      critical section. This can result in two problems:
      
      1) The access within the critical section will occur without the
         required TTBR0 tables installed. This will fault, and where the
         required tables permit access, the access will be retried without the
         required tables, resulting in a livelock.
      
      2) When TTBR0 SW PAN is in use, check_and_switch_context() does not
         modify TTBR0, leaving a stale value installed. The mappings of the
         blocked task will erroneously be accessible to regular accesses in
         the context of the new task. Additionally, if the tables are
         subsequently freed, local TLB maintenance required to reuse the ASID
         may be lost, potentially resulting in TLB corruption (e.g. in the
         presence of CnP).
      
      The same issue exists for __raw_get_user() in the critical section
      between uaccess_ttbr0_enable() and uaccess_ttbr0_disable().
      
      A similar issue exists for __get_kernel_nofault() and
      __put_kernel_nofault() for the critical section between
      __uaccess_enable_tco_async() and __uaccess_disable_tco_async(), as the
      TCO state is not context-switched by direct calls into the scheduler.
      Here the TCO state may be lost from the context of the current task,
      resulting in unexpected asynchronous tag check faults. It may also be
      leaked to another task, suppressing expected tag check faults.
      
      To fix all of these cases, we must ensure that we do not directly call
      into the scheduler in their respective critical sections. This patch
      reworks __raw_put_user(), __raw_get_user(), __get_kernel_nofault(), and
      __put_kernel_nofault(), ensuring that parameters are evaluated outside
      of the critical sections. To make this requirement clear, comments are
      added describing the problem, and line spaces added to separate the
      critical sections from other portions of the macros.
      
      For __raw_get_user() and __raw_put_user() the `err` parameter is
      conditionally assigned to, and we must currently evaluate this in the
      critical section. This behaviour is relied upon by the signal code,
      which uses chains of put_user_error() and get_user_error(), checking the
      return value at the end. In all cases, the `err` parameter is a plain
      int rather than a more complex expression with a blocking call, so this
      is safe.
      
      In future we should try to clean up the `err` usage to remove the
      potential for this to be a problem.
      
      Aside from the changes to time of evaluation, there should be no
      functional change as a result of this patch.
      
      Reported-by: default avatarVincent Whitchurch <vincent.whitchurch@axis.com>
      Link: https://lore.kernel.org/r/20211118163417.21617-1-vincent.whitchurch@axis.com
      Fixes: f253d827
      
       ("arm64: uaccess: refactor __{get,put}_user")
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Link: https://lore.kernel.org/r/20211122125820.55286-1-mark.rutland@arm.com
      
      
      Signed-off-by: default avatarWill Deacon <will@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      ff1a3074
    • Mohammed Gamal's avatar
      drm/hyperv: Fix device removal on Gen1 VMs · 85851d9f
      Mohammed Gamal authored
      [ Upstream commit e048834c ]
      
      The Hyper-V DRM driver tries to free MMIO region on removing
      the device regardless of VM type, while Gen1 VMs don't use MMIO
      and hence causing the kernel to crash on a NULL pointer dereference.
      
      Fix this by making deallocating MMIO only on Gen2 machines and implement
      removal for Gen1
      
      Fixes: 76c56a5a
      
       ("drm/hyperv: Add DRM driver for hyperv synthetic video device")
      
      Signed-off-by: default avatarMohammed Gamal <mgamal@redhat.com>
      Reviewed-by: default avatarDeepak Rawat <drawat.floss@gmail.com>
      Signed-off-by: default avatarDeepak Rawat <drawat.floss@gmail.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20211119112900.300537-1-mgamal@redhat.com
      
      
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      85851d9f
    • Varun Prakash's avatar
      nvmet-tcp: fix incomplete data digest send · 63a68f37
      Varun Prakash authored
      [ Upstream commit 102110ef ]
      
      Current nvmet_try_send_ddgst() code does not check whether
      all data digest bytes are transmitted, fix this by returning
      -EAGAIN if all data digest bytes are not transmitted.
      
      Fixes: 872d26a3
      
       ("nvmet-tcp: add NVMe over TCP target driver")
      Signed-off-by: default avatarVarun Prakash <varun@chelsio.com>
      Reviewed-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      63a68f37
    • Adamos Ttofari's avatar
      cpufreq: intel_pstate: Add Ice Lake server to out-of-band IDs · d10ecfd9
      Adamos Ttofari authored
      [ Upstream commit cd23f02f ]
      
      Commit fbdc21e9 ("cpufreq: intel_pstate: Add Icelake servers
      support in no-HWP mode") enabled the use of Intel P-State driver
      for Ice Lake servers.
      
      But it doesn't cover the case when OS can't control P-States.
      
      Therefore, for Ice Lake server, if MSR_MISC_PWR_MGMT bits 8 or 18
      are enabled, then the Intel P-State driver should exit as OS can't
      control P-States.
      
      Fixes: fbdc21e9
      
       ("cpufreq: intel_pstate: Add Icelake servers support in no-HWP mode")
      Signed-off-by: default avatarAdamos Ttofari <attofari@amazon.de>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d10ecfd9
    • Marek Behún's avatar
      net: marvell: mvpp2: increase MTU limit when XDP enabled · 57e91396
      Marek Behún authored
      [ Upstream commit 7b1b62bc ]
      
      Currently mvpp2_xdp_setup won't allow attaching XDP program if
        mtu > ETH_DATA_LEN (1500).
      
      The mvpp2_change_mtu on the other hand checks whether
        MVPP2_RX_PKT_SIZE(mtu) > MVPP2_BM_LONG_PKT_SIZE.
      
      These two checks are semantically different.
      
      Moreover this limit can be increased to MVPP2_MAX_RX_BUF_SIZE, since in
      mvpp2_rx we have
        xdp.data = data + MVPP2_MH_SIZE + MVPP2_SKB_HEADROOM;
        xdp.frame_sz = PAGE_SIZE;
      
      Change the checks to check whether
        mtu > MVPP2_MAX_RX_BUF_SIZE
      
      Fixes: 07dd0a7a
      
       ("mvpp2: add basic XDP support")
      Signed-off-by: default avatarMarek Behún <kabel@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      57e91396
    • Alex Elder's avatar
      net: ipa: kill ipa_cmd_pipeline_clear() · d815f7ca
      Alex Elder authored
      [ Upstream commit e4e9bfb7 ]
      
      Calling ipa_cmd_pipeline_clear() after stopping the channel
      underlying the AP<-modem RX endpoint can lead to a deadlock.
      
      This occurs in the ->runtime_suspend device power operation for the
      IPA driver.  While this callback is in progress, any other requests
      for power will block until the callback returns.
      
      Stopping the AP<-modem RX channel does not prevent the modem from
      sending another packet to this endpoint.  If a packet arrives for an
      RX channel when the channel is stopped, an SUSPEND IPA interrupt
      condition will be pending.  Handling an IPA interrupt requires
      power, so ipa_isr_thread() calls pm_runtime_get_sync() first thing.
      
      The problem occurs because a "pipeline clear" command will not
      complete while such a SUSPEND interrupt condition exists.  So the
      SUSPEND IPA interrupt handler won't proceed until it gets power;
      that won't happen until the ->runtime_suspend callback (and its
      "pipeline clear" command) completes; and that can't happen while
      the SUSPEND interrupt condition exists.
      
      It turns out that in this case there is no need to use the "pipeline
      clear" command.  There are scenarios in which clearing the pipeline
      is required while suspending, but those are not (yet) supported
      upstream.  So a simple fix, avoiding the potential deadlock, is to
      stop calling ipa_cmd_pipeline_clear() in ipa_endpoint_suspend().
      This removes the only user of ipa_cmd_pipeline_clear(), so get rid
      of that function.  It can be restored again whenever it's needed.
      
      This is basically a manual revert along with an explanation for
      commit 6cb63ea6 ("net: ipa: introduce ipa_cmd_tag_process()").
      
      Fixes: 6cb63ea6
      
       ("net: ipa: introduce ipa_cmd_tag_process()")
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d815f7ca
    • Alex Elder's avatar
      net: ipa: separate disabling setup from modem stop · 740c461a
      Alex Elder authored
      [ Upstream commit 8afc7e47 ]
      
      The IPA setup_complete flag is set at the end of ipa_setup(), when
      the setup phase of initialization has completed successfully.  This
      occurs as part of driver probe processing, or (if "modem-init" is
      specified in the DTS file) it is triggered by the "ipa-setup-ready"
      SMP2P interrupt generated by the modem.
      
      In the latter case, it's possible for driver shutdown (or remove) to
      begin while setup processing is underway, and this can't be allowed.
      The problem is that the setup_complete flag is not adequate to signal
      that setup is underway.
      
      If setup_complete is set, it will never be un-set, so that case is
      not a problem.  But if setup_complete is false, there's a chance
      setup is underway.
      
      Because setup is triggered by an interrupt on a "modem-init" system,
      there is a simple way to ensure the value of setup_complete is safe
      to read.  The threaded handler--if it is executing--will complete as
      part of a request to disable the "ipa-modem-ready" interrupt.  This
      means that ipa_setup() (which is called from the handler) will run
      to completion if it was underway, or will never be called otherwise.
      
      The request to disable the "ipa-setup-ready" interrupt is currently
      made within ipa_modem_stop().  Instead, disable the interrupt
      outside that function in the two places it's called.  In the case of
      ipa_remove(), this ensures the setup_complete flag is safe to read
      before we read it.
      
      Rename ipa_smp2p_disable() to be ipa_smp2p_irq_disable_setup(), to be
      more specific about its effect.
      
      Fixes: 530f9216
      
       ("soc: qcom: ipa: AP/modem communications")
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      740c461a
    • Alex Elder's avatar
      net: ipa: directly disable ipa-setup-ready interrupt · f38aa5cf
      Alex Elder authored
      [ Upstream commit 33a15310 ]
      
      We currently maintain a "disabled" Boolean flag to determine whether
      the "ipa-setup-ready" SMP2P IRQ handler does anything.  That flag
      must be accessed under protection of a mutex.
      
      Instead, disable the SMP2P interrupt when requested, which prevents
      the interrupt handler from ever being called.  More importantly, it
      synchronizes a thread disabling the interrupt with the completion of
      the interrupt handler in case they run concurrently.
      
      Use the IPA setup_complete flag rather than the disabled flag in the
      handler to determine whether to ignore any interrupts arriving after
      the first.
      
      Rename the "disabled" flag to be "setup_disabled", to be specific
      about its purpose.
      
      Fixes: 530f9216
      
       ("soc: qcom: ipa: AP/modem communications")
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f38aa5cf
    • Amit Cohen's avatar
      mlxsw: spectrum: Protect driver from buggy firmware · da4d7019
      Amit Cohen authored
      [ Upstream commit 63b08b1f ]
      
      When processing port up/down events generated by the device's firmware,
      the driver protects itself from events reported for non-existent local
      ports, but not the CPU port (local port 0), which exists, but lacks a
      netdev.
      
      This can result in a NULL pointer dereference when calling
      netif_carrier_{on,off}().
      
      Fix this by bailing early when processing an event reported for the CPU
      port. Problem was only observed when running on top of a buggy emulator.
      
      Fixes: 28b1987e
      
       ("mlxsw: spectrum: Register CPU port with devlink")
      Signed-off-by: default avatarAmit Cohen <amcohen@nvidia.com>
      Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      da4d7019
    • Tony Lu's avatar
      net/smc: Ensure the active closing peer first closes clcsock · 12dea26c
      Tony Lu authored
      [ Upstream commit 606a63c9 ]
      
      The side that actively closed socket, it's clcsock doesn't enter
      TIME_WAIT state, but the passive side does it. It should show the same
      behavior as TCP sockets.
      
      Consider this, when client actively closes the socket, the clcsock in
      server enters TIME_WAIT state, which means the address is occupied and
      won't be reused before TIME_WAIT dismissing. If we restarted server, the
      service would be unavailable for a long time.
      
      To solve this issue, shutdown the clcsock in [A], perform the TCP active
      close progress first, before the passive closed side closing it. So that
      the actively closed side enters TIME_WAIT, not the passive one.
      
      Client                                            |  Server
      close() // client actively close                  |
        smc_release()                                   |
            smc_close_active() // PEERCLOSEWAIT1        |
                smc_close_final() // abort or closed = 1|
                    smc_cdc_get_slot_and_msg_send()     |
                [A]                                     |
                                                        |smc_cdc_msg_recv_action() // ACTIVE
                                                        |  queue_work(smc_close_wq, &conn->close_work)
                                                        |    smc_close_passive_work() // PROCESSABORT or APPCLOSEWAIT1
                                                        |      smc_close_passive_abort_received() // only in abort
                                                        |
                                                        |close() // server recv zero, close
                                                        |  smc_release() // PROCESSABORT or APPCLOSEWAIT1
                                                        |    smc_close_active()
                                                        |      smc_close_abort() or smc_close_final() // CLOSED
                                                        |        smc_cdc_get_slot_and_msg_send() // abort or closed = 1
      smc_cdc_msg_recv_action()                         |    smc_clcsock_release()
        queue_work(smc_close_wq, &conn->close_work)     |      sock_release(tcp) // actively close clc, enter TIME_WAIT
          smc_close_passive_work() // PEERCLOSEWAIT1    |    smc_conn_free()
            smc_close_passive_abort_received() // CLOSED|
            smc_conn_free()                             |
            smc_clcsock_release()                       |
              sock_release(tcp) // passive close clc    |
      
      Link: https://www.spinics.net/lists/netdev/msg780407.html
      Fixes: b38d7324
      
       ("smc: socket closing and linkgroup cleanup")
      Signed-off-by: default avatarTony Lu <tonylu@linux.alibaba.com>
      Reviewed-by: default avatarWen Gu <guwen@linux.alibaba.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      12dea26c
    • Vincent Whitchurch's avatar
      i2c: virtio: disable timeout handling · cc432b07
      Vincent Whitchurch authored
      [ Upstream commit 84e1d0bf ]
      
      If a timeout is hit, it can result is incorrect data on the I2C bus
      and/or memory corruptions in the guest since the device can still be
      operating on the buffers it was given while the guest has freed them.
      
      Here is, for example, the start of a slub_debug splat which was
      triggered on the next transfer after one transfer was forced to timeout
      by setting a breakpoint in the backend (rust-vmm/vhost-device):
      
       BUG kmalloc-1k (Not tainted): Poison overwritten
       First byte 0x1 instead of 0x6b
       Allocated in virtio_i2c_xfer+0x65/0x35c age=350 cpu=0 pid=29
       	__kmalloc+0xc2/0x1c9
       	virtio_i2c_xfer+0x65/0x35c
       	__i2c_transfer+0x429/0x57d
       	i2c_transfer+0x115/0x134
       	i2cdev_ioctl_rdwr+0x16a/0x1de
       	i2cdev_ioctl+0x247/0x2ed
       	vfs_ioctl+0x21/0x30
       	sys_ioctl+0xb18/0xb41
       Freed in virtio_i2c_xfer+0x32e/0x35c age=244 cpu=0 pid=29
       	kfree+0x1bd/0x1cc
       	virtio_i2c_xfer+0x32e/0x35c
       	__i2c_transfer+0x429/0x57d
       	i2c_transfer+0x115/0x134
       	i2cdev_ioctl_rdwr+0x16a/0x1de
       	i2cdev_ioctl+0x247/0x2ed
       	vfs_ioctl+0x21/0x30
       	sys_ioctl+0xb18/0xb41
      
      There is no simple fix for this (the driver would have to always create
      bounce buffers and hold on to them until the device eventually returns
      the buffers), so just disable the timeout support for now.
      
      Fixes: 3cfc8838
      
       ("i2c: virtio: add a virtio i2c frontend driver")
      Acked-by: default avatarJie Deng <jie.deng@intel.com>
      Signed-off-by: default avatarVincent Whitchurch <vincent.whitchurch@axis.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Reviewed-by: default avatarViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: default avatarWolfram Sang <wsa@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      cc432b07
    • Huang Jianan's avatar
      erofs: fix deadlock when shrink erofs slab · 4339cd08
      Huang Jianan authored
      [ Upstream commit 57bbeacd ]
      
      We observed the following deadlock in the stress test under low
      memory scenario:
      
      Thread A                               Thread B
      - erofs_shrink_scan
       - erofs_try_to_release_workgroup
        - erofs_workgroup_try_to_freeze -- A
                                             - z_erofs_do_read_page
                                              - z_erofs_collection_begin
                                               - z_erofs_register_collection
                                                - erofs_insert_workgroup
                                                 - xa_lock(&sbi->managed_pslots) -- B
                                                 - erofs_workgroup_get
                                                  - erofs_wait_on_workgroup_freezed -- A
        - xa_erase
         - xa_lock(&sbi->managed_pslots) -- B
      
      To fix this, it needs to hold xa_lock before freezing the workgroup
      since xarray will be touched then. So let's hold the lock before
      accessing each workgroup, just like what we did with the radix tree
      before.
      
      [ Gao Xiang: Jianhua Hao also reports this issue at
        https://lore.kernel.org/r/b10b85df30694bac8aadfe43537c897a@xiaomi.com ]
      
      Link: https://lore.kernel.org/r/20211118135844.3559-1-huangjianan@oppo.com
      Fixes: 64094a04
      
       ("erofs: convert workstn to XArray")
      Reviewed-by: default avatarChao Yu <chao@kernel.org>
      Reviewed-by: default avatarGao Xiang <hsiangkao@linux.alibaba.com>
      Signed-off-by: default avatarHuang Jianan <huangjianan@oppo.com>
      Reported-by: default avatarJianhua Hao <haojianhua1@xiaomi.com>
      Signed-off-by: default avatarGao Xiang <xiang@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      4339cd08
    • Shin'ichiro Kawasaki's avatar
      scsi: scsi_debug: Zero clear zones at reset write pointer · 8b3b9aaa
      Shin'ichiro Kawasaki authored
      [ Upstream commit 2d62253e ]
      
      When a reset is requested the position of the write pointer is updated but
      the data in the corresponding zone is not cleared. Instead scsi_debug
      returns any data written before the write pointer was reset. This is an
      error and prevents using scsi_debug for stale page cache testing of the
      BLKRESETZONE ioctl.
      
      Zero written data in the zone when resetting the write pointer.
      
      Link: https://lore.kernel.org/r/20211122061223.298890-1-shinichiro.kawasaki@wdc.com
      Fixes: f0d1cf93
      
       ("scsi: scsi_debug: Add ZBC zone commands")
      Reviewed-by: default avatarDamien Le Moal <damien.lemoal@opensource.wdc.com>
      Acked-by: default avatarDouglas Gilbert <dgilbert@interlog.com>
      Signed-off-by: default avatarShin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      8b3b9aaa
    • Mike Christie's avatar
      scsi: core: sysfs: Fix setting device state to SDEV_RUNNING · a67c045b
      Mike Christie authored
      [ Upstream commit eb97545d ]
      
      This fixes an issue added in commit 4edd8cd4 ("scsi: core: sysfs: Fix
      hang when device state is set via sysfs") where if userspace is requesting
      to set the device state to SDEV_RUNNING when the state is already
      SDEV_RUNNING, we return -EINVAL instead of count. The commmit above set ret
      to count for this case, when it should have set it to 0.
      
      Link: https://lore.kernel.org/r/20211120164917.4924-1-michael.christie@oracle.com
      Fixes: 4edd8cd4
      
       ("scsi: core: sysfs: Fix hang when device state is set via sysfs")
      Reviewed-by: default avatarLee Duncan <lduncan@suse.com>
      Signed-off-by: default avatarMike Christie <michael.christie@oracle.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a67c045b
    • Marta Plantykow's avatar
      ice: avoid bpf_prog refcount underflow · 1f10b09c
      Marta Plantykow authored
      [ Upstream commit f65ee535 ]
      
      Ice driver has the routines for managing XDP resources that are shared
      between ndo_bpf op and VSI rebuild flow. The latter takes place for
      example when user changes queue count on an interface via ethtool's
      set_channels().
      
      There is an issue around the bpf_prog refcounting when VSI is being
      rebuilt - since ice_prepare_xdp_rings() is called with vsi->xdp_prog as
      an argument that is used later on by ice_vsi_assign_bpf_prog(), same
      bpf_prog pointers are swapped with each other. Then it is also
      interpreted as an 'old_prog' which in turn causes us to call
      bpf_prog_put on it that will decrement its refcount.
      
      Below splat can be interpreted in a way that due to zero refcount of a
      bpf_prog it is wiped out from the system while kernel still tries to
      refer to it:
      
      [  481.069429] BUG: unable to handle page fault for address: ffffc9000640f038
      [  481.077390] #PF: supervisor read access in kernel mode
      [  481.083335] #PF: error_code(0x0000) - not-present page
      [  481.089276] PGD 100000067 P4D 100000067 PUD 1001cb067 PMD 106d2b067 PTE 0
      [  481.097141] Oops: 0000 [#1] PREEMPT SMP PTI
      [  481.101980] CPU: 12 PID: 3339 Comm: sudo Tainted: G           OE     5.15.0-rc5+ #1
      [  481.110840] Hardware name: Intel Corp. GRANTLEY/GRANTLEY, BIOS GRRFCRB1.86B.0276.D07.1605190235 05/19/2016
      [  481.122021] RIP: 0010:dev_xdp_prog_id+0x25/0x40
      [  481.127265] Code: 80 00 00 00 00 0f 1f 44 00 00 89 f6 48 c1 e6 04 48 01 fe 48 8b 86 98 08 00 00 48 85 c0 74 13 48 8b 50 18 31 c0 48 85 d2 74 07 <48> 8b 42 38 8b 40 20 c3 48 8b 96 90 08 00 00 eb e8 66 2e 0f 1f 84
      [  481.148991] RSP: 0018:ffffc90007b63868 EFLAGS: 00010286
      [  481.155034] RAX: 0000000000000000 RBX: ffff889080824000 RCX: 0000000000000000
      [  481.163278] RDX: ffffc9000640f000 RSI: ffff889080824010 RDI: ffff889080824000
      [  481.171527] RBP: ffff888107af7d00 R08: 0000000000000000 R09: ffff88810db5f6e0
      [  481.179776] R10: 0000000000000000 R11: ffff8890885b9988 R12: ffff88810db5f4bc
      [  481.188026] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
      [  481.196276] FS:  00007f5466d5bec0(0000) GS:ffff88903fb00000(0000) knlGS:0000000000000000
      [  481.205633] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  481.212279] CR2: ffffc9000640f038 CR3: 000000014429c006 CR4: 00000000003706e0
      [  481.220530] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  481.228771] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  481.237029] Call Trace:
      [  481.239856]  rtnl_fill_ifinfo+0x768/0x12e0
      [  481.244602]  rtnl_dump_ifinfo+0x525/0x650
      [  481.249246]  ? __alloc_skb+0xa5/0x280
      [  481.253484]  netlink_dump+0x168/0x3c0
      [  481.257725]  netlink_recvmsg+0x21e/0x3e0
      [  481.262263]  ____sys_recvmsg+0x87/0x170
      [  481.266707]  ? __might_fault+0x20/0x30
      [  481.271046]  ? _copy_from_user+0x66/0xa0
      [  481.275591]  ? iovec_from_user+0xf6/0x1c0
      [  481.280226]  ___sys_recvmsg+0x82/0x100
      [  481.284566]  ? sock_sendmsg+0x5e/0x60
      [  481.288791]  ? __sys_sendto+0xee/0x150
      [  481.293129]  __sys_recvmsg+0x56/0xa0
      [  481.297267]  do_syscall_64+0x3b/0xc0
      [  481.301395]  entry_SYSCALL_64_after_hwframe+0x44/0xae
      [  481.307238] RIP: 0033:0x7f5466f39617
      [  481.311373] Code: 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb bd 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2f 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 89 54 24 1c 48 89 74 24 10
      [  481.342944] RSP: 002b:00007ffedc7f4308 EFLAGS: 00000246 ORIG_RAX: 000000000000002f
      [  481.361783] RAX: ffffffffffffffda RBX: 00007ffedc7f5460 RCX: 00007f5466f39617
      [  481.380278] RDX: 0000000000000000 RSI: 00007ffedc7f5360 RDI: 0000000000000003
      [  481.398500] RBP: 00007ffedc7f53f0 R08: 0000000000000000 R09: 000055d556f04d50
      [  481.416463] R10: 0000000000000077 R11: 0000000000000246 R12: 00007ffedc7f5360
      [  481.434131] R13: 00007ffedc7f5350 R14: 00007ffedc7f5344 R15: 0000000000000e98
      [  481.451520] Modules linked in: ice(OE) af_packet binfmt_misc nls_iso8859_1 ipmi_ssif intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp mxm_wmi mei_me coretemp mei ipmi_si ipmi_msghandler wmi acpi_pad acpi_power_meter ip_tables x_tables autofs4 crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel ahci crypto_simd cryptd libahci lpc_ich [last unloaded: ice]
      [  481.528558] CR2: ffffc9000640f038
      [  481.542041] ---[ end trace d1f24c9ecf5b61c1 ]---
      
      Fix this by only calling ice_vsi_assign_bpf_prog() inside
      ice_prepare_xdp_rings() when current vsi->xdp_prog pointer is NULL.
      This way set_channels() flow will not attempt to swap the vsi->xdp_prog
      pointers with itself.
      
      Also, sprinkle around some comments that provide a reasoning about
      correlation between driver and kernel in terms of bpf_prog refcount.
      
      Fixes: efc2214b
      
       ("ice: Add support for XDP")
      Reviewed-by: default avatarAlexander Lobakin <alexandr.lobakin@intel.com>
      Signed-off-by: default avatarMarta Plantykow <marta.a.plantykow@intel.com>
      Co-developed-by: default avatarMaciej Fijalkowski <maciej.fijalkowski@intel.com>
      Signed-off-by: default avatarMaciej Fijalkowski <maciej.fijalkowski@intel.com>
      Tested-by: default avatarKiran Bhandare <kiranx.bhandare@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      1f10b09c
    • Maciej Fijalkowski's avatar
      ice: fix vsi->txq_map sizing · 992ba40a
      Maciej Fijalkowski authored
      [ Upstream commit 792b2086 ]
      
      The approach of having XDP queue per CPU regardless of user's setting
      exposed a hidden bug that could occur in case when Rx queue count differ
      from Tx queue count. Currently vsi->txq_map's size is equal to the
      doubled vsi->alloc_txq, which is not correct due to the fact that XDP
      rings were previously based on the Rx queue count. Below splat can be
      seen when ethtool -L is used and XDP rings are configured:
      
      [  682.875339] BUG: kernel NULL pointer dereference, address: 000000000000000f
      [  682.883403] #PF: supervisor read access in kernel mode
      [  682.889345] #PF: error_code(0x0000) - not-present page
      [  682.895289] PGD 0 P4D 0
      [  682.898218] Oops: 0000 [#1] PREEMPT SMP PTI
      [  682.903055] CPU: 42 PID: 2878 Comm: ethtool Tainted: G           OE     5.15.0-rc5+ #1
      [  682.912214] Hardware name: Intel Corp. GRANTLEY/GRANTLEY, BIOS GRRFCRB1.86B.0276.D07.1605190235 05/19/2016
      [  682.923380] RIP: 0010:devres_remove+0x44/0x130
      [  682.928527] Code: 49 89 f4 55 48 89 fd 4c 89 ff 53 48 83 ec 10 e8 92 b9 49 00 48 8b 9d a8 02 00 00 48 8d 8d a0 02 00 00 49 89 c2 48 39 cb 74 0f <4c> 3b 63 10 74 25 48 8b 5b 08 48 39 cb 75 f1 4c 89 ff 4c 89 d6 e8
      [  682.950237] RSP: 0018:ffffc90006a679f0 EFLAGS: 00010002
      [  682.956285] RAX: 0000000000000286 RBX: ffffffffffffffff RCX: ffff88908343a370
      [  682.964538] RDX: 0000000000000001 RSI: ffffffff81690d60 RDI: 0000000000000000
      [  682.972789] RBP: ffff88908343a0d0 R08: 0000000000000000 R09: 0000000000000000
      [  682.981040] R10: 0000000000000286 R11: 3fffffffffffffff R12: ffffffff81690d60
      [  682.989282] R13: ffffffff81690a00 R14: ffff8890819807a8 R15: ffff88908343a36c
      [  682.997535] FS:  00007f08c7bfa740(0000) GS:ffff88a03fd00000(0000) knlGS:0000000000000000
      [  683.006910] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  683.013557] CR2: 000000000000000f CR3: 0000001080a66003 CR4: 00000000003706e0
      [  683.021819] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  683.030075] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  683.038336] Call Trace:
      [  683.041167]  devm_kfree+0x33/0x50
      [  683.045004]  ice_vsi_free_arrays+0x5e/0xc0 [ice]
      [  683.050380]  ice_vsi_rebuild+0x4c8/0x750 [ice]
      [  683.055543]  ice_vsi_recfg_qs+0x9a/0x110 [ice]
      [  683.060697]  ice_set_channels+0x14f/0x290 [ice]
      [  683.065962]  ethnl_set_channels+0x333/0x3f0
      [  683.070807]  genl_family_rcv_msg_doit+0xea/0x150
      [  683.076152]  genl_rcv_msg+0xde/0x1d0
      [  683.080289]  ? channels_prepare_data+0x60/0x60
      [  683.085432]  ? genl_get_cmd+0xd0/0xd0
      [  683.089667]  netlink_rcv_skb+0x50/0xf0
      [  683.094006]  genl_rcv+0x24/0x40
      [  683.097638]  netlink_unicast+0x239/0x340
      [  683.102177]  netlink_sendmsg+0x22e/0x470
      [  683.106717]  sock_sendmsg+0x5e/0x60
      [  683.110756]  __sys_sendto+0xee/0x150
      [  683.114894]  ? handle_mm_fault+0xd0/0x2a0
      [  683.119535]  ? do_user_addr_fault+0x1f3/0x690
      [  683.134173]  __x64_sys_sendto+0x25/0x30
      [  683.148231]  do_syscall_64+0x3b/0xc0
      [  683.161992]  entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      Fix this by taking into account the value that num_possible_cpus()
      yields in addition to vsi->alloc_txq instead of doubling the latter.
      
      Fixes: efc2214b ("ice: Add support for XDP")
      Fixes: 22bf877e
      
       ("ice: introduce XDP_TX fallback path")
      Reviewed-by: default avatarAlexander Lobakin <alexandr.lobakin@intel.com>
      Signed-off-by: default avatarMaciej Fijalkowski <maciej.fijalkowski@intel.com>
      Tested-by: default avatarKiran Bhandare <kiranx.bhandare@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      992ba40a
    • Nikolay Aleksandrov's avatar
      net: nexthop: release IPv6 per-cpu dsts when replacing a nexthop group · 66521011
      Nikolay Aleksandrov authored
      [ Upstream commit 1005f19b ]
      
      When replacing a nexthop group, we must release the IPv6 per-cpu dsts of
      the removed nexthop entries after an RCU grace period because they
      contain references to the nexthop's net device and to the fib6 info.
      With specific series of events[1] we can reach net device refcount
      imbalance which is unrecoverable. IPv4 is not affected because dsts
      don't take a refcount on the route.
      
      [1]
       $ ip nexthop list
        id 200 via 2002:db8::2 dev bridge.10 scope link onlink
        id 201 via 2002:db8::3 dev bridge scope link onlink
        id 203 group 201/200
       $ ip -6 route
        2001:db8::10 nhid 203 metric 1024 pref medium
           nexthop via 2002:db8::3 dev bridge weight 1 onlink
           nexthop via 2002:db8::2 dev bridge.10 weight 1 onlink
      
      Create rt6_info through one of the multipath legs, e.g.:
       $ taskset -a -c 1  ./pkt_inj 24 bridge.10 2001:db8::10
       (pkt_inj is just a custom packet generator, nothing special)
      
      Then remove that leg from the group by replace (let's assume it is id
      200 in this case):
       $ ip nexthop replace id 203 group 201
      
      Now remove the IPv6 route:
       $ ip -6 route del 2001:db8::10/128
      
      The route won't be really deleted due to the stale rt6_info holding 1
      refcnt in nexthop id 200.
      At this point we have the following reference count dependency:
       (deleted) IPv6 route holds 1 reference over nhid 203
       nh 203 holds 1 ref over id 201
       nh 200 holds 1 ref over the net device and the route due to the stale
       rt6_info
      
      Now to create circular dependency between nh 200 and the IPv6 route, and
      also to get a reference over nh 200, restore nhid 200 in the group:
       $ ip nexthop replace id 203 group 201/200
      
      And now we have a permanent circular dependncy because nhid 203 holds a
      reference over nh 200 and 201, but the route holds a ref over nh 203 and
      is deleted.
      
      To trigger the bug just delete the group (nhid 203):
       $ ip nexthop del id 203
      
      It won't really be deleted due to the IPv6 route dependency, and now we
      have 2 unlinked and deleted objects that reference each other: the group
      and the IPv6 route. Since the group drops the reference it holds over its
      entries at free time (i.e. its own refcount needs to drop to 0) that will
      never happen and we get a permanent ref on them, since one of the entries
      holds a reference over the IPv6 route it will also never be released.
      
      At this point the dependencies are:
       (deleted, only unlinked) IPv6 route holds reference over group nh 203
       (deleted, only unlinked) group nh 203 holds reference over nh 201 and 200
       nh 200 holds 1 ref over the net device and the route due to the stale
       rt6_info
      
      This is the last point where it can be fixed by running traffic through
      nh 200, and specifically through the same CPU so the rt6_info (dst) will
      get released due to the IPv6 genid, that in turn will free the IPv6
      route, which in turn will free the ref count over the group nh 203.
      
      If nh 200 is deleted at this point, it will never be released due to the
      ref from the unlinked group 203, it will only be unlinked:
       $ ip nexthop del id 200
       $ ip nexthop
       $
      
      Now we can never release that stale rt6_info, we have IPv6 route with ref
      over group nh 203, group nh 203 with ref over nh 200 and 201, nh 200 with
      rt6_info (dst) with ref over the net device and the IPv6 route. All of
      these objects are only unlinked, and cannot be released, thus they can't
      release their ref counts.
      
       Message from syslogd@dev at Nov 19 14:04:10 ...
        kernel:[73501.828730] unregister_netdevice: waiting for bridge.10 to become free. Usage count = 3
       Message from syslogd@dev at Nov 19 14:04:20 ...
        kernel:[73512.068811] unregister_netdevice: waiting for bridge.10 to become free. Usage count = 3
      
      Fixes: 7bf4796d
      
       ("nexthops: add support for replace")
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      66521011
    • Nikolay Aleksandrov's avatar
      net: ipv6: add fib6_nh_release_dsts stub · e085ae66
      Nikolay Aleksandrov authored
      [ Upstream commit 8837cbbf ]
      
      We need a way to release a fib6_nh's per-cpu dsts when replacing
      nexthops otherwise we can end up with stale per-cpu dsts which hold net
      device references, so add a new IPv6 stub called fib6_nh_release_dsts.
      It must be used after an RCU grace period, so no new dsts can be created
      through a group's nexthop entry.
      Similar to fib6_nh_release it shouldn't be used if fib6_nh_init has failed
      so it doesn't need a dummy stub when IPv6 is not enabled.
      
      Fixes: 7bf4796d
      
       ("nexthops: add support for replace")
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e085ae66
    • Holger Assmann's avatar
      net: stmmac: retain PTP clock time during SIOCSHWTSTAMP ioctls · 8d196fa5
      Holger Assmann authored
      [ Upstream commit a6da2bbb ]
      
      Currently, when user space emits SIOCSHWTSTAMP ioctl calls such as
      enabling/disabling timestamping or changing filter settings, the driver
      reads the current CLOCK_REALTIME value and programming this into the
      NIC's hardware clock. This might be necessary during system
      initialization, but at runtime, when the PTP clock has already been
      synchronized to a grandmaster, a reset of the timestamp settings might
      result in a clock jump. Furthermore, if the clock is also controlled by
      phc2sys in automatic mode (where the UTC offset is queried from ptp4l),
      that UTC-to-TAI offset (currently 37 seconds in 2021) would be
      temporarily reset to 0, and it would take a long time for phc2sys to
      readjust so that CLOCK_REALTIME and the PHC are apart by 37 seconds
      again.
      
      To address the issue, we introduce a new function called
      stmmac_init_tstamp_counter(), which gets called during ndo_open().
      It contains the code snippet moved from stmmac_hwtstamp_set() that
      manages the time synchronization. Besides, the sub second increment
      configuration is also moved here since the related values are hardware
      dependent and runtime invariant.
      
      Furthermore, the hardware clock must be kept running even when no time
      stamping mode is selected in order to retain the synchronized time base.
      That way, timestamping can be enabled again at any time only with the
      need to compensate the clock's natural drifting.
      
      As a side effect, this patch fixes the issue that ptp_clock_info::enable
      can be called before SIOCSHWTSTAMP and the driver (which looks at
      priv->systime_flags) was not prepared to handle that ordering.
      
      Fixes: 92ba6888
      
       ("stmmac: add the support for PTP hw clock driver")
      Reported-by: default avatarMichael Olbrich <m.olbrich@pengutronix.de>
      Signed-off-by: default avatarAhmad Fatoum <a.fatoum@pengutronix.de>
      Signed-off-by: default avatarHolger Assmann <h.assmann@pengutronix.de>
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      8d196fa5
    • Diana Wang's avatar
      nfp: checking parameter process for rx-usecs/tx-usecs is invalid · f6cd5768
      Diana Wang authored
      [ Upstream commit 3bd6b2a8 ]
      
      Use nn->tlv_caps.me_freq_mhz instead of nn->me_freq_mhz to check whether
      rx-usecs/tx-usecs is valid.
      
      This is because nn->tlv_caps.me_freq_mhz represents the clock_freq (MHz) of
      the flow processing cores (FPC) on the NIC. While nn->me_freq_mhz is not
      be set.
      
      Fixes: ce991ab6
      
       ("nfp: read ME frequency from vNIC ctrl memory")
      Signed-off-by: default avatarDiana Wang <na.wang@corigine.com>
      Signed-off-by: default avatarSimon Horman <simon.horman@corigine.com>
      Reviewed-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f6cd5768
    • Eric Dumazet's avatar
      ipv6: fix typos in __ip6_finish_output() · f1f243c0
      Eric Dumazet authored
      [ Upstream commit 19d36c5f ]
      
      We deal with IPv6 packets, so we need to use IP6CB(skb)->flags and
      IP6SKB_REROUTED, instead of IPCB(skb)->flags and IPSKB_REROUTED
      
      Found by code inspection, please double check that fixing this bug
      does not surface other bugs.
      
      Fixes: 09ee9dba
      
       ("ipv6: Reinject IPv6 packets if IPsec policy matches after SNAT")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Tobias Brunner <tobias@strongswan.org>
      Cc: Steffen Klassert <steffen.klassert@secunet.com>
      Cc: David Ahern <dsahern@kernel.org>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Tested-by: default avatarTobias Brunner <tobias@strongswan.org>
      Acked-by: default avatarTobias Brunner <tobias@strongswan.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f1f243c0
    • Michael Kelley's avatar
      firmware: smccc: Fix check for ARCH_SOC_ID not implemented · 88f6b5f1
      Michael Kelley authored
      [ Upstream commit e95d8eae ]
      
      The ARCH_FEATURES function ID is a 32-bit SMC call, which returns
      a 32-bit result per the SMCCC spec.  Current code is doing a 64-bit
      comparison against -1 (SMCCC_RET_NOT_SUPPORTED) to detect that the
      feature is unimplemented.  That check doesn't work in a Hyper-V VM,
      where the upper 32-bits are zero as allowed by the spec.
      
      Cast the result as an 'int' so the comparison works. The change also
      makes the code consistent with other similar checks in this file.
      
      Fixes: 821b67fa
      
       ("firmware: smccc: Add ARCH_SOC_ID support")
      Signed-off-by: default avatarMichael Kelley <mikelley@microsoft.com>
      Reviewed-by: default avatarSudeep Holla <sudeep.holla@arm.com>
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      88f6b5f1
    • Vincent Whitchurch's avatar
      af_unix: fix regression in read after shutdown · 80d70987
      Vincent Whitchurch authored
      [ Upstream commit f9390b24 ]
      
      On kernels before v5.15, calling read() on a unix socket after
      shutdown(SHUT_RD) or shutdown(SHUT_RDWR) would return the data
      previously written or EOF.  But now, while read() after
      shutdown(SHUT_RD) still behaves the same way, read() after
      shutdown(SHUT_RDWR) always fails with -EINVAL.
      
      This behaviour change was apparently inadvertently introduced as part of
      a bug fix for a different regression caused by the commit adding sockmap
      support to af_unix, commit 94531cfc ("af_unix: Add
      unix_stream_proto for sockmap").  Those commits, for unclear reasons,
      started setting the socket state to TCP_CLOSE on shutdown(SHUT_RDWR),
      while this state change had previously only been done in
      unix_release_sock().
      
      Restore the original behaviour.  The sockmap tests in
      tests/selftests/bpf continue to pass after this patch.
      
      Fixes: d0c6416b ("unix: Fix an issue in unix_shutdown causing the other end read/write failures")
      Link: https://lore.kernel.org/lkml/20211111140000.GA10779@axis.com/
      
      
      Signed-off-by: default avatarVincent Whitchurch <vincent.whitchurch@axis.com>
      Tested-by: default avatarCasey Schaufler <casey@schaufler-ca.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      80d70987
    • Paolo Abeni's avatar
      mptcp: use delegate action to schedule 3rd ack retrans · 97e5d850
      Paolo Abeni authored
      [ Upstream commit bcd97734 ]
      
      Scheduling a delack in mptcp_established_options_mp() is
      not a good idea: such function is called by tcp_send_ack() and
      the pending delayed ack will be cleared shortly after by the
      tcp_event_ack_sent() call in __tcp_transmit_skb().
      
      Instead use the mptcp delegated action infrastructure to
      schedule the delayed ack after the current bh processing completes.
      
      Additionally moves the schedule_3rdack_retransmission() helper
      into protocol.c to avoid making it visible in a different compilation
      unit.
      
      Fixes: ec3edaa7
      
       ("mptcp: Add handling of outgoing MP_JOIN requests")
      Reviewed-by: default avatarMat Martineau <mathew.j.martineau&gt;@linux.intel.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      97e5d850
    • Eric Dumazet's avatar
      mptcp: fix delack timer · 10ef3a1c
      Eric Dumazet authored
      [ Upstream commit ee50e67b ]
      
      To compute the rtx timeout schedule_3rdack_retransmission() does multiple
      things in the wrong way: srtt_us is measured in usec/8 and the timeout
      itself is an absolute value.
      
      Fixes: ec3edaa7
      
       ("mptcp: Add handling of outgoing MP_JOIN requests")
      Acked-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: default avatarMat Martineau <mathew.j.martineau&gt;@linux.intel.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      10ef3a1c
    • Pierre-Louis Bossart's avatar
      ALSA: intel-dsp-config: add quirk for JSL devices based on ES8336 codec · 26c3603a
      Pierre-Louis Bossart authored
      [ Upstream commit fa9730b4 ]
      
      These devices are based on an I2C/I2S device, we need to force the use
      of the SOF driver otherwise the legacy HDaudio driver will be loaded -
      only HDMI will be supported.
      
      We previously added support for other Intel platforms but missed
      JasperLake.
      
      BugLink: https://github.com/thesofproject/linux/issues/3210
      Fixes: 9d36ceab
      
       ('ALSA: intel-dsp-config: add quirk for APL/GLK/TGL devices based on ES8336 codec')
      Signed-off-by: default avatarPierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
      Reviewed-by: default avatarKai Vehmanen <kai.vehmanen@intel.com>
      Signed-off-by: default avatarBard Liao <yung-chuan.liao@linux.intel.com>
      Link: https://lore.kernel.org/r/20211027023254.24955-1-yung-chuan.liao@linux.intel.com
      
      
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      26c3603a
    • Juergen Gross's avatar
      xen/pvh: add missing prototype to header · c6db0b15
      Juergen Gross authored
      [ Upstream commit 2a099192
      
       ]
      
      The prototype of mem_map_via_hcall() is missing in its header, so add
      it.
      
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Fixes: a43fb7da
      
       ("xen/pvh: Move Xen code for getting mem map via hcall out of common file")
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      Link: https://lore.kernel.org/r/20211119153913.21678-1-jgross@suse.com
      
      
      Reviewed-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      c6db0b15
    • Juergen Gross's avatar
      x86/pvh: add prototype for xen_pvh_init() · 7c7cfc9d
      Juergen Gross authored
      [ Upstream commit 76721679
      
       ]
      
      xen_pvh_init() is lacking a prototype in a header, add it.
      
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      Link: https://lore.kernel.org/r/20211006061950.9227-1-jgross@suse.com
      
      
      Reviewed-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      7c7cfc9d
    • Brett Creeley's avatar
      iavf: Fix VLAN feature flags after VFR · 229e70bf
      Brett Creeley authored
      [ Upstream commit 5951a2b9 ]
      
      When a VF goes through a reset, it's possible for the VF's feature set
      to change. For example it may lose the VIRTCHNL_VF_OFFLOAD_VLAN
      capability after VF reset. Unfortunately, the driver doesn't correctly
      deal with this situation and errors are seen from downing/upping the
      interface and/or moving the interface in/out of a network namespace.
      
      When setting the interface down/up we see the following errors after the
      VIRTCHNL_VF_OFFLOAD_VLAN capability was taken away from the VF:
      
      ice 0000:51:00.1: VF 1 failed opcode 12, retval: -64 iavf 0000:51:09.1:
      Failed to add VLAN filter, error IAVF_NOT_SUPPORTED ice 0000:51:00.1: VF
      1 failed opcode 13, retval: -64 iavf 0000:51:09.1: Failed to delete VLAN
      filter, error IAVF_NOT_SUPPORTED
      
      These add/delete errors are happening because the VLAN filters are
      tracked internally to the driver and regardless of the VLAN_ALLOWED()
      setting the driver tries to delete/re-add them over virtchnl.
      
      Fix the delete failure by making sure to delete any VLAN filter tracking
      in the driver when a removal request is made, while preventing the
      virtchnl request.  This makes it so the driver's VLAN list is up to date
      and the errors are
      
      Fix the add failure by making sure the check for VLAN_ALLOWED() during
      reset is done after the VF receives its capability list from the PF via
      VIRTCHNL_OP_GET_VF_RESOURCES. If VLAN functionality is not allowed, then
      prevent requesting re-adding the filters over virtchnl.
      
      When moving the interface into a network namespace we see the following
      errors after the VIRTCHNL_VF_OFFLOAD_VLAN capability was taken away from
      the VF:
      
      iavf 0000:51:09.1 enp81s0f1v1: NIC Link is Up Speed is 25 Gbps Full Duplex
      iavf 0000:51:09.1 temp_27: renamed from enp81s0f1v1
      iavf 0000:51:09.1 mgmt: renamed from temp_27
      iavf 0000:51:09.1 dev27: set_features() failed (-22); wanted 0x020190001fd54833, left 0x020190001fd54bb3
      
      These errors are happening because we aren't correctly updating the
      netdev capabilities and dealing with ndo_fix_features() and
      ndo_set_features() correctly.
      
      Fix this by only reporting errors in the driver's ndo_set_features()
      callback when VIRTCHNL_VF_OFFLOAD_VLAN is not allowed and any attempt to
      enable the VLAN features is made. Also, make sure to disable VLAN
      insertion, filtering, and stripping since the VIRTCHNL_VF_OFFLOAD_VLAN
      flag applies to all of them and not just VLAN stripping.
      
      Also, after we process the capabilities in the VF reset path, make sure
      to call netdev_update_features() in case the capabilities have changed
      in order to update the netdev's feature set to match the VF's actual
      capabilities.
      
      Lastly, make sure to always report success on VLAN filter delete when
      VIRTCHNL_VF_OFFLOAD_VLAN is not supported. The changed flow in
      iavf_del_vlans() allows the stack to delete previosly existing VLAN
      filters even if VLAN filtering is not allowed. This makes it so the VLAN
      filter list is up to date.
      
      Fixes: 8774370d
      
       ("i40e/i40evf: support for VF VLAN tag stripping control")
      Signed-off-by: default avatarBrett Creeley <brett.creeley@intel.com>
      Tested-by: default avatarKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      229e70bf
    • Jedrzej Jagielski's avatar
      iavf: Fix refreshing iavf adapter stats on ethtool request · 8d4b4e0f
      Jedrzej Jagielski authored
      [ Upstream commit 3b5bdd18 ]
      
      Currently iavf adapter statistics are refreshed only in a
      watchdog task, triggered approximately every two seconds,
      which causes some ethtool requests to return outdated values.
      
      Add explicit statistics refresh when requested by ethtool -S.
      
      Fixes: b476b003
      
       ("iavf: Move commands processing to the separate function")
      Signed-off-by: default avatarJan Sokolowski <jan.sokolowski@intel.com>
      Signed-off-by: default avatarJedrzej Jagielski <jedrzej.jagielski@intel.com>
      Tested-by: default avatarKonrad Jankowski <konrad0.jankowski@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      8d4b4e0f
    • Nitesh B Venkatesh's avatar
      iavf: Prevent changing static ITR values if adaptive moderation is on · e4031c04
      Nitesh B Venkatesh authored
      [ Upstream commit e792779e ]
      
      Resolve being able to change static values on VF when adaptive interrupt
      moderation is enabled.
      
      This problem is fixed by checking the interrupt settings is not
      a combination of change of static value while adaptive interrupt
      moderation is turned on.
      
      Without this fix, the user would be able to change static values
      on VF with adaptive moderation enabled.
      
      Fixes: 65e87c03
      
       ("i40evf: support queue-specific settings for interrupt moderation")
      Signed-off-by: default avatarNitesh B Venkatesh <nitesh.b.venkatesh@intel.com>
      Tested-by: default avatarGeorge Kuruvinakunnel <george.kuruvinakunnel@intel.com>
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      e4031c04
    • Claudia Pellegrino's avatar
      HID: magicmouse: prevent division by 0 on scroll · 25bbaa3a
      Claudia Pellegrino authored
      [ Upstream commit a1091118 ]
      
      In hid_magicmouse, if the user has set scroll_speed to a value between
      55 and 63 and scrolls seven times in quick succession, the
      step_hr variable in the magicmouse_emit_touch function becomes 0.
      
      That causes a division by zero further down in the function when
      it does `step_x_hr /= step_hr`.
      
      To reproduce, create `/etc/modprobe.d/hid_magicmouse.conf` with the
      following content:
      
      ```
      options hid_magicmouse scroll_acceleration=1 scroll_speed=55
      ```
      
      Then reboot, connect a Magic Mouse and scroll seven times quickly.
      The system will freeze for a minute, and after that `dmesg` will
      confirm that a division by zero occurred.
      
      Enforce a minimum of 1 for the variable so the high resolution
      step count can never reach 0 even at maximum scroll acceleration.
      
      Fixes: d4b9f10a
      
       ("HID: magicmouse: enable high-resolution scroll")
      
      Signed-off-by: default avatarClaudia Pellegrino <linux@cpellegrino.de>
      Tested-by: default avatarJosé Expósito <jose.exposito89@gmail.com>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      25bbaa3a
    • Thomas Weißschuh's avatar
      HID: input: set usage type to key on keycode remap · 6341c9cc
      Thomas Weißschuh authored
      [ Upstream commit 3e6a950d ]
      
      When a scancode is manually remapped that previously was not handled as
      key, then the old usage type was incorrectly reused.
      
      This caused issues on a "04b3:301b IBM Corp. SK-8815 Keyboard" which has
      marked some of its keys with an invalid HID usage.  These invalid usage
      keys are being ignored since support for USB programmable buttons was
      added.
      
      The scancodes are however remapped explicitly by the systemd hwdb to the
      keycodes that are printed on the physical buttons.  During this mapping
      step the existing usage is retrieved which will be found with a default
      type of 0 (EV_SYN) instead of EV_KEY.
      
      The events with the correct code but EV_SYN type are not forwarded to
      userspace.
      
      This also leads to a kernel oops when trying to print the report descriptor
      via debugfs.  hid_resolv_event() tries to resolve a EV_SYN event with an
      EV_KEY code which leads to an out-of-bounds access in the EV_SYN names
      array.
      
      Fixes: bcfa8d14 ("HID: input: Add support for Programmable Buttons")
      Fixes: f5854fad
      
       ("Input: hid-input - allow mapping unknown usages")
      Reported-by: default avatarBrent Roman <brent@mbari.org>
      Tested-by: default avatarBrent Roman <brent@mbari.org>
      Signed-off-by: default avatarThomas Weißschuh <linux@weissschuh.net>
      Reviewed-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      6341c9cc
    • Hans de Goede's avatar
      HID: input: Fix parsing of HID_CP_CONSUMER_CONTROL fields · 740dd842
      Hans de Goede authored
      [ Upstream commit 7fc48fd6 ]
      
      Fix parsing of HID_CP_CONSUMER_CONTROL fields which are not in
      the HID_CP_PROGRAMMABLEBUTTONS collection.
      
      Fixes: bcfa8d14 ("HID: input: Add support for Programmable Buttons")
      BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=2018096
      
      
      Cc: Thomas Weißschuh <linux@weissschuh.net>
      Suggested-by: default avatarBenjamin Tissoires <btissoir@redhat.com>
      Signed-off-by: default avatarHans de Goede <hdegoede@redhat.com>
      Reviewed-By: default avatarThomas Weißschuh <linux@weissschuh.net>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      740dd842
    • Volodymyr Mytnyk's avatar
      net: marvell: prestera: fix double free issue on err path · 03e5203d
      Volodymyr Mytnyk authored
      [ Upstream commit e8d03250 ]
      
      fix error path handling in prestera_bridge_port_join() that
      cases prestera driver to crash (see below).
      
       Trace:
         Internal error: Oops: 96000044 [#1] SMP
         Modules linked in: prestera_pci prestera uio_pdrv_genirq
         CPU: 1 PID: 881 Comm: ip Not tainted 5.15.0 #1
         pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
         pc : prestera_bridge_destroy+0x2c/0xb0 [prestera]
         lr : prestera_bridge_port_join+0x2cc/0x350 [prestera]
         sp : ffff800011a1b0f0
         ...
         x2 : ffff000109ca6c80 x1 : dead000000000100 x0 : dead000000000122
          Call trace:
         prestera_bridge_destroy+0x2c/0xb0 [prestera]
         prestera_bridge_port_join+0x2cc/0x350 [prestera]
         prestera_netdev_port_event.constprop.0+0x3c4/0x450 [prestera]
         prestera_netdev_event_handler+0xf4/0x110 [prestera]
         raw_notifier_call_chain+0x54/0x80
         call_netdevice_notifiers_info+0x54/0xa0
         __netdev_upper_dev_link+0x19c/0x380
      
      Fixes: e1189d9a
      
       ("net: marvell: prestera: Add Switchdev driver implementation")
      Signed-off-by: default avatarVolodymyr Mytnyk <vmytnyk@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      03e5203d
    • Volodymyr Mytnyk's avatar
      net: marvell: prestera: fix brige port operation · 8599e15e
      Volodymyr Mytnyk authored
      [ Upstream commit 253e9b4d ]
      
      Return NOTIFY_DONE (dont't care) for switchdev notifications
      that prestera driver don't know how to handle them.
      
      With introduction of SWITCHDEV_BRPORT_[UN]OFFLOADED switchdev
      events, the driver rejects adding swport to bridge operation
      which is handled by prestera_bridge_port_join() func. The root
      cause of this is that prestera driver returns error (EOPNOTSUPP)
      in prestera_switchdev_blk_event() handler for unknown swdev
      events. This causes switchdev_bridge_port_offload() to fail
      when adding port to bridge in prestera_bridge_port_join().
      
      Fixes: 957e2235
      
       ("net: make switchdev_bridge_port_{,unoffload} loosely coupled with the bridge")
      Signed-off-by: default avatarVolodymyr Mytnyk <vmytnyk@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      8599e15e
    • Joel Stanley's avatar
      drm/aspeed: Fix vga_pw sysfs output · 94850e2d
      Joel Stanley authored
      [ Upstream commit b4a6aaea ]
      
      Before the drm driver had support for this file there was a driver that
      exposed the contents of the vga password register to userspace. It would
      present the entire register instead of interpreting it.
      
      The drm implementation chose to mask of the lower bit, without explaining
      why. This breaks the existing userspace, which is looking for 0xa8 in
      the lower byte.
      
      Change our implementation to expose the entire register.
      
      Fixes: 696029eb
      
       ("drm/aspeed: Add sysfs for output settings")
      Reported-by: default avatarOskar Senft <osk@google.com>
      Signed-off-by: default avatarJoel Stanley <joel@jms.id.au>
      Reviewed-by: default avatarJeremy Kerr <jk@codeconstruct.com.au>
      Tested-by: default avatarOskar Senft <osk@google.com>
      Signed-off-by: default avatarMaxime Ripard <maxime@cerno.tech>
      Link: https://patchwork.freedesktop.org/patch/msgid/20211117010145.297253-1-joel@jms.id.au
      
      
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      94850e2d