Skip to content
  1. Aug 27, 2020
  2. Aug 26, 2020
    • Greg Kroah-Hartman's avatar
      v5.8.4
      47dcb7fc
    • Alex Deucher's avatar
      Revert "drm/amd/display: Improve DisplayPort monitor interop" · 920ebff4
      Alex Deucher authored
      This reverts commit 1adb2ff1
      
      .
      
      This breaks display wake up in stable kernels (5.7.x and 5.8.x).
      
      Note that there is no upstream equivalent to this
      revert. This patch was targeted for stable by Sasha's stable
      patch process. Presumably there are some other changes necessary
      for this patch to work properly on stable kernels.
      
      Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1266
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Cc: stable@vger.kernel.org # 5.7.x, 5.8.x
      Cc: Sasha Levin <sashal@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      920ebff4
    • Will Deacon's avatar
      KVM: arm64: Only reschedule if MMU_NOTIFIER_RANGE_BLOCKABLE is not set · d0a3a013
      Will Deacon authored
      commit b5331379 upstream.
      
      When an MMU notifier call results in unmapping a range that spans multiple
      PGDs, we end up calling into cond_resched_lock() when crossing a PGD boundary,
      since this avoids running into RCU stalls during VM teardown. Unfortunately,
      if the VM is destroyed as a result of OOM, then blocking is not permitted
      and the call to the scheduler triggers the following BUG():
      
       | BUG: sleeping function called from invalid context at arch/arm64/kvm/mmu.c:394
       | in_atomic(): 1, irqs_disabled(): 0, non_block: 1, pid: 36, name: oom_reaper
       | INFO: lockdep is turned off.
       | CPU: 3 PID: 36 Comm: oom_reaper Not tainted 5.8.0 #1
       | Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
       | Call trace:
       |  dump_backtrace+0x0/0x284
       |  show_stack+0x1c/0x28
       |  dump_stack+0xf0/0x1a4
       |  ___might_sleep+0x2bc/0x2cc
       |  unmap_stage2_range+0x160/0x1ac
       |  kvm_unmap_hva_range+0x1a0/0x1c8
       |  kvm_mmu_notifier_invalidate_range_start+0x8c/0xf8
       |  __mmu_notifier_invalidate_range_start+0x218/0x31c
       |  mmu_notifier_invalidate_range_start_nonblock+0x78/0xb0
       |  __oom_reap_task_mm+0x128/0x268
       |  oom_reap_task+0xac/0x298
       |  oom_reaper+0x178/0x17c
       |  kthread+0x1e4/0x1fc
       |  ret_from_fork+0x10/0x30
      
      Use the new 'flags' argument to kvm_unmap_hva_range() to ensure that we
      only reschedule if MMU_NOTIFIER_RANGE_BLOCKABLE is set in the notifier
      flags.
      
      Cc: <stable@vger.kernel.org>
      Fixes: 8b3405e3
      
       ("kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd")
      Cc: Marc Zyngier <maz@kernel.org>
      Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Signed-off-by: default avatarWill Deacon <will@kernel.org>
      Message-Id: <20200811102725.7121-3-will@kernel.org>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarWill Deacon <will@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d0a3a013
    • Will Deacon's avatar
      KVM: Pass MMU notifier range flags to kvm_unmap_hva_range() · af309331
      Will Deacon authored
      commit fdfe7cbd
      
       upstream.
      
      The 'flags' field of 'struct mmu_notifier_range' is used to indicate
      whether invalidate_range_{start,end}() are permitted to block. In the
      case of kvm_mmu_notifier_invalidate_range_start(), this field is not
      forwarded on to the architecture-specific implementation of
      kvm_unmap_hva_range() and therefore the backend cannot sensibly decide
      whether or not to block.
      
      Add an extra 'flags' parameter to kvm_unmap_hva_range() so that
      architectures are aware as to whether or not they are permitted to block.
      
      Cc: <stable@vger.kernel.org>
      Cc: Marc Zyngier <maz@kernel.org>
      Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
      Cc: James Morse <james.morse@arm.com>
      Signed-off-by: default avatarWill Deacon <will@kernel.org>
      Message-Id: <20200811102725.7121-2-will@kernel.org>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarWill Deacon <will@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      af309331
    • Al Viro's avatar
      do_epoll_ctl(): clean the failure exits up a bit · d9903e8c
      Al Viro authored
      commit 52c47969
      
       upstream.
      
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d9903e8c
    • Arvind Sankar's avatar
      efi/libstub: Handle unterminated cmdline · 1f802ace
      Arvind Sankar authored
      commit 8a8a3237
      
       upstream.
      
      Make the command line parsing more robust, by handling the case it is
      not NUL-terminated.
      
      Use strnlen instead of strlen, and make sure that the temporary copy is
      NUL-terminated before parsing.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarArvind Sankar <nivedita@alum.mit.edu>
      Link: https://lore.kernel.org/r/20200813185811.554051-4-nivedita@alum.mit.edu
      Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1f802ace
    • Arvind Sankar's avatar
      efi/libstub: Handle NULL cmdline · ca60a5eb
      Arvind Sankar authored
      commit a37ca6a2
      
       upstream.
      
      Treat a NULL cmdline the same as empty. Although this is unlikely to
      happen in practice, the x86 kernel entry does check for NULL cmdline and
      handles it, so do it here as well.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarArvind Sankar <nivedita@alum.mit.edu>
      Link: https://lore.kernel.org/r/20200729193300.598448-1-nivedita@alum.mit.edu
      Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ca60a5eb
    • Arvind Sankar's avatar
      efi/libstub: Stop parsing arguments at "--" · 3bff856b
      Arvind Sankar authored
      commit 1fd9717d
      
       upstream.
      
      Arguments after "--" are arguments for init, not for the kernel.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarArvind Sankar <nivedita@alum.mit.edu>
      Link: https://lore.kernel.org/r/20200725155916.1376773-1-nivedita@alum.mit.edu
      Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3bff856b
    • Li Heng's avatar
      efi: add missed destroy_workqueue when efisubsys_init fails · e6584124
      Li Heng authored
      commit 98086df8
      
       upstream.
      
      destroy_workqueue() should be called to destroy efi_rts_wq
      when efisubsys_init() init resources fails.
      
      Cc: <stable@vger.kernel.org>
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Signed-off-by: default avatarLi Heng <liheng40@huawei.com>
      Link: https://lore.kernel.org/r/1595229738-10087-1-git-send-email-liheng40@huawei.com
      Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e6584124
    • Arvind Sankar's avatar
      efi/x86: Mark kernel rodata non-executable for mixed mode · 09a30705
      Arvind Sankar authored
      commit c8502eb2
      
       upstream.
      
      When remapping the kernel rodata section RO in the EFI pagetables, the
      protection flags that were used for the text section are being reused,
      but the rodata section should not be marked executable.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarArvind Sankar <nivedita@alum.mit.edu>
      Link: https://lore.kernel.org/r/20200717194526.3452089-1-nivedita@alum.mit.edu
      Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      09a30705
    • Tony Luck's avatar
      EDAC/{i7core,sb,pnd2,skx}: Fix error event severity · 3d9ed544
      Tony Luck authored
      commit 45bc6098
      
       upstream.
      
      IA32_MCG_STATUS.RIPV indicates whether the return RIP value pushed onto
      the stack as part of machine check delivery is valid or not.
      
      Various drivers copied a code fragment that uses the RIPV bit to
      determine the severity of the error as either HW_EVENT_ERR_UNCORRECTED
      or HW_EVENT_ERR_FATAL, but this check is reversed (marking errors where
      RIPV is set as "FATAL").
      
      Reverse the tests so that the error is marked fatal when RIPV is not set.
      
      Reported-by: default avatarGabriele Paoloni <gabriele.paoloni@intel.com>
      Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: <stable@vger.kernel.org>
      Link: https://lkml.kernel.org/r/20200707194324.14884-1-tony.luck@intel.com
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3d9ed544
    • Vasant Hegde's avatar
      powerpc/pseries: Do not initiate shutdown when system is running on UPS · 979a9c00
      Vasant Hegde authored
      commit 90a9b102 upstream.
      
      As per PAPR we have to look for both EPOW sensor value and event
      modifier to identify the type of event and take appropriate action.
      
      In LoPAPR v1.1 section 10.2.2 includes table 136 "EPOW Action Codes":
      
        SYSTEM_SHUTDOWN 3
      
        The system must be shut down. An EPOW-aware OS logs the EPOW error
        log information, then schedules the system to be shut down to begin
        after an OS defined delay internal (default is 10 minutes.)
      
      Then in section 10.3.2.2.8 there is table 146 "Platform Event Log
      Format, Version 6, EPOW Section", which includes the "EPOW Event
      Modifier":
      
        For EPOW sensor value = 3
        0x01 = Normal system shutdown with no additional delay
        0x02 = Loss of utility power, system is running on UPS/Battery
        0x03 = Loss of system critical functions, system should be shutdown
        0x04 = Ambient temperature too high
        All other values = reserved
      
      We have a user space tool (rtas_errd) on LPAR to monitor for
      EPOW_SHUTDOWN_ON_UPS. Once it gets an event it initiates shutdown
      after predefined time. It also starts monitoring for any new EPOW
      events. If it receives "Power restored" event before predefined time
      it will cancel the shutdown. Otherwise after predefined time it will
      shutdown the system.
      
      Commit 79872e35 ("powerpc/pseries: All events of
      EPOW_SYSTEM_SHUTDOWN must initiate shutdown") changed our handling of
      the "on UPS/Battery" case, to immediately shutdown the system. This
      breaks existing setups that rely on the userspace tool to delay
      shutdown and let the system run on the UPS.
      
      Fixes: 79872e35
      
       ("powerpc/pseries: All events of EPOW_SYSTEM_SHUTDOWN must initiate shutdown")
      Cc: stable@vger.kernel.org # v4.0+
      Signed-off-by: default avatarVasant Hegde <hegdevasant@linux.vnet.ibm.com>
      [mpe: Massage change log and add PAPR references]
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20200820061844.306460-1-hegdevasant@linux.vnet.ibm.com
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      979a9c00
    • Michael Neuling's avatar
      powerpc: Fix P10 PVR revision in /proc/cpuinfo for SMT4 cores · d9b227a0
      Michael Neuling authored
      commit 030a2c68 upstream.
      
      On POWER10 bit 12 in the PVR indicates if the core is SMT4 or SMT8.
      Bit 12 is set for SMT4.
      
      Without this patch, /proc/cpuinfo on a SMT4 DD1 POWER10 looks like
      this:
        cpu             : POWER10, altivec supported
        revision        : 17.0 (pvr 0080 1100)
      
      Fixes: a3ea40d5
      
       ("powerpc: Add POWER10 architected mode")
      Cc: stable@vger.kernel.org # v5.8
      Signed-off-by: default avatarMichael Neuling <mikey@neuling.org>
      Reviewed-by: default avatarVaidyanathan Srinivasan <svaidy@linux.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20200803035600.1820371-1-mikey@neuling.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d9b227a0
    • Marc Zyngier's avatar
      epoll: Keep a reference on files added to the check list · c09886c1
      Marc Zyngier authored
      commit a9ed4a65
      
       upstream.
      
      When adding a new fd to an epoll, and that this new fd is an
      epoll fd itself, we recursively scan the fds attached to it
      to detect cycles, and add non-epool files to a "check list"
      that gets subsequently parsed.
      
      However, this check list isn't completely safe when deletions
      can happen concurrently. To sidestep the issue, make sure that
      a struct file placed on the check list sees its f_count increased,
      ensuring that a concurrent deletion won't result in the file
      disapearing from under our feet.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMarc Zyngier <maz@kernel.org>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c09886c1
    • Tom Rix's avatar
      net: dsa: b53: check for timeout · 3489cea0
      Tom Rix authored
      [ Upstream commit 774d977a ]
      
      clang static analysis reports this problem
      
      b53_common.c:1583:13: warning: The left expression of the compound
        assignment is an uninitialized value. The computed value will
        also be garbage
              ent.port &= ~BIT(port);
              ~~~~~~~~ ^
      
      ent is set by a successful call to b53_arl_read().  Unsuccessful
      calls are caught by an switch statement handling specific returns.
      b32_arl_read() calls b53_arl_op_wait() which fails with the
      unhandled -ETIMEDOUT.
      
      So add -ETIMEDOUT to the switch statement.  Because
      b53_arl_op_wait() already prints out a message, do not add another
      one.
      
      Fixes: 1da6df85
      
       ("net: dsa: b53: Implement ARL add/del/dump operations")
      Signed-off-by: default avatarTom Rix <trix@redhat.com>
      Acked-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      3489cea0
    • Haiyang Zhang's avatar
      hv_netvsc: Fix the queue_mapping in netvsc_vf_xmit() · 0c831e9d
      Haiyang Zhang authored
      [ Upstream commit c3d897e0 ]
      
      netvsc_vf_xmit() / dev_queue_xmit() will call VF NIC’s ndo_select_queue
      or netdev_pick_tx() again. They will use skb_get_rx_queue() to get the
      queue number, so the “skb->queue_mapping - 1” will be used. This may
      cause the last queue of VF not been used.
      
      Use skb_record_rx_queue() here, so that the skb_get_rx_queue() called
      later will get the correct queue number, and VF will be able to use
      all queues.
      
      Fixes: b3bf5666
      
       ("hv_netvsc: defer queue selection to VF")
      Signed-off-by: default avatarHaiyang Zhang <haiyangz@microsoft.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      0c831e9d
    • Veronika Kabatova's avatar
      selftests/bpf: Remove test_align leftovers · d446604e
      Veronika Kabatova authored
      [ Upstream commit 5597432d ]
      
      Calling generic selftests "make install" fails as rsync expects all
      files from TEST_GEN_PROGS to be present. The binary is not generated
      anymore (commit 3b09d27c) so we can safely remove it from there
      and also from gitignore.
      
      Fixes: 3b09d27c
      
       ("selftests/bpf: Move test_align under test_progs")
      Signed-off-by: default avatarVeronika Kabatova <vkabatov@redhat.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarJesper Dangaard Brouer <brouer@redhat.com>
      Link: https://lore.kernel.org/bpf/20200819160710.1345956-1-vkabatov@redhat.com
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      d446604e
    • Wang Hai's avatar
      net: gemini: Fix missing free_netdev() in error path of gemini_ethernet_port_probe() · 9500db54
      Wang Hai authored
      [ Upstream commit cf96d977 ]
      
      Replace alloc_etherdev_mq with devm_alloc_etherdev_mqs. In this way,
      when probe fails, netdev can be freed automatically.
      
      Fixes: 4d5ae32f
      
       ("net: ethernet: Add a driver for Gemini gigabit ethernet")
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Signed-off-by: default avatarWang Hai <wanghai38@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      9500db54
    • Shay Agroskin's avatar
      net: ena: Change WARN_ON expression in ena_del_napi_in_range() · af4a5647
      Shay Agroskin authored
      [ Upstream commit 8b147f6f ]
      
      The ena_del_napi_in_range() function unregisters the napi handler for
      rings in a given range.
      This function had the following WARN_ON macro:
      
          WARN_ON(ENA_IS_XDP_INDEX(adapter, i) &&
      	    adapter->ena_napi[i].xdp_ring);
      
      This macro prints the call stack if the expression inside of it is
      true [1], but the expression inside of it is the wanted situation.
      The expression checks whether the ring has an XDP queue and its index
      corresponds to a XDP one.
      
      This patch changes the expression to
          !ENA_IS_XDP_INDEX(adapter, i) && adapter->ena_napi[i].xdp_ring
      which indicates an unwanted situation.
      
      Also, change the structure of the function. The napi handler is
      unregistered for all rings, and so there's no need to check whether the
      index is an XDP index or not. By removing this check the code becomes
      much more readable.
      
      Fixes: 548c4940
      
       ("net: ena: Implement XDP_TX action")
      Signed-off-by: default avatarShay Agroskin <shayagr@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      af4a5647
    • Shay Agroskin's avatar
      net: ena: Prevent reset after device destruction · 8c01a77d
      Shay Agroskin authored
      [ Upstream commit 63d4a4c1 ]
      
      The reset work is scheduled by the timer routine whenever it
      detects that a device reset is required (e.g. when a keep_alive signal
      is missing).
      When releasing device resources in ena_destroy_device() the driver
      cancels the scheduling of the timer routine without destroying the reset
      work explicitly.
      
      This creates the following bug:
          The driver is suspended and the ena_suspend() function is called
      	-> This function calls ena_destroy_device() to free the net device
      	   resources
      	    -> The driver waits for the timer routine to finish
      	    its execution and then cancels it, thus preventing from it
      	    to be called again.
      
          If, in its final execution, the timer routine schedules a reset,
          the reset routine might be called afterwards,and a redundant call to
          ena_restore_device() would be made.
      
      By changing the reset routine we allow it to read the device's state
      accurately.
      This is achieved by checking whether ENA_FLAG_TRIGGER_RESET flag is set
      before resetting the device and making both the destruction function and
      the flag check are under rtnl lock.
      The ENA_FLAG_TRIGGER_RESET is cleared at the end of the destruction
      routine. Also surround the flag check with 'likely' because
      we expect that the reset routine would be called only when
      ENA_FLAG_TRIGGER_RESET flag is set.
      
      The destruction of the timer and reset services in __ena_shutoff() have to
      stay, even though the timer routine is destroyed in ena_destroy_device().
      This is to avoid a case in which the reset routine is scheduled after
      free_netdev() in __ena_shutoff(), which would create an access to freed
      memory in adapter->flags.
      
      Fixes: 8c5c7abd
      
       ("net: ena: add power management ops to the ENA driver")
      Signed-off-by: default avatarShay Agroskin <shayagr@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      8c01a77d
    • Jiri Wiesner's avatar
      bonding: fix active-backup failover for current ARP slave · 3c1d705f
      Jiri Wiesner authored
      [ Upstream commit 0410d071 ]
      
      When the ARP monitor is used for link detection, ARP replies are
      validated for all slaves (arp_validate=3) and fail_over_mac is set to
      active, two slaves of an active-backup bond may get stuck in a state
      where both of them are active and pass packets that they receive to
      the bond. This state makes IPv6 duplicate address detection fail. The
      state is reached thus:
      1. The current active slave goes down because the ARP target
         is not reachable.
      2. The current ARP slave is chosen and made active.
      3. A new slave is enslaved. This new slave becomes the current active
         slave and can reach the ARP target.
      As a result, the current ARP slave stays active after the enslave
      action has finished and the log is littered with "PROBE BAD" messages:
      > bond0: PROBE: c_arp ens10 && cas ens11 BAD
      The workaround is to remove the slave with "going back" status from
      the bond and re-enslave it. This issue was encountered when DPDK PMD
      interfaces were being enslaved to an active-backup bond.
      
      I would be possible to fix the issue in bond_enslave() or
      bond_change_active_slave() but the ARP monitor was fixed instead to
      keep most of the actions changing the current ARP slave in the ARP
      monitor code. The current ARP slave is set as inactive and backup
      during the commit phase. A new state, BOND_LINK_FAIL, has been
      introduced for slaves in the context of the ARP monitor. This allows
      administrators to see how slaves are rotated for sending ARP requests
      and attempts are made to find a new active slave.
      
      Fixes: b2220cad
      
       ("bonding: refactor ARP active-backup monitor")
      Signed-off-by: default avatarJiri Wiesner <jwiesner@suse.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      3c1d705f
    • Michael Roth's avatar
      powerpc/pseries/hotplug-cpu: wait indefinitely for vCPU death · f6c6b312
      Michael Roth authored
      [ Upstream commit 801980f6 ]
      
      For a power9 KVM guest with XIVE enabled, running a test loop
      where we hotplug 384 vcpus and then unplug them, the following traces
      can be seen (generally within a few loops) either from the unplugged
      vcpu:
      
        cpu 65 (hwid 65) Ready to die...
        Querying DEAD? cpu 66 (66) shows 2
        list_del corruption. next->prev should be c00a000002470208, but was c00a000002470048
        ------------[ cut here ]------------
        kernel BUG at lib/list_debug.c:56!
        Oops: Exception in kernel mode, sig: 5 [#1]
        LE SMP NR_CPUS=2048 NUMA pSeries
        Modules linked in: fuse nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 ...
        CPU: 66 PID: 0 Comm: swapper/66 Kdump: loaded Not tainted 4.18.0-221.el8.ppc64le #1
        NIP:  c0000000007ab50c LR: c0000000007ab508 CTR: 00000000000003ac
        REGS: c0000009e5a17840 TRAP: 0700   Not tainted  (4.18.0-221.el8.ppc64le)
        MSR:  800000000282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 28000842  XER: 20040000
        ...
        NIP __list_del_entry_valid+0xac/0x100
        LR  __list_del_entry_valid+0xa8/0x100
        Call Trace:
          __list_del_entry_valid+0xa8/0x100 (unreliable)
          free_pcppages_bulk+0x1f8/0x940
          free_unref_page+0xd0/0x100
          xive_spapr_cleanup_queue+0x148/0x1b0
          xive_teardown_cpu+0x1bc/0x240
          pseries_mach_cpu_die+0x78/0x2f0
          cpu_die+0x48/0x70
          arch_cpu_idle_dead+0x20/0x40
          do_idle+0x2f4/0x4c0
          cpu_startup_entry+0x38/0x40
          start_secondary+0x7bc/0x8f0
          start_secondary_prolog+0x10/0x14
      
      or on the worker thread handling the unplug:
      
        pseries-hotplug-cpu: Attempting to remove CPU <NULL>, drc index: 1000013a
        Querying DEAD? cpu 314 (314) shows 2
        BUG: Bad page state in process kworker/u768:3  pfn:95de1
        cpu 314 (hwid 314) Ready to die...
        page:c00a000002577840 refcount:0 mapcount:-128 mapping:0000000000000000 index:0x0
        flags: 0x5ffffc00000000()
        raw: 005ffffc00000000 5deadbeef0000100 5deadbeef0000200 0000000000000000
        raw: 0000000000000000 0000000000000000 00000000ffffff7f 0000000000000000
        page dumped because: nonzero mapcount
        Modules linked in: kvm xt_CHECKSUM ipt_MASQUERADE xt_conntrack ...
        CPU: 0 PID: 548 Comm: kworker/u768:3 Kdump: loaded Not tainted 4.18.0-224.el8.bz1856588.ppc64le #1
        Workqueue: pseries hotplug workque pseries_hp_work_fn
        Call Trace:
          dump_stack+0xb0/0xf4 (unreliable)
          bad_page+0x12c/0x1b0
          free_pcppages_bulk+0x5bc/0x940
          page_alloc_cpu_dead+0x118/0x120
          cpuhp_invoke_callback.constprop.5+0xb8/0x760
          _cpu_down+0x188/0x340
          cpu_down+0x5c/0xa0
          cpu_subsys_offline+0x24/0x40
          device_offline+0xf0/0x130
          dlpar_offline_cpu+0x1c4/0x2a0
          dlpar_cpu_remove+0xb8/0x190
          dlpar_cpu_remove_by_index+0x12c/0x150
          dlpar_cpu+0x94/0x800
          pseries_hp_work_fn+0x128/0x1e0
          process_one_work+0x304/0x5d0
          worker_thread+0xcc/0x7a0
          kthread+0x1ac/0x1c0
          ret_from_kernel_thread+0x5c/0x80
      
      The latter trace is due to the following sequence:
      
        page_alloc_cpu_dead
          drain_pages
            drain_pages_zone
              free_pcppages_bulk
      
      where drain_pages() in this case is called under the assumption that
      the unplugged cpu is no longer executing. To ensure that is the case,
      and early call is made to __cpu_die()->pseries_cpu_die(), which runs a
      loop that waits for the cpu to reach a halted state by polling its
      status via query-cpu-stopped-state RTAS calls. It only polls for 25
      iterations before giving up, however, and in the trace above this
      results in the following being printed only .1 seconds after the
      hotplug worker thread begins processing the unplug request:
      
        pseries-hotplug-cpu: Attempting to remove CPU <NULL>, drc index: 1000013a
        Querying DEAD? cpu 314 (314) shows 2
      
      At that point the worker thread assumes the unplugged CPU is in some
      unknown/dead state and procedes with the cleanup, causing the race
      with the XIVE cleanup code executed by the unplugged CPU.
      
      Fix this by waiting indefinitely, but also making an effort to avoid
      spurious lockup messages by allowing for rescheduling after polling
      the CPU status and printing a warning if we wait for longer than 120s.
      
      Fixes: eac1e731
      
       ("powerpc/xive: guest exploitation of the XIVE interrupt controller")
      Suggested-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarMichael Roth <mdroth@linux.vnet.ibm.com>
      Tested-by: default avatarGreg Kurz <groug@kaod.org>
      Reviewed-by: default avatarThiago Jung Bauermann <bauerman@linux.ibm.com>
      Reviewed-by: default avatarGreg Kurz <groug@kaod.org>
      [mpe: Trim oopses in change log slightly for readability]
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20200811161544.10513-1-mdroth@linux.vnet.ibm.com
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f6c6b312
    • Yonghong Song's avatar
      bpf: Use get_file_rcu() instead of get_file() for task_file iterator · 63f10205
      Yonghong Song authored
      [ Upstream commit cf28f3bb ]
      
      With latest `bpftool prog` command, we observed the following kernel
      panic.
          BUG: kernel NULL pointer dereference, address: 0000000000000000
          #PF: supervisor instruction fetch in kernel mode
          #PF: error_code(0x0010) - not-present page
          PGD dfe894067 P4D dfe894067 PUD deb663067 PMD 0
          Oops: 0010 [#1] SMP
          CPU: 9 PID: 6023 ...
          RIP: 0010:0x0
          Code: Bad RIP value.
          RSP: 0000:ffffc900002b8f18 EFLAGS: 00010286
          RAX: ffff8883a405f400 RBX: ffff888e46a6bf00 RCX: 000000008020000c
          RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8883a405f400
          RBP: ffff888e46a6bf50 R08: 0000000000000000 R09: ffffffff81129600
          R10: ffff8883a405f300 R11: 0000160000000000 R12: 0000000000002710
          R13: 000000e9494b690c R14: 0000000000000202 R15: 0000000000000009
          FS:  00007fd9187fe700(0000) GS:ffff888e46a40000(0000) knlGS:0000000000000000
          CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
          CR2: ffffffffffffffd6 CR3: 0000000de5d33002 CR4: 0000000000360ee0
          DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
          DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
          Call Trace:
           <IRQ>
           rcu_core+0x1a4/0x440
           __do_softirq+0xd3/0x2c8
           irq_exit+0x9d/0xa0
           smp_apic_timer_interrupt+0x68/0x120
           apic_timer_interrupt+0xf/0x20
           </IRQ>
          RIP: 0033:0x47ce80
          Code: Bad RIP value.
          RSP: 002b:00007fd9187fba40 EFLAGS: 00000206 ORIG_RAX: ffffffffffffff13
          RAX: 0000000000000002 RBX: 00007fd931789160 RCX: 000000000000010c
          RDX: 00007fd9308cdfb4 RSI: 00007fd9308cdfb4 RDI: 00007ffedd1ea0a8
          RBP: 00007fd9187fbab0 R08: 000000000000000e R09: 000000000000002a
          R10: 0000000000480210 R11: 00007fd9187fc570 R12: 00007fd9316cc400
          R13: 0000000000000118 R14: 00007fd9308cdfb4 R15: 00007fd9317a9380
      
      After further analysis, the bug is triggered by
      Commit eaaacd23 ("bpf: Add task and task/file iterator targets")
      which introduced task_file bpf iterator, which traverses all open file
      descriptors for all tasks in the current namespace.
      The latest `bpftool prog` calls a task_file bpf program to traverse
      all files in the system in order to associate processes with progs/maps, etc.
      When traversing files for a given task, rcu read_lock is taken to
      access all files in a file_struct. But it used get_file() to grab
      a file, which is not right. It is possible file->f_count is 0 and
      get_file() will unconditionally increase it.
      Later put_file() may cause all kind of issues with the above
      as one of sympotoms.
      
      The failure can be reproduced with the following steps in a few seconds:
          $ cat t.c
          #include <stdio.h>
          #include <sys/types.h>
          #include <sys/stat.h>
          #include <fcntl.h>
          #include <unistd.h>
      
          #define N 10000
          int fd[N];
          int main() {
            int i;
      
            for (i = 0; i < N; i++) {
              fd[i] = open("./note.txt", 'r');
              if (fd[i] < 0) {
                 fprintf(stderr, "failed\n");
                 return -1;
              }
            }
            for (i = 0; i < N; i++)
              close(fd[i]);
      
            return 0;
          }
          $ gcc -O2 t.c
          $ cat run.sh
          #/bin/bash
          for i in {1..100}
          do
            while true; do ./a.out; done &
          done
          $ ./run.sh
          $ while true; do bpftool prog >& /dev/null; done
      
      This patch used get_file_rcu() which only grabs a file if the
      file->f_count is not zero. This is to ensure the file pointer
      is always valid. The above reproducer did not fail for more
      than 30 minutes.
      
      Fixes: eaaacd23
      
       ("bpf: Add task and task/file iterator targets")
      Suggested-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Signed-off-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Reviewed-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Link: https://lore.kernel.org/bpf/20200817174214.252601-1-yhs@fb.com
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      63f10205
    • Christophe Leroy's avatar
      powerpc/fixmap: Fix the size of the early debug area · 2fe8be1a
      Christophe Leroy authored
      [ Upstream commit fdc6edbb ]
      
      Commit ("03fd42d4 powerpc/fixmap: Fix FIX_EARLY_DEBUG_BASE when
      page size is 256k") reworked the setup of the early debug area and
      mistakenly replaced 128 * 1024 by SZ_128.
      
      Change to SZ_128K to restore the original 128 kbytes size of the area.
      
      Fixes: 03fd42d4
      
       ("powerpc/fixmap: Fix FIX_EARLY_DEBUG_BASE when page size is 256k")
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@csgroup.eu>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/996184974d674ff984643778cf1cdd7fe58cc065.1597644194.git.christophe.leroy@csgroup.eu
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      2fe8be1a
    • Stephen Boyd's avatar
      ARM64: vdso32: Install vdso32 from vdso_install · 6f1d3ac2
      Stephen Boyd authored
      [ Upstream commit 8d75785a ]
      
      Add the 32-bit vdso Makefile to the vdso_install rule so that 'make
      vdso_install' installs the 32-bit compat vdso when it is compiled.
      
      Fixes: a7f71a2c
      
       ("arm64: compat: Add vDSO")
      Signed-off-by: default avatarStephen Boyd <swboyd@chromium.org>
      Reviewed-by: default avatarVincenzo Frascino <vincenzo.frascino@arm.com>
      Acked-by: default avatarWill Deacon <will@kernel.org>
      Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
      Link: https://lore.kernel.org/r/20200818014950.42492-1-swboyd@chromium.org
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      6f1d3ac2
    • David Howells's avatar
      afs: Fix NULL deref in afs_dynroot_depopulate() · 88d78fa3
      David Howells authored
      [ Upstream commit 5e0b17b0 ]
      
      If an error occurs during the construction of an afs superblock, it's
      possible that an error occurs after a superblock is created, but before
      we've created the root dentry.  If the superblock has a dynamic root
      (ie.  what's normally mounted on /afs), the afs_kill_super() will call
      afs_dynroot_depopulate() to unpin any created dentries - but this will
      oops if the root hasn't been created yet.
      
      Fix this by skipping that bit of code if there is no root dentry.
      
      This leads to an oops looking like:
      
      	general protection fault, ...
      	KASAN: null-ptr-deref in range [0x0000000000000068-0x000000000000006f]
      	...
      	RIP: 0010:afs_dynroot_depopulate+0x25f/0x529 fs/afs/dynroot.c:385
      	...
      	Call Trace:
      	 afs_kill_super+0x13b/0x180 fs/afs/super.c:535
      	 deactivate_locked_super+0x94/0x160 fs/super.c:335
      	 afs_get_tree+0x1124/0x1460 fs/afs/super.c:598
      	 vfs_get_tree+0x89/0x2f0 fs/super.c:1547
      	 do_new_mount fs/namespace.c:2875 [inline]
      	 path_mount+0x1387/0x2070 fs/namespace.c:3192
      	 do_mount fs/namespace.c:3205 [inline]
      	 __do_sys_mount fs/namespace.c:3413 [inline]
      	 __se_sys_mount fs/namespace.c:3390 [inline]
      	 __x64_sys_mount+0x27f/0x300 fs/namespace.c:3390
      	 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
      	 entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      which is oopsing on this line:
      
      	inode_lock(root->d_inode);
      
      presumably because sb->s_root was NULL.
      
      Fixes: 0da0b7fd
      
       ("afs: Display manually added cells in dynamic root mount")
      Reported-by: default avatar <syzbot+c1eff8205244ae7e11a6@syzkaller.appspotmail.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      88d78fa3
    • Masahiro Yamada's avatar
      kconfig: qconf: remove qInfo() to get back Qt4 support · f09a790a
      Masahiro Yamada authored
      [ Upstream commit 53efe2e7 ]
      
      qconf is supposed to work with Qt4 and Qt5, but since commit
      c4f7398b ("kconfig: qconf: make debug links work again"),
      building with Qt4 fails as follows:
      
        HOSTCXX scripts/kconfig/qconf.o
      scripts/kconfig/qconf.cc: In member function ‘void ConfigInfoView::clicked(const QUrl&)’:
      scripts/kconfig/qconf.cc:1241:3: error: ‘qInfo’ was not declared in this scope; did you mean ‘setInfo’?
       1241 |   qInfo() << "Clicked link is empty";
            |   ^~~~~
            |   setInfo
      scripts/kconfig/qconf.cc:1254:3: error: ‘qInfo’ was not declared in this scope; did you mean ‘setInfo’?
       1254 |   qInfo() << "Clicked symbol is invalid:" << data;
            |   ^~~~~
            |   setInfo
      make[1]: *** [scripts/Makefile.host:129: scripts/kconfig/qconf.o] Error 1
      make: *** [Makefile:606: xconfig] Error 2
      
      qInfo() does not exist in Qt4. In my understanding, these call-sites
      should be unreachable. Perhaps, qWarning(), assertion, or something
      is better, but qInfo() is not the right one to use here, I think.
      
      Fixes: c4f7398b
      
       ("kconfig: qconf: make debug links work again")
      Reported-by: default avatarRonald Warsow <rwarsow@gmx.de>
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f09a790a
    • David Howells's avatar
      afs: Fix key ref leak in afs_put_operation() · 19881eba
      David Howells authored
      [ Upstream commit ba8e4207 ]
      
      The afs_put_operation() function needs to put the reference to the key
      that's authenticating the operation.
      
      Fixes: e49c7b2f
      
       ("afs: Build an abstraction around an "operation" concept")
      Reported-by: default avatarDave Botsch <botsch@cnf.cornell.edu>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      19881eba
    • Weihang Li's avatar
      Revert "RDMA/hns: Reserve one sge in order to avoid local length error" · f35bb842
      Weihang Li authored
      [ Upstream commit 6da06c62 ]
      
      This patch caused some issues on SEND operation, and it should be reverted
      to make the drivers work correctly. There will be a better solution that
      has been tested carefully to solve the original problem.
      
      This reverts commit 711195e5.
      
      Fixes: 711195e5
      
       ("RDMA/hns: Reserve one sge in order to avoid local length error")
      Link: https://lore.kernel.org/r/1597829984-20223-1-git-send-email-liweihang@huawei.com
      Signed-off-by: default avatarWeihang Li <liweihang@huawei.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@nvidia.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      f35bb842
    • Selvin Xavier's avatar
      RDMA/bnxt_re: Do not add user qps to flushlist · 8facd0c4
      Selvin Xavier authored
      [ Upstream commit a812f2d6 ]
      
      Driver shall add only the kernel qps to the flush list for clean up.
      During async error events from the HW, driver is adding qps to this list
      without checking if the qp is kernel qp or not.
      
      Add a check to avoid user qp addition to the flush list.
      
      Fixes: 942c9b6c ("RDMA/bnxt_re: Avoid Hard lockup during error CQE processing")
      Fixes: c50866e2
      
       ("bnxt_re: fix the regression due to changes in alloc_pbl")
      Link: https://lore.kernel.org/r/1596689148-4023-1-git-send-email-selvin.xavier@broadcom.com
      Signed-off-by: default avatarSelvin Xavier <selvin.xavier@broadcom.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@nvidia.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      8facd0c4
    • Randy Dunlap's avatar
      Fix build error when CONFIG_ACPI is not set/enabled: · 1f43cb1c
      Randy Dunlap authored
      [ Upstream commit ee87e155 ]
      
      ../arch/x86/pci/xen.c: In function ‘pci_xen_init’:
      ../arch/x86/pci/xen.c:410:2: error: implicit declaration of function ‘acpi_noirq_set’; did you mean ‘acpi_irq_get’? [-Werror=implicit-function-declaration]
        acpi_noirq_set();
      
      Fixes: 88e9ca16
      
       ("xen/pci: Use acpi_noirq_set() helper to avoid #ifdef")
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Reviewed-by: default avatarJuergen Gross <jgross@suse.com>
      Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
      Cc: Bjorn Helgaas <bhelgaas@google.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: xen-devel@lists.xenproject.org
      Cc: linux-pci@vger.kernel.org
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      1f43cb1c
    • Juergen Gross's avatar
      efi: avoid error message when booting under Xen · 15f8decf
      Juergen Gross authored
      [ Upstream commit 6163a985 ]
      
      efifb_probe() will issue an error message in case the kernel is booted
      as Xen dom0 from UEFI as EFI_MEMMAP won't be set in this case. Avoid
      that message by calling efi_mem_desc_lookup() only if EFI_MEMMAP is set.
      
      Fixes: 38ac0287
      
       ("fbdev/efifb: Honour UEFI memory map attributes when mapping the FB")
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      Acked-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Acked-by: default avatarBartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
      Signed-off-by: default avatarJuergen Gross <jgross@suse.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      15f8decf
    • Masahiro Yamada's avatar
      kconfig: qconf: fix signal connection to invalid slots · 80876bf7
      Masahiro Yamada authored
      [ Upstream commit d85de339 ]
      
      If you right-click in the ConfigList window, you will see the following
      messages in the console:
      
      QObject::connect: No such slot QAction::setOn(bool) in scripts/kconfig/qconf.cc:888
      QObject::connect:  (sender name:   'config')
      QObject::connect: No such slot QAction::setOn(bool) in scripts/kconfig/qconf.cc:897
      QObject::connect:  (sender name:   'config')
      QObject::connect: No such slot QAction::setOn(bool) in scripts/kconfig/qconf.cc:906
      QObject::connect:  (sender name:   'config')
      
      Right, there is no such slot in QAction. I think this is a typo of
      setChecked.
      
      Due to this bug, when you toggled the menu "Option->Show Name/Range/Data"
      the state of the context menu was not previously updated. Fix this.
      
      Fixes: d5d973c3
      
       ("Port xconfig to Qt5 - Put back some of the old implementation(part 2)")
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      80876bf7
    • Masahiro Yamada's avatar
      kconfig: qconf: do not limit the pop-up menu to the first row · a00ac434
      Masahiro Yamada authored
      [ Upstream commit fa8de0a3
      
       ]
      
      If you right-click the first row in the option tree, the pop-up menu
      shows up, but if you right-click the second row or below, the event
      is ignored due to the following check:
      
        if (e->y() <= header()->geometry().bottom()) {
      
      Perhaps, the intention was to show the pop-menu only when the tree
      header was right-clicked, but this handler is not called in that case.
      
      Since the origin of e->y() starts from the bottom of the header,
      this check is odd.
      
      Going forward, you can right-click anywhere in the tree to get the
      pop-up menu.
      
      Signed-off-by: default avatarMasahiro Yamada <masahiroy@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a00ac434
    • Quinn Tran's avatar
      Revert "scsi: qla2xxx: Disable T10-DIF feature with FC-NVMe during probe" · a4d53396
      Quinn Tran authored
      [ Upstream commit dca93232 ]
      
      FCP T10-PI and NVMe features are independent of each other. This patch
      allows both features to co-exist.
      
      This reverts commit 5da05a26.
      
      Link: https://lore.kernel.org/r/20200806111014.28434-12-njavali@marvell.com
      Fixes: 5da05a26
      
       ("scsi: qla2xxx: Disable T10-DIF feature with FC-NVMe during probe")
      Reviewed-by: default avatarHimanshu Madhani <himanshu.madhani@oracle.com>
      Signed-off-by: default avatarQuinn Tran <qutran@marvell.com>
      Signed-off-by: default avatarNilesh Javali <njavali@marvell.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      a4d53396