Skip to content
  1. Jan 18, 2020
  2. Sep 24, 2019
  3. Sep 23, 2019
  4. Sep 21, 2019
    • Paul Gortmaker's avatar
    • Gustavo Romero's avatar
      powerpc/tm: Fix restoring FP/VMX facility incorrectly on interrupts · d37c9ce7
      Gustavo Romero authored
      
      
      commit a8318c13 upstream.
      
      When in userspace and MSR FP=0 the hardware FP state is unrelated to
      the current process. This is extended for transactions where if tbegin
      is run with FP=0, the hardware checkpoint FP state will also be
      unrelated to the current process. Due to this, we need to ensure this
      hardware checkpoint is updated with the correct state before we enable
      FP for this process.
      
      Unfortunately we get this wrong when returning to a process from a
      hardware interrupt. A process that starts a transaction with FP=0 can
      take an interrupt. When the kernel returns back to that process, we
      change to FP=1 but with hardware checkpoint FP state not updated. If
      this transaction is then rolled back, the FP registers now contain the
      wrong state.
      
      The process looks like this:
         Userspace:                      Kernel
      
                     Start userspace
                      with MSR FP=0 TM=1
                        < -----
         ...
         tbegin
         bne
                     Hardware interrupt
                         ---- >
                                          <do_IRQ...>
                                          ....
                                          ret_from_except
                                            restore_math()
      				        /* sees FP=0 */
                                              restore_fp()
                                                tm_active_with_fp()
      					    /* sees FP=1 (Incorrect) */
                                                load_fp_state()
                                              FP = 0 -> 1
                        < -----
                     Return to userspace
                       with MSR TM=1 FP=1
                       with junk in the FP TM checkpoint
         TM rollback
         reads FP junk
      
      When returning from the hardware exception, tm_active_with_fp() is
      incorrectly making restore_fp() call load_fp_state() which is setting
      FP=1.
      
      The fix is to remove tm_active_with_fp().
      
      tm_active_with_fp() is attempting to handle the case where FP state
      has been changed inside a transaction. In this case the checkpointed
      and transactional FP state is different and hence we must restore the
      FP state (ie. we can't do lazy FP restore inside a transaction that's
      used FP). It's safe to remove tm_active_with_fp() as this case is
      handled by restore_tm_state(). restore_tm_state() detects if FP has
      been using inside a transaction and will set load_fp and call
      restore_math() to ensure the FP state (checkpoint and transaction) is
      restored.
      
      This is a data integrity problem for the current process as the FP
      registers are corrupted. It's also a security problem as the FP
      registers from one process may be leaked to another.
      
      Similarly for VMX.
      
      A simple testcase to replicate this will be posted to
      tools/testing/selftests/powerpc/tm/tm-poison.c
      
      This fixes CVE-2019-15031.
      
      Fixes: a7771176 ("powerpc: Don't enable FP/Altivec if not checkpointed")
      Cc: stable@vger.kernel.org # 4.15+
      Signed-off-by: default avatarGustavo Romero <gromero@linux.ibm.com>
      Signed-off-by: default avatarMichael Neuling <mikey@neuling.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20190904045529.23002-2-gromero@linux.vnet.ibm.com
      
      
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      d37c9ce7
    • Gustavo Romero's avatar
      powerpc/tm: Fix FP/VMX unavailable exceptions inside a transaction · 337cbcfb
      Gustavo Romero authored
      
      
      commit 8205d5d9 upstream.
      
      When we take an FP unavailable exception in a transaction we have to
      account for the hardware FP TM checkpointed registers being
      incorrect. In this case for this process we know the current and
      checkpointed FP registers must be the same (since FP wasn't used
      inside the transaction) hence in the thread_struct we copy the current
      FP registers to the checkpointed ones.
      
      This copy is done in tm_reclaim_thread(). We use thread->ckpt_regs.msr
      to determine if FP was on when in userspace. thread->ckpt_regs.msr
      represents the state of the MSR when exiting userspace. This is setup
      by check_if_tm_restore_required().
      
      Unfortunatley there is an optimisation in giveup_all() which returns
      early if tsk->thread.regs->msr (via local variable `usermsr`) has
      FP=VEC=VSX=SPE=0. This optimisation means that
      check_if_tm_restore_required() is not called and hence
      thread->ckpt_regs.msr is not updated and will contain an old value.
      
      This can happen if due to load_fp=255 we start a userspace process
      with MSR FP=1 and then we are context switched out. In this case
      thread->ckpt_regs.msr will contain FP=1. If that same process is then
      context switched in and load_fp overflows, MSR will have FP=0. If that
      process now enters a transaction and does an FP instruction, the FP
      unavailable will not update thread->ckpt_regs.msr (the bug) and MSR
      FP=1 will be retained in thread->ckpt_regs.msr.  tm_reclaim_thread()
      will then not perform the required memcpy and the checkpointed FP regs
      in the thread struct will contain the wrong values.
      
      The code path for this happening is:
      
             Userspace:                      Kernel
                         Start userspace
                          with MSR FP/VEC/VSX/SPE=0 TM=1
                            < -----
             ...
             tbegin
             bne
             fp instruction
                         FP unavailable
                             ---- >
                                              fp_unavailable_tm()
      					  tm_reclaim_current()
      					    tm_reclaim_thread()
      					      giveup_all()
      					        return early since FP/VMX/VSX=0
      						/* ckpt MSR not updated (Incorrect) */
      					      tm_reclaim()
      					        /* thread_struct ckpt FP regs contain junk (OK) */
                                                    /* Sees ckpt MSR FP=1 (Incorrect) */
      					      no memcpy() performed
      					        /* thread_struct ckpt FP regs not fixed (Incorrect) */
      					  tm_recheckpoint()
      					     /* Put junk in hardware checkpoint FP regs */
                                               ....
                            < -----
                         Return to userspace
                           with MSR TM=1 FP=1
                           with junk in the FP TM checkpoint
             TM rollback
             reads FP junk
      
      This is a data integrity problem for the current process as the FP
      registers are corrupted. It's also a security problem as the FP
      registers from one process may be leaked to another.
      
      This patch moves up check_if_tm_restore_required() in giveup_all() to
      ensure thread->ckpt_regs.msr is updated correctly.
      
      A simple testcase to replicate this will be posted to
      tools/testing/selftests/powerpc/tm/tm-poison.c
      
      Similarly for VMX.
      
      This fixes CVE-2019-15030.
      
      Fixes: f48e91e8 ("powerpc/tm: Fix FP and VMX register corruption")
      Cc: stable@vger.kernel.org # 4.12+
      Signed-off-by: default avatarGustavo Romero <gromero@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Neuling <mikey@neuling.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Link: https://lore.kernel.org/r/20190904045529.23002-1-gromero@linux.vnet.ibm.com
      
      
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      337cbcfb
    • Arnaldo Carvalho de Melo's avatar
      tools arch x86: Sync asm/cpufeatures.h with the with the kernel · 59cb34e1
      Arnaldo Carvalho de Melo authored
      commit 0ac10d87 upstream.
      
      To pick up the changes in:
      
        f36cf386 ("x86/speculation/swapgs: Exclude ATOMs from speculation through SWAPGS")
        18ec54fd ("x86/speculation: Prepare entry code for Spectre v1 swapgs mitigations")
      
      That don't affect anything in tools/.
      
      This silences this perf build warning:
      
        Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h'
        diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: https://lkml.kernel.org/n/tip-860dq1qie2cpnfghlpcnxrzr@git.kernel.org
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      59cb34e1
    • Arnaldo Carvalho de Melo's avatar
      tools arch x86: Sync asm/cpufeatures.h with the with the kernel · 0325926c
      Arnaldo Carvalho de Melo authored
      commit b979540a upstream.
      
      To pick up the changes in:
      
        ed5194c2 ("x86/speculation/mds: Add basic bug infrastructure for MDS")
        e261f209 ("x86/speculation/mds: Add BUG_MSBDS_ONLY")
      
      That don't affect anything in tools/.
      
      This silences this perf build warning:
      
        Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h'
        diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: https://lkml.kernel.org/n/tip-jp1afecx3ql1jkuirpgkqfad@git.kernel.org
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      0325926c
    • Josh Poimboeuf's avatar
      Documentation: Add swapgs description to the Spectre v1 documentation · ff09129b
      Josh Poimboeuf authored
      
      
      commit 4c920576 upstream.
      
      Add documentation to the Spectre document about the new swapgs variant of
      Spectre v1.
      
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      ff09129b
    • Thomas Gleixner's avatar
      x86/speculation/swapgs: Exclude ATOMs from speculation through SWAPGS · c1b8a3e4
      Thomas Gleixner authored
      
      
      commit f36cf386 upstream.
      
      Intel provided the following information:
      
       On all current Atom processors, instructions that use a segment register
       value (e.g. a load or store) will not speculatively execute before the
       last writer of that segment retires. Thus they will not use a
       speculatively written segment value.
      
      That means on ATOMs there is no speculation through SWAPGS, so the SWAPGS
      entry paths can be excluded from the extra LFENCE if PTI is disabled.
      
      Create a separate bug flag for the through SWAPGS speculation and mark all
      out-of-order ATOMs and AMD/HYGON CPUs as not affected. The in-order ATOMs
      are excluded from the whole mitigation mess anyway.
      
      Reported-by: default avatarAndrew Cooper <andrew.cooper3@citrix.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarTyler Hicks <tyhicks@canonical.com>
      Reviewed-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      [PG: drop HYGON chunk - doesn't exist in 4.18.x kernels.]
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      c1b8a3e4
    • Josh Poimboeuf's avatar
      x86/entry/64: Use JMP instead of JMPQ · 80d91a69
      Josh Poimboeuf authored
      
      
      commit 64dbc122 upstream.
      
      Somehow the swapgs mitigation entry code patch ended up with a JMPQ
      instruction instead of JMP, where only the short jump is needed.  Some
      assembler versions apparently fail to optimize JMPQ into a two-byte JMP
      when possible, instead always using a 7-byte JMP with relocation.  For
      some reason that makes the entry code explode with a #GP during boot.
      
      Change it back to "JMP" as originally intended.
      
      Fixes: 18ec54fd ("x86/speculation: Prepare entry code for Spectre v1 swapgs mitigations")
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      80d91a69
    • Josh Poimboeuf's avatar
      x86/speculation: Enable Spectre v1 swapgs mitigations · eeec7198
      Josh Poimboeuf authored
      
      
      commit a2059825 upstream.
      
      The previous commit added macro calls in the entry code which mitigate the
      Spectre v1 swapgs issue if the X86_FEATURE_FENCE_SWAPGS_* features are
      enabled.  Enable those features where applicable.
      
      The mitigations may be disabled with "nospectre_v1" or "mitigations=off".
      
      There are different features which can affect the risk of attack:
      
      - When FSGSBASE is enabled, unprivileged users are able to place any
        value in GS, using the wrgsbase instruction.  This means they can
        write a GS value which points to any value in kernel space, which can
        be useful with the following gadget in an interrupt/exception/NMI
        handler:
      
      	if (coming from user space)
      		swapgs
      	mov %gs:<percpu_offset>, %reg1
      	// dependent load or store based on the value of %reg
      	// for example: mov %(reg1), %reg2
      
        If an interrupt is coming from user space, and the entry code
        speculatively skips the swapgs (due to user branch mistraining), it
        may speculatively execute the GS-based load and a subsequent dependent
        load or store, exposing the kernel data to an L1 side channel leak.
      
        Note that, on Intel, a similar attack exists in the above gadget when
        coming from kernel space, if the swapgs gets speculatively executed to
        switch back to the user GS.  On AMD, this variant isn't possible
        because swapgs is serializing with respect to future GS-based
        accesses.
      
        NOTE: The FSGSBASE patch set hasn't been merged yet, so the above case
      	doesn't exist quite yet.
      
      - When FSGSBASE is disabled, the issue is mitigated somewhat because
        unprivileged users must use prctl(ARCH_SET_GS) to set GS, which
        restricts GS values to user space addresses only.  That means the
        gadget would need an additional step, since the target kernel address
        needs to be read from user space first.  Something like:
      
      	if (coming from user space)
      		swapgs
      	mov %gs:<percpu_offset>, %reg1
      	mov (%reg1), %reg2
      	// dependent load or store based on the value of %reg2
      	// for example: mov %(reg2), %reg3
      
        It's difficult to audit for this gadget in all the handlers, so while
        there are no known instances of it, it's entirely possible that it
        exists somewhere (or could be introduced in the future).  Without
        tooling to analyze all such code paths, consider it vulnerable.
      
        Effects of SMAP on the !FSGSBASE case:
      
        - If SMAP is enabled, and the CPU reports RDCL_NO (i.e., not
          susceptible to Meltdown), the kernel is prevented from speculatively
          reading user space memory, even L1 cached values.  This effectively
          disables the !FSGSBASE attack vector.
      
        - If SMAP is enabled, but the CPU *is* susceptible to Meltdown, SMAP
          still prevents the kernel from speculatively reading user space
          memory.  But it does *not* prevent the kernel from reading the
          user value from L1, if it has already been cached.  This is probably
          only a small hurdle for an attacker to overcome.
      
      Thanks to Dave Hansen for contributing the speculative_smap() function.
      
      Thanks to Andrew Cooper for providing the inside scoop on whether swapgs
      is serializing on AMD.
      
      [ tglx: Fixed the USER fence decision and polished the comment as suggested
        	by Dave Hansen ]
      
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarDave Hansen <dave.hansen@intel.com>
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      eeec7198
    • Josh Poimboeuf's avatar
      x86/speculation: Prepare entry code for Spectre v1 swapgs mitigations · ebc32005
      Josh Poimboeuf authored
      
      
      commit 18ec54fd upstream.
      
      Spectre v1 isn't only about array bounds checks.  It can affect any
      conditional checks.  The kernel entry code interrupt, exception, and NMI
      handlers all have conditional swapgs checks.  Those may be problematic in
      the context of Spectre v1, as kernel code can speculatively run with a user
      GS.
      
      For example:
      
      	if (coming from user space)
      		swapgs
      	mov %gs:<percpu_offset>, %reg
      	mov (%reg), %reg1
      
      When coming from user space, the CPU can speculatively skip the swapgs, and
      then do a speculative percpu load using the user GS value.  So the user can
      speculatively force a read of any kernel value.  If a gadget exists which
      uses the percpu value as an address in another load/store, then the
      contents of the kernel value may become visible via an L1 side channel
      attack.
      
      A similar attack exists when coming from kernel space.  The CPU can
      speculatively do the swapgs, causing the user GS to get used for the rest
      of the speculative window.
      
      The mitigation is similar to a traditional Spectre v1 mitigation, except:
      
        a) index masking isn't possible; because the index (percpu offset)
           isn't user-controlled; and
      
        b) an lfence is needed in both the "from user" swapgs path and the
           "from kernel" non-swapgs path (because of the two attacks described
           above).
      
      The user entry swapgs paths already have SWITCH_TO_KERNEL_CR3, which has a
      CR3 write when PTI is enabled.  Since CR3 writes are serializing, the
      lfences can be skipped in those cases.
      
      On the other hand, the kernel entry swapgs paths don't depend on PTI.
      
      To avoid unnecessary lfences for the user entry case, create two separate
      features for alternative patching:
      
        X86_FEATURE_FENCE_SWAPGS_USER
        X86_FEATURE_FENCE_SWAPGS_KERNEL
      
      Use these features in entry code to patch in lfences where needed.
      
      The features aren't enabled yet, so there's no functional change.
      
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarDave Hansen <dave.hansen@intel.com>
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      ebc32005
    • Fenghua Yu's avatar
      x86/cpufeatures: Combine word 11 and 12 into a new scattered features word · 6e620390
      Fenghua Yu authored
      
      
      commit acec0ce0 upstream.
      
      It's a waste for the four X86_FEATURE_CQM_* feature bits to occupy two
      whole feature bits words. To better utilize feature words, re-define
      word 11 to host scattered features and move the four X86_FEATURE_CQM_*
      features into Linux defined word 11. More scattered features can be
      added in word 11 in the future.
      
      Rename leaf 11 in cpuid_leafs to CPUID_LNX_4 to reflect it's a
      Linux-defined leaf.
      
      Rename leaf 12 as CPUID_DUMMY which will be replaced by a meaningful
      name in the next patch when CPUID.7.1:EAX occupies world 12.
      
      Maximum number of RMID and cache occupancy scale are retrieved from
      CPUID.0xf.1 after scattered CQM features are enumerated. Carve out the
      code into a separate function.
      
      KVM doesn't support resctrl now. So it's safe to move the
      X86_FEATURE_CQM_* features to scattered features word 11 for KVM.
      
      Signed-off-by: default avatarFenghua Yu <fenghua.yu@intel.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: Aaron Lewis <aaronlewis@google.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Babu Moger <babu.moger@amd.com>
      Cc: "Chang S. Bae" <chang.seok.bae@intel.com>
      Cc: "Sean J Christopherson" <sean.j.christopherson@intel.com>
      Cc: Frederic Weisbecker <frederic@kernel.org>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jann Horn <jannh@google.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: kvm ML <kvm@vger.kernel.org>
      Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Nadav Amit <namit@vmware.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
      Cc: Peter Feiner <pfeiner@google.com>
      Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
      Cc: "Radim Krčmář" <rkrcmar@redhat.com>
      Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
      Cc: Ravi V Shankar <ravi.v.shankar@intel.com>
      Cc: Sherry Hurwitz <sherry.hurwitz@amd.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Thomas Lendacky <Thomas.Lendacky@amd.com>
      Cc: x86 <x86@kernel.org>
      Link: https://lkml.kernel.org/r/1560794416-217638-2-git-send-email-fenghua.yu@intel.com
      
      
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      6e620390
    • Borislav Petkov's avatar
      x86/cpufeatures: Carve out CQM features retrieval · 18ddd656
      Borislav Petkov authored
      
      
      commit 45fc56e6 upstream.
      
      ... into a separate function for better readability. Split out from a
      patch from Fenghua Yu <fenghua.yu@intel.com> to keep the mechanical,
      sole code movement separate for easy review.
      
      No functional changes.
      
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: Fenghua Yu <fenghua.yu@intel.com>
      Cc: x86@kernel.org
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      18ddd656
    • Tim Chen's avatar
      Documentation: Add section about CPU vulnerabilities for Spectre · 4981f758
      Tim Chen authored
      
      
      commit 6e885594 upstream.
      
      Add documentation for Spectre vulnerability and the mitigation mechanisms:
      
      - Explain the problem and risks
      - Document the mitigation mechanisms
      - Document the command line controls
      - Document the sysfs files
      
      Co-developed-by: default avatarAndi Kleen <ak@linux.intel.com>
      Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
      Co-developed-by: default avatarTim Chen <tim.c.chen@linux.intel.com>
      Signed-off-by: default avatarTim Chen <tim.c.chen@linux.intel.com>
      Reviewed-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Reviewed-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJonathan Corbet <corbet@lwn.net>
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      4981f758
    • Diana Craciun's avatar
      Documentation: Add nospectre_v1 parameter · d91192d8
      Diana Craciun authored
      
      
      commit 26cb1f36 upstream.
      
      Currently only supported on powerpc.
      
      Signed-off-by: default avatarDiana Craciun <diana.craciun@nxp.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      d91192d8
  5. Sep 17, 2019
    • Paul Gortmaker's avatar
    • Thomas Gleixner's avatar
      x86/microcode: Fix the microcode load on CPU hotplug for real · 50d4c659
      Thomas Gleixner authored
      
      
      commit 5423f5ce upstream.
      
      A recent change moved the microcode loader hotplug callback into the early
      startup phase which is running with interrupts disabled. It missed that
      the callbacks invoke sysfs functions which might sleep causing nice 'might
      sleep' splats with proper debugging enabled.
      
      Split the callbacks and only load the microcode in the early startup phase
      and move the sysfs handling back into the later threaded and preemptible
      bringup phase where it was before.
      
      Fixes: 78f4e932 ("x86/microcode, cpuhotplug: Add a microcode loader CPU hotplug callback")
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: stable@vger.kernel.org
      Cc: x86-ml <x86@kernel.org>
      Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1906182228350.1766@nanos.tec.linutronix.de
      50d4c659
    • Jason A. Donenfeld's avatar
      timekeeping: Use proper ktime_add when adding nsecs in coarse offset · 6129763b
      Jason A. Donenfeld authored
      
      
      commit 0354c1a3 upstream.
      
      While this doesn't actually amount to a real difference, since the macro
      evaluates to the same thing, every place else operates on ktime_t using
      these functions, so let's not break the pattern.
      
      Fixes: e3ff9c36 ("timekeeping: Repair ktime_get_coarse*() granularity")
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarArnd Bergmann <arnd@arndb.de>
      Link: https://lkml.kernel.org/r/20190621203249.3909-1-Jason@zx2c4.com
      6129763b
    • Hangbin Liu's avatar
      selftests: fib_rule_tests: use pre-defined DEV_ADDR · 67eaa743
      Hangbin Liu authored
      
      
      commit 34632975 upstream.
      
      DEV_ADDR is defined but not used. Use it in address setting.
      Do the same with IPv6 for consistency.
      
      Reported-by: default avatarDavid Ahern <dsahern@gmail.com>
      Fixes: fc82d93e ("selftests: fib_rule_tests: fix local IPv4 address typo")
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      67eaa743
    • Julian Anastasov's avatar
      ipvs: defer hook registration to avoid leaks · 86968937
      Julian Anastasov authored
      
      
      commit cf47a0b8 upstream.
      
      syzkaller reports for memory leak when registering hooks [1]
      
      As we moved the nf_unregister_net_hooks() call into
      __ip_vs_dev_cleanup(), defer the nf_register_net_hooks()
      call, so that hooks are allocated and freed from same
      pernet_operations (ipvs_core_dev_ops).
      
      [1]
      BUG: memory leak
      unreferenced object 0xffff88810acd8a80 (size 96):
       comm "syz-executor073", pid 7254, jiffies 4294950560 (age 22.250s)
       hex dump (first 32 bytes):
         02 00 00 00 00 00 00 00 50 8b bb 82 ff ff ff ff  ........P.......
         00 00 00 00 00 00 00 00 00 77 bb 82 ff ff ff ff  .........w......
       backtrace:
         [<0000000013db61f1>] kmemleak_alloc_recursive include/linux/kmemleak.h:55 [inline]
         [<0000000013db61f1>] slab_post_alloc_hook mm/slab.h:439 [inline]
         [<0000000013db61f1>] slab_alloc_node mm/slab.c:3269 [inline]
         [<0000000013db61f1>] kmem_cache_alloc_node_trace+0x15b/0x2a0 mm/slab.c:3597
         [<000000001a27307d>] __do_kmalloc_node mm/slab.c:3619 [inline]
         [<000000001a27307d>] __kmalloc_node+0x38/0x50 mm/slab.c:3627
         [<0000000025054add>] kmalloc_node include/linux/slab.h:590 [inline]
         [<0000000025054add>] kvmalloc_node+0x4a/0xd0 mm/util.c:431
         [<0000000050d1bc00>] kvmalloc include/linux/mm.h:637 [inline]
         [<0000000050d1bc00>] kvzalloc include/linux/mm.h:645 [inline]
         [<0000000050d1bc00>] allocate_hook_entries_size+0x3b/0x60 net/netfilter/core.c:61
         [<00000000e8abe142>] nf_hook_entries_grow+0xae/0x270 net/netfilter/core.c:128
         [<000000004b94797c>] __nf_register_net_hook+0x9a/0x170 net/netfilter/core.c:337
         [<00000000d1545cbc>] nf_register_net_hook+0x34/0xc0 net/netfilter/core.c:464
         [<00000000876c9b55>] nf_register_net_hooks+0x53/0xc0 net/netfilter/core.c:480
         [<000000002ea868e0>] __ip_vs_init+0xe8/0x170 net/netfilter/ipvs/ip_vs_core.c:2280
         [<000000002eb2d451>] ops_init+0x4c/0x140 net/core/net_namespace.c:130
         [<000000000284ec48>] setup_net+0xde/0x230 net/core/net_namespace.c:316
         [<00000000a70600fa>] copy_net_ns+0xf0/0x1e0 net/core/net_namespace.c:439
         [<00000000ff26c15e>] create_new_namespaces+0x141/0x2a0 kernel/nsproxy.c:107
         [<00000000b103dc79>] copy_namespaces+0xa1/0xe0 kernel/nsproxy.c:165
         [<000000007cc008a2>] copy_process.part.0+0x11fd/0x2150 kernel/fork.c:2035
         [<00000000c344af7c>] copy_process kernel/fork.c:1800 [inline]
         [<00000000c344af7c>] _do_fork+0x121/0x4f0 kernel/fork.c:2369
      
      Reported-by: default avatar <syzbot+722da59ccb264bc19910@syzkaller.appspotmail.com>
      Fixes: 719c7d56 ("ipvs: Fix use-after-free in ip_vs_in")
      Signed-off-by: default avatarJulian Anastasov <ja@ssi.bg>
      Acked-by: default avatarSimon Horman <horms@verge.net.au>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      86968937
    • Steven Price's avatar
      initramfs: don't free a non-existent initrd · 98d726f8
      Steven Price authored
      commit 5d59aa8f upstream.
      
      Since commit 54c7a891 ("initramfs: free initrd memory if opening
      /initrd.image fails"), the kernel has unconditionally attempted to free
      the initrd even if it doesn't exist.
      
      In the non-existent case this causes a boot-time splat if
      CONFIG_DEBUG_VIRTUAL is enabled due to a call to virt_to_phys() with a
      NULL address.
      
      Instead we should check that the initrd actually exists and only attempt
      to free it if it does.
      
      Link: http://lkml.kernel.org/r/20190516143125.48948-1-steven.price@arm.com
      
      
      Fixes: 54c7a891 ("initramfs: free initrd memory if opening /initrd.image fails")
      Signed-off-by: default avatarSteven Price <steven.price@arm.com>
      Reported-by: default avatarMark Rutland <mark.rutland@arm.com>
      Tested-by: default avatarMark Rutland <mark.rutland@arm.com>
      Reviewed-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [PG: adapt for older 4.18.x codebase with kexec if/else chunk.]
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      98d726f8
    • Steven Rostedt (VMware)'s avatar
      x86/ftrace: Fix warning and considate ftrace_jmp_replace() and ftrace_call_replace() · 499b8347
      Steven Rostedt (VMware) authored
      
      
      commit 745cfeaa upstream.
      
      Arnd reported the following compiler warning:
      
      arch/x86/kernel/ftrace.c:669:23: error: 'ftrace_jmp_replace' defined but not used [-Werror=unused-function]
      
      The ftrace_jmp_replace() function now only has a single user and should be
      simply moved by that user. But looking at the code, it shows that
      ftrace_jmp_replace() is similar to ftrace_call_replace() except that instead
      of using the opcode of 0xe8 it uses 0xe9. It makes more sense to consolidate
      that function into one implementation that both ftrace_jmp_replace() and
      ftrace_call_replace() use by passing in the op code separate.
      
      The structure in ftrace_code_union is also modified to replace the "e8"
      field with the more appropriate name "op".
      
      Cc: stable@vger.kernel.org
      Reported-by: default avatarArnd Bergmann <arnd@arndb.de>
      Acked-by: default avatarArnd Bergmann <arnd@arndb.de>
      Link: http://lkml.kernel.org/r/20190304200748.1418790-1-arnd@arndb.de
      
      
      Fixes: d2a68c4e ("x86/ftrace: Do not call function graph from dynamic trampolines")
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      499b8347
    • Hauke Mehrtens's avatar
      MIPS: Fix bounds check virt_addr_valid · f10361c9
      Hauke Mehrtens authored
      
      
      commit d6ed083f upstream.
      
      The bounds check used the uninitialized variable vaddr, it should use
      the given parameter kaddr instead. When using the uninitialized value
      the compiler assumed it to be 0 and optimized this function to just
      return 0 in all cases.
      
      This should make the function check the range of the given address and
      only do the page map check in case it is in the expected range of
      virtual addresses.
      
      Fixes: 074a1e11 ("MIPS: Bounds check virt_addr_valid")
      Cc: stable@vger.kernel.org # v4.12+
      Cc: Paul Burton <paul.burton@mips.com>
      Signed-off-by: default avatarHauke Mehrtens <hauke@hauke-m.de>
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: ralf@linux-mips.org
      Cc: jhogan@kernel.org
      Cc: f4bug@amsat.org
      Cc: linux-mips@vger.kernel.org
      Cc: ysu@wavecomp.com
      Cc: jcristau@debian.org
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      f10361c9
    • Lyude Paul's avatar
      drm/nouveau/i2c: Enable i2c pads & busses during preinit · 221d3e88
      Lyude Paul authored
      
      
      commit 7cb95eee upstream.
      
      It turns out that while disabling i2c bus access from software when the
      GPU is suspended was a step in the right direction with:
      
      commit 342406e4 ("drm/nouveau/i2c: Disable i2c bus access after
      ->fini()")
      
      We also ended up accidentally breaking the vbios init scripts on some
      older Tesla GPUs, as apparently said scripts can actually use the i2c
      bus. Since these scripts are executed before initializing any
      subdevices, we end up failing to acquire access to the i2c bus which has
      left a number of cards with their fan controllers uninitialized. Luckily
      this doesn't break hardware - it just means the fan gets stuck at 100%.
      
      This also means that we've always been using our i2c busses before
      initializing them during the init scripts for older GPUs, we just didn't
      notice it until we started preventing them from being used until init.
      It's pretty impressive this never caused us any issues before!
      
      So, fix this by initializing our i2c pad and busses during subdev
      pre-init. We skip initializing aux busses during pre-init, as those are
      guaranteed to only ever be used by nouveau for DP aux transactions.
      
      Signed-off-by: default avatarLyude Paul <lyude@redhat.com>
      Tested-by: default avatarMarc Meledandri <m.meledandri@gmail.com>
      Fixes: 342406e4 ("drm/nouveau/i2c: Disable i2c bus access after ->fini()")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarBen Skeggs <bskeggs@redhat.com>
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      221d3e88
    • Lyude Paul's avatar
      drm/nouveau: Don't retry infinitely when receiving no data on i2c over AUX · 67c16baf
      Lyude Paul authored
      
      
      commit c358ebf5 upstream.
      
      While I had thought I had fixed this issue in:
      
      commit 342406e4 ("drm/nouveau/i2c: Disable i2c bus access after
      ->fini()")
      
      It turns out that while I did fix the error messages I was seeing on my
      P50 when trying to access i2c busses with the GPU in runtime suspend, I
      accidentally had missed one important detail that was mentioned on the
      bug report this commit was supposed to fix: that the CPU would only lock
      up when trying to access i2c busses _on connected devices_ _while the
      GPU is not in runtime suspend_. Whoops. That definitely explains why I
      was not able to get my machine to hang with i2c bus interactions until
      now, as plugging my P50 into it's dock with an HDMI monitor connected
      allowed me to finally reproduce this locally.
      
      Now that I have managed to reproduce this issue properly, it looks like
      the problem is much simpler then it looks. It turns out that some
      connected devices, such as MST laptop docks, will actually ACK i2c reads
      even if no data was actually read:
      
      [  275.063043] nouveau 0000:01:00.0: i2c: aux 000a: 1: 0000004c 1
      [  275.063447] nouveau 0000:01:00.0: i2c: aux 000a: 00 01101000 10040000
      [  275.063759] nouveau 0000:01:00.0: i2c: aux 000a: rd 00000001
      [  275.064024] nouveau 0000:01:00.0: i2c: aux 000a: rd 00000000
      [  275.064285] nouveau 0000:01:00.0: i2c: aux 000a: rd 00000000
      [  275.064594] nouveau 0000:01:00.0: i2c: aux 000a: rd 00000000
      
      Because we don't handle the situation of i2c ack without any data, we
      end up entering an infinite loop in nvkm_i2c_aux_i2c_xfer() since the
      value of cnt always remains at 0. This finally properly explains how
      this could result in a CPU hang like the ones observed in the
      aforementioned commit.
      
      So, fix this by retrying transactions if no data is written or received,
      and give up and fail the transaction if we continue to not write or
      receive any data after 32 retries.
      
      Signed-off-by: default avatarLyude Paul <lyude@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarBen Skeggs <bskeggs@redhat.com>
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      67c16baf
    • Alaa Hleihel's avatar
      net/mlx5: Avoid reloading already removed devices · e217f8d4
      Alaa Hleihel authored
      
      
      commit dd80857b upstream.
      
      Prior to reloading a device we must first verify that it was not already
      removed. Otherwise, the attempt to remove the device will do nothing, and
      in that case we will end up proceeding with adding an new device that no
      one was expecting to remove, leaving behind used resources such as EQs that
      causes a failure to destroy comp EQs and syndrome (0x30f433).
      
      Fix that by making sure that we try to remove and add a device (based on a
      protocol) only if the device is already added.
      
      Fixes: c5447c70 ("net/mlx5: E-Switch, Reload IB interface when switching devlink modes")
      Signed-off-by: default avatarAlaa Hleihel <alaa@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      e217f8d4
    • Alexander Lochmann's avatar
      Abort file_remove_privs() for non-reg. files · e688ac75
      Alexander Lochmann authored
      
      
      commit f69e749a upstream.
      
      file_remove_privs() might be called for non-regular files, e.g.
      blkdev inode. There is no reason to do its job on things
      like blkdev inodes, pipes, or cdevs. Hence, abort if
      file does not refer to a regular inode.
      
      AV: more to the point, for devices there might be any number of
      inodes refering to given device.  Which one to strip the permissions
      from, even if that made any sense in the first place?  All of them
      will be observed with contents modified, after all.
      
      Found by LockDoc (Alexander Lochmann, Horst Schirmeier and Olaf
      Spinczyk)
      
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarAlexander Lochmann <alexander.lochmann@tu-dortmund.de>
      Signed-off-by: default avatarHorst Schirmeier <horst.schirmeier@tu-dortmund.de>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      e688ac75
    • Andrea Arcangeli's avatar
      coredump: fix race condition between collapse_huge_page() and core dumping · a94b1275
      Andrea Arcangeli authored
      commit 59ea6d06 upstream.
      
      When fixing the race conditions between the coredump and the mmap_sem
      holders outside the context of the process, we focused on
      mmget_not_zero()/get_task_mm() callers in 04f5866e ("coredump: fix
      race condition between mmget_not_zero()/get_task_mm() and core
      dumping"), but those aren't the only cases where the mmap_sem can be
      taken outside of the context of the process as Michal Hocko noticed
      while backporting that commit to older -stable kernels.
      
      If mmgrab() is called in the context of the process, but then the
      mm_count reference is transferred outside the context of the process,
      that can also be a problem if the mmap_sem has to be taken for writing
      through that mm_count reference.
      
      khugepaged registration calls mmgrab() in the context of the process,
      but the mmap_sem for writing is taken later in the context of the
      khugepaged kernel thread.
      
      collapse_huge_page() after taking the mmap_sem for writing doesn't
      modify any vma, so it's not obvious that it could cause a problem to the
      coredump, but it happens to modify the pmd in a way that breaks an
      invariant that pmd_trans_huge_lock() relies upon.  collapse_huge_page()
      needs the mmap_sem for writing just to block concurrent page faults that
      call pmd_trans_huge_lock().
      
      Specifically the invariant that "!pmd_trans_huge()" cannot become a
      "pmd_trans_huge()" doesn't hold while collapse_huge_page() runs.
      
      The coredump will call __get_user_pages() without mmap_sem for reading,
      which eventually can invoke a lockless page fault which will need a
      functional pmd_trans_huge_lock().
      
      So collapse_huge_page() needs to use mmget_still_valid() to check it's
      not running concurrently with the coredump...  as long as the coredump
      can invoke page faults without holding the mmap_sem for reading.
      
      This has "Fixes: khugepaged" to facilitate backporting, but in my view
      it's more a bug in the coredump code that will eventually have to be
      rewritten to stop invoking page faults without the mmap_sem for reading.
      So the long term plan is still to drop all mmget_still_valid().
      
      Link: http://lkml.kernel.org/r/20190607161558.32104-1-aarcange@redhat.com
      
      
      Fixes: ba76149f ("thp: khugepaged")
      Signed-off-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
      Reported-by: default avatarMichal Hocko <mhocko@suse.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Acked-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Jann Horn <jannh@google.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Peter Xu <peterx@redhat.com>
      Cc: Jason Gunthorpe <jgg@mellanox.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      a94b1275
    • Tobin C. Harding's avatar
      ocfs2: fix error path kobject memory leak · 66a72b9a
      Tobin C. Harding authored
      commit b9fba67b upstream.
      
      If a call to kobject_init_and_add() fails we should call kobject_put()
      otherwise we leak memory.
      
      Add call to kobject_put() in the error path of call to
      kobject_init_and_add().  Please note, this has the side effect that the
      release method is called if kobject_init_and_add() fails.
      
      Link: http://lkml.kernel.org/r/20190513033458.2824-1-tobin@kernel.org
      
      
      Signed-off-by: default avatarTobin C. Harding <tobin@kernel.org>
      Reviewed-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Reviewed-by: default avatarJoseph Qi <joseph.qi@linux.alibaba.com>
      Cc: Mark Fasheh <mark@fasheh.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Cc: Changwei Ge <gechangwei@live.cn>
      Cc: Gang He <ghe@suse.com>
      Cc: Jun Piao <piaojun@huawei.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      66a72b9a
    • Amit Cohen's avatar
      mlxsw: spectrum: Prevent force of 56G · 409347c4
      Amit Cohen authored
      
      
      commit 275e928f upstream.
      
      Force of 56G is not supported by hardware in Ethernet devices. This
      configuration fails with a bad parameter error from firmware.
      
      Add check of this case. Instead of trying to set 56G with autoneg off,
      return a meaningful error.
      
      Fixes: 56ade8fe ("mlxsw: spectrum: Add initial support for Spectrum ASIC")
      Signed-off-by: default avatarAmit Cohen <amitc@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      409347c4
    • Jason Yan's avatar
      scsi: libsas: delete sas port if expander discover failed · caa79423
      Jason Yan authored
      
      
      commit 3b054179 upstream.
      
      The sas_port(phy->port) allocated in sas_ex_discover_expander() will not be
      deleted when the expander failed to discover. This will cause resource leak
      and a further issue of kernel BUG like below:
      
      [159785.843156]  port-2:17:29: trying to add phy phy-2:17:29 fails: it's
      already part of another port
      [159785.852144] ------------[ cut here  ]------------
      [159785.856833] kernel BUG at drivers/scsi/scsi_transport_sas.c:1086!
      [159785.863000] Internal error: Oops - BUG: 0 [#1] SMP
      [159785.867866] CPU: 39 PID: 16993 Comm: kworker/u96:2 Tainted: G
      W  OE     4.19.25-vhulk1901.1.0.h111.aarch64 #1
      [159785.878458] Hardware name: Huawei Technologies Co., Ltd.
      Hi1620EVBCS/Hi1620EVBCS, BIOS Hi1620 CS B070 1P TA 03/21/2019
      [159785.889231] Workqueue: 0000:74:02.0_disco_q sas_discover_domain
      [159785.895224] pstate: 40c00009 (nZcv daif +PAN +UAO)
      [159785.900094] pc : sas_port_add_phy+0x188/0x1b8
      [159785.904524] lr : sas_port_add_phy+0x188/0x1b8
      [159785.908952] sp : ffff0001120e3b80
      [159785.912341] x29: ffff0001120e3b80 x28: 0000000000000000
      [159785.917727] x27: ffff802ade8f5400 x26: ffff0000681b7560
      [159785.923111] x25: ffff802adf11a800 x24: ffff0000680e8000
      [159785.928496] x23: ffff802ade8f5728 x22: ffff802ade8f5708
      [159785.933880] x21: ffff802adea2db40 x20: ffff802ade8f5400
      [159785.939264] x19: ffff802adea2d800 x18: 0000000000000010
      [159785.944649] x17: 00000000821bf734 x16: ffff00006714faa0
      [159785.950033] x15: ffff0000e8ab4ecf x14: 7261702079646165
      [159785.955417] x13: 726c612073277469 x12: ffff00006887b830
      [159785.960802] x11: ffff00006773eaa0 x10: 7968702079687020
      [159785.966186] x9 : 0000000000002453 x8 : 726f702072656874
      [159785.971570] x7 : 6f6e6120666f2074 x6 : ffff802bcfb21290
      [159785.976955] x5 : ffff802bcfb21290 x4 : 0000000000000000
      [159785.982339] x3 : ffff802bcfb298c8 x2 : 337752b234c2ab00
      [159785.987723] x1 : 337752b234c2ab00 x0 : 0000000000000000
      [159785.993108] Process kworker/u96:2 (pid: 16993, stack limit =
      0x0000000072dae094)
      [159786.000576] Call trace:
      [159786.003097]  sas_port_add_phy+0x188/0x1b8
      [159786.007179]  sas_ex_get_linkrate.isra.5+0x134/0x140
      [159786.012130]  sas_ex_discover_expander+0x128/0x408
      [159786.016906]  sas_ex_discover_dev+0x218/0x4c8
      [159786.021249]  sas_ex_discover_devices+0x9c/0x1a8
      [159786.025852]  sas_discover_root_expander+0x134/0x160
      [159786.030802]  sas_discover_domain+0x1b8/0x1e8
      [159786.035148]  process_one_work+0x1b4/0x3f8
      [159786.039230]  worker_thread+0x54/0x470
      [159786.042967]  kthread+0x134/0x138
      [159786.046269]  ret_from_fork+0x10/0x18
      [159786.049918] Code: 91322300 f0004402 91178042 97fe4c9b (d4210000)
      [159786.056083] Modules linked in: hns3_enet_ut(OE) hclge(OE) hnae3(OE)
      hisi_sas_test_hw(OE) hisi_sas_test_main(OE) serdes(OE)
      [159786.067202] ---[ end trace 03622b9e2d99e196  ]---
      [159786.071893] Kernel panic - not syncing: Fatal exception
      [159786.077190] SMP: stopping secondary CPUs
      [159786.081192] Kernel Offset: disabled
      [159786.084753] CPU features: 0x2,a2a00a38
      
      Fixes: 2908d778 ("[SCSI] aic94xx: new driver")
      Reported-by: default avatarJian Luo <luojian5@huawei.com>
      Signed-off-by: default avatarJason Yan <yanaijie@huawei.com>
      CC: John Garry <john.garry@huawei.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      caa79423
    • YueHaibing's avatar
      scsi: scsi_dh_alua: Fix possible null-ptr-deref · 87ab1372
      YueHaibing authored
      
      
      commit 12e750bc upstream.
      
      If alloc_workqueue fails in alua_init, it should return -ENOMEM, otherwise
      it will trigger null-ptr-deref while unloading module which calls
      destroy_workqueue dereference
      wq->lock like this:
      
      BUG: KASAN: null-ptr-deref in __lock_acquire+0x6b4/0x1ee0
      Read of size 8 at addr 0000000000000080 by task syz-executor.0/7045
      
      CPU: 0 PID: 7045 Comm: syz-executor.0 Tainted: G         C        5.1.0+ #28
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1
      Call Trace:
       dump_stack+0xa9/0x10e
       __kasan_report+0x171/0x18d
       ? __lock_acquire+0x6b4/0x1ee0
       kasan_report+0xe/0x20
       __lock_acquire+0x6b4/0x1ee0
       lock_acquire+0xb4/0x1b0
       __mutex_lock+0xd8/0xb90
       drain_workqueue+0x25/0x290
       destroy_workqueue+0x1f/0x3f0
       __x64_sys_delete_module+0x244/0x330
       do_syscall_64+0x72/0x2a0
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Reported-by: default avatarHulk Robot <hulkci@huawei.com>
      Fixes: 03197b61 ("scsi_dh_alua: Use workqueue for RTPG")
      Signed-off-by: default avatarYueHaibing <yuehaibing@huawei.com>
      Reviewed-by: default avatarBart Van Assche <bvanassche@acm.org>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      87ab1372
    • Lianbo Jiang's avatar
      scsi: smartpqi: properly set both the DMA mask and the coherent DMA mask · 7eca591c
      Lianbo Jiang authored
      
      
      commit 1d94f06e upstream.
      
      When SME is enabled, the smartpqi driver won't work on the HP DL385 G10
      machine, which causes the failure of kernel boot because it fails to
      allocate pqi error buffer. Please refer to the kernel log:
      ....
      [    9.431749] usbcore: registered new interface driver uas
      [    9.441524] Microsemi PQI Driver (v1.1.4-130)
      [    9.442956] i40e 0000:04:00.0: fw 6.70.48768 api 1.7 nvm 10.2.5
      [    9.447237] smartpqi 0000:23:00.0: Microsemi Smart Family Controller found
               Starting dracut initqueue hook...
      [  OK  ] Started Show Plymouth Boot Scre[    9.471654] Broadcom NetXtreme-C/E driver bnxt_en v1.9.1
      en.
      [  OK  ] Started Forward Password Requests to Plymouth Directory Watch.
      [[0;[    9.487108] smartpqi 0000:23:00.0: failed to allocate PQI error buffer
      ....
      [  139.050544] dracut-initqueue[949]: Warning: dracut-initqueue timeout - starting timeout scripts
      [  139.589779] dracut-initqueue[949]: Warning: dracut-initqueue timeout - starting timeout scripts
      
      Basically, the fact that the coherent DMA mask value wasn't set caused the
      driver to fall back to SWIOTLB when SME is active.
      
      For correct operation, lets call the dma_set_mask_and_coherent() to
      properly set the mask for both streaming and coherent, in order to inform
      the kernel about the devices DMA addressing capabilities.
      
      Signed-off-by: default avatarLianbo Jiang <lijiang@redhat.com>
      Acked-by: default avatarDon Brace <don.brace@microsemi.com>
      Tested-by: default avatarDon Brace <don.brace@microsemi.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      7eca591c
    • Varun Prakash's avatar
      scsi: libcxgbi: add a check for NULL pointer in cxgbi_check_route() · d00ca2d9
      Varun Prakash authored
      
      
      commit cc555759 upstream.
      
      ip_dev_find() can return NULL so add a check for NULL pointer.
      
      Signed-off-by: default avatarVarun Prakash <varun@chelsio.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      d00ca2d9
    • Max Uvarov's avatar
      net: phy: dp83867: Set up RGMII TX delay · 9336b513
      Max Uvarov authored
      
      
      commit 2b892649 upstream.
      
      PHY_INTERFACE_MODE_RGMII_RXID is less then TXID
      so code to set tx delay is never called.
      
      Fixes: 2a10154a ("net: phy: dp83867: Add TI dp83867 phy")
      Signed-off-by: default avatarMax Uvarov <muvarov@gmail.com>
      Cc: Florian Fainelli <f.fainelli@gmail.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      9336b513
    • Russell King's avatar
      net: phylink: ensure consistent phy interface mode · c0de0ca4
      Russell King authored
      
      
      commit c6787263 upstream.
      
      Ensure that we supply the same phy interface mode to mac_link_down() as
      we did for the corresponding mac_link_up() call.  This ensures that MAC
      drivers that use the phy interface mode in these methods can depend on
      mac_link_down() always corresponding to a mac_link_up() call for the
      same interface mode.
      
      Signed-off-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      c0de0ca4
    • Yoshihiro Shimoda's avatar
      net: sh_eth: fix mdio access in sh_eth_close() for R-Car Gen2 and RZ/A1 SoCs · c9dd47f4
      Yoshihiro Shimoda authored
      
      
      commit 315ca92d upstream.
      
      The sh_eth_close() resets the MAC and then calls phy_stop()
      so that mdio read access result is incorrect without any error
      according to kernel trace like below:
      
      ifconfig-216   [003] .n..   109.133124: mdio_access: ee700000.ethernet-ffffffff read  phy:0x01 reg:0x00 val:0xffff
      
      According to the hardware manual, the RMII mode should be set to 1
      before operation the Ethernet MAC. However, the previous code was not
      set to 1 after the driver issued the soft_reset in sh_eth_dev_exit()
      so that the mdio read access result seemed incorrect. To fix the issue,
      this patch adds a condition and set the RMII mode register in
      sh_eth_dev_exit() for R-Car Gen2 and RZ/A1 SoCs.
      
      Note that when I have tried to move the sh_eth_dev_exit() calling
      after phy_stop() on sh_eth_close(), but it gets worse (kernel panic
      happened and it seems that a register is accessed while the clock is
      off).
      
      Signed-off-by: default avatarYoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      c9dd47f4
    • Paul Mackerras's avatar
      KVM: PPC: Book3S HV: Don't take kvm->lock around kvm_for_each_vcpu · c91a76e3
      Paul Mackerras authored
      
      
      commit 5a3f4936 upstream.
      
      Currently the HV KVM code takes the kvm->lock around calls to
      kvm_for_each_vcpu() and kvm_get_vcpu_by_id() (which can call
      kvm_for_each_vcpu() internally).  However, that leads to a lock
      order inversion problem, because these are called in contexts where
      the vcpu mutex is held, but the vcpu mutexes nest within kvm->lock
      according to Documentation/virtual/kvm/locking.txt.  Hence there
      is a possibility of deadlock.
      
      To fix this, we simply don't take the kvm->lock mutex around these
      calls.  This is safe because the implementations of kvm_for_each_vcpu()
      and kvm_get_vcpu_by_id() have been designed to be able to be called
      locklessly.
      
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Reviewed-by: default avatarCédric Le Goater <clg@kaod.org>
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: default avatarPaul Gortmaker <paul.gortmaker@windriver.com>
      c91a76e3
Loading