Skip to content
  1. Jan 31, 2018
    • Dan Williams's avatar
      x86: Introduce barrier_nospec · b3d7ad85
      Dan Williams authored
      
      
      Rename the open coded form of this instruction sequence from
      rdtsc_ordered() into a generic barrier primitive, barrier_nospec().
      
      One of the mitigations for Spectre variant1 vulnerabilities is to fence
      speculative execution after successfully validating a bounds check. I.e.
      force the result of a bounds check to resolve in the instruction pipeline
      to ensure speculative execution honors that result before potentially
      operating on out-of-bounds data.
      
      No functional changes.
      
      Suggested-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Suggested-by: default avatarAndi Kleen <ak@linux.intel.com>
      Suggested-by: default avatarIngo Molnar <mingo@redhat.com>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: linux-arch@vger.kernel.org
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: kernel-hardening@lists.openwall.com
      Cc: gregkh@linuxfoundation.org
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: alan@linux.intel.com
      Link: https://lkml.kernel.org/r/151727415361.33451.9049453007262764675.stgit@dwillia2-desk3.amr.corp.intel.com
      b3d7ad85
    • Dan Williams's avatar
      x86: Implement array_index_mask_nospec · babdde26
      Dan Williams authored
      
      
      array_index_nospec() uses a mask to sanitize user controllable array
      indexes, i.e. generate a 0 mask if 'index' >= 'size', and a ~0 mask
      otherwise. While the default array_index_mask_nospec() handles the
      carry-bit from the (index - size) result in software.
      
      The x86 array_index_mask_nospec() does the same, but the carry-bit is
      handled in the processor CF flag without conditional instructions in the
      control flow.
      
      Suggested-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: linux-arch@vger.kernel.org
      Cc: kernel-hardening@lists.openwall.com
      Cc: gregkh@linuxfoundation.org
      Cc: alan@linux.intel.com
      Link: https://lkml.kernel.org/r/151727414808.33451.1873237130672785331.stgit@dwillia2-desk3.amr.corp.intel.com
      babdde26
    • Dan Williams's avatar
      array_index_nospec: Sanitize speculative array de-references · f3804203
      Dan Williams authored
      
      
      array_index_nospec() is proposed as a generic mechanism to mitigate
      against Spectre-variant-1 attacks, i.e. an attack that bypasses boundary
      checks via speculative execution. The array_index_nospec()
      implementation is expected to be safe for current generation CPUs across
      multiple architectures (ARM, x86).
      
      Based on an original implementation by Linus Torvalds, tweaked to remove
      speculative flows by Alexei Starovoitov, and tweaked again by Linus to
      introduce an x86 assembly implementation for the mask generation.
      
      Co-developed-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Co-developed-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Suggested-by: default avatarCyril Novikov <cnovikov@lynx.com>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: linux-arch@vger.kernel.org
      Cc: kernel-hardening@lists.openwall.com
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: gregkh@linuxfoundation.org
      Cc: torvalds@linux-foundation.org
      Cc: alan@linux.intel.com
      Link: https://lkml.kernel.org/r/151727414229.33451.18411580953862676575.stgit@dwillia2-desk3.amr.corp.intel.com
      f3804203
    • Mark Rutland's avatar
      Documentation: Document array_index_nospec · f84a56f7
      Mark Rutland authored
      
      
      Document the rationale and usage of the new array_index_nospec() helper.
      
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarKees Cook <keescook@chromium.org>
      Cc: linux-arch@vger.kernel.org
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: gregkh@linuxfoundation.org
      Cc: kernel-hardening@lists.openwall.com
      Cc: torvalds@linux-foundation.org
      Cc: alan@linux.intel.com
      Link: https://lkml.kernel.org/r/151727413645.33451.15878817161436755393.stgit@dwillia2-desk3.amr.corp.intel.com
      f84a56f7
  2. Jan 30, 2018
    • Andy Lutomirski's avatar
      x86/asm: Move 'status' from thread_struct to thread_info · 37a8f7c3
      Andy Lutomirski authored
      
      
      The TS_COMPAT bit is very hot and is accessed from code paths that mostly
      also touch thread_info::flags.  Move it into struct thread_info to improve
      cache locality.
      
      The only reason it was in thread_struct is that there was a brief period
      during which arch-specific fields were not allowed in struct thread_info.
      
      Linus suggested further changing:
      
        ti->status &= ~(TS_COMPAT|TS_I386_REGS_POKED);
      
      to:
      
        if (unlikely(ti->status & (TS_COMPAT|TS_I386_REGS_POKED)))
                ti->status &= ~(TS_COMPAT|TS_I386_REGS_POKED);
      
      on the theory that frequently dirtying the cacheline even in pure 64-bit
      code that never needs to modify status hurts performance.  That could be a
      reasonable followup patch, but I suspect it matters less on top of this
      patch.
      
      Suggested-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarIngo Molnar <mingo@kernel.org>
      Acked-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Kernel Hardening <kernel-hardening@lists.openwall.com>
      Link: https://lkml.kernel.org/r/03148bcc1b217100e6e8ecf6a5468c45cf4304b6.1517164461.git.luto@kernel.org
      37a8f7c3
    • Andy Lutomirski's avatar
      x86/entry/64: Push extra regs right away · d1f77320
      Andy Lutomirski authored
      
      
      With the fast path removed there is no point in splitting the push of the
      normal and the extra register set. Just push the extra regs right away.
      
      [ tglx: Split out from 'x86/entry/64: Remove the SYSCALL64 fast path' ]
      
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Acked-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Kernel Hardening <kernel-hardening@lists.openwall.com>
      Link: https://lkml.kernel.org/r/462dff8d4d64dfbfc851fbf3130641809d980ecd.1517164461.git.luto@kernel.org
      d1f77320
    • Andy Lutomirski's avatar
      x86/entry/64: Remove the SYSCALL64 fast path · 21d375b6
      Andy Lutomirski authored
      
      
      The SYCALLL64 fast path was a nice, if small, optimization back in the good
      old days when syscalls were actually reasonably fast.  Now there is PTI to
      slow everything down, and indirect branches are verboten, making everything
      messier.  The retpoline code in the fast path is particularly nasty.
      
      Just get rid of the fast path. The slow path is barely slower.
      
      [ tglx: Split out the 'push all extra regs' part ]
      
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Acked-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Kernel Hardening <kernel-hardening@lists.openwall.com>
      Link: https://lkml.kernel.org/r/462dff8d4d64dfbfc851fbf3130641809d980ecd.1517164461.git.luto@kernel.org
      21d375b6
    • Dou Liyang's avatar
      x86/spectre: Check CONFIG_RETPOLINE in command line parser · 9471eee9
      Dou Liyang authored
      The spectre_v2 option 'auto' does not check whether CONFIG_RETPOLINE is
      enabled. As a consequence it fails to emit the appropriate warning and sets
      feature flags which have no effect at all.
      
      Add the missing IS_ENABLED() check.
      
      Fixes: da285121
      
       ("x86/spectre: Add boot time option to select Spectre v2 mitigation")
      Signed-off-by: default avatarDou Liyang <douly.fnst@cn.fujitsu.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: ak@linux.intel.com
      Cc: peterz@infradead.org
      Cc: Tomohiro" <misono.tomohiro@jp.fujitsu.com>
      Cc: dave.hansen@intel.com
      Cc: bp@alien8.de
      Cc: arjan@linux.intel.com
      Cc: dwmw@amazon.co.uk
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/f5892721-7528-3647-08fb-f8d10e65ad87@cn.fujitsu.com
      9471eee9
    • William Grant's avatar
      x86/mm: Fix overlap of i386 CPU_ENTRY_AREA with FIX_BTMAP · 55f49fcb
      William Grant authored
      Since commit 92a0f81d ("x86/cpu_entry_area: Move it out of the
      fixmap"), i386's CPU_ENTRY_AREA has been mapped to the memory area just
      below FIXADDR_START. But already immediately before FIXADDR_START is the
      FIX_BTMAP area, which means that early_ioremap can collide with the entry
      area.
      
      It's especially bad on PAE where FIX_BTMAP_BEGIN gets aligned to exactly
      match CPU_ENTRY_AREA_BASE, so the first early_ioremap slot clobbers the
      IDT and causes interrupts during early boot to reset the system.
      
      The overlap wasn't a problem before the CPU entry area was introduced,
      as the fixmap has classically been preceded by the pkmap or vmalloc
      areas, neither of which is used until early_ioremap is out of the
      picture.
      
      Relocate CPU_ENTRY_AREA to below FIX_BTMAP, not just below the permanent
      fixmap area.
      
      Fixes: commit 92a0f81d
      
       ("x86/cpu_entry_area: Move it out of the fixmap")
      Signed-off-by: default avatarWilliam Grant <william.grant@canonical.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/7041d181-a019-e8b9-4e4e-48215f841e2c@canonical.com
      55f49fcb
    • Josh Poimboeuf's avatar
      objtool: Warn on stripped section symbol · 830c1e3d
      Josh Poimboeuf authored
      With the following fix:
      
        2a0098d7
      
       ("objtool: Fix seg fault with gold linker")
      
      ... a seg fault was avoided, but the original seg fault condition in
      objtool wasn't fixed.  Replace the seg fault with an error message.
      
      Suggested-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/dc4585a70d6b975c99fc51d1957ccdde7bd52f3a.1517284349.git.jpoimboe@redhat.com
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      830c1e3d
    • Josh Poimboeuf's avatar
      objtool: Add support for alternatives at the end of a section · 17bc3391
      Josh Poimboeuf authored
      
      
      Now that the previous patch gave objtool the ability to read retpoline
      alternatives, it shows a new warning:
      
        arch/x86/entry/entry_64.o: warning: objtool: .entry_trampoline: don't know how to handle alternatives at end of section
      
      This is due to the JMP_NOSPEC in entry_SYSCALL_64_trampoline().
      
      Previously, objtool ignored this situation because it wasn't needed, and
      it would have required a bit of extra code.  Now that this case exists,
      add proper support for it.
      
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Guenter Roeck <linux@roeck-us.net>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/2a30a3c2158af47d891a76e69bb1ef347e0443fd.1517284349.git.jpoimboe@redhat.com
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      17bc3391
    • Josh Poimboeuf's avatar
      objtool: Improve retpoline alternative handling · a845c7cf
      Josh Poimboeuf authored
      
      
      Currently objtool requires all retpolines to be:
      
        a) patched in with alternatives; and
      
        b) annotated with ANNOTATE_NOSPEC_ALTERNATIVE.
      
      If you forget to do both of the above, objtool segfaults trying to
      dereference a NULL 'insn->call_dest' pointer.
      
      Avoid that situation and print a more helpful error message:
      
        quirks.o: warning: objtool: efi_delete_dummy_variable()+0x99: unsupported intra-function call
        quirks.o: warning: objtool: If this is a retpoline, please patch it in with alternatives and annotate it with ANNOTATE_NOSPEC_ALTERNATIVE.
      
      Future improvements can be made to make objtool smarter with respect to
      retpolines, but this is a good incremental improvement for now.
      
      Reported-and-tested-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/819e50b6d9c2e1a22e34c1a636c0b2057cc8c6e5.1517284349.git.jpoimboe@redhat.com
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      a845c7cf
    • Ingo Molnar's avatar
      Merge tag 'v4.15' into x86/pti, to be able to merge dependent changes · 7e86548e
      Ingo Molnar authored
      
      
      Time has come to switch PTI development over to a v4.15 base - we'll still
      try to make sure that all PTI fixes backport cleanly to v4.14 and earlier.
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      7e86548e
  3. Jan 29, 2018
    • Linus Torvalds's avatar
      Linux 4.15 · d8a5b805
      Linus Torvalds authored
      d8a5b805
    • Linus Torvalds's avatar
      Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 24b1cccf
      Linus Torvalds authored
      Pull x86 retpoline fixlet from Thomas Gleixner:
       "Remove the ESP/RSP thunks for retpoline as they cannot ever work.
      
        Get rid of them before they show up in a release"
      
      * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/retpoline: Remove the esp/rsp thunk
      24b1cccf
    • Linus Torvalds's avatar
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 32c6cdf7
      Linus Torvalds authored
      Pull x86 fixes from Thomas Gleixner:
       "A set of small fixes for 4.15:
      
         - Fix vmapped stack synchronization on systems with 4-level paging
           and a large amount of memory caused by a missing 5-level folding
           which made the pgd synchronization logic to fail and causing double
           faults.
      
         - Add a missing sanity check in the vmalloc_fault() logic on 5-level
           paging systems.
      
         - Bring back protection against accessing a freed initrd in the
           microcode loader which was lost by a wrong merge conflict
           resolution.
      
         - Extend the Broadwell micro code loading sanity check.
      
         - Add a missing ENDPROC annotation in ftrace assembly code which
           makes ORC unhappy.
      
         - Prevent loading the AMD power module on !AMD platforms. The load
           itself is uncritical, but an unload attempt results in a kernel
           crash.
      
         - Update Peter Anvins role in the MAINTAINERS file"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/ftrace: Add one more ENDPROC annotation
        x86: Mark hpa as a "Designated Reviewer" for the time being
        x86/mm/64: Tighten up vmalloc_fault() sanity checks on 5-level kernels
        x86/mm/64: Fix vmapped stack syncing on very-large-memory 4-level systems
        x86/microcode: Fix again accessing initrd after having been freed
        x86/microcode/intel: Extend BDW late-loading further with LLC size check
        perf/x86/amd/power: Do not load AMD power module on !AMD platforms
      32c6cdf7
    • Linus Torvalds's avatar
      Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 07b0137c
      Linus Torvalds authored
      Pull timer fix from Thomas Gleixner:
       "A single fix for a ~10 years old problem which causes high resolution
        timers to stop after a CPU unplug/plug cycle due to a stale flag in
        the per CPU hrtimer base struct.
      
        Paul McKenney was hunting this for about a year, but the heisenbug
        nature made it resistant against debug attempts for quite some time"
      
      * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        hrtimer: Reset hrtimer cpu base proper on CPU hotplug
      07b0137c
    • Linus Torvalds's avatar
      Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 62444192
      Linus Torvalds authored
      Pull scheduler fix from Thomas Gleixner:
       "A single bug fix to prevent a subtle deadlock in the scheduler core
        code vs cpu hotplug"
      
      * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched/core: Fix cpu.max vs. cpuhotplug deadlock
      62444192
    • Linus Torvalds's avatar
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 39e38362
      Linus Torvalds authored
      Pull perf fixes from Thomas Gleixner:
       "Four patches which all address lock inversions and deadlocks in the
        perf core code and the Intel debug store"
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf/x86: Fix perf,x86,cpuhp deadlock
        perf/core: Fix ctx::mutex deadlock
        perf/core: Fix another perf,trace,cpuhp lock inversion
        perf/core: Fix lock inversion between perf,trace,cpuhp
      39e38362
    • Linus Torvalds's avatar
      Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 8c76e31a
      Linus Torvalds authored
      Pull locking fixes from Thomas Gleixner:
       "Two final locking fixes for 4.15:
      
         - Repair the OWNER_DIED logic in the futex code which got wreckaged
           with the recent fix for a subtle race condition.
      
         - Prevent the hard lockup detector from triggering when dumping all
           held locks in the system"
      
      * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        locking/lockdep: Avoid triggering hardlockup from debug_show_all_locks()
        futex: Fix OWNER_DEAD fixup
      8c76e31a
  4. Jan 28, 2018
    • Josh Poimboeuf's avatar
      x86/ftrace: Add one more ENDPROC annotation · dd085168
      Josh Poimboeuf authored
      When ORC support was added for the ftrace_64.S code, an ENDPROC
      for function_hook() was missed. This results in the following warning:
      
        arch/x86/kernel/ftrace_64.o: warning: objtool: .entry.text+0x0: unreachable instruction
      
      Fixes: e2ac83d7
      
       ("x86/ftrace: Fix ORC unwinding from ftrace handlers")
      Reported-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Reported-by: default avatarBorislav Petkov <bp@alien8.de>
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Acked-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: https://lkml.kernel.org/r/20180128022150.dqierscqmt3uwwsr@treble
      dd085168
    • Borislav Petkov's avatar
      x86/speculation: Simplify indirect_branch_prediction_barrier() · 64e16720
      Borislav Petkov authored
      
      
      Make it all a function which does the WRMSR instead of having a hairy
      inline asm.
      
      [dwmw2: export it, fix CONFIG_RETPOLINE issues]
      
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Signed-off-by: default avatarDavid Woodhouse <dwmw@amazon.co.uk>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: ak@linux.intel.com
      Cc: dave.hansen@intel.com
      Cc: karahmed@amazon.de
      Cc: arjan@linux.intel.com
      Cc: torvalds@linux-foundation.org
      Cc: peterz@infradead.org
      Cc: bp@alien8.de
      Cc: pbonzini@redhat.com
      Cc: tim.c.chen@linux.intel.com
      Cc: gregkh@linux-foundation.org
      Link: https://lkml.kernel.org/r/1517070274-12128-4-git-send-email-dwmw@amazon.co.uk
      64e16720
    • Borislav Petkov's avatar
      x86/retpoline: Simplify vmexit_fill_RSB() · 1dde7415
      Borislav Petkov authored
      
      
      Simplify it to call an asm-function instead of pasting 41 insn bytes at
      every call site. Also, add alignment to the macro as suggested here:
      
        https://support.google.com/faqs/answer/7625886
      
      [dwmw2: Clean up comments, let it clobber %ebx and just tell the compiler]
      
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Signed-off-by: default avatarDavid Woodhouse <dwmw@amazon.co.uk>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: ak@linux.intel.com
      Cc: dave.hansen@intel.com
      Cc: karahmed@amazon.de
      Cc: arjan@linux.intel.com
      Cc: torvalds@linux-foundation.org
      Cc: peterz@infradead.org
      Cc: bp@alien8.de
      Cc: pbonzini@redhat.com
      Cc: tim.c.chen@linux.intel.com
      Cc: gregkh@linux-foundation.org
      Link: https://lkml.kernel.org/r/1517070274-12128-3-git-send-email-dwmw@amazon.co.uk
      1dde7415
    • David Woodhouse's avatar
      x86/cpufeatures: Clean up Spectre v2 related CPUID flags · 2961298e
      David Woodhouse authored
      
      
      We want to expose the hardware features simply in /proc/cpuinfo as "ibrs",
      "ibpb" and "stibp". Since AMD has separate CPUID bits for those, use them
      as the user-visible bits.
      
      When the Intel SPEC_CTRL bit is set which indicates both IBRS and IBPB
      capability, set those (AMD) bits accordingly. Likewise if the Intel STIBP
      bit is set, set the AMD STIBP that's used for the generic hardware
      capability.
      
      Hide the rest from /proc/cpuinfo by putting "" in the comments. Including
      RETPOLINE and RETPOLINE_AMD which shouldn't be visible there. There are
      patches to make the sysfs vulnerabilities information non-readable by
      non-root, and the same should apply to all information about which
      mitigations are actually in use. Those *shouldn't* appear in /proc/cpuinfo.
      
      The feature bit for whether IBPB is actually used, which is needed for
      ALTERNATIVEs, is renamed to X86_FEATURE_USE_IBPB.
      
      Originally-by: default avatarBorislav Petkov <bp@suse.de>
      Signed-off-by: default avatarDavid Woodhouse <dwmw@amazon.co.uk>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: ak@linux.intel.com
      Cc: dave.hansen@intel.com
      Cc: karahmed@amazon.de
      Cc: arjan@linux.intel.com
      Cc: torvalds@linux-foundation.org
      Cc: peterz@infradead.org
      Cc: bp@alien8.de
      Cc: pbonzini@redhat.com
      Cc: tim.c.chen@linux.intel.com
      Cc: gregkh@linux-foundation.org
      Link: https://lkml.kernel.org/r/1517070274-12128-2-git-send-email-dwmw@amazon.co.uk
      2961298e
  5. Jan 27, 2018
    • Thomas Gleixner's avatar
      x86/cpu/bugs: Make retpoline module warning conditional · e383095c
      Thomas Gleixner authored
      If sysfs is disabled and RETPOLINE not defined:
      
      arch/x86/kernel/cpu/bugs.c:97:13: warning: ‘spectre_v2_bad_module’ defined but not used
      [-Wunused-variable]
       static bool spectre_v2_bad_module;
      
      Hide it.
      
      Fixes: caf7501a
      
       ("module/retpoline: Warn about missing retpoline in module")
      Reported-by: default avatarBorislav Petkov <bp@alien8.de>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      e383095c
    • Thomas Gleixner's avatar
      hrtimer: Reset hrtimer cpu base proper on CPU hotplug · d5421ea4
      Thomas Gleixner authored
      The hrtimer interrupt code contains a hang detection and mitigation
      mechanism, which prevents that a long delayed hrtimer interrupt causes a
      continous retriggering of interrupts which prevent the system from making
      progress. If a hang is detected then the timer hardware is programmed with
      a certain delay into the future and a flag is set in the hrtimer cpu base
      which prevents newly enqueued timers from reprogramming the timer hardware
      prior to the chosen delay. The subsequent hrtimer interrupt after the delay
      clears the flag and resumes normal operation.
      
      If such a hang happens in the last hrtimer interrupt before a CPU is
      unplugged then the hang_detected flag is set and stays that way when the
      CPU is plugged in again. At that point the timer hardware is not armed and
      it cannot be armed because the hang_detected flag is still active, so
      nothing clears that flag. As a consequence the CPU does not receive hrtimer
      interrupts and no timers expire on that CPU which results in RCU stalls and
      other malfunctions.
      
      Clear the flag along with some other less critical members of the hrtimer
      cpu base to ensure starting from a clean state when a CPU is plugged in.
      
      Thanks to Paul, Sebastian and Anna-Maria for their help to get down to the
      root cause of that hard to reproduce heisenbug. Once understood it's
      trivial and certainly justifies a brown paperbag.
      
      Fixes: 41d2e494
      
       ("hrtimer: Tune hrtimer_interrupt hang logic")
      Reported-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sebastian Sewior <bigeasy@linutronix.de>
      Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801261447590.2067@nanos
      d5421ea4
    • H. Peter Anvin's avatar
      x86: Mark hpa as a "Designated Reviewer" for the time being · 8a95b74d
      H. Peter Anvin authored
      
      
      Due to some unfortunate events, I have not been directly involved in
      the x86 kernel patch flow for a while now.  I have also not been able
      to ramp back up by now like I had hoped to, and after reviewing what I
      will need to work on both internally at Intel and elsewhere in the near
      term, it is clear that I am not going to be able to ramp back up until
      late 2018 at the very earliest.
      
      It is not acceptable to not recognize that this load is currently
      taken by Ingo and Thomas without my direct participation, so I mark
      myself as R: (designated reviewer) rather than M: (maintainer) until
      further notice.  This is in fact recognizing the de facto situation
      for the past few years.
      
      I have obviously no intention of going away, and I will do everything
      within my power to improve Linux on x86 and x86 for Linux.  This,
      however, puts credit where it is due and reflects a change of focus.
      
      This patch also removes stale entries for portions of the x86
      architecture which have not been maintained separately from arch/x86
      for a long time.  If there is a reason to re-introduce them then that
      can happen later.
      
      Signed-off-by: default avatarH. Peter Anvin <h.peter.anvin@intel.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Bruce Schlobohm <bruce.schlobohm@intel.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20180125195934.5253-1-hpa@zytor.com
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      8a95b74d
    • Linus Torvalds's avatar
      Merge tag 'riscv-for-linus-4.15-maintainers' of... · c4e0ca7f
      Linus Torvalds authored
      Merge tag 'riscv-for-linus-4.15-maintainers' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux
      
      Pull RISC-V update from Palmer Dabbelt:
       "RISC-V: We have a new mailing list and git repo!
      
        Sorry to send something essentially as late as possible (Friday after
        an rc9), but we managed to get a mailing list for the RISC-V Linux
        port. We've been using patches@groups.riscv.org for a while, but that
        list has some problems (it's Google Groups and it's shared over all
        RISC-V software projects). The new infaread.org list is much better.
        We just got it on Wednesday but I used it a bit on Thursday to shake
        out all the configuration problems and it appears to be in working
        order.
      
        When I updated the mailing list I noticed that the MAINTAINERS file
        was pointing to our github repo, but now that we have a kernel.org
        repo I'd like to point to that instead so I changed that as well.
        We'll be centralizing all RISC-V Linux related development here as
        that seems to be the saner way to go about it.
      
        I can understand if it's too late to get this into 4.15, but given
        that it's not a code change I was hoping it'd still be OK. It would be
        nice to have the new mailing list and git repo in the release tarballs
        so when people start to find bugs they'll get to the right place"
      
      * tag 'riscv-for-linus-4.15-maintainers' of git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux:
        Update the RISC-V MAINTAINERS file
      c4e0ca7f
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · ba804bb4
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) The per-network-namespace loopback device, and thus its namespace,
          can have its teardown deferred for a long time if a kernel created
          TCP socket closes and the namespace is exiting meanwhile. The kernel
          keeps trying to finish the close sequence until it times out (which
          takes quite some time).
      
          Fix this by forcing the socket closed in this situation, from Dan
          Streetman.
      
       2) Fix regression where we're trying to invoke the update_pmtu method
          on route types (in this case metadata tunnel routes) that don't
          implement the dst_ops method. Fix from Nicolas Dichtel.
      
       3) Fix long standing memory corruption issues in r8169 driver by
          performing the chip statistics DMA programming more correctly. From
          Francois Romieu.
      
       4) Handle local broadcast sends over VRF routes properly, from David
          Ahern.
      
       5) Don't refire the DCCP CCID2 timer endlessly, otherwise the socket
          can never be released. From Alexey Kodanev.
      
       6) Set poll flags properly in VSOCK protocol layer, from Stefan
          Hajnoczi.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
        VSOCK: set POLLOUT | POLLWRNORM for TCP_CLOSING
        dccp: don't restart ccid2_hc_tx_rto_expire() if sk in closed state
        net: vrf: Add support for sends to local broadcast address
        r8169: fix memory corruption on retrieval of hardware statistics.
        net: don't call update_pmtu unconditionally
        net: tcp: close sock if net namespace is exiting
      ba804bb4
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-for-v4.15-rc10-2' of git://people.freedesktop.org/~airlied/linux · db218549
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "A fairly urgent nouveau regression fix for broken irqs across
        suspend/resume came in. This was broken before but a patch in 4.15 has
        made it much more obviously broken and now s/r fails a lot more often.
      
        The fix removes freeing the irq across s/r which never should have
        been done anyways.
      
        Also two vc4 fixes for a NULL deference and some misrendering /
        flickering on screen"
      
      * tag 'drm-fixes-for-v4.15-rc10-2' of git://people.freedesktop.org/~airlied/linux:
        drm/nouveau: Move irq setup/teardown to pci ctor/dtor
        drm/vc4: Fix NULL pointer dereference in vc4_save_hang_state()
        drm/vc4: Flush the caches before the bin jobs, as well.
      db218549
    • Stefan Hajnoczi's avatar
      VSOCK: set POLLOUT | POLLWRNORM for TCP_CLOSING · ba3169fc
      Stefan Hajnoczi authored
      
      
      select(2) with wfds but no rfds must return when the socket is shut down
      by the peer.  This way userspace notices socket activity and gets -EPIPE
      from the next write(2).
      
      Currently select(2) does not return for virtio-vsock when a SEND+RCV
      shutdown packet is received.  This is because vsock_poll() only sets
      POLLOUT | POLLWRNORM for TCP_CLOSE, not the TCP_CLOSING state that the
      socket is in when the shutdown is received.
      
      Signed-off-by: default avatarStefan Hajnoczi <stefanha@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ba3169fc
    • Alexey Kodanev's avatar
      dccp: don't restart ccid2_hc_tx_rto_expire() if sk in closed state · dd5684ec
      Alexey Kodanev authored
      ccid2_hc_tx_rto_expire() timer callback always restarts the timer
      again and can run indefinitely (unless it is stopped outside), and after
      commit 120e9dab ("dccp: defer ccid_hc_tx_delete() at dismantle time"),
      which moved ccid_hc_tx_delete() (also includes sk_stop_timer()) from
      dccp_destroy_sock() to sk_destruct(), this started to happen quite often.
      The timer prevents releasing the socket, as a result, sk_destruct() won't
      be called.
      
      Found with LTP/dccp_ipsec tests running on the bonding device,
      which later couldn't be unloaded after the tests were completed:
      
        unregister_netdevice: waiting for bond0 to become free. Usage count = 148
      
      Fixes: 2a91aa39
      
       ("[DCCP] CCID2: Initial CCID2 (TCP-Like) implementation")
      Signed-off-by: default avatarAlexey Kodanev <alexey.kodanev@oracle.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dd5684ec
    • Palmer Dabbelt's avatar
      Update the RISC-V MAINTAINERS file · 6572cc2b
      Palmer Dabbelt authored
      
      
      Now that we're upstream in Linux we've been able to make some
      infrastructure changes so our port works a bit more like other ports.
      Specifically:
      
      * We now have a mailing list specific to the RISC-V Linux port, hosted
        at lists.infreadead.org.
      * We now have a kernel.org git tree where work on our port is
        coordinated.
      
      This patch changes the RISC-V maintainers entry to reflect these new
      bits of infrastructure.
      
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarPalmer Dabbelt <palmer@sifive.com>
      6572cc2b
  6. Jan 26, 2018
    • Andy Lutomirski's avatar
      x86/mm/64: Tighten up vmalloc_fault() sanity checks on 5-level kernels · 36b3a772
      Andy Lutomirski authored
      On a 5-level kernel, if a non-init mm has a top-level entry, it needs to
      match init_mm's, but the vmalloc_fault() code skipped over the BUG_ON()
      that would have checked it.
      
      While we're at it, get rid of the rather confusing 4-level folded "pgd"
      logic.
      
      Cleans-up: b50858ce
      
       ("x86/mm/vmalloc: Add 5-level paging support")
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Neil Berrington <neil.berrington@datacore.com>
      Link: https://lkml.kernel.org/r/2ae598f8c279b0a29baf75df207e6f2fdddc0a1b.1516914529.git.luto@kernel.org
      36b3a772
    • Andy Lutomirski's avatar
      x86/mm/64: Fix vmapped stack syncing on very-large-memory 4-level systems · 5beda7d5
      Andy Lutomirski authored
      Neil Berrington reported a double-fault on a VM with 768GB of RAM that uses
      large amounts of vmalloc space with PTI enabled.
      
      The cause is that load_new_mm_cr3() was never fixed to take the 5-level pgd
      folding code into account, so, on a 4-level kernel, the pgd synchronization
      logic compiles away to exactly nothing.
      
      Interestingly, the problem doesn't trigger with nopti.  I assume this is
      because the kernel is mapped with global pages if we boot with nopti.  The
      sequence of operations when we create a new task is that we first load its
      mm while still running on the old stack (which crashes if the old stack is
      unmapped in the new mm unless the TLB saves us), then we call
      prepare_switch_to(), and then we switch to the new stack.
      prepare_switch_to() pokes the new stack directly, which will populate the
      mapping through vmalloc_fault().  I assume that we're getting lucky on
      non-PTI systems -- the old stack's TLB entry stays alive long enough to
      make it all the way through prepare_switch_to() and switch_to() so that we
      make it to a valid stack.
      
      Fixes: b50858ce
      
       ("x86/mm/vmalloc: Add 5-level paging support")
      Reported-and-tested-by: default avatarNeil Berrington <neil.berrington@datacore.com>
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Cc: stable@vger.kernel.org
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Link: https://lkml.kernel.org/r/346541c56caed61abbe693d7d2742b4a380c5001.1516914529.git.luto@kernel.org
      5beda7d5
    • Borislav Petkov's avatar
      x86/bugs: Drop one "mitigation" from dmesg · 55fa19d3
      Borislav Petkov authored
      
      
      Make
      
      [    0.031118] Spectre V2 mitigation: Mitigation: Full generic retpoline
      
      into
      
      [    0.031118] Spectre V2: Mitigation: Full generic retpoline
      
      to reduce the mitigation mitigations strings.
      
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: riel@redhat.com
      Cc: ak@linux.intel.com
      Cc: peterz@infradead.org
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: jikos@kernel.org
      Cc: luto@amacapital.net
      Cc: dave.hansen@intel.com
      Cc: torvalds@linux-foundation.org
      Cc: keescook@google.com
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: tim.c.chen@linux.intel.com
      Cc: pjt@google.com
      Link: https://lkml.kernel.org/r/20180126121139.31959-5-bp@alien8.de
      55fa19d3
    • Borislav Petkov's avatar
      x86/nospec: Fix header guards names · 7a32fc51
      Borislav Petkov authored
      
      
      ... to adhere to the _ASM_X86_ naming scheme.
      
      No functional change.
      
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: riel@redhat.com
      Cc: ak@linux.intel.com
      Cc: peterz@infradead.org
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: jikos@kernel.org
      Cc: luto@amacapital.net
      Cc: dave.hansen@intel.com
      Cc: torvalds@linux-foundation.org
      Cc: keescook@google.com
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: tim.c.chen@linux.intel.com
      Cc: gregkh@linux-foundation.org
      Cc: pjt@google.com
      Link: https://lkml.kernel.org/r/20180126121139.31959-3-bp@alien8.de
      7a32fc51
    • Borislav Petkov's avatar
      x86/alternative: Print unadorned pointers · 0e6c16c6
      Borislav Petkov authored
      After commit ad67b74d
      
       ("printk: hash addresses printed with %p")
      pointers are being hashed when printed. However, this makes the alternative
      debug output completely useless. Switch to %px in order to see the
      unadorned kernel pointers.
      
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: riel@redhat.com
      Cc: ak@linux.intel.com
      Cc: peterz@infradead.org
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: jikos@kernel.org
      Cc: luto@amacapital.net
      Cc: dave.hansen@intel.com
      Cc: torvalds@linux-foundation.org
      Cc: keescook@google.com
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: tim.c.chen@linux.intel.com
      Cc: gregkh@linux-foundation.org
      Cc: pjt@google.com
      Link: https://lkml.kernel.org/r/20180126121139.31959-2-bp@alien8.de
      0e6c16c6
    • David Woodhouse's avatar
      x86/speculation: Add basic IBPB (Indirect Branch Prediction Barrier) support · 20ffa1ca
      David Woodhouse authored
      
      
      Expose indirect_branch_prediction_barrier() for use in subsequent patches.
      
      [ tglx: Add IBPB status to spectre_v2 sysfs file ]
      
      Co-developed-by: default avatarKarimAllah Ahmed <karahmed@amazon.de>
      Signed-off-by: default avatarKarimAllah Ahmed <karahmed@amazon.de>
      Signed-off-by: default avatarDavid Woodhouse <dwmw@amazon.co.uk>
      Cc: gnomes@lxorguk.ukuu.org.uk
      Cc: ak@linux.intel.com
      Cc: ashok.raj@intel.com
      Cc: dave.hansen@intel.com
      Cc: arjan@linux.intel.com
      Cc: torvalds@linux-foundation.org
      Cc: peterz@infradead.org
      Cc: bp@alien8.de
      Cc: pbonzini@redhat.com
      Cc: tim.c.chen@linux.intel.com
      Cc: gregkh@linux-foundation.org
      Link: https://lkml.kernel.org/r/1516896855-7642-8-git-send-email-dwmw@amazon.co.uk
      20ffa1ca
    • David Woodhouse's avatar
      x86/cpufeature: Blacklist SPEC_CTRL/PRED_CMD on early Spectre v2 microcodes · a5b29663
      David Woodhouse authored
      This doesn't refuse to load the affected microcodes; it just refuses to
      use the Spectre v2 mitigation features if they're detected, by clearing
      the appropriate feature bits.
      
      The AMD CPUID bits are handled here too, because hypervisors *may* have
      been exposing those bits even on Intel chips, for fine-grained control
      of what's available.
      
      It is non-trivial to use x86_match_cpu() for this table because that
      doesn't handle steppings. And the approach taken in commit bd9240a1
      
      
      almost made me lose my lunch.
      
      Signed-off-by: default avatarDavid Woodhouse <dwmw@amazon.co.uk>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: gnomes@lxorguk.ukuu.org.uk
      Cc: ak@linux.intel.com
      Cc: ashok.raj@intel.com
      Cc: dave.hansen@intel.com
      Cc: karahmed@amazon.de
      Cc: arjan@linux.intel.com
      Cc: torvalds@linux-foundation.org
      Cc: peterz@infradead.org
      Cc: bp@alien8.de
      Cc: pbonzini@redhat.com
      Cc: tim.c.chen@linux.intel.com
      Cc: gregkh@linux-foundation.org
      Link: https://lkml.kernel.org/r/1516896855-7642-7-git-send-email-dwmw@amazon.co.uk
      a5b29663