Skip to content
  1. Jun 01, 2018
  2. Mar 31, 2018
  3. Mar 20, 2018
  4. Feb 21, 2018
  5. Feb 20, 2018
  6. Jan 15, 2018
    • Thomas Gleixner's avatar
      x86/retpoline: Remove compile time warning · b8b9ce4b
      Thomas Gleixner authored
      Remove the compile time warning when CONFIG_RETPOLINE=y and the compiler
      does not have retpoline support. Linus rationale for this is:
      
        It's wrong because it will just make people turn off RETPOLINE, and the
        asm updates - and return stack clearing - that are independent of the
        compiler are likely the most important parts because they are likely the
        ones easiest to target.
      
        And it's annoying because most people won't be able to do anything about
        it. The number of people building their own compiler? Very small. So if
        their distro hasn't got a compiler yet (and pretty much nobody does), the
        warning is just annoying crap.
      
        It is already properly reported as part of the sysfs interface. The
        compile-time warning only encourages bad things.
      
      Fixes: 76b04384
      
       ("x86/retpoline: Add initial retpoline support")
      Requested-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: David Woodhouse <dwmw@amazon.co.uk>
      Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
      Cc: gnomes@lxorguk.ukuu.org.uk
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: thomas.lendacky@amd.com
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jiri Kosina <jikos@kernel.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Kees Cook <keescook@google.com>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
      Link: https://lkml.kernel.org/r/CA+55aFzWgquv4i6Mab6bASqYXg3ErV3XDFEYf=GEcCDQg5uAtw@mail.gmail.com
      b8b9ce4b
  7. Jan 12, 2018
    • David Woodhouse's avatar
      x86/retpoline: Add initial retpoline support · 76b04384
      David Woodhouse authored
      
      
      Enable the use of -mindirect-branch=thunk-extern in newer GCC, and provide
      the corresponding thunks. Provide assembler macros for invoking the thunks
      in the same way that GCC does, from native and inline assembler.
      
      This adds X86_FEATURE_RETPOLINE and sets it by default on all CPUs. In
      some circumstances, IBRS microcode features may be used instead, and the
      retpoline can be disabled.
      
      On AMD CPUs if lfence is serialising, the retpoline can be dramatically
      simplified to a simple "lfence; jmp *\reg". A future patch, after it has
      been verified that lfence really is serialising in all circumstances, can
      enable this by setting the X86_FEATURE_RETPOLINE_AMD feature bit in addition
      to X86_FEATURE_RETPOLINE.
      
      Do not align the retpoline in the altinstr section, because there is no
      guarantee that it stays aligned when it's copied over the oldinstr during
      alternative patching.
      
      [ Andi Kleen: Rename the macros, add CONFIG_RETPOLINE option, export thunks]
      [ tglx: Put actual function CALL/JMP in front of the macros, convert to
        	symbolic labels ]
      [ dwmw2: Convert back to numeric labels, merge objtool fixes ]
      
      Signed-off-by: default avatarDavid Woodhouse <dwmw@amazon.co.uk>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Acked-by: default avatarArjan van de Ven <arjan@linux.intel.com>
      Acked-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: gnomes@lxorguk.ukuu.org.uk
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: thomas.lendacky@amd.com
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Jiri Kosina <jikos@kernel.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Kees Cook <keescook@google.com>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org>
      Cc: Paul Turner <pjt@google.com>
      Link: https://lkml.kernel.org/r/1515707194-20531-4-git-send-email-dwmw@amazon.co.uk
      76b04384
  8. Nov 16, 2017
  9. Nov 02, 2017
    • Greg Kroah-Hartman's avatar
      License cleanup: add SPDX GPL-2.0 license identifier to files with no license · b2441318
      Greg Kroah-Hartman authored
      Many source files in the tree are missing licensing information, which
      makes it harder for compliance tools to determine the correct license.
      
      By default all files without license information are under the default
      license of the kernel, which is GPL version 2.
      
      Update the files which contain no license information with the 'GPL-2.0'
      SPDX license identifier.  The SPDX identifier is a legally binding
      shorthand, which can be used instead of the full boiler plate text.
      
      This patch is based on work done by Thomas Gleixner and Kate Stewart and
      Philippe Ombredanne.
      
      How this work was done:
      
      Patches were generated and checked against linux-4.14-rc6 for a subset of
      the use cases:
       - file had no licensing information it it.
       - file was a */uapi/* one with no licensing information in it,
       - file was a */uapi/* one with existing licensing information,
      
      Further patches will be generated in subsequent months to fix up cases
      where non-standard...
      b2441318
  10. Aug 21, 2017
    • Matthias Kaehlcke's avatar
      x86/build: Use cc-option to validate stack alignment parameter · 9e8730b1
      Matthias Kaehlcke authored
      With the following commit:
      
        8f918697
      
       ("x86/build: Fix stack alignment for CLang")
      
      cc-option is only used to determine the name of the stack alignment option
      supported by the compiler, but not to verify that the actual parameter
      <option>=N is valid in combination with the other CFLAGS.
      
      This causes problems (as reported by the kbuild robot) with older GCC versions
      which only support stack alignment on a boundary of 16 bytes or higher.
      
      Also use (__)cc_option to add the stack alignment option to CFLAGS to
      make sure only valid options are added.
      
      Reported-by: default avatarkbuild test robot <fengguang.wu@intel.com>
      Signed-off-by: default avatarMatthias Kaehlcke <mka@chromium.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Bernhard.Rosenkranzer@linaro.org
      Cc: Greg Hackmann <ghackmann@google.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Michael Davidson <md@google.com>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephen Hines <srhines@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: dianders@chromium.org
      Fixes: 8f918697 ("x86/build: Fix stack alignment for CLang")
      Link: http://lkml.kernel.org/r/20170817182047.176752-1-mka@chromium.org
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      9e8730b1
  11. Aug 17, 2017
    • Matthias Kaehlcke's avatar
      x86/build: Fix stack alignment for CLang · 8f918697
      Matthias Kaehlcke authored
      Commit:
      
        d77698df
      
       ("x86/build: Specify stack alignment for clang")
      
      intended to use the same stack alignment for clang as with gcc.
      
      The two compilers use different options to configure the stack alignment
      (gcc: -mpreferred-stack-boundary=n, clang: -mstack-alignment=n).
      
      The above commit assumes that the clang option uses the same parameter
      type as gcc, i.e. that the alignment is specified as 2^n. However clang
      interprets the value of this option literally to use an alignment of n,
      in consequence the stack remains misaligned.
      
      Change the values used with -mstack-alignment to be the actual alignment
      instead of a power of two.
      
      cc-option isn't used here with the typical pattern of KBUILD_CFLAGS +=
      $(call cc-option ...). The reason is that older gcc versions don't
      support the -mpreferred-stack-boundary option, since cc-option doesn't
      verify whether the alternative option is valid it would incorrectly
      select the clang option -mstack-alignment..
      
      Signed-off-by: default avatarMatthias Kaehlcke <mka@chromium.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Bernhard.Rosenkranzer@linaro.org
      Cc: Greg Hackmann <ghackmann@google.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Michael Davidson <md@google.com>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephen Hines <srhines@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: dianders@chromium.org
      Link: http://lkml.kernel.org/r/20170817004740.170588-1-mka@chromium.org
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      8f918697
  12. Aug 10, 2017
  13. Jun 25, 2017
    • Matthias Kaehlcke's avatar
      x86/build: Specify stack alignment for clang · d77698df
      Matthias Kaehlcke authored
      For gcc stack alignment is configured with -mpreferred-stack-boundary=N,
      clang has the option -mstack-alignment=N for that purpose. Use the same
      alignment as with gcc.
      
      If the alignment is not specified clang assumes an alignment of
      16 bytes, as required by the standard ABI. However as mentioned in
      d9b0cde9
      
       ("x86-64, gcc: Use -mpreferred-stack-boundary=3 if
      supported") the standard kernel entry on x86-64 leaves the stack
      on an 8-byte boundary, as a consequence clang will keep the stack
      misaligned.
      
      Signed-off-by: default avatarMatthias Kaehlcke <mka@chromium.org>
      Acked-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      d77698df
    • Matthias Kaehlcke's avatar
      x86/build: Use __cc-option for boot code compiler options · 032a2c4f
      Matthias Kaehlcke authored
      
      
      cc-option is used to enable compiler options for the boot code if they
      are available. The macro uses KBUILD_CFLAGS and KBUILD_CPPFLAGS for the
      check, however these flags aren't used to build the boot code, in
      consequence cc-option can yield wrong results. For example
      -mpreferred-stack-boundary=2 is never set with a 64-bit compiler,
      since the setting is only valid for 16 and 32-bit binaries. This
      is also the case for 32-bit kernel builds, because the option -m32 is
      added to KBUILD_CFLAGS after the assignment of REALMODE_CFLAGS.
      
      Use __cc-option instead of cc-option for the boot mode options.
      The macro receives the compiler options as parameter instead of using
      KBUILD_C*FLAGS, for the boot code we pass REALMODE_CFLAGS.
      
      Also use separate statements for the __cc-option checks instead
      of performing them in the initial assignment of REALMODE_CFLAGS since
      the variable is an input of the macro.
      
      Signed-off-by: default avatarMatthias Kaehlcke <mka@chromium.org>
      Acked-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      032a2c4f
  14. Jun 14, 2017
  15. May 24, 2017
  16. May 09, 2017
  17. Apr 19, 2017
  18. Apr 17, 2017
  19. Mar 30, 2017
    • Josh Poimboeuf's avatar
      x86/build: Mostly disable '-maccumulate-outgoing-args' · 3f135e57
      Josh Poimboeuf authored
      
      
      The GCC '-maccumulate-outgoing-args' flag is enabled for most configs,
      mostly because of issues which are no longer relevant.  For most
      configs, and with most recent versions of GCC, it's no longer needed.
      
      Clarify which cases need it, and only enable it for those cases.  Also
      produce a compile-time error for the ftrace graph + mcount + '-Os' case,
      which will otherwise cause runtime failures.
      
      The main benefit of '-maccumulate-outgoing-args' is that it prevents an
      ugly prologue for functions which have aligned stacks.  But removing the
      option also has some benefits: more readable argument saves, smaller
      text size, and (presumably) slightly improved performance.
      
      Here are the object size savings for 32-bit and 64-bit defconfig
      kernels:
      
            text	   data	    bss	     dec	    hex	filename
        10006710	3543328	1773568	15323606	 e9d1d6	vmlinux.x86-32.before
         9706358	3547424	1773568	15027350	 e54c96	vmlinux.x86-32.after
      
            text	   data	    bss	     dec	    hex	filename
        10652105	4537576	 843776	16033457	 f4a6b1	vmlinux.x86-64.before
        10639629	4537576	 843776	16020981	 f475f5	vmlinux.x86-64.after
      
      That comes out to a 3% text size improvement on x86-32 and a 0.1% text
      size improvement on x86-64.
      
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andrew Lutomirski <luto@kernel.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Pavel Machek <pavel@ucw.cz>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20170316193133.zrj6gug53766m6nn@treble
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      3f135e57
  20. Sep 22, 2016
  21. Jul 27, 2016
    • Kees Cook's avatar
      kbuild: abort build on bad stack protector flag · c965b105
      Kees Cook authored
      Before, the stack protector flag was sanity checked before .config had
      been reprocessed.  This meant the build couldn't be aborted early, and
      only a warning could be emitted followed later by the compiler blowing
      up with an unknown flag.  This has caused a lot of confusion over time,
      so this splits the flag selection from sanity checking and performs the
      sanity checking after the make has been restarted from a reprocessed
      .config, so builds can be aborted as early as possible now.
      
      Additionally moves the x86-specific sanity check to the same location,
      since it suffered from the same warn-then-wait-for-compiler-failure
      problem.
      
      Link: http://lkml.kernel.org/r/20160712223043.GA11664@www.outflux.net
      
      
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Cc: Michal Marek <mmarek@suse.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c965b105
    • Kees Cook's avatar
      kbuild: Abort build on bad stack protector flag · 228d96c6
      Kees Cook authored
      
      
      Before, the stack protector flag was sanity checked before .config had
      been reprocessed. This meant the build couldn't be aborted early, and
      only a warning could be emitted followed later by the compiler blowing
      up with an unknown flag. This has caused a lot of confusion over time,
      so this splits the flag selection from sanity checking and performs the
      sanity checking after the make has been restarted from a reprocessed
      .config, so builds can be aborted as early as possible now.
      
      Additionally moves the x86-specific sanity check to the same location,
      since it suffered from the same warn-then-wait-for-compiler-failure
      problem.
      
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarMichal Marek <mmarek@suse.com>
      228d96c6
  22. Apr 22, 2016
    • Luis R. Rodriguez's avatar
      x86/init: Rename EBDA code file · f2d85299
      Luis R. Rodriguez authored
      
      
      This makes it clearer what this is.
      
      Signed-off-by: default avatarLuis R. Rodriguez <mcgrof@kernel.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: andrew.cooper3@citrix.com
      Cc: andriy.shevchenko@linux.intel.com
      Cc: bigeasy@linutronix.de
      Cc: boris.ostrovsky@oracle.com
      Cc: david.vrabel@citrix.com
      Cc: ffainelli@freebox.fr
      Cc: george.dunlap@citrix.com
      Cc: glin@suse.com
      Cc: jgross@suse.com
      Cc: jlee@suse.com
      Cc: josh@joshtriplett.org
      Cc: julien.grall@linaro.org
      Cc: konrad.wilk@oracle.com
      Cc: kozerkov@parallels.com
      Cc: lenb@kernel.org
      Cc: lguest@lists.ozlabs.org
      Cc: linux-acpi@vger.kernel.org
      Cc: lv.zheng@intel.com
      Cc: matt@codeblueprint.co.uk
      Cc: mbizon@freebox.fr
      Cc: rjw@rjwysocki.net
      Cc: robert.moore@intel.com
      Cc: rusty@rustcorp.com.au
      Cc: tiwai@suse.de
      Cc: toshi.kani@hp.com
      Cc: xen-devel@lists.xensource.com
      Link: http://lkml.kernel.org/r/1460592286-300-14-git-send-email-mcgrof@kernel.org
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      f2d85299
    • Luis R. Rodriguez's avatar
      x86/rtc: Replace paravirt rtc check with platform legacy quirk · 8d152e7a
      Luis R. Rodriguez authored
      
      
      We have 4 types of x86 platforms that disable RTC:
      
        * Intel MID
        * Lguest - uses paravirt
        * Xen dom-U - uses paravirt
        * x86 on legacy systems annotated with an ACPI legacy flag
      
      We can consolidate all of these into a platform specific legacy
      quirk set early in boot through i386_start_kernel() and through
      x86_64_start_reservations(). This deals with the RTC quirks which
      we can rely on through the hardware subarch, the ACPI check can
      be dealt with separately.
      
      For Xen things are bit more complex given that the @X86_SUBARCH_XEN
      x86_hardware_subarch is shared on for Xen which uses the PV path for
      both domU and dom0. Since the semantics for differentiating between
      the two are Xen specific we provide a platform helper to help override
      default legacy features -- x86_platform.set_legacy_features(). Use
      of this helper is highly discouraged, its only purpose should be
      to account for the lack of semantics available within your given
      x86_hardware_subarch.
      
      As per 0-day, this bumps the vmlinux size using i386-tinyconfig as
      follows:
      
      TOTAL   TEXT   init.text    x86_early_init_platform_quirks()
      +70     +62    +62          +43
      
      Only 8 bytes overhead total, as the main increase in size is
      all removed via __init.
      
      Suggested-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarLuis R. Rodriguez <mcgrof@kernel.org>
      Reviewed-by: default avatarJuergen Gross <jgross@suse.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: andrew.cooper3@citrix.com
      Cc: andriy.shevchenko@linux.intel.com
      Cc: bigeasy@linutronix.de
      Cc: boris.ostrovsky@oracle.com
      Cc: david.vrabel@citrix.com
      Cc: ffainelli@freebox.fr
      Cc: george.dunlap@citrix.com
      Cc: glin@suse.com
      Cc: jlee@suse.com
      Cc: josh@joshtriplett.org
      Cc: julien.grall@linaro.org
      Cc: konrad.wilk@oracle.com
      Cc: kozerkov@parallels.com
      Cc: lenb@kernel.org
      Cc: lguest@lists.ozlabs.org
      Cc: linux-acpi@vger.kernel.org
      Cc: lv.zheng@intel.com
      Cc: matt@codeblueprint.co.uk
      Cc: mbizon@freebox.fr
      Cc: rjw@rjwysocki.net
      Cc: robert.moore@intel.com
      Cc: rusty@rustcorp.com.au
      Cc: tiwai@suse.de
      Cc: toshi.kani@hp.com
      Cc: xen-devel@lists.xensource.com
      Link: http://lkml.kernel.org/r/1460592286-300-5-git-send-email-mcgrof@kernel.org
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      8d152e7a
  23. Oct 09, 2015
  24. Sep 21, 2015
  25. Aug 13, 2015
  26. Jul 22, 2015
  27. Jun 04, 2015
    • Ingo Molnar's avatar
      x86/asm/entry: Move the arch/x86/syscalls/ definitions to arch/x86/entry/syscalls/ · 1f57d5d8
      Ingo Molnar authored
      
      
      The build time generated syscall definitions are entry code related, move
      them into the arch/x86/entry/ directory.
      
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      1f57d5d8
    • Ingo Molnar's avatar
      x86/asm/entry, x86/vdso: Move the vDSO code to arch/x86/entry/vdso/ · d603c8e1
      Ingo Molnar authored
      
      
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      d603c8e1
  28. Jun 02, 2015
    • Ingo Molnar's avatar
      x86/debug: Remove perpetually broken, unmaintainable dwarf annotations · 131484c8
      Ingo Molnar authored
      
      
      So the dwarf2 annotations in low level assembly code have
      become an increasing hindrance: unreadable, messy macros
      mixed into some of the most security sensitive code paths
      of the Linux kernel.
      
      These debug info annotations don't even buy the upstream
      kernel anything: dwarf driven stack unwinding has caused
      problems in the past so it's out of tree, and the upstream
      kernel only uses the much more robust framepointers based
      stack unwinding method.
      
      In addition to that there's a steady, slow bitrot going
      on with these annotations, requiring frequent fixups.
      There's no tooling and no functionality upstream that
      keeps it correct.
      
      So burn down the sick forest, allowing new, healthier growth:
      
         27 files changed, 350 insertions(+), 1101 deletions(-)
      
      Someone who has the willingness and time to do this
      properly can attempt to reintroduce dwarf debuginfo in x86
      assembly code plus dwarf unwinding from first principles,
      with the following conditions:
      
       - it should be maximally readable, and maximally low-key to
         'ordinary' code reading and maintenance.
      
       - find a build time method to insert dwarf annotations
         automatically in the most common cases, for pop/push
         instructions that manipulate the stack pointer. This could
         be done for example via a preprocessing step that just
         looks for common patterns - plus special annotations for
         the few cases where we want to depart from the default.
         We have hundreds of CFI annotations, so automating most of
         that makes sense.
      
       - it should come with build tooling checks that ensure that
         CFI annotations are sensible. We've seen such efforts from
         the framepointer side, and there's no reason it couldn't be
         done on the dwarf side.
      
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jan Beulich <JBeulich@suse.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      131484c8
  29. May 17, 2015
    • Ingo Molnar's avatar
      x86: Pack loops tightly as well · 52648e83
      Ingo Molnar authored
      Packing loops tightly (-falign-loops=1) is beneficial to code size:
      
           text        data    bss     dec              filename
       12566391        1617840 1089536 15273767         vmlinux.align.16-byte
       12224951        1617840 1089536 14932327         vmlinux.align.1-byte
       11976567        1617840 1089536 14683943         vmlinux.align.1-byte.funcs-1-byte
       11903735        1617840 1089536 14611111         vmlinux.align.1-byte.funcs-1-byte.loops-1-byte
      
      Which reduces the size of the kernel by another 0.6%, so the
      the total combined size reduction of the alignment-packing
      patches is ~5.5%.
      
      The x86 decoder bandwidth and caching arguments laid out in:
      
        be6cb027
      
       ("x86: Align jump targets to 1-byte boundaries")
      
      apply to loop alignment as well.
      
      Furtermore, modern CPU uarchs have a loop cache/buffer that
      is a L0 cache before even any uop cache, covering a few
      dozen most recently executed instructions.
      
      This loop cache generally does not have the 16-byte alignment
      restrictions of the uop cache.
      
      Now loop alignment can still be beneficial if:
      
       - a loop is cache-hot and its surroundings are not.
      
       - if the loop is so cache hot that the instruction
         flow becomes x86 decoder bandwidth limited
      
      But loop alignment is harmful if:
      
       - a loop is cache-cold
      
       - a loop's surroundings are cache-hot as well
      
       - two cache-hot loops are close to each other
      
       - if the loop fits into the loop cache
      
       - if the code flow is not decoder bandwidth limited
      
      and I'd argue that the latter five scenarios are much
      more common in the kernel, as our hottest loops are
      typically:
      
       - pointer chasing: this should fit into the loop cache
         in most cases and is typically data cache and address
         generation limited
      
       - generic memory ops (memset, memcpy, etc.): these generally
         fit into the loop cache as well, and are likewise data
         cache limited.
      
      So this patch packs loop addresses tightly as well.
      
      Acked-by: default avatarDenys Vlasenko <dvlasenk@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Aswin Chandramouleeswaran <aswin@hp.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jason Low <jason.low2@hp.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Link: http://lkml.kernel.org/r/20150410123017.GB19918@gmail.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      52648e83
  30. May 15, 2015
    • Ingo Molnar's avatar
      x86: Align jump targets to 1-byte boundaries · be6cb027
      Ingo Molnar authored
      
      
      The following NOP in a hot function caught my attention:
      
        >   5a:	66 0f 1f 44 00 00    	nopw   0x0(%rax,%rax,1)
      
      That's a dead NOP that bloats the function a bit, added for the
      default 16-byte alignment that GCC applies for jump targets.
      
      I realize that x86 CPU manufacturers recommend 16-byte jump
      target alignments (it's in the Intel optimization manual),
      to help their relatively narrow decoder prefetch alignment
      and uop cache constraints, but the cost of that is very
      significant:
      
              text           data       bss         dec      filename
          12566391        1617840   1089536    15273767      vmlinux.align.16-byte
          12224951        1617840   1089536    14932327      vmlinux.align.1-byte
      
      By using 1-byte jump target alignment (i.e. no alignment at all)
      we get an almost 3% reduction in kernel size (!) - and a
      probably similar reduction in I$ footprint.
      
      Now, the usual justification for jump target alignment is the
      following:
      
       - modern decoders tend to have 16-byte (effective) decoder
         prefetch windows. (AMD documents it higher but measurements
         suggest the effective prefetch window on curretn uarchs is
         still around 16 bytes)
      
       - on Intel there's also the uop-cache with cachelines that have
         16-byte granularity and limited associativity.
      
       - older x86 uarchs had a penalty for decoder fetches that crossed
         16-byte boundaries. These limits are mostly gone from recent
         uarchs.
      
      So if a forward jump target is aligned to cacheline boundary then
      prefetches will start from a new prefetch-cacheline and there's
      higher chance for decoding in fewer steps and packing tightly.
      
      But I think that argument is flawed for typical optimized kernel
      code flows: forward jumps often go to 'cold' (uncommon) pieces
      of code, and  aligning cold code to cache lines does not bring a
      lot of advantages  (they are uncommon), while it causes
      collateral damage:
      
       - their alignment 'spreads out' the cache footprint, it shifts
         followup hot code further out
      
       - plus it slows down even 'cold' code that immediately follows 'hot'
         code (like in the above case), which could have benefited from the
         partial cacheline that comes off the end of hot code.
      
      But even in the cache-hot case the 16 byte alignment brings
      disadvantages:
      
       - it spreads out the cache footprint, possibly making the code
         fall out of the L1 I$.
      
       - On Intel CPUs, recent microarchitectures have plenty of
         uop cache (typically doubling every 3 years) - while the
         size of the L1 cache grows much less aggressively. So
         workloads are rarely uop cache limited.
      
      The only situation where alignment might matter are tight
      loops that could fit into a single 16 byte chunk - but those
      are pretty rare in the kernel: if they exist they tend
      to be pointer chasing or generic memory ops, which both tend
      to be cache miss (or cache allocation) intensive and are not
      decoder bandwidth limited.
      
      So the balance of arguments strongly favors packing kernel
      instructions tightly versus maximizing for decoder bandwidth:
      this patch changes the jump target alignment from 16 bytes
      to 1 byte (tightly packed, unaligned).
      
      Acked-by: default avatarDenys Vlasenko <dvlasenk@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Aswin Chandramouleeswaran <aswin@hp.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jason Low <jason.low2@hp.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Link: http://lkml.kernel.org/r/20150410120846.GA17101@gmail.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      be6cb027
  31. May 06, 2015
    • H.J. Lu's avatar
      x86/asm: Use -mskip-rax-setup if supported · d9ee948d
      H.J. Lu authored
      
      
      GCC 5 added a compiler option, -mskip-rax-setup, for x86-64. It skips
      setting up the RAX register when SSE is disabled and there are no
      variable arguments passed in vector registers. (According to the x86_64
      ABI, %al is used as a hidden register containing the number of vector
      registers used).
      
      Since the kernel doesn't pass vector registers to functions with
      variable arguments, this option can be used to optimize the x86-64
      kernel.
      
      This GCC feature was suggested by Rasmus Villemoes <linux@rasmusvillemoes.dk>.
      This is the corresponding kernel change using it.
      
      For kernel v3.17:
      
            text   data    bss    dec       filename
        11455921 2204048 5853184 19513153   vmlinux #with -mskip-rax-setup
        11480079 2204048 5853184 19537311   vmlinux
      
      For Kernel v4.0+ - custom config:
      
            text   data    bss    dec       filename
        10231778 3479800 16617472 30329050  vmlinux-gcc5+-mskip-rax-setup
        10268797 3547448 16621568 30437813  vmlinux
      
      Signed-off-by: default avatarH.J. Lu <hjl.tools@gmail.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      d9ee948d
  32. Apr 02, 2015
  33. Feb 04, 2015
  34. Sep 16, 2014
  35. Aug 30, 2014