Skip to content
  1. May 17, 2015
    • Ingo Molnar's avatar
      x86: Pack loops tightly as well · 52648e83
      Ingo Molnar authored
      
      
      Packing loops tightly (-falign-loops=1) is beneficial to code size:
      
           text        data    bss     dec              filename
       12566391        1617840 1089536 15273767         vmlinux.align.16-byte
       12224951        1617840 1089536 14932327         vmlinux.align.1-byte
       11976567        1617840 1089536 14683943         vmlinux.align.1-byte.funcs-1-byte
       11903735        1617840 1089536 14611111         vmlinux.align.1-byte.funcs-1-byte.loops-1-byte
      
      Which reduces the size of the kernel by another 0.6%, so the
      the total combined size reduction of the alignment-packing
      patches is ~5.5%.
      
      The x86 decoder bandwidth and caching arguments laid out in:
      
        be6cb027 ("x86: Align jump targets to 1-byte boundaries")
      
      apply to loop alignment as well.
      
      Furtermore, modern CPU uarchs have a loop cache/buffer that
      is a L0 cache before even any uop cache, covering a few
      dozen most recently executed instructions.
      
      This loop cache generally does not have the 16-byte alignment
      restrictions of the uop cache.
      
      Now loop alignment can still be beneficial if:
      
       - a loop is cache-hot and its surroundings are not.
      
       - if the loop is so cache hot that the instruction
         flow becomes x86 decoder bandwidth limited
      
      But loop alignment is harmful if:
      
       - a loop is cache-cold
      
       - a loop's surroundings are cache-hot as well
      
       - two cache-hot loops are close to each other
      
       - if the loop fits into the loop cache
      
       - if the code flow is not decoder bandwidth limited
      
      and I'd argue that the latter five scenarios are much
      more common in the kernel, as our hottest loops are
      typically:
      
       - pointer chasing: this should fit into the loop cache
         in most cases and is typically data cache and address
         generation limited
      
       - generic memory ops (memset, memcpy, etc.): these generally
         fit into the loop cache as well, and are likewise data
         cache limited.
      
      So this patch packs loop addresses tightly as well.
      
      Acked-by: default avatarDenys Vlasenko <dvlasenk@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Aswin Chandramouleeswaran <aswin@hp.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jason Low <jason.low2@hp.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Link: http://lkml.kernel.org/r/20150410123017.GB19918@gmail.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      52648e83
  2. May 15, 2015
    • Ingo Molnar's avatar
      x86: Align jump targets to 1-byte boundaries · be6cb027
      Ingo Molnar authored
      
      
      The following NOP in a hot function caught my attention:
      
        >   5a:	66 0f 1f 44 00 00    	nopw   0x0(%rax,%rax,1)
      
      That's a dead NOP that bloats the function a bit, added for the
      default 16-byte alignment that GCC applies for jump targets.
      
      I realize that x86 CPU manufacturers recommend 16-byte jump
      target alignments (it's in the Intel optimization manual),
      to help their relatively narrow decoder prefetch alignment
      and uop cache constraints, but the cost of that is very
      significant:
      
              text           data       bss         dec      filename
          12566391        1617840   1089536    15273767      vmlinux.align.16-byte
          12224951        1617840   1089536    14932327      vmlinux.align.1-byte
      
      By using 1-byte jump target alignment (i.e. no alignment at all)
      we get an almost 3% reduction in kernel size (!) - and a
      probably similar reduction in I$ footprint.
      
      Now, the usual justification for jump target alignment is the
      following:
      
       - modern decoders tend to have 16-byte (effective) decoder
         prefetch windows. (AMD documents it higher but measurements
         suggest the effective prefetch window on curretn uarchs is
         still around 16 bytes)
      
       - on Intel there's also the uop-cache with cachelines that have
         16-byte granularity and limited associativity.
      
       - older x86 uarchs had a penalty for decoder fetches that crossed
         16-byte boundaries. These limits are mostly gone from recent
         uarchs.
      
      So if a forward jump target is aligned to cacheline boundary then
      prefetches will start from a new prefetch-cacheline and there's
      higher chance for decoding in fewer steps and packing tightly.
      
      But I think that argument is flawed for typical optimized kernel
      code flows: forward jumps often go to 'cold' (uncommon) pieces
      of code, and  aligning cold code to cache lines does not bring a
      lot of advantages  (they are uncommon), while it causes
      collateral damage:
      
       - their alignment 'spreads out' the cache footprint, it shifts
         followup hot code further out
      
       - plus it slows down even 'cold' code that immediately follows 'hot'
         code (like in the above case), which could have benefited from the
         partial cacheline that comes off the end of hot code.
      
      But even in the cache-hot case the 16 byte alignment brings
      disadvantages:
      
       - it spreads out the cache footprint, possibly making the code
         fall out of the L1 I$.
      
       - On Intel CPUs, recent microarchitectures have plenty of
         uop cache (typically doubling every 3 years) - while the
         size of the L1 cache grows much less aggressively. So
         workloads are rarely uop cache limited.
      
      The only situation where alignment might matter are tight
      loops that could fit into a single 16 byte chunk - but those
      are pretty rare in the kernel: if they exist they tend
      to be pointer chasing or generic memory ops, which both tend
      to be cache miss (or cache allocation) intensive and are not
      decoder bandwidth limited.
      
      So the balance of arguments strongly favors packing kernel
      instructions tightly versus maximizing for decoder bandwidth:
      this patch changes the jump target alignment from 16 bytes
      to 1 byte (tightly packed, unaligned).
      
      Acked-by: default avatarDenys Vlasenko <dvlasenk@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Aswin Chandramouleeswaran <aswin@hp.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jason Low <jason.low2@hp.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Link: http://lkml.kernel.org/r/20150410120846.GA17101@gmail.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      be6cb027
  3. May 06, 2015
    • H.J. Lu's avatar
      x86/asm: Use -mskip-rax-setup if supported · d9ee948d
      H.J. Lu authored
      
      
      GCC 5 added a compiler option, -mskip-rax-setup, for x86-64. It skips
      setting up the RAX register when SSE is disabled and there are no
      variable arguments passed in vector registers. (According to the x86_64
      ABI, %al is used as a hidden register containing the number of vector
      registers used).
      
      Since the kernel doesn't pass vector registers to functions with
      variable arguments, this option can be used to optimize the x86-64
      kernel.
      
      This GCC feature was suggested by Rasmus Villemoes <linux@rasmusvillemoes.dk>.
      This is the corresponding kernel change using it.
      
      For kernel v3.17:
      
            text   data    bss    dec       filename
        11455921 2204048 5853184 19513153   vmlinux #with -mskip-rax-setup
        11480079 2204048 5853184 19537311   vmlinux
      
      For Kernel v4.0+ - custom config:
      
            text   data    bss    dec       filename
        10231778 3479800 16617472 30329050  vmlinux-gcc5+-mskip-rax-setup
        10268797 3547448 16621568 30437813  vmlinux
      
      Signed-off-by: default avatarH.J. Lu <hjl.tools@gmail.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      d9ee948d
  4. Apr 02, 2015
  5. Feb 04, 2015
  6. Sep 16, 2014
  7. Aug 30, 2014
  8. Aug 09, 2014
    • Josh Triplett's avatar
      x86, platform, kconfig: move kvmconfig functionality to a helper · 3aaefce1
      Josh Triplett authored
      
      
      The new mergeconfig helper makes it easier to add other partial
      configurations similar to kvmconfig.  Architecture-independent portions
      of those partial configurations should go in
      kernel/configs/${name}.config, and architecture-dependent portions
      should go in arch/${arch}/configs/${name}.config.
      
      Based on a patch by Luis R. Rodriguez <mcgrof@suse.com>.
      Originally-Signed-off-by: default avatarLuis R. Rodriguez <mcgrof@suse.com>
      
      Modified to make the helper name more general than just virtualization,
      support architecture-dependent and architecture-independent partial
      configurations, move the helper and kvmconfig to
      scripts/kconfig/Makefile, and factor out more of the common file path.
      
      Signed-off-by: default avatarJosh Triplett <josh@joshtriplett.org>
      3aaefce1
    • Vivek Goyal's avatar
      purgatory: core purgatory functionality · 8fc5b4d4
      Vivek Goyal authored
      
      
      Create a stand alone relocatable object purgatory which runs between two
      kernels.  This name, concept and some code has been taken from
      kexec-tools.  Idea is that this code runs after a crash and it runs in
      minimal environment.  So keep it separate from rest of the kernel and in
      long term we will have to practically do no maintenance of this code.
      
      This code also has the logic to do verify sha256 hashes of various
      segments which have been loaded into memory.  So first we verify that the
      kernel we are jumping to is fine and has not been corrupted and make
      progress only if checsums are verified.
      
      This code also takes care of copying some memory contents to backup region.
      
      [sfr@canb.auug.org.au: run host built programs from objtree]
      Signed-off-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      Cc: Greg Kroah-Hartman <greg@kroah.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: WANG Chao <chaowang@redhat.com>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Signed-off-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8fc5b4d4
  9. Jun 05, 2014
  10. May 08, 2014
  11. Apr 22, 2014
  12. Apr 14, 2014
  13. Apr 10, 2014
  14. Mar 20, 2014
  15. Feb 05, 2014
    • Borislav Petkov's avatar
      x86: Disable generation of traditional x87 instructions · b399fe35
      Borislav Petkov authored
      
      
      We recently had the case where wrongly used floating-constant 'E' caused
      the generation of traditional x87 instructions in kernel code and
      wreaking all kinds of havoc.
      
      Disable the generation of those too. This will save people a lot of time
      when trying to debug such issues by erroring out of the build instead of
      let them manifest themselves in very spectacular and happy-crappy ways
      at runtime.
      
      We're using -mno-fp-ret-in-387 in addition to -mno-80387 (which is ==
      -msoft-float) because, as the gcc manpage says:
      
        On machines where a function returns floating-point results in the
        80387 register stack, some floating-point opcodes may be emitted even
        if -msoft-float is used.
      
      so we want to turn off *all* non-integer instructions involving any
      architectural FPU state, unless it is absolutely necessary (and those
      cases need special handling anyway).
      
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Michael Matz <matz@suse.de>
      Cc: Michal Hocko <mhocko@suse.cz>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Link: http://lkml.kernel.org/r/1391561711-3023-1-git-send-email-bp@alien8.de
      
      
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      b399fe35
  16. Jan 31, 2014
  17. Jan 22, 2014
  18. Dec 20, 2013
    • Kees Cook's avatar
      stackprotector: Unify the HAVE_CC_STACKPROTECTOR logic between architectures · 19952a92
      Kees Cook authored
      
      
      Instead of duplicating the CC_STACKPROTECTOR Kconfig and
      Makefile logic in each architecture, switch to using
      HAVE_CC_STACKPROTECTOR and keep everything in one place. This
      retains the x86-specific bug verification scripts.
      
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Cc: Michal Marek <mmarek@suse.cz>
      Cc: Russell King <linux@arm.linux.org.uk>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Shawn Guo <shawn.guo@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-mips@linux-mips.org
      Cc: linux-arch@vger.kernel.org
      Link: http://lkml.kernel.org/r/1387481759-14535-2-git-send-email-keescook@chromium.org
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      19952a92
  19. Dec 10, 2013
    • H. Peter Anvin's avatar
      x86, build: Pass in additional -mno-mmx, -mno-sse options · 8b3b005d
      H. Peter Anvin authored
      
      
      In checkin
      
          5551a34e x86-64, build: Always pass in -mno-sse
      
      we unconditionally added -mno-sse to the main build, to keep newer
      compilers from generating SSE instructions from autovectorization.
      However, this did not extend to the special environments
      (arch/x86/boot, arch/x86/boot/compressed, and arch/x86/realmode/rm).
      Add -mno-sse to the compiler command line for these environments, and
      add -mno-mmx to all the environments as well, as we don't want a
      compiler to generate MMX code either.
      
      This patch also removes a $(cc-option) call for -m32, since we have
      long since stopped supporting compilers too old for the -m32 option,
      and in fact hardcode it in other places in the Makefiles.
      
      Reported-by: default avatarKevin B. Smith <kevin.b.smith@intel.com>
      Cc: Sunil K. Pandey <sunil.k.pandey@intel.com>
      Signed-off-by: default avatarH. Peter Anvin <hpa@linux.intel.com>
      Cc: H. J. Lu <hjl.tools@gmail.com>
      Link: http://lkml.kernel.org/n/tip-j21wzqv790q834n7yc6g80j1@git.kernel.org
      Cc: <stable@vger.kernel.org> # build fix only
      8b3b005d
  20. Dec 04, 2013
  21. Aug 08, 2013
  22. Jun 23, 2013
  23. May 28, 2013
  24. Dec 21, 2012
    • David Woodhouse's avatar
      x86: Default to ARCH=x86 to avoid overriding CONFIG_64BIT · ffee0de4
      David Woodhouse authored
      
      
      It is easy to waste a bunch of time when one takes a 32-bit .config
      from a test machine and try to build it on a faster 64-bit system, and
      its existing setting of CONFIG_64BIT=n gets *changed* to match the
      build host.  Similarly, if one has an existing build tree it is easy
      to trash an entire build tree that way.
      
      This is because the default setting for $ARCH when discovered from
      'uname' is one of the legacy pre-x86-merge values (i386 or x86_64),
      which effectively force the setting of CONFIG_64BIT to match. We should
      default to ARCH=x86 instead, finally completing the merge that we
      started so long ago.
      
      This patch preserves the behaviour of the legacy ARCH settings for commands
      such as:
      
         make ARCH=x86_64 randconfig
         make ARCH=i386 randconfig
      
      ... since making the value of CONFIG_64BIT actually random in that situation
      is not desirable.
      
      In time, perhaps we can retire this legacy use of the old ARCH= values.
      We already have a way to override values for *any* config option, using
      $KCONFIG_ALLCONFIG, so it could be argued that we don't necessarily need
      to keep ARCH={i386,x86_64} around as a special case just for overriding
      CONFIG_64BIT.
      
      We'd probably at least want to add a way to override config options from
      the command line ('make CONFIG_FOO=y oldconfig') before we talk about doing
      that though.
      
      Signed-off-by: default avatarDavid Woodhouse <David.Woodhouse@intel.com>
      Link: http://lkml.kernel.org/r/1356040315.3198.51.camel@shinybook.infradead.org
      
      
      Signed-off-by: default avatarH. Peter Anvin <hpa@linux.intel.com>
      ffee0de4
  25. Dec 13, 2012
  26. Oct 16, 2012
  27. Oct 03, 2012
    • Jean Delvare's avatar
      kbuild: Fix gcc -x syntax · b1e0d8b7
      Jean Delvare authored
      
      
      The correct syntax for gcc -x is "gcc -x assembler", not
      "gcc -xassembler". Even though the latter happens to work, the former
      is what is documented in the manual page and thus what gcc wrappers
      such as icecream do expect.
      
      This isn't a cosmetic change. The missing space prevents icecream from
      recognizing compilation tasks it can't handle, leading to silent kernel
      miscompilations.
      
      Besides me, credits go to Michael Matz and Dirk Mueller for
      investigating the miscompilation issue and tracking it down to this
      incorrect -x parameter syntax.
      
      Signed-off-by: default avatarJean Delvare <jdelvare@suse.de>
      Acked-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: stable@vger.kernel.org
      Cc: Bernhard Walle <bernhard@bwalle.de>
      Cc: Michal Marek <mmarek@suse.cz>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarMichal Marek <mmarek@suse.cz>
      b1e0d8b7
  28. Sep 27, 2012
  29. Sep 21, 2012
    • Jeff Mahoney's avatar
      x86/kbuild: archscripts depends on scripts_basic · 24cc7fb6
      Jeff Mahoney authored
      
      
      While building the SUSE kernel packages, which build the scripts,
      make clean, and then build everything, we have been running into spurious
      build failures. We tracked them down to a simple dependency issue:
      
      $ make mrproper
        CLEAN   arch/x86/tools
        CLEAN   scripts/basic
      $ cp patches/config/x86_64/desktop .config
      $ make archscripts
        HOSTCC  arch/x86/tools/relocs
      /bin/sh: scripts/basic/fixdep: No such file or directory
      make[3]: *** [arch/x86/tools/relocs] Error 1
      make[2]: *** [archscripts] Error 2
      make[1]: *** [sub-make] Error 2
      make: *** [all] Error 2
      
      This was introduced by commit
      6520fe55 (x86, realmode: 16-bit real-mode code support for relocs),
      which added the archscripts dependency to archprepare.
      
      This patch adds the scripts_basic dependency to the x86 archscripts.
      
      Signed-off-by: default avatarJeff Mahoney <jeffm@suse.com>
      Signed-off-by: default avatarMichal Marek <mmarek@suse.cz>
      24cc7fb6
  30. Aug 11, 2012
  31. Jun 24, 2012
  32. May 22, 2012
  33. May 19, 2012
    • H. Peter Anvin's avatar
      x86, realmode: 16-bit real-mode code support for relocs tool · 6520fe55
      H. Peter Anvin authored
      
      
      A new option is added to the relocs tool called '--realmode'.
      This option causes the generation of 16-bit segment relocations
      and 32-bit linear relocations for the real-mode code. When
      the real-mode code is moved to the low-memory during kernel
      initialization, these relocation entries can be used to
      relocate the code properly.
      
      In the assembly code 16-bit segment relocations must be relative
      to the 'real_mode_seg' absolute symbol. Linear relocations must be
      relative to a symbol prefixed with 'pa_'.
      
      16-bit segment relocation is used to load cs:ip in 16-bit code.
      Linear relocations are used in the 32-bit code for relocatable
      data references. They are declared in the linker script of the
      real-mode code.
      
      The relocs tool is moved to arch/x86/tools/relocs.c, and added new
      target archscripts that can be used to build scripts needed building
      an architecture.  be compiled before building the arch/x86 tree.
      
      [ hpa: accelerating this because it detects invalid absolute
        relocations, a serious bug in binutils 2.22.52.0.x which currently
        produces bad kernels. ]
      
      Signed-off-by: default avatarH. Peter Anvin <hpa@linux.intel.com>
      Link: http://lkml.kernel.org/r/1336501366-28617-2-git-send-email-jarkko.sakkinen@intel.com
      
      
      Signed-off-by: default avatarJarkko Sakkinen <jarkko.sakkinen@intel.com>
      Signed-off-by: default avatarH. Peter Anvin <hpa@linux.intel.com>
      Cc: <stable@vger.kernel.org>
      6520fe55
  34. May 09, 2012
  35. May 05, 2012
  36. Mar 31, 2012
  37. Feb 28, 2012