Skip to content
  1. May 24, 2015
    • Andy Lutomirski's avatar
      x86/asm/irq: Stop relying on magic JMP behavior for early_idt_handlers · cdeb6048
      Andy Lutomirski authored
      
      
      The early_idt_handlers asm code generates an array of entry
      points spaced nine bytes apart.  It's not really clear from that
      code or from the places that reference it what's going on, and
      the code only works in the first place because GAS never
      generates two-byte JMP instructions when jumping to global
      labels.
      
      Clean up the code to generate the correct array stride (member size)
      explicitly. This should be considerably more robust against
      screw-ups, as GAS will warn if a .fill directive has a negative
      count.  Using '. =' to advance would have been even more robust
      (it would generate an actual error if it tried to move
      backwards), but it would pad with nulls, confusing anyone who
      tries to disassemble the code.  The new scheme should be much
      clearer to future readers.
      
      While we're at it, improve the comments and rename the array and
      common code.
      
      Binutils may start relaxing jumps to non-weak labels.  If so,
      this change will fix our build, and we may need to backport this
      change.
      
      Before, on x86_64:
      
        0000000000000000 <early_idt_handlers>:
           0:   6a 00                   pushq  $0x0
           2:   6a 00                   pushq  $0x0
           4:   e9 00 00 00 00          jmpq   9 <early_idt_handlers+0x9>
                                5: R_X86_64_PC32        early_idt_handler-0x4
        ...
          48:   66 90                   xchg   %ax,%ax
          4a:   6a 08                   pushq  $0x8
          4c:   e9 00 00 00 00          jmpq   51 <early_idt_handlers+0x51>
                                4d: R_X86_64_PC32       early_idt_handler-0x4
        ...
         117:   6a 00                   pushq  $0x0
         119:   6a 1f                   pushq  $0x1f
         11b:   e9 00 00 00 00          jmpq   120 <early_idt_handler>
                                11c: R_X86_64_PC32      early_idt_handler-0x4
      
      After:
      
        0000000000000000 <early_idt_handler_array>:
           0:   6a 00                   pushq  $0x0
           2:   6a 00                   pushq  $0x0
           4:   e9 14 01 00 00          jmpq   11d <early_idt_handler_common>
        ...
          48:   6a 08                   pushq  $0x8
          4a:   e9 d1 00 00 00          jmpq   120 <early_idt_handler_common>
          4f:   cc                      int3
          50:   cc                      int3
        ...
         117:   6a 00                   pushq  $0x0
         119:   6a 1f                   pushq  $0x1f
         11b:   eb 03                   jmp    120 <early_idt_handler_common>
         11d:   cc                      int3
         11e:   cc                      int3
         11f:   cc                      int3
      
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Acked-by: default avatarH. Peter Anvin <hpa@linux.intel.com>
      Cc: Binutils <binutils@sourceware.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: H.J. Lu <hjl.tools@gmail.com>
      Cc: Jan Beulich <JBeulich@suse.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/ac027962af343b0c599cbfcf50b945ad2ef3d7a8.1432336324.git.luto@kernel.org
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      cdeb6048
  2. May 17, 2015
    • Denys Vlasenko's avatar
      x86/asm/entry/64: Use shorter MOVs from segment registers · adeb5537
      Denys Vlasenko authored
      
      
      The "movw %ds,%cx" instruction needs a 0x66 prefix, while
      "movl %ds,%ecx" does not.
      
      The difference is that latter form (on 64-bit CPUs)
      overwrites the entire %ecx, not only its lower half.
      
      But subsequent code doesn't depend on the value of upper
      half of %ecx, so we can safely use the shorter instruction.
      
      The new code is also faster than the old one - now we don't
      depend on the old value of %ecx, but this code fragment is
      not performance-critical so it does not matter much.
      
      Signed-off-by: default avatarDenys Vlasenko <dvlasenk@redhat.com>
      Cc: Alexei Starovoitov <ast@plumgrid.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Drewry <wad@chromium.org>
      Link: http://lkml.kernel.org/r/1431722346-26585-1-git-send-email-dvlasenk@redhat.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      adeb5537
    • Borislav Petkov's avatar
      x86/asm/head*.S: Change global labels to local · e839004b
      Borislav Petkov authored
      
      
      Make the disassembly look less confusing:
      
        -- head_64.o.before.asm
        ++ head_64.o.after.asm
         0000000000000120 <early_idt_handler>:
          120:	fc                   	cld
          121:	83 3c 24 02          	cmpl   $0x2,(%rsp)
        - 125:	0f 84 9d 00 00 00    	je     1c8 <is_nmi>
        + 125:	0f 84 9d 00 00 00    	je     1c8 <early_idt_handler+0xa8>
          12b:	83 3d 00 00 00 00 02 	cmpl   $0x2,0x0(%rip)        # 132 <early_idt_handler+0x12>
          132:	74 7e                	je     1b2 <early_idt_handler+0x92>
          134:	ff 05 00 00 00 00    	incl   0x0(%rip)        # 13a <early_idt_handler+0x1a>
        @@ -1198,9 +1198,7 @@ Disassembly of section .init.text:
          1bf:	5a                   	pop    %rdx
          1c0:	59                   	pop    %rcx
          1c1:	58                   	pop    %rax
        - 1c2:	ff 0d 00 00 00 00    	decl   0x0(%rip)        # 1c8 <is_nmi>
        -
        -00000000000001c8 <is_nmi>:
        + 1c2:	ff 0d 00 00 00 00    	decl   0x0(%rip)        # 1c8 <early_idt_handler+0xa8>
          1c8:	48 83 c4 10          	add    $0x10,%rsp
          1cc:	48 cf                	iretq
      
        -- head_32.o.before.asm
        ++ head_32.o.after.asm
         0000016c <early_idt_handler>:
          16c:  fc                      cld
          16d:  83 3c 24 02             cmpl   $0x2,(%esp)
        - 171:  74 73                   je     1e6 <is_nmi>
        + 171:  74 73                   je     1e6 <ex_entry+0xc>
          173:  36 83 3d 00 00 00 00    cmpl   $0x2,%ss:0x0
          17a:  02
          17b:  74 5a                   je     1d7 <hlt_loop>
        @@ -483,8 +483,6 @@ Disassembly of section .init.text:
          1dd:  59                      pop    %ecx
          1de:  58                      pop    %eax
          1df:  36 ff 0d 00 00 00 00    decl   %ss:0x0
        -
        -000001e6 <is_nmi>:
          1e6:  83 c4 08                add    $0x8,%esp
          1e9:  cf                      iret
          1ea:  66 90                   xchg   %ax,%ax
      
      No functionality change.
      
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@linux.intel.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1431793079-11153-1-git-send-email-bp@alien8.de
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      e839004b
    • Ingo Molnar's avatar
      x86: Pack loops tightly as well · 52648e83
      Ingo Molnar authored
      
      
      Packing loops tightly (-falign-loops=1) is beneficial to code size:
      
           text        data    bss     dec              filename
       12566391        1617840 1089536 15273767         vmlinux.align.16-byte
       12224951        1617840 1089536 14932327         vmlinux.align.1-byte
       11976567        1617840 1089536 14683943         vmlinux.align.1-byte.funcs-1-byte
       11903735        1617840 1089536 14611111         vmlinux.align.1-byte.funcs-1-byte.loops-1-byte
      
      Which reduces the size of the kernel by another 0.6%, so the
      the total combined size reduction of the alignment-packing
      patches is ~5.5%.
      
      The x86 decoder bandwidth and caching arguments laid out in:
      
        be6cb027 ("x86: Align jump targets to 1-byte boundaries")
      
      apply to loop alignment as well.
      
      Furtermore, modern CPU uarchs have a loop cache/buffer that
      is a L0 cache before even any uop cache, covering a few
      dozen most recently executed instructions.
      
      This loop cache generally does not have the 16-byte alignment
      restrictions of the uop cache.
      
      Now loop alignment can still be beneficial if:
      
       - a loop is cache-hot and its surroundings are not.
      
       - if the loop is so cache hot that the instruction
         flow becomes x86 decoder bandwidth limited
      
      But loop alignment is harmful if:
      
       - a loop is cache-cold
      
       - a loop's surroundings are cache-hot as well
      
       - two cache-hot loops are close to each other
      
       - if the loop fits into the loop cache
      
       - if the code flow is not decoder bandwidth limited
      
      and I'd argue that the latter five scenarios are much
      more common in the kernel, as our hottest loops are
      typically:
      
       - pointer chasing: this should fit into the loop cache
         in most cases and is typically data cache and address
         generation limited
      
       - generic memory ops (memset, memcpy, etc.): these generally
         fit into the loop cache as well, and are likewise data
         cache limited.
      
      So this patch packs loop addresses tightly as well.
      
      Acked-by: default avatarDenys Vlasenko <dvlasenk@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Aswin Chandramouleeswaran <aswin@hp.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jason Low <jason.low2@hp.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Link: http://lkml.kernel.org/r/20150410123017.GB19918@gmail.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      52648e83
  3. May 16, 2015
  4. May 15, 2015
    • Ingo Molnar's avatar
      x86: Align jump targets to 1-byte boundaries · be6cb027
      Ingo Molnar authored
      
      
      The following NOP in a hot function caught my attention:
      
        >   5a:	66 0f 1f 44 00 00    	nopw   0x0(%rax,%rax,1)
      
      That's a dead NOP that bloats the function a bit, added for the
      default 16-byte alignment that GCC applies for jump targets.
      
      I realize that x86 CPU manufacturers recommend 16-byte jump
      target alignments (it's in the Intel optimization manual),
      to help their relatively narrow decoder prefetch alignment
      and uop cache constraints, but the cost of that is very
      significant:
      
              text           data       bss         dec      filename
          12566391        1617840   1089536    15273767      vmlinux.align.16-byte
          12224951        1617840   1089536    14932327      vmlinux.align.1-byte
      
      By using 1-byte jump target alignment (i.e. no alignment at all)
      we get an almost 3% reduction in kernel size (!) - and a
      probably similar reduction in I$ footprint.
      
      Now, the usual justification for jump target alignment is the
      following:
      
       - modern decoders tend to have 16-byte (effective) decoder
         prefetch windows. (AMD documents it higher but measurements
         suggest the effective prefetch window on curretn uarchs is
         still around 16 bytes)
      
       - on Intel there's also the uop-cache with cachelines that have
         16-byte granularity and limited associativity.
      
       - older x86 uarchs had a penalty for decoder fetches that crossed
         16-byte boundaries. These limits are mostly gone from recent
         uarchs.
      
      So if a forward jump target is aligned to cacheline boundary then
      prefetches will start from a new prefetch-cacheline and there's
      higher chance for decoding in fewer steps and packing tightly.
      
      But I think that argument is flawed for typical optimized kernel
      code flows: forward jumps often go to 'cold' (uncommon) pieces
      of code, and  aligning cold code to cache lines does not bring a
      lot of advantages  (they are uncommon), while it causes
      collateral damage:
      
       - their alignment 'spreads out' the cache footprint, it shifts
         followup hot code further out
      
       - plus it slows down even 'cold' code that immediately follows 'hot'
         code (like in the above case), which could have benefited from the
         partial cacheline that comes off the end of hot code.
      
      But even in the cache-hot case the 16 byte alignment brings
      disadvantages:
      
       - it spreads out the cache footprint, possibly making the code
         fall out of the L1 I$.
      
       - On Intel CPUs, recent microarchitectures have plenty of
         uop cache (typically doubling every 3 years) - while the
         size of the L1 cache grows much less aggressively. So
         workloads are rarely uop cache limited.
      
      The only situation where alignment might matter are tight
      loops that could fit into a single 16 byte chunk - but those
      are pretty rare in the kernel: if they exist they tend
      to be pointer chasing or generic memory ops, which both tend
      to be cache miss (or cache allocation) intensive and are not
      decoder bandwidth limited.
      
      So the balance of arguments strongly favors packing kernel
      instructions tightly versus maximizing for decoder bandwidth:
      this patch changes the jump target alignment from 16 bytes
      to 1 byte (tightly packed, unaligned).
      
      Acked-by: default avatarDenys Vlasenko <dvlasenk@redhat.com>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Aswin Chandramouleeswaran <aswin@hp.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jason Low <jason.low2@hp.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tim Chen <tim.c.chen@linux.intel.com>
      Link: http://lkml.kernel.org/r/20150410120846.GA17101@gmail.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      be6cb027
  5. May 14, 2015
  6. May 13, 2015
    • Heiko Stuebner's avatar
      Revert "ARM: rockchip: fix undefined instruction of reset_ctrl_regs" · 3f937cf3
      Heiko Stuebner authored
      
      
      This reverts commit b403125d.
      
      As reported by Chris, both commits
              b403125d "ARM: rockchip: fix undefined instruction of reset_ctrl_regs"
              0ea001d3 "ARM: rockchip: disable dapswjdp during suspend"
      actually fix the same issue and b403125d is the older one, which got
      superseded by 0ea001d3. Therefore revert the obsolete one again.
      
      Reported-by: default avatarChris Zhong <zyw@rock-chips.com>
      Signed-off-by: default avatarHeiko Stuebner <heiko@sntech.de>
      3f937cf3
    • Krzysztof Kozlowski's avatar
      ARM: EXYNOS: Fix dereference of ERR_PTR returned by of_genpd_get_from_provider · 0b7dc0ff
      Krzysztof Kozlowski authored
      
      
      ERR_PTR was dereferenced during sub domain parsing, if parent domain
      could not be obtained (because of invalid phandle or deferred
      registration of parent domain).
      
      The Exynos power domain code checked whether
      of_genpd_get_from_provider() returned NULL and in that case it skipped
      that power domain node. However this function returns ERR_PTR or valid
      pointer, not NULL.
      
      Fixes: 0f780751 ("ARM: EXYNOS: add support for sub-power domains")
      Cc: <stable@vger.kernel.org>	[4.0+]
      Signed-off-by: default avatarKrzysztof Kozlowski <k.kozlowski@samsung.com>
      Signed-off-by: default avatarKukjin Kim <kgene@kernel.org>
      0b7dc0ff
    • Alexei Starovoitov's avatar
      x86: bpf_jit: fix FROM_BE16 and FROM_LE16/32 instructions · 343f845b
      Alexei Starovoitov authored
      
      
      FROM_BE16:
      'ror %reg, 8' doesn't clear upper bits of the register,
      so use additional 'movzwl' insn to zero extend 16 bits into 64
      
      FROM_LE16:
      should zero extend lower 16 bits into 64 bit
      
      FROM_LE32:
      should zero extend lower 32 bits into 64 bit
      
      Fixes: 89aa0758 ("net: sock: allow eBPF programs to be attached to sockets")
      Signed-off-by: default avatarAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      343f845b
    • Joshua Kinard's avatar
      MIPS: IP32: Fix build errors in reset code in DS1685 platform hook. · 4305689d
      Joshua Kinard authored
      
      
      Fix two build errors in reset code introduced in DS1685 platform hook patch.
      
      Signed-off-by: default avatarJoshua Kinard <kumba@gentoo.org>
      Fixes: 15beb694: "mips: ip32: add platform data hooks to use DS1685 driver"
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Alessandro Zummo <a.zummo@towertech.it>
      Cc: LKML <linux-kernel@vger.kernel.org>
      Cc: rtc-linux@googlegroups.com
      Cc: Linux MIPS List <linux-mips@linux-mips.org>
      Patchwork: https://patchwork.linux-mips.org/patch/9787/
      
      
      Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      4305689d
    • Nicholas Mc Guire's avatar
      MIPS: KVM: Fix unused variable build warning · 5f508c43
      Nicholas Mc Guire authored
      
      
      As kvm_mips_complete_mmio_load() did not yet modify PC at this point
      as James Hogans <james.hogan@imgtec.com> explained the curr_pc variable
      and the comments along with it can be dropped.
      
      Signed-off-by: default avatarNicholas Mc Guire <hofrat@osadl.org>
      Link: http://lkml.org/lkml/2015/5/8/422
      Cc: Gleb Natapov <gleb@kernel.org>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: kvm@vger.kernel.org
      Cc: linux-mips@linux-mips.org
      Cc: linux-kernel@vger.kernel.org
      Patchwork: https://patchwork.linux-mips.org/patch/9993/
      
      
      Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      5f508c43
    • Petri Gynther's avatar
      MIPS: traps: remove extra Tainted: line from __show_regs() output · 2d2ec2f7
      Petri Gynther authored
      
      
      __show_regs() calls show_regs_print_info(), which already outputs
      the Tainted: information. So, no need to output it twice.
      
      Signed-off-by: default avatarPetri Gynther <pgynther@google.com>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/9997/
      
      
      Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      2d2ec2f7
    • Aaro Koskinen's avatar
      MIPS: Fix wrong CHECKFLAGS (sparse builds) with GCC 5.1 · 73d8f99c
      Aaro Koskinen authored
      
      
      GCC 5.1 defines __REGISTER_PREFIX__ to $. This will break sparse
      command line (and build fails with: /bin/sh: syntax error:
      unexpected "(") since make tries to expand starting with the dollar
      sign with a make variable. Prevent that by using double dollar sign.
      
      Signed-off-by: default avatarAaro Koskinen <aaro.koskinen@iki.fi>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/10025/
      
      
      Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      73d8f99c
    • Maciej W. Rozycki's avatar
      MIPS: Fix a preemption issue with thread's FPU defaults · 03dce595
      Maciej W. Rozycki authored
      
      
      Fix "BUG: using smp_processor_id() in preemptible" reported in accesses
      to thread's FPU defaults: the value to initialise FSCR to at program
      startup, the FCSR r/w mask and the contents of FIR in full FPU
      emulation, removing a regression introduced with 9b26616c [MIPS: Respect
      the ISA level in FCSR handling] and f6843626 [MIPS: math-emu: Set FIR
      feature flags for full emulation].
      
      Use `boot_cpu_data' to obtain the data from, following the approach that
      `cpu_has_*' macros take and avoiding the call to `smp_processor_id' made
      in the reference to `current_cpu_data'.  The contents of FSCR have to be
      consistent across processors in an SMP system, the settings there must
      not change as a thread is migrated across processors.  And the contents
      of FIR are guaranteed to be consistent in FPU emulation, by definition.
      
      Signed-off-by: default avatarMaciej W. Rozycki <macro@linux-mips.org>
      Tested-by: default avatarEzequiel Garcia <ezequiel.garcia@imgtec.com>
      Tested-by: default avatarPaul Martin <paul.martin@codethink.co.uk>
      Cc: Markos Chandras <Markos.Chandras@imgtec.com>
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/10030/
      
      
      Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      03dce595
    • Helge Deller's avatar
      parisc,metag: Fix crashes due to stack randomization on stack-grows-upwards architectures · d045c77c
      Helge Deller authored
      
      
      On architectures where the stack grows upwards (CONFIG_STACK_GROWSUP=y,
      currently parisc and metag only) stack randomization sometimes leads to crashes
      when the stack ulimit is set to lower values than STACK_RND_MASK (which is 8 MB
      by default if not defined in arch-specific headers).
      
      The problem is, that when the stack vm_area_struct is set up in fs/exec.c, the
      additional space needed for the stack randomization (as defined by the value of
      STACK_RND_MASK) was not taken into account yet and as such, when the stack
      randomization code added a random offset to the stack start, the stack
      effectively got smaller than what the user defined via rlimit_max(RLIMIT_STACK)
      which then sometimes leads to out-of-stack situations and crashes.
      
      This patch fixes it by adding the maximum possible amount of memory (based on
      STACK_RND_MASK) which theoretically could be added by the stack randomization
      code to the initial stack size. That way, the user-defined stack size is always
      guaranteed to be at minimum what is defined via rlimit_max(RLIMIT_STACK).
      
      This bug is currently not visible on the metag architecture, because on metag
      STACK_RND_MASK is defined to 0 which effectively disables stack randomization.
      
      The changes to fs/exec.c are inside an "#ifdef CONFIG_STACK_GROWSUP"
      section, so it does not affect other platformws beside those where the
      stack grows upwards (parisc and metag).
      
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      Cc: linux-parisc@vger.kernel.org
      Cc: James Hogan <james.hogan@imgtec.com>
      Cc: linux-metag@vger.kernel.org
      Cc: stable@vger.kernel.org # v3.16+
      d045c77c
    • Julien Grall's avatar
      ARM: EXYNOS: Don't try to initialize suspend on old DT · e5cbec61
      Julien Grall authored
      
      
      Since commit 8b283c02 ("ARM: exynos4/5: convert pmu wakeup to
      stacked domains"), a suspend/resume is not supported on old DT.
      
      Although, rather than printing a warning and continue to boot, the
      kernel will segfault just after:
      
      ------------[ cut here ]------------
      
      WARNING: CPU: 1 PID: 1 at arch/arm/mach-exynos/suspend.c:726 exynos_pm_init+0x4c/0xc8()
      Modules linked in:
      CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.1.0-rc3 #1
      Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
      [<c02181c4>] (unwind_backtrace) from [<c0213b2c>] (show_stack+0x10/0x14)
      [<c0213b2c>] (show_stack) from [<c0949890>] (dump_stack+0x70/0x8c)
      [<c0949890>] (dump_stack) from [<c024f0b0>] (warn_slowpath_common+0x74/0xac)
      [<c024f0b0>] (warn_slowpath_common) from [<c024f104>] (warn_slowpath_null+0x1c/0x24)
      [<c024f104>] (warn_slowpath_null) from [<c0cf1d28>] (exynos_pm_init+0x4c/0xc8)
      [<c0cf1d28>] (exynos_pm_init) from [<c0ceaae8>] (init_machine_late+0x1c/0x28)
      [<c0ceaae8>] (init_machine_late) from [<c020aa64>] (do_one_initcall+0x80/0x1d0)
      [<c020aa64>] (do_one_initcall) from [<c0ce8d4c>] (kernel_init_freeable+0x10c/0x1d8)
      [<c0ce8d4c>] (kernel_init_freeable) from [<c0944a2c>] (kernel_init+0x8/0xe4)
      [<c0944a2c>] (kernel_init) from [<c0210e60>] (ret_from_fork+0x14/0x34)
      ---[ end trace 335bd937d409f3c7 ]---
      Outdated DT detected, suspend/resume will NOT work
      Unable to handle kernel NULL pointer dereference at virtual address 00000608
      pgd = c0204000
      [00000608] *pgd=00000000
      Internal error: Oops: 5 [#1] SMP ARM
      Modules linked in:
      CPU: 1 PID: 1 Comm: swapper/0 Tainted: G        W       4.1.0-rc3 #1
      Hardware name: SAMSUNG EXYNOS (Flattened Device Tree)
      task: db06c000 ti: db05a000 task.ti: db05a000
      PC is at exynos_pm_init+0x6c/0xc8
      LR is at exynos_pm_init+0x54/0xc8
      pc : [<c0cf1d48>]    lr : [<c0cf1d30>]    psr: 60000113
      sp : db05bee8  ip : 00000000  fp : 00000000
      r10: 00000116  r9 : c0dab2d4  r8 : d8d5f440
      r7 : c0db7ad8  r6 : c0db7ad8  r5 : 00000000  r4 : c0ceaacc
      r3 : c0eb2aec  r2 : c0951e40  r1 : 00000000  r0 : c0eb2acc
      Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
      Control: 10c5387d  Table: 6020406a  DAC: 00000015
      Process swapper/0 (pid: 1, stack limit = 0xdb05a220)
      Stack: (0xdb05bee8 to 0xdb05c000)
      bee0:                   c0db7ad8 c0d8fe34 c0cf17c8 c0ceaae8 00000000 c020aa64
      bf00: 00000033 c09580b8 db04fd00 c0ed79a4 c0eb1000 c0ce8588 c0ca2bc4 c0353fcc
      bf20: 00000000 c0df358c 60000113 00000000 dbfffba4 00000000 c0ca2bc4 c026654c
      bf40: c0b80134 c0ca1a64 00000007 00000007 c0df3554 c0d6c2f4 00000007 c0d6c2d4
      bf60: c0eb1000 c0ce8588 c0dab2d4 00000116 00000000 c0ce8d4c 00000007 00000007
      bf80: c0ce8588 c0944a24 00000000 c0944a24 00000000 00000000 00000000 00000000
      bfa0: 00000000 c0944a2c 00000000 c0210e60 00000000 00000000 00000000 00000000
      bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
      bfe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000
      [<c0cf1d48>] (exynos_pm_init) from [<c0ceaae8>] (init_machine_late+0x1c/0x28)
      [<c0ceaae8>] (init_machine_late) from [<c020aa64>] (do_one_initcall+0x80/0x1d0)
      [<c020aa64>] (do_one_initcall) from [<c0ce8d4c>] (kernel_init_freeable+0x10c/0x1d8)
      [<c0ce8d4c>] (kernel_init_freeable) from [<c0944a2c>] (kernel_init+0x8/0xe4)
      [<c0944a2c>] (kernel_init) from [<c0210e60>] (ret_from_fork+0x14/0x34)
      Code: e59f005c e59220c0 e5901000 e5832000 (e591e608)
      ---[ end trace 335bd937d409f3c8 ]---
      
      This is happening because pmu_base_addr is only initialized when the
      PMU is an interrupt controller. It's not the case on old DT.
      
      Signed-off-by: default avatarJulien Grall <julien.grall@citrix.com>
      Signed-off-by: default avatarKukjin Kim <kgene@kernel.org>
      e5cbec61
    • Javier Martinez Canillas's avatar
      ARM: dts: Add keep-power-in-suspend to WiFi SDIO node for Peach Boards · b2706879
      Javier Martinez Canillas authored
      
      
      The Marvell mwifiex driver prevents the system to enter into a suspend
      state if the card power is not preserved during a suspend/resume cycle.
      
      So Suspend-to-RAM and Suspend-to-idle is failing on Exynos5800 Peach Pi
      and Exynos5420 Peach Pit Chromebooks.
      
      Add the keep-power-in-suspend Power Management property to the SDIO/MMC
      node so the mwifiex suspend handler doesn't fail and the system is able
      to enter into a suspend state.
      
      Signed-off-by: default avatarJavier Martinez Canillas <javier.martinez@collabora.co.uk>
      Reviewed-by: default avatarDoug Anderson <dianders@chromium.org>
      Signed-off-by: default avatarKrzysztof Kozlowski <k.kozlowski@samsung.com>
      Signed-off-by: default avatarKukjin Kim <kgene@kernel.org>
      b2706879
    • Paul Burton's avatar
      MIPS: fix FP mode selection in lieu of .MIPS.abiflags data · 620b1550
      Paul Burton authored
      
      
      Commit 46490b57 ("MIPS: kernel: elf: Improve the overall ABI and FPU
      mode checks") reworked the ELF FP ABI mode selection logic, but when
      CONFIG_MIPS_O32_FP64_SUPPORT is enabled it breaks the use of binaries
      which have no PT_MIPS_ABIFLAGS program header & associated
      .MIPS.abiflags section.
      
      A default mode is selected based upon whether the ELF contains MIPS32 or
      MIPS64 code, but that selection is made in arch_elf_pt_proc.
      arch_elf_pt_proc only executes when a PT_MIPS_ABIFLAGS program header is
      found. If one is not found then arch_elf_pt_proc is never called, and no
      default overall_fp_mode value is selected. When arch_check_elf is
      called, both abi0 & abi1 are MIPS_ABI_FP_UNKNOWN which leads to both
      prog_req & interp_req being set to none_req. none_req matches none of
      the conditions for mode selection at the end of arch_check_elf, so
      overall_fp_mode is left untouched. Finally once mips_set_personality_fp
      is called the BUG() in the default case is then hit & the kernel likely
      panics.
      
      Fix this by moving the selection of a default overall mode to the start
      of arch_check_elf, which runs once per ELF executed regardless of
      whether it has a PT_MIPS_ABIFLAGS program header.
      
      Signed-off-by: default avatarPaul Burton <paul.burton@imgtec.com>
      Cc: Markos Chandras <markos.chandras@imgtec.com>
      Cc: Matthew Fortune <matthew.fortune@imgtec.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: stable@vger.kernel.org # v4.0+
      Patchwork: http://patchwork.linux-mips.org/patch/9978/
      
      
      Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      620b1550
  7. May 12, 2015
    • Will Deacon's avatar
      arm64: perf: fix memory leak when probing PMU PPIs · 4801ba33
      Will Deacon authored
      
      
      Commit d795ef9a ("arm64: perf: don't warn about missing
      interrupt-affinity property for PPIs") added a check for PPIs so that
      we avoid parsing the interrupt-affinity property for these naturally
      affine interrupts.
      
      Unfortunately, this check can trigger an early (successful) return and
      we will leak the irqs array. This patch fixes the issue by reordering
      the code so that the check is performed before any independent
      allocation.
      
      Reported-by: default avatarDavid Binderman <dcb314@hotmail.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      4801ba33
    • Hans Ulli Kroll's avatar
      ARM: gemini: fix compiler warning due wrong data type · 31fc835f
      Hans Ulli Kroll authored
      
      
      This patch fixes a compiler warning in gemini_restart()
      issued by commit 7b6d864b ("reboot:arm: reboot_mode
      changes from char to enum reboot_mode").
      
      arch/arm/mach-gemini/board-rut1xx.c:93:2: warning: initialization from incompatible pointer type
      
      The warning is harmless, and the patch does not need to
      be backported to stable kernels.
      
      Fixes: 7b6d864b ("reboot:arm: reboot_mode changes from char to enum reboot_mode.")
      Signed-off-by: default avatarHans Ulli Kroll <ulli.kroll@googlemail.com>
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      31fc835f
    • Sudeep Holla's avatar
      ARM: vexpress/tc2: Add interrupt-affinity to the PMU node · 51ef519c
      Sudeep Holla authored
      
      
      Commit 9fd85eb5 ("ARM: pmu: add support for interrupt-affinity
      property") added an optional "interrupt-affinity" property, to specify
      the CPU affinity for each SPI listed in the interrupts property.
      
      Without this property, we get this boot warning:
      
        CPU PMU: Failed to parse <no-node>/interrupt-affinity[0]
      
      This patch adds interrupt-affinity to the PMU node in the
      vexpress-ca15_a7(a.k.a TC2) device tree.
      
      Signed-off-by: default avatarSudeep Holla <sudeep.holla@arm.com>
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      51ef519c
    • Robert Schwebel's avatar
      ARM: vexpress/ca9: Add interrupt-affinity to the PMU node · 613880a1
      Robert Schwebel authored
      
      
      Commit 9fd85eb5 ("ARM: pmu: add support for interrupt-affinity
      property") added an optional "interrupt-affinity" property, to specify
      the CPU affinity for each SPI listed in the interrupts property.
      
      Without this property, we get this boot warning:
      
        CPU PMU: Failed to parse <no-node>/interrupt-affinity[0]
      
      This patch adds interrupt-affinity to the PMU node in the
      vexpress-v2p-ca9 device tree.
      
      Signed-off-by: default avatarRobert Schwebel <r.schwebel@pengutronix.de>
      Acked-by: default avatarSudeep Holla <sudeep.holla@arm.com>
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      613880a1
    • Robert Schwebel's avatar
      ARM: vexpress/ca9: Add unified-cache property to l2 cache node · 2004f98a
      Robert Schwebel authored
      
      
      Commit d9d1f3e2 ("ARM: l2c: check that DT files specify the required
      "cache-unified" property") mandates to specify this required property.
      Without this property, we get this boot warning:
      
      "L2C: device tree omits to specify unified cache"
      
      This patch adds "cache-unified" property to L2 cache node in vexpress
      CA9 device tree.
      
      Signed-off-by: default avatarRobert Schwebel <r.schwebel@pengutronix.de>
      Acked-by: default avatarSudeep Holla <sudeep.holla@arm.com>
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      2004f98a
    • Sudeep Holla's avatar
      ARM64: juno: add sp810 support and fix sp804 clock frequency · 3bb1555c
      Sudeep Holla authored
      
      
      The clock generator in IOFPGA generates the two source clocks: 32kHz and
      1MHz for the SP810 System Controller.
      
      The SP810 System Controller selects 32kHz or 1MHz as the sources for
      TIM_CLK[3:0], the SP804 timer clocks. The powerup default is 32kHz but
      the maximum of "refclk" and "timclk" is chosen by the SP810 driver.
      
      This patch adds support for SP810 system controller and also fixes the
      SP804 timer clock frequency.
      
      However the SP804 driver needs to be enabled on ARM64 to test this,
      which requires SP804 driver to be moved out of arch/arm.
      
      Fixes: 71f867ec ("arm64: Add Juno board device tree.")
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Olof Johansson <olof@lixom.net>
      Acked-by: default avatarLiviu Dudau <Liviu.Dudau@arm.com>
      Signed-off-by: default avatarSudeep Holla <sudeep.holla@arm.com>
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      3bb1555c
    • Ralf Baechle's avatar
      MIPS: SMP: Fix build error. · cafb45b2
      Ralf Baechle authored
      
      
        CC      arch/mips/kernel/smp.o
      arch/mips/kernel/smp.c: In function ‘start_secondary’:
      arch/mips/kernel/smp.c:149:2: error: passing argument 2 of ‘cpumask_set_cpu’ discards ‘volatile’ qualifier from pointer target type [-Werror]
        cpumask_set_cpu(cpu, &cpu_callin_map);
        ^
      In file included from ./arch/mips/include/asm/processor.h:14:0,
                       from ./arch/mips/include/asm/thread_info.h:15,
                       from include/linux/thread_info.h:54,
                       from include/asm-generic/preempt.h:4,
                       from arch/mips/include/generated/asm/preempt.h:1,
                       from include/linux/preempt.h:18,
                       from include/linux/interrupt.h:8,
                       from arch/mips/kernel/smp.c:24:
      include/linux/cpumask.h:272:91: note: expected ‘struct cpumask *’ but argument is of type ‘volatile struct cpumask_t *’
       static inline void cpumask_set_cpu(unsigned int cpu, struct cpumask *dstp)
                                                                                                 ^
      arch/mips/kernel/smp.c: In function ‘smp_prepare_boot_cpu’:
      arch/mips/kernel/smp.c:211:2: error: passing argument 2 of ‘cpumask_set_cpu’ discards ‘volatile’ qualifier from pointer target type [-Werror]
        cpumask_set_cpu(0, &cpu_callin_map);
        ^
      In file included from ./arch/mips/include/asm/processor.h:14:0,
                       from ./arch/mips/include/asm/thread_info.h:15,
                       from include/linux/thread_info.h:54,
                       from include/asm-generic/preempt.h:4,
                       from arch/mips/include/generated/asm/preempt.h:1,
                       from include/linux/preempt.h:18,
                       from include/linux/interrupt.h:8,
                       from arch/mips/kernel/smp.c:24:
      include/linux/cpumask.h:272:91: note: expected ‘struct cpumask *’ but argument is of type ‘volatile struct cpumask_t *’
       static inline void cpumask_set_cpu(unsigned int cpu, struct cpumask *dstp)
                                                                                                 ^
      arch/mips/kernel/smp.c: In function ‘__cpu_up’:
      arch/mips/kernel/smp.c:221:10: error: passing argument 2 of ‘cpumask_test_cpu’ discards ‘volatile’ qualifier from pointer target type [-Werror]
        while (!cpumask_test_cpu(cpu, &cpu_callin_map))
                ^
      In file included from ./arch/mips/include/asm/processor.h:14:0,
                       from ./arch/mips/include/asm/thread_info.h:15,
                       from include/linux/thread_info.h:54,
                       from include/asm-generic/preempt.h:4,
                       from arch/mips/include/generated/asm/preempt.h:1,
                       from include/linux/preempt.h:18,
                       from include/linux/interrupt.h:8,
                       from arch/mips/kernel/smp.c:24:
      include/linux/cpumask.h:294:90: note: expected ‘const struct cpumask *’ but argument is of type ‘volatile struct cpumask_t *’
       static inline int cpumask_test_cpu(int cpu, const struct cpumask *cpumask)
                                                                                                ^
      cc1: all warnings being treated as errors
      make[2]: *** [arch/mips/kernel/smp.o] Error 1
      make[1]: *** [arch/mips/kernel] Error 2
      make: *** [arch/mips] Error 2
      
      Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      cafb45b2
    • Tony Lindgren's avatar
      ARM: OMAP2+: Remove bogus struct clk comparison for timer clock · b0897972
      Tony Lindgren authored
      
      
      With recent changes to use determine_rate, the comparison of two
      clocks won't work without clk_is_match that does __clk_get_hw
      on the clocks first.
      
      As we've been unconditionally already calling clk_set_parent
      already because of the bogus comparison, let's just remove the
      check as suggested by Stephen Boyd <sboyd@codeaurora.org>.
      
      Cc: Michael Turquette <mturquette@linaro.org>
      Cc: Paul Walmsley <paul@pwsan.com>
      Cc: Tero Kristo <t-kristo@ti.com>
      Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
      Acked-by: default avatarStephen Boyd <sboyd@codeaurora.org>
      Signed-off-by: default avatarTony Lindgren <tony@atomide.com>
      b0897972
  8. May 11, 2015
    • Sebastian Hesselbarth's avatar
      ARM: dove: Add clock-names to CuBox Si5351 clk generator · ba0a1ff8
      Sebastian Hesselbarth authored
      
      
      Si5351 clock generator on CuBox uses XTAL as clock reference, name the
      clock phandle accordingly.
      
      Signed-off-by: default avatarSebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
      ba0a1ff8
    • Stephane Eranian's avatar
      perf/x86/rapl: Enable Broadwell-U RAPL support · 44b11fee
      Stephane Eranian authored
      
      
      This patch enables RAPL counters (energy consumption counters)
      support for Intel Broadwell-U processors (Model 61):
      
      To use:
      
        $ perf stat -a -I 1000 -e power/energy-cores/,power/energy-pkg/,power/energy-ram/ sleep 10
      
      Signed-off-by: default avatarStephane Eranian <eranian@google.com>
      Cc: <stable@vger.kernel.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: jacob.jun.pan@linux.intel.com
      Cc: kan.liang@intel.com
      Cc: peterz@infradead.org
      Cc: sonnyrao@chromium.org
      Link: http://lkml.kernel.org/r/20150423070709.GA4970@thinkpad
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      44b11fee
    • Borislav Petkov's avatar
      x86/alternatives: Switch AMD F15h and later to the P6 NOPs · f21262b8
      Borislav Petkov authored
      
      
      Software optimization guides for both F15h and F16h cite those
      NOPs as the optimal ones. A microbenchmark confirms that
      actually even older families are better with the single-insn
      NOPs so switch to them for the alternatives.
      
      Cycles count below includes the loop overhead of the measurement
      but that overhead is the same with all runs.
      
      	F10h, revE:
      	-----------
      	Running NOP tests, 1000 NOPs x 1000000 repetitions
      
      	K8:
      			      90     288.212282 cycles
      			   66 90     288.220840 cycles
      			66 66 90     288.219447 cycles
      		     66 66 66 90     288.223204 cycles
      		  66 66 90 66 90     571.393424 cycles
      	       66 66 90 66 66 90     571.374919 cycles
      	    66 66 66 90 66 66 90     572.249281 cycles
      	 66 66 66 90 66 66 66 90     571.388651 cycles
      
      	P6:
      			      90     288.214193 cycles
      			   66 90     288.225550 cycles
      			0f 1f 00     288.224441 cycles
      		     0f 1f 40 00     288.225030 cycles
      		  0f 1f 44 00 00     288.233558 cycles
      	       66 0f 1f 44 00 00     324.792342 cycles
      	    0f 1f 80 00 00 00 00     325.657462 cycles
      	 0f 1f 84 00 00 00 00 00     430.246643 cycles
      
      	F14h:
      	----
      	Running NOP tests, 1000 NOPs x 1000000 repetitions
      
      	K8:
      			      90     510.404890 cycles
      			   66 90     510.432117 cycles
      			66 66 90     510.561858 cycles
      		     66 66 66 90     510.541865 cycles
      		  66 66 90 66 90    1014.192782 cycles
      	       66 66 90 66 66 90    1014.226546 cycles
      	    66 66 66 90 66 66 90    1014.334299 cycles
      	 66 66 66 90 66 66 66 90    1014.381205 cycles
      
      	P6:
      			      90     510.436710 cycles
      			   66 90     510.448229 cycles
      			0f 1f 00     510.545100 cycles
      		     0f 1f 40 00     510.502792 cycles
      		  0f 1f 44 00 00     510.589517 cycles
      	       66 0f 1f 44 00 00     510.611462 cycles
      	    0f 1f 80 00 00 00 00     511.166794 cycles
      	 0f 1f 84 00 00 00 00 00     511.651641 cycles
      
      	F15h:
      	-----
      	Running NOP tests, 1000 NOPs x 1000000 repetitions
      
      	K8:
      			      90     243.128396 cycles
      			   66 90     243.129883 cycles
      			66 66 90     243.131631 cycles
      		     66 66 66 90     242.499324 cycles
      		  66 66 90 66 90     481.829083 cycles
      	       66 66 90 66 66 90     481.884413 cycles
      	    66 66 66 90 66 66 90     481.851446 cycles
      	 66 66 66 90 66 66 66 90     481.409220 cycles
      
      	P6:
      			      90     243.127026 cycles
      			   66 90     243.130711 cycles
      			0f 1f 00     243.122747 cycles
      		     0f 1f 40 00     242.497617 cycles
      		  0f 1f 44 00 00     245.354461 cycles
      	       66 0f 1f 44 00 00     361.930417 cycles
      	    0f 1f 80 00 00 00 00     362.844944 cycles
      	 0f 1f 84 00 00 00 00 00     480.514948 cycles
      
      	F16h:
      	-----
      	Running NOP tests, 1000 NOPs x 1000000 repetitions
      
      	K8:
      			      90     507.793298 cycles
      			   66 90     507.789636 cycles
      			66 66 90     507.826490 cycles
      		     66 66 66 90     507.859075 cycles
      		  66 66 90 66 90    1008.663129 cycles
      	       66 66 90 66 66 90    1008.696259 cycles
      	    66 66 66 90 66 66 90    1008.692517 cycles
      	 66 66 66 90 66 66 66 90    1008.755399 cycles
      
      	P6:
      			      90     507.795232 cycles
      			   66 90     507.794761 cycles
      			0f 1f 00     507.834901 cycles
      		     0f 1f 40 00     507.822629 cycles
      		  0f 1f 44 00 00     507.838493 cycles
      	       66 0f 1f 44 00 00     507.908597 cycles
      	    0f 1f 80 00 00 00 00     507.946417 cycles
      	 0f 1f 84 00 00 00 00 00     507.954960 cycles
      
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Aravind Gopalakrishnan <aravind.gopalakrishnan@amd.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1431332153-18566-2-git-send-email-bp@alien8.de
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      f21262b8
    • Oleg Nesterov's avatar
      x86/vdso: Fix 'make bzImage' on older distros · ef7254a5
      Oleg Nesterov authored
      
      
      Change HOST_EXTRACFLAGS to include arch/x86/include/uapi along
      with include/uapi.
      
      This looks more consistent, and this fixes "make bzImage" on my
      old distro which doesn't have asm/bitsperlong.h in /usr/include/.
      
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Acked-by: default avatarAndy Lutomirski <luto@kernel.org>
      Cc: <stable@vger.kernel.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: 6f121e54 ("x86, vdso: Reimplement vdso.so preparation in build-time C")
      Link: http://lkml.kernel.org/r/1431332153-18566-6-git-send-email-bp@alien8.de
      Link: http://lkml.kernel.org/r/20150507165835.GB18652@redhat.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      ef7254a5
    • Vineet Gupta's avatar
      ARC: inline cache flush toggle helpers · 4a8a2245
      Vineet Gupta authored
      
      
      Signed-off-by: default avatarVineet Gupta <vgupta@synopsys.com>
      4a8a2245
    • Vineet Gupta's avatar
      b4f006db
    • Nicolas Schichan's avatar
      ARM: net: delegate filter to kernel interpreter when imm_offset() return value... · 0b59d880
      Nicolas Schichan authored
      
      ARM: net: delegate filter to kernel interpreter when imm_offset() return value can't fit into 12bits.
      
      The ARM JIT code emits "ldr rX, [pc, #offset]" to access the literal
      pool. #offset maximum value is 4095 and if the generated code is too
      large, the #offset value can overflow and not point to the expected
      slot in the literal pool. Additionally, when overflow occurs, bits of
      the overflow can end up changing the destination register of the ldr
      instruction.
      
      Fix that by detecting the overflow in imm_offset() and setting a flag
      that is checked for each BPF instructions converted in
      build_body(). As of now it can only be detected in the second pass. As
      a result the second build_body() call can now fail, so add the
      corresponding cleanup code in that case.
      
      Using multiple literal pools in the JITed code is going to require
      lots of intrusive changes to the JIT code (which would better be done
      as a feature instead of fix), just delegating to the kernel BPF
      interpreter in that case is a more straight forward, minimal fix and
      easy to backport.
      
      Fixes: ddecdfce ("ARM: 7259/3: net: JIT compiler for packet filters")
      Signed-off-by: default avatarNicolas Schichan <nschichan@freebox.fr>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0b59d880
    • Nicolas Schichan's avatar
      ARM: net fix emit_udiv() for BPF_ALU | BPF_DIV | BPF_K intruction. · 19fc99d0
      Nicolas Schichan authored
      
      
      In that case, emit_udiv() will be called with rn == ARM_R0 (r_scratch)
      and loading rm first into ARM_R0 will result in jit_udiv() function
      being called the same dividend and divisor. Fix that by loading rn
      first into ARM_R1 and then rm into ARM_R0.
      
      Signed-off-by: default avatarNicolas Schichan <nschichan@freebox.fr>
      Cc: <stable@vger.kernel.org> # v3.13+
      Fixes: aee636c4 (bpf: do not use reciprocal divide)
      Acked-by: default avatarMircea Gherzan <mgherzan@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      19fc99d0