Skip to content
  1. Mar 17, 2015
  2. Mar 01, 2015
    • Kirill A. Shutemov's avatar
      mm: add missing __PAGETABLE_{PUD,PMD}_FOLDED defines · c07af4f1
      Kirill A. Shutemov authored
      
      
      Core mm expects __PAGETABLE_{PUD,PMD}_FOLDED to be defined if these page
      table levels folded.  Usually, these defines are provided by
      <asm-generic/pgtable-nopmd.h> and <asm-generic/pgtable-nopud.h>.
      
      But some architectures fold page table levels in a custom way.  They
      need to define these macros themself.  This patch adds missing defines.
      
      The patch fixes mm->nr_pmds underflow and eliminates dead __pmd_alloc()
      and __pud_alloc() on architectures without these page table levels.
      
      Signed-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Aaro Koskinen <aaro.koskinen@iki.fi>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Helge Deller <deller@gmx.de>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c07af4f1
  3. Feb 28, 2015
  4. Feb 27, 2015
    • Marc Zyngier's avatar
      arm64: Fix text patching logic when using fixmap · f6242cac
      Marc Zyngier authored
      Patch 2f896d58
      
       ("arm64: use fixmap for text patching") changed
      the way we patch the kernel text, using a fixmap when the kernel or
      modules are flagged as read only.
      
      Unfortunately, a flaw in the logic makes it fall over when patching
      modules without CONFIG_DEBUG_SET_MODULE_RONX enabled:
      
      [...]
      [   32.032636] Call trace:
      [   32.032716] [<fffffe00003da0dc>] __copy_to_user+0x2c/0x60
      [   32.032837] [<fffffe0000099f08>] __aarch64_insn_write+0x94/0xf8
      [   32.033027] [<fffffe000009a0a0>] aarch64_insn_patch_text_nosync+0x18/0x58
      [   32.033200] [<fffffe000009c3ec>] ftrace_modify_code+0x58/0x84
      [   32.033363] [<fffffe000009c4e4>] ftrace_make_nop+0x3c/0x58
      [   32.033532] [<fffffe0000164420>] ftrace_process_locs+0x3d0/0x5c8
      [   32.033709] [<fffffe00001661cc>] ftrace_module_init+0x28/0x34
      [   32.033882] [<fffffe0000135148>] load_module+0xbb8/0xfc4
      [   32.034044] [<fffffe0000135714>] SyS_finit_module+0x94/0xc4
      [...]
      
      This is triggered by the use of virt_to_page() on a module address,
      which ends to pointing to Nowhereland if you're lucky, or corrupt
      your precious data if not.
      
      This patch fixes the logic by mimicking what is done on arm:
      - If we're patching a module and CONFIG_DEBUG_SET_MODULE_RONX is set,
        use vmalloc_to_page().
      - If we're patching the kernel and CONFIG_DEBUG_RODATA is set,
        use virt_to_page().
      - Otherwise, use the provided address, as we can write to it directly.
      
      Tested on 4.0-rc1 as a KVM guest.
      
      Reported-by: default avatarRichard W.M. Jones <rjones@redhat.com>
      Reviewed-by: default avatarKees Cook <keescook@chromium.org>
      Acked-by: default avatarMark Rutland <mark.rutland@arm.com>
      Acked-by: default avatarLaura Abbott <lauraa@codeaurora.org>
      Tested-by: default avatarRichard W.M. Jones <rjones@redhat.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      f6242cac
    • Ard Biesheuvel's avatar
      arm64: crypto: increase AES interleave to 4x · 0eee0fbd
      Ard Biesheuvel authored
      
      
      This patch increases the interleave factor for parallel AES modes
      to 4x. This improves performance on Cortex-A57 by ~35%. This is
      due to the 3-cycle latency of AES instructions on the A57's
      relatively deep pipeline (compared to Cortex-A53 where the AES
      instruction latency is only 2 cycles).
      
      At the same time, disable inline expansion of the core AES functions,
      as the performance benefit of this feature is negligible.
      
        Measured on AMD Seattle (using tcrypt.ko mode=500 sec=1):
      
        Baseline (2x interleave, inline expansion)
        ------------------------------------------
        testing speed of async cbc(aes) (cbc-aes-ce) decryption
        test 4 (128 bit key, 8192 byte blocks): 95545 operations in 1 seconds
        test 14 (256 bit key, 8192 byte blocks): 68496 operations in 1 seconds
      
        This patch (4x interleave, no inline expansion)
        -----------------------------------------------
        testing speed of async cbc(aes) (cbc-aes-ce) decryption
        test 4 (128 bit key, 8192 byte blocks): 124735 operations in 1 seconds
        test 14 (256 bit key, 8192 byte blocks): 92328 operations in 1 seconds
      
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      0eee0fbd
    • Feng Kan's avatar
      arm64: enable PTE type bit in the mask for pte_modify · 6910fa16
      Feng Kan authored
      
      
      Caught during Trinity testing. The pte_modify does not allow
      modification for PTE type bit. This cause the test to hang
      the system. It is found that the PTE can't transit from an
      inaccessible page (b00) to a valid page (b11) because the mask
      does not allow it. This happens when a big block of mmaped
      memory is set the PROT_NONE, then the a small piece is broken
      off and set to PROT_WRITE | PROT_READ cause a huge page split.
      
      Signed-off-by: default avatarFeng Kan <fkan@apm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      6910fa16
    • Yingjoe Chen's avatar
      arm64: mm: remove unused functions and variable protoypes · 06ff87ba
      Yingjoe Chen authored
      The functions __cpu_flush_user_tlb_range and __cpu_flush_kern_tlb_range
      were removed in commit fa48e6f7
      
       'arm64: mm: Optimise tlb flush logic
      where we have >4K granule'. Global variable cpu_tlb was never used in
      arm64.
      
      Remove them.
      
      Signed-off-by: default avatarYingjoe Chen <yingjoe.chen@mediatek.com>
      Acked-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      06ff87ba
    • Will Deacon's avatar
      arm64: psci: move psci firmware calls out of line · f5e0a12c
      Will Deacon authored
      
      
      An arm64 allmodconfig fails to build with GCC 5 due to __asmeq
      assertions in the PSCI firmware calling code firing due to mcount
      preambles breaking our assumptions about register allocation of function
      arguments:
      
        /tmp/ccDqJsJ6.s: Assembler messages:
        /tmp/ccDqJsJ6.s:60: Error: .err encountered
        /tmp/ccDqJsJ6.s:61: Error: .err encountered
        /tmp/ccDqJsJ6.s:62: Error: .err encountered
        /tmp/ccDqJsJ6.s:99: Error: .err encountered
        /tmp/ccDqJsJ6.s:100: Error: .err encountered
        /tmp/ccDqJsJ6.s:101: Error: .err encountered
      
      This patch fixes the issue by moving the PSCI calls out-of-line into
      their own assembly files, which are safe from the compiler's meddling
      fingers.
      
      Reported-by: default avatarAndy Whitcroft <apw@canonical.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      f5e0a12c
    • Nathan Lynch's avatar
      arm64: vdso: minor ABI fix for clock_getres · e1b6b6ce
      Nathan Lynch authored
      
      
      The vdso implementation of clock_getres currently returns 0 (success)
      whenever a null timespec is provided by the caller, regardless of the
      clock id supplied.
      
      This behavior is incorrect.  It should fall back to syscall when an
      unrecognized clock id is passed, even when the timespec argument is
      null.  This ensures that clock_getres always returns an error for
      invalid clock ids.
      
      Signed-off-by: default avatarNathan Lynch <nathan_lynch@mentor.com>
      Acked-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      e1b6b6ce
  5. Feb 26, 2015
  6. Feb 25, 2015
  7. Feb 24, 2015
    • James Hogan's avatar
      metag: Fix KSTK_EIP() and KSTK_ESP() macros · c2996cb2
      James Hogan authored
      
      
      The KSTK_EIP() and KSTK_ESP() macros should return the user program
      counter (PC) and stack pointer (A0StP) of the given task. These are used
      to determine which VMA corresponds to the user stack in
      /proc/<pid>/maps, and for the user PC & A0StP in /proc/<pid>/stat.
      
      However for Meta the PC & A0StP from the task's kernel context are used,
      resulting in broken output. For example in following /proc/<pid>/maps
      output, the 3afff000-3b021000 VMA should be described as the stack:
      
        # cat /proc/self/maps
        ...
        100b0000-100b1000 rwxp 00000000 00:00 0          [heap]
        3afff000-3b021000 rwxp 00000000 00:00 0
      
      And in the following /proc/<pid>/stat output, the PC is in kernel code
      (1074234964 = 0x40078654) and the A0StP is in the kernel heap
      (1335981392 = 0x4fa17550):
      
        # cat /proc/self/stat
        51 (cat) R ... 1335981392 1074234964 ...
      
      Fix the definitions of KSTK_EIP() and KSTK_ESP() to use
      task_pt_regs(tsk)->ctx rather than (tsk)->thread.kernel_context. This
      gets the registers from the user context stored after the thread info at
      the base of the kernel stack, which is from the last entry into the
      kernel from userland, regardless of where in the kernel the task may
      have been interrupted, which results in the following more correct
      /proc/<pid>/maps output:
      
        # cat /proc/self/maps
        ...
        0800b000-08070000 r-xp 00000000 00:02 207        /lib/libuClibc-0.9.34-git.so
        ...
        100b0000-100b1000 rwxp 00000000 00:00 0          [heap]
        3afff000-3b021000 rwxp 00000000 00:00 0          [stack]
      
      And /proc/<pid>/stat now correctly reports the PC in libuClibc
      (134320308 = 0x80190b4) and the A0StP in the [stack] region (989864576 =
      0x3b002280):
      
        # cat /proc/self/stat
        51 (cat) R ... 989864576 134320308 ...
      
      Reported-by: default avatarAlexey Brodkin <Alexey.Brodkin@synopsys.com>
      Reported-by: default avatarVineet Gupta <Vineet.Gupta1@synopsys.com>
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Cc: linux-metag@vger.kernel.org
      Cc: <stable@vger.kernel.org> # v3.9+
      c2996cb2
    • Yannick Guerrini's avatar
    • Boris Ostrovsky's avatar
      x86/xen: Initialize cr4 shadow for 64-bit PV(H) guests · 5054daa2
      Boris Ostrovsky authored
      Commit 1e02ce4c
      
       ("x86: Store a per-cpu shadow copy of CR4")
      introduced CR4 shadows.
      
      These shadows are initialized in early boot code. The commit missed
      initialization for 64-bit PV(H) guests that this patch adds.
      
      Signed-off-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      5054daa2
    • David Vrabel's avatar
      x86/xen: allow privcmd hypercalls to be preempted · fdfd811d
      David Vrabel authored
      
      
      Hypercalls submitted by user space tools via the privcmd driver can
      take a long time (potentially many 10s of seconds) if the hypercall
      has many sub-operations.
      
      A fully preemptible kernel may deschedule such as task in any upcall
      called from a hypercall continuation.
      
      However, in a kernel with voluntary or no preemption, hypercall
      continuations in Xen allow event handlers to be run but the task
      issuing the hypercall will not be descheduled until the hypercall is
      complete and the ioctl returns to user space.  These long running
      tasks may also trigger the kernel's soft lockup detection.
      
      Add xen_preemptible_hcall_begin() and xen_preemptible_hcall_end() to
      bracket hypercalls that may be preempted.  Use these in the privcmd
      driver.
      
      When returning from an upcall, call xen_maybe_preempt_hcall() which
      adds a schedule point if if the current task was within a preemptible
      hypercall.
      
      Since _cond_resched() can move the task to a different CPU, clear and
      set xen_in_preemptible_hcall around the call.
      
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Reviewed-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      fdfd811d
    • Boris Ostrovsky's avatar
      x86/xen: Make sure X2APIC_ENABLE bit of MSR_IA32_APICBASE is not set · 31795b47
      Boris Ostrovsky authored
      Commit d524165c
      
       ("x86/apic: Check x2apic early") tests X2APIC_ENABLE
      bit of MSR_IA32_APICBASE when CONFIG_X86_X2APIC is off and panics
      the kernel when this bit is set.
      
      Xen's PV guests will pass this MSR read to the hypervisor which will
      return its version of the MSR, where this bit might be set. Make sure
      we clear it before returning MSR value to the caller.
      
      Signed-off-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      31795b47
  8. Feb 23, 2015