Skip to content
  1. Dec 11, 2018
  2. Dec 10, 2018
    • Will Deacon's avatar
      arm64: Fix minor issues with the dcache_by_line_op macro · 33309ecd
      Will Deacon authored
      
      
      The dcache_by_line_op macro suffers from a couple of small problems:
      
      First, the GAS directives that are currently being used rely on
      assembler behavior that is not documented, and probably not guaranteed
      to produce the correct behavior going forward. As a result, we end up
      with some undefined symbols in cache.o:
      
      $ nm arch/arm64/mm/cache.o
               ...
               U civac
               ...
               U cvac
               U cvap
               U cvau
      
      This is due to the fact that the comparisons used to select the
      operation type in the dcache_by_line_op macro are comparing symbols
      not strings, and even though it seems that GAS is doing the right
      thing here (undefined symbols by the same name are equal to each
      other), it seems unwise to rely on this.
      
      Second, when patching in a DC CVAP instruction on CPUs that support it,
      the fallback path consists of a DC CVAU instruction which may be
      affected by CPU errata that require ARM64_WORKAROUND_CLEAN_CACHE.
      
      Solve these issues by unrolling the various maintenance routines and
      using the conditional directives that are documented as operating on
      strings. To avoid the complexity of nested alternatives, we move the
      DC CVAP patching to __clean_dcache_area_pop, falling back to a branch
      to __clean_dcache_area_poc if DCPOP is not supported by the CPU.
      
      Reported-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Suggested-by: default avatarRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      33309ecd
    • Mark Rutland's avatar
      arm64: remove arm64ksyms.c · 2a9cee5b
      Mark Rutland authored
      
      
      Now that arm64ksyms.c has been reduced to a stub, let's remove it
      entirely. New exports should be associated with their function
      definition.
      
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      2a9cee5b
    • Mark Rutland's avatar
      arm64: frace: use asm EXPORT_SYMBOL() · dbd31962
      Mark Rutland authored
      
      
      For a while now it's been possible to use EXPORT_SYMBOL() in assembly
      files, which allows us to place exports immediately after assembly
      functions, as we do for C functions.
      
      As a step towards removing arm64ksyms.c, let's move the ftrace exports
      to the assembly files the functions are defined in.
      
      There should be no functional change as a result of this patch.
      
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      dbd31962
    • Mark Rutland's avatar
      arm64: string: use asm EXPORT_SYMBOL() · ac0e8c72
      Mark Rutland authored
      
      
      For a while now it's been possible to use EXPORT_SYMBOL() in assembly
      files, which allows us to place exports immediately after assembly
      functions, as we do for C functions.
      
      As a step towards removing arm64ksyms.c, let's move the string routine
      exports to the assembly files the functions are defined in. Routines
      which should only be exported for !KASAN builds are exported using the
      EXPORT_SYMBOL_NOKASAN() helper.
      
      There should be no functional change as a result of this patch.
      
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      ac0e8c72
    • Mark Rutland's avatar
      arm64: uaccess: use asm EXPORT_SYMBOL() · 56c08ec5
      Mark Rutland authored
      
      
      For a while now it's been possible to use EXPORT_SYMBOL() in assembly
      files, which allows us to place exports immediately after assembly
      functions, as we do for C functions.
      
      As a step towards removing arm64ksyms.c, let's move the uaccess exports
      to the assembly files the functions are defined in.  As we have to
      include <asm/assembler.h>, the existing includes are fixed to follow the
      usual ordering conventions.
      
      There should be no functional change as a result of this patch.
      
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      56c08ec5
    • Mark Rutland's avatar
      arm64: page: use asm EXPORT_SYMBOL() · 50fdecb2
      Mark Rutland authored
      
      
      For a while now it's been possible to use EXPORT_SYMBOL() in assembly
      files, which allows us to place exports immediately after assembly
      functions, as we do for C functions.
      
      As a step towards removing arm64ksyms.c, let's move the copy_page and
      clear_page exports to the assembly files the functions are defined in.
      
      There should be no functional change as a result of this patch.
      
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      50fdecb2
    • Mark Rutland's avatar
      arm64: smccc: use asm EXPORT_SYMBOL() · 23fe04c0
      Mark Rutland authored
      
      
      For a while now it's been possible to use EXPORT_SYMBOL() in assembly
      files, which allows us to place exports immediately after assembly
      functions, as we do for C functions.
      
      As a step towards removing arm64ksyms.c, let's move the SMCCC exports to
      the assembly file the functions are defined in.
      
      There should be no functional change as a result of this patch.
      
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      23fe04c0
    • Mark Rutland's avatar
      arm64: tishift: use asm EXPORT_SYMBOL() · abb77f3d
      Mark Rutland authored
      
      
      For a while now it's been possible to use EXPORT_SYMBOL() in assembly
      files, which allows us to place exports immediately after assembly
      functions, as we do for C functions.
      
      As a step towards removing arm64ksyms.c, let's move the tishift exports
      to the assembly file the functions are defined in.
      
      There should be no functional change as a result of this patch.
      
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      abb77f3d
    • Mark Rutland's avatar
      arm64: add EXPORT_SYMBOL_NOKASAN() · 386b3c7b
      Mark Rutland authored
      
      
      So that we can export symbols directly from assembly files, let's make
      use of the generic <asm/export.h>. We have a few symbols that we'll want
      to conditionally export for !KASAN kernel builds, so we add a helper for
      that in <asm/assembler.h>.
      
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      386b3c7b
    • Mark Rutland's avatar
      arm64: move memstart_addr export inline · 03ef055f
      Mark Rutland authored
      
      
      Since we define memstart_addr in a C file, we can have the export
      immediately after the definition of the symbol, as we do elsewhere.
      
      As a step towards removing arm64ksyms.c, move the export of
      memstart_addr to init.c, where the symbol is defined.
      
      There should be no functional change as a result of this patch.
      
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      03ef055f
    • Mark Rutland's avatar
      arm64: remove bitop exports · 2d7c89b0
      Mark Rutland authored
      
      
      Now that the arm64 bitops are inlines built atop of the regular atomics,
      we don't need to export anything.
      
      Remove the redundant exports.
      
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      2d7c89b0
  3. Dec 08, 2018
    • Will Deacon's avatar
      arm64: cmpxchg: Use "K" instead of "L" for ll/sc immediate constraint · 42305099
      Will Deacon authored
      
      
      The "L" AArch64 machine constraint, which we use for the "old" value in
      an LL/SC cmpxchg(), generates an immediate that is suitable for a 64-bit
      logical instruction. However, for cmpxchg() operations on types smaller
      than 64 bits, this constraint can result in an invalid instruction which
      is correctly rejected by GAS, such as EOR W1, W1, #0xffffffff.
      
      Whilst we could special-case the constraint based on the cmpxchg size,
      it's far easier to change the constraint to "K" and put up with using
      a register for large 64-bit immediates. For out-of-line LL/SC atomics,
      this is all moot anyway.
      
      Reported-by: default avatarRobin Murphy <robin.murphy@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      42305099
    • Will Deacon's avatar
      arm64: percpu: Rewrite per-cpu ops to allow use of LSE atomics · 959bf2fd
      Will Deacon authored
      
      
      Our percpu code is a bit of an inconsistent mess:
      
        * It rolls its own xchg(), but reuses cmpxchg_local()
        * It uses various different flavours of preempt_{enable,disable}()
        * It returns values even for the non-returning RmW operations
        * It makes no use of LSE atomics outside of the cmpxchg() ops
        * There are individual macros for different sizes of access, but these
          are all funneled through a switch statement rather than dispatched
          directly to the relevant case
      
      This patch rewrites the per-cpu operations to address these shortcomings.
      Whilst the new code is a lot cleaner, the big advantage is that we can
      use the non-returning ST- atomic instructions when we have LSE.
      
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      959bf2fd
    • Will Deacon's avatar
      arm64: Avoid masking "old" for LSE cmpxchg() implementation · b4f9209b
      Will Deacon authored
      
      
      The CAS instructions implicitly access only the relevant bits of the "old"
      argument, so there is no need for explicit masking via type-casting as
      there is in the LL/SC implementation.
      
      Move the casting into the LL/SC code and remove it altogether for the LSE
      implementation.
      
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      b4f9209b
    • Will Deacon's avatar
      arm64: Avoid redundant type conversions in xchg() and cmpxchg() · 5ef3fe4c
      Will Deacon authored
      
      
      Our atomic instructions (either LSE atomics of LDXR/STXR sequences)
      natively support byte, half-word, word and double-word memory accesses
      so there is no need to mask the data register prior to being stored.
      
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      5ef3fe4c
  4. Dec 07, 2018
  5. Dec 06, 2018
  6. Dec 04, 2018
    • Ard Biesheuvel's avatar
      arm64: relocatable: fix inconsistencies in linker script and options · 3bbd3db8
      Ard Biesheuvel authored
      readelf complains about the section layout of vmlinux when building
      with CONFIG_RELOCATABLE=y (for KASLR):
      
        readelf: Warning: [21]: Link field (0) should index a symtab section.
        readelf: Warning: [21]: Info field (0) should index a relocatable section.
      
      Also, it seems that our use of '-pie -shared' is contradictory, and
      thus ambiguous. In general, the way KASLR is wired up at the moment
      is highly tailored to how ld.bfd happens to implement (and conflate)
      PIE executables and shared libraries, so given the current effort to
      support other toolchains, let's fix some of these issues as well.
      
      - Drop the -pie linker argument and just leave -shared. In ld.bfd,
        the differences between them are unclear (except for the ELF type
        of the produced image [0]) but lld chokes on seeing both at the
        same time.
      
      - Rename the .rela output section to .rela.dyn, as is customary for
        shared libraries and PIE executables, so that it is not misidentified
        by readelf as a static relocation section (producing the warnings
        above).
      
      - Pass the -z notext and -z norelro options to explicitly instruct the
        linker to permit text relocations, and to omit the RELRO program
        header (which requires a certain section layout that we don't adhere
        to in the kernel). These are the defaults for current versions of
        ld.bfd.
      
      - Discard .eh_frame and .gnu.hash sections to avoid them from being
        emitted between .head.text and .text, screwing up the section layout.
      
      These changes only affect the ELF image, and produce the same binary
      image.
      
      [0] b9dce7f1
      
       ("arm64: kernel: force ET_DYN ELF type for ...")
      
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Peter Smith <peter.smith@linaro.org>
      Tested-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      3bbd3db8
  7. Nov 30, 2018
    • Ard Biesheuvel's avatar
      arm64/lib: improve CRC32 performance for deep pipelines · efdb25ef
      Ard Biesheuvel authored
      
      
      Improve the performance of the crc32() asm routines by getting rid of
      most of the branches and small sized loads on the common path.
      
      Instead, use a branchless code path involving overlapping 16 byte
      loads to process the first (length % 32) bytes, and process the
      remainder using a loop that processes 32 bytes at a time.
      
      Tested using the following test program:
      
        #include <stdlib.h>
      
        extern void crc32_le(unsigned short, char const*, int);
      
        int main(void)
        {
          static const char buf[4096];
      
          srand(20181126);
      
          for (int i = 0; i < 100 * 1000 * 1000; i++)
            crc32_le(0, buf, rand() % 1024);
      
          return 0;
        }
      
      On Cortex-A53 and Cortex-A57, the performance regresses but only very
      slightly. On Cortex-A72 however, the performance improves from
      
        $ time ./crc32
      
        real  0m10.149s
        user  0m10.149s
        sys   0m0.000s
      
      to
      
        $ time ./crc32
      
        real  0m7.915s
        user  0m7.915s
        sys   0m0.000s
      
      Cc: Rui Sun <sunrui26@huawei.com>
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      efdb25ef
    • Mark Rutland's avatar
      arm64: ftrace: always pass instrumented pc in x0 · 7dc48bf9
      Mark Rutland authored
      
      
      The core ftrace hooks take the instrumented PC in x0, but for some
      reason arm64's prepare_ftrace_return() takes this in x1.
      
      For consistency, let's flip the argument order and always pass the
      instrumented PC in x0.
      
      There should be no functional change as a result of this patch.
      
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Torsten Duwe <duwe@suse.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      7dc48bf9
    • Mark Rutland's avatar
      arm64: ftrace: remove return_regs macros · 49e258e0
      Mark Rutland authored
      
      
      The save_return_regs and restore_return_regs macros are only used by
      return_to_handler, and having them defined out-of-line only serves to
      obscure the logic.
      
      Before we complicate, let's clean this up and fold the logic directly
      into return_to_handler, saving a few lines of macro boilerplate in the
      process. At the same time, a missing trailing space is added to the
      comments, fixing a code style violation.
      
      There should be no functional change as a result of this patch.
      
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Torsten Duwe <duwe@suse.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      49e258e0
    • Mark Rutland's avatar
      arm64: ftrace: don't adjust the LR value · 6e803e2e
      Mark Rutland authored
      
      
      The core ftrace code requires that when it is handed the PC of an
      instrumented function, this PC is the address of the instrumented
      instruction. This is necessary so that the core ftrace code can identify
      the specific instrumentation site. Since the instrumented function will
      be a BL, the address of the instrumented function is LR - 4 at entry to
      the ftrace code.
      
      This fixup is applied in the mcount_get_pc and mcount_get_pc0 helpers,
      which acquire the PC of the instrumented function.
      
      The mcount_get_lr helper is used to acquire the LR of the instrumented
      function, whose value does not require this adjustment, and cannot be
      adjusted to anything meaningful. No adjustment of this value is made on
      other architectures, including arm. However, arm64 adjusts this value by
      4.
      
      This patch brings arm64 in line with other architectures and removes the
      adjustment of the LR value.
      
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Torsten Duwe <duwe@suse.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      6e803e2e
    • Mark Rutland's avatar
      arm64: ftrace: enable graph FP test · 5c176aff
      Mark Rutland authored
      
      
      The core frace code has an optional sanity check on the frame pointer
      passed by ftrace_graph_caller and return_to_handler. This is cheap,
      useful, and enabled unconditionally on x86, sparc, and riscv.
      
      Let's do the same on arm64, so that we can catch any problems early.
      
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Torsten Duwe <duwe@suse.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      5c176aff
    • Mark Rutland's avatar
      arm64: ftrace: use GLOBAL() · e4fe1966
      Mark Rutland authored
      
      
      The global exports of ftrace_call and ftrace_graph_call are somewhat
      painful to read. Let's use the generic GLOBAL() macro to ameliorate
      matters.
      
      There should be no functional change as a result of this patch.
      
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Torsten Duwe <duwe@suse.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      e4fe1966
    • Mark Rutland's avatar
      linkage: add generic GLOBAL() macro · ad697a1a
      Mark Rutland authored
      Declaring a global symbol in assembly is tedious, error-prone, and
      painful to read. While ENTRY() exists, this is supposed to be used for
      function entry points, and this affects alignment in a potentially
      undesireable manner.
      
      Instead, let's add a generic GLOBAL() macro for this, as x86 added
      locally in commit:
      
        95695547
      
       ("x86: asm linkage - introduce GLOBAL macro")
      
      ... thus allowing us to use this more freely in the kernel.
      
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: AKASHI Takahiro <takahiro.akashi@linaro.org>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Torsten Duwe <duwe@suse.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      ad697a1a
    • Ard Biesheuvel's avatar
      arm64: drop linker script hack to hide __efistub_ symbols · dd6846d7
      Ard Biesheuvel authored
      Commit 1212f7a1
      
       ("scripts/kallsyms: filter arm64's __efistub_
      symbols") updated the kallsyms code to filter out symbols with
      the __efistub_ prefix explicitly, so we no longer require the
      hack in our linker script to emit them as absolute symbols.
      
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Signed-off-by: default avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      dd6846d7