Skip to content
  1. Feb 27, 2021
    • Andrey Konovalov's avatar
      kasan: ensure poisoning size alignment · cde8a7eb
      Andrey Konovalov authored
      A previous changes d99f6a10
      
       ("kasan: don't round_up too much")
      attempted to simplify the code by adding a round_up(size) call into
      kasan_poison().  While this allows to have less round_up() calls around
      the code, this results in round_up() being called multiple times.
      
      This patch removes round_up() of size from kasan_poison() and ensures that
      all callers round_up() the size explicitly.  This patch also adds
      WARN_ON() alignment checks for address and size to kasan_poison() and
      kasan_unpoison().
      
      Link: https://lkml.kernel.org/r/3ffe8d4a246ae67a8b5e91f65bf98cd7cba9d7b9.1612546384.git.andreyknvl@google.com
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Reviewed-by: default avatarMarco Elver <elver@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Branislav Rankov <Branislav.Rankov@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Evgenii Stepanov <eugenis@google.com>
      Cc: Kevin Brodsky <kevin.brodsky@arm.com>
      Cc: Peter Collingbourne <pcc@google.com>
      Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      cde8a7eb
    • Andrey Konovalov's avatar
      kasan, mm: optimize krealloc poisoning · d12d9ad8
      Andrey Konovalov authored
      
      
      Currently, krealloc() always calls ksize(), which unpoisons the whole
      object including the redzone.  This is inefficient, as kasan_krealloc()
      repoisons the redzone for objects that fit into the same buffer.
      
      This patch changes krealloc() instrumentation to use uninstrumented
      __ksize() that doesn't unpoison the memory.  Instead, kasan_kreallos() is
      changed to unpoison the memory excluding the redzone.
      
      For objects that don't fit into the old allocation, this patch disables
      KASAN accessibility checks when copying memory into a new object instead
      of unpoisoning it.
      
      Link: https://lkml.kernel.org/r/9bef90327c9cb109d736c40115684fd32f49e6b0.1612546384.git.andreyknvl@google.com
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Reviewed-by: default avatarMarco Elver <elver@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Branislav Rankov <Branislav.Rankov@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Evgenii Stepanov <eugenis@google.com>
      Cc: Kevin Brodsky <kevin.brodsky@arm.com>
      Cc: Peter Collingbourne <pcc@google.com>
      Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d12d9ad8
    • Andrey Konovalov's avatar
      kasan, mm: fail krealloc on freed objects · 26a5ca7a
      Andrey Konovalov authored
      
      
      Currently, if krealloc() is called on a freed object with KASAN enabled,
      it allocates and returns a new object, but doesn't copy any memory from
      the old one as ksize() returns 0.  This makes the caller believe that
      krealloc() succeeded (KASAN report is printed though).
      
      This patch adds an accessibility check into __do_krealloc().  If the check
      fails, krealloc() returns NULL.  This check duplicates the one in ksize();
      this is fixed in the following patch.
      
      This patch also adds a KASAN-KUnit test to check krealloc() behaviour when
      it's called on a freed object.
      
      Link: https://lkml.kernel.org/r/cbcf7b02be0a1ca11de4f833f2ff0b3f2c9b00c8.1612546384.git.andreyknvl@google.com
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Reviewed-by: default avatarMarco Elver <elver@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Branislav Rankov <Branislav.Rankov@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Dmitry Vyukov...
      26a5ca7a
    • Andrey Konovalov's avatar
      kasan: rework krealloc tests · b87c28b9
      Andrey Konovalov authored
      
      
      This patch reworks KASAN-KUnit tests for krealloc() to:
      
      1. Check both slab and page_alloc based krealloc() implementations.
      2. Allow at least one full granule to fit between old and new sizes for
         each KASAN mode, and check accesses to that granule accordingly.
      
      Link: https://lkml.kernel.org/r/c707f128a2bb9f2f05185d1eb52192cf179cf4fa.1612546384.git.andreyknvl@google.com
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Reviewed-by: default avatarMarco Elver <elver@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Branislav Rankov <Branislav.Rankov@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Evgenii Stepanov <eugenis@google.com>
      Cc: Kevin Brodsky <kevin.brodsky@arm.com>
      Cc: Peter Collingbourne <pcc@google.com>
      Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b87c28b9
    • Andrey Konovalov's avatar
      kasan: unify large kfree checks · 200072ce
      Andrey Konovalov authored
      
      
      Unify checks in kasan_kfree_large() and in kasan_slab_free_mempool() for
      large allocations as it's done for small kfree() allocations.
      
      With this change, kasan_slab_free_mempool() starts checking that the first
      byte of the memory that's being freed is accessible.
      
      Link: https://lkml.kernel.org/r/14ffc4cd867e0b1ed58f7527e3b748a1b4ad08aa.1612546384.git.andreyknvl@google.com
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Reviewed-by: default avatarMarco Elver <elver@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Branislav Rankov <Branislav.Rankov@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Evgenii Stepanov <eugenis@google.com>
      Cc: Kevin Brodsky <kevin.brodsky@arm.com>
      Cc: Peter Collingbourne <pcc@google.com>
      Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      200072ce
    • Andrey Konovalov's avatar
      kasan: clean up setting free info in kasan_slab_free · df54b383
      Andrey Konovalov authored
      
      
      Put kasan_stack_collection_enabled() check and kasan_set_free_info() calls
      next to each other.
      
      The way this was previously implemented was a minor optimization that
      relied of the the fact that kasan_stack_collection_enabled() is always
      true for generic KASAN.  The confusion that this brings outweights saving
      a few instructions.
      
      Link: https://lkml.kernel.org/r/f838e249be5ab5810bf54a36ef5072cfd80e2da7.1612546384.git.andreyknvl@google.com
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Reviewed-by: default avatarMarco Elver <elver@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Branislav Rankov <Branislav.Rankov@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Evgenii Stepanov <eugenis@google.com>
      Cc: Kevin Brodsky <kevin.brodsky@arm.com>
      Cc: Peter Collingbourne <pcc@google.com>
      Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      df54b383
    • Andrey Konovalov's avatar
      kasan: optimize large kmalloc poisoning · 43a219cb
      Andrey Konovalov authored
      
      
      Similarly to kasan_kmalloc(), kasan_kmalloc_large() doesn't need to
      unpoison the object as it as already unpoisoned by alloc_pages() (or by
      ksize() for krealloc()).
      
      This patch changes kasan_kmalloc_large() to only poison the redzone.
      
      Link: https://lkml.kernel.org/r/33dee5aac0e550ad7f8e26f590c9b02c6129b4a3.1612546384.git.andreyknvl@google.com
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Reviewed-by: default avatarMarco Elver <elver@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Branislav Rankov <Branislav.Rankov@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Evgenii Stepanov <eugenis@google.com>
      Cc: Kevin Brodsky <kevin.brodsky@arm.com>
      Cc: Peter Collingbourne <pcc@google.com>
      Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      43a219cb
    • Andrey Konovalov's avatar
      kasan, mm: optimize kmalloc poisoning · e2db1a9a
      Andrey Konovalov authored
      
      
      For allocations from kmalloc caches, kasan_kmalloc() always follows
      kasan_slab_alloc().  Currenly, both of them unpoison the whole object,
      which is unnecessary.
      
      This patch provides separate implementations for both annotations:
      kasan_slab_alloc() unpoisons the whole object, and kasan_kmalloc() only
      poisons the redzone.
      
      For generic KASAN, the redzone start might not be aligned to
      KASAN_GRANULE_SIZE.  Therefore, the poisoning is split in two parts:
      kasan_poison_last_granule() poisons the unaligned part, and then
      kasan_poison() poisons the rest.
      
      This patch also clarifies alignment guarantees of each of the poisoning
      functions and drops the unnecessary round_up() call for redzone_end.
      
      With this change, the early SLUB cache annotation needs to be changed to
      kasan_slab_alloc(), as kasan_kmalloc() doesn't unpoison objects now.  The
      number of poisoned bytes for objects in this cache stays the same, as
      kmem_cache_node->object_size is equal to sizeof(struct kmem_cache_node).
      
      Link: https://lkml.kernel.org/r/7e3961cb52be380bc412860332063f5f7ce10d13.1612546384.git.andreyknvl@google.com
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Reviewed-by: default avatarMarco Elver <elver@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Branislav Rankov <Branislav.Rankov@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Evgenii Stepanov <eugenis@google.com>
      Cc: Kevin Brodsky <kevin.brodsky@arm.com>
      Cc: Peter Collingbourne <pcc@google.com>
      Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e2db1a9a
    • Andrey Konovalov's avatar
      kasan, mm: don't save alloc stacks twice · 92850134
      Andrey Konovalov authored
      
      
      Patch series "kasan: optimizations and fixes for HW_TAGS", v4.
      
      This patchset makes the HW_TAGS mode more efficient, mostly by reworking
      poisoning approaches and simplifying/inlining some internal helpers.
      
      With this change, the overhead of HW_TAGS annotations excluding setting
      and checking memory tags is ~3%.  The performance impact caused by tags
      will be unknown until we have hardware that supports MTE.
      
      As a side-effect, this patchset speeds up generic KASAN by ~15%.
      
      This patch (of 13):
      
      Currently KASAN saves allocation stacks in both kasan_slab_alloc() and
      kasan_kmalloc() annotations.  This patch changes KASAN to save allocation
      stacks for slab objects from kmalloc caches in kasan_kmalloc() only, and
      stacks for other slab objects in kasan_slab_alloc() only.
      
      This change requires ____kasan_kmalloc() knowing whether the object
      belongs to a kmalloc cache.  This is implemented by adding a flag field to
      the kasan_info structure.  That flag is only set for kmalloc caches via a
      new kasan_cache_create_kmalloc() annotation.
      
      Link: https://lkml.kernel.org/r/cover.1612546384.git.andreyknvl@google.com
      Link: https://lkml.kernel.org/r/7c673ebca8d00f40a7ad6f04ab9a2bddeeae2097.1612546384.git.andreyknvl@google.com
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Reviewed-by: default avatarMarco Elver <elver@google.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Peter Collingbourne <pcc@google.com>
      Cc: Evgenii Stepanov <eugenis@google.com>
      Cc: Branislav Rankov <Branislav.Rankov@arm.com>
      Cc: Kevin Brodsky <kevin.brodsky@arm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      92850134
    • Alexander Potapenko's avatar
      kasan: use error_report_end tracepoint · d3a61f74
      Alexander Potapenko authored
      
      
      Make it possible to trace KASAN error reporting.  A good usecase is
      watching for trace events from the userspace to detect and process memory
      corruption reports from the kernel.
      
      Link: https://lkml.kernel.org/r/20210121131915.1331302-4-glider@google.com
      Signed-off-by: default avatarAlexander Potapenko <glider@google.com>
      Suggested-by: default avatarMarco Elver <elver@google.com>
      Cc: Andrey Konovalov <andreyknvl@google.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d3a61f74
    • Alexander Potapenko's avatar
      kfence: use error_report_end tracepoint · f2b84d2e
      Alexander Potapenko authored
      
      
      Make it possible to trace KFENCE error reporting.  A good usecase is
      watching for trace events from the userspace to detect and process memory
      corruption reports from the kernel.
      
      Link: https://lkml.kernel.org/r/20210121131915.1331302-3-glider@google.com
      Signed-off-by: default avatarAlexander Potapenko <glider@google.com>
      Suggested-by: default avatarMarco Elver <elver@google.com>
      Cc: Andrey Konovalov <andreyknvl@google.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f2b84d2e
    • Alexander Potapenko's avatar
      tracing: add error_report_end trace point · 9c0dee54
      Alexander Potapenko authored
      
      
      Patch series "Add error_report_end tracepoint to KFENCE and KASAN", v3.
      
      This patchset adds a tracepoint, error_repor_end, that is to be used by
      KFENCE, KASAN, and potentially other bug detection tools, when they print
      an error report.  One of the possible use cases is userspace collection of
      kernel error reports: interested parties can subscribe to the tracing
      event via tracefs, and get notified when an error report occurs.
      
      This patch (of 3):
      
      Introduce error_report_end tracepoint.  It can be used in debugging tools
      like KASAN, KFENCE, etc.  to provide extensions to the error reporting
      mechanisms (e.g.  allow tests hook into error reporting, ease error report
      collection from production kernels).  Another benefit would be making use
      of ftrace for debugging or benchmarking the tools themselves.
      
      Should we need it, the tracepoint name leaves us with the possibility to
      introduce a complementary error_report_start tracepoint in the future.
      
      Link: https://lkml.kernel.org/r/20210121131915.1331302-1-glider@google.com
      Link: https://lkml.kernel.org/r/20210121131915.1331302-2-glider@google.com
      Signed-off-by: default avatarAlexander Potapenko <glider@google.com>
      Suggested-by: default avatarMarco Elver <elver@google.com>
      Cc: Andrey Konovalov <andreyknvl@google.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9c0dee54
    • Marco Elver's avatar
      kfence: report sensitive information based on no_hash_pointers · 35beccf0
      Marco Elver authored
      
      
      We cannot rely on CONFIG_DEBUG_KERNEL to decide if we're running a "debug
      kernel" where we can safely show potentially sensitive information in the
      kernel log.
      
      Instead, simply rely on the newly introduced "no_hash_pointers" to print
      unhashed kernel pointers, as well as decide if our reports can include
      other potentially sensitive information such as registers and corrupted
      bytes.
      
      Link: https://lkml.kernel.org/r/20210223082043.1972742-1-elver@google.com
      Signed-off-by: default avatarMarco Elver <elver@google.com>
      Cc: Timur Tabi <timur@kernel.org>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Andrey Konovalov <andreyknvl@google.com>
      Cc: Jann Horn <jannh@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      35beccf0
    • Marco Elver's avatar
      MAINTAINERS: add entry for KFENCE · 0825c1d5
      Marco Elver authored
      
      
      Add entry for KFENCE maintainers.
      
      Link: https://lkml.kernel.org/r/20201103175841.3495947-10-elver@google.com
      Signed-off-by: default avatarAlexander Potapenko <glider@google.com>
      Signed-off-by: default avatarMarco Elver <elver@google.com>
      Reviewed-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Reviewed-by: default avatarSeongJae Park <sjpark@amazon.de>
      Co-developed-by: default avatarAlexander Potapenko <glider@google.com>
      Cc: Andrey Konovalov <andreyknvl@google.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christopher Lameter <cl@linux.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Hillf Danton <hdanton@sina.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jann Horn <jannh@google.com>
      Cc: Joern Engel <joern@purestorage.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Paul E. McKenney <paulmck@kernel.org>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0825c1d5
    • Marco Elver's avatar
      kfence: add test suite · bc8fbc5f
      Marco Elver authored
      
      
      Add KFENCE test suite, testing various error detection scenarios. Makes
      use of KUnit for test organization. Since KFENCE's interface to obtain
      error reports is via the console, the test verifies that KFENCE outputs
      expected reports to the console.
      
      [elver@google.com: fix typo in test]
        Link: https://lkml.kernel.org/r/X9lHQExmHGvETxY4@elver.google.com
      [elver@google.com: show access type in report]
        Link: https://lkml.kernel.org/r/20210111091544.3287013-2-elver@google.com
      
      Link: https://lkml.kernel.org/r/20201103175841.3495947-9-elver@google.com
      Signed-off-by: default avatarAlexander Potapenko <glider@google.com>
      Signed-off-by: default avatarMarco Elver <elver@google.com>
      Reviewed-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Co-developed-by: default avatarAlexander Potapenko <glider@google.com>
      Reviewed-by: default avatarJann Horn <jannh@google.com>
      Cc: Andrey Konovalov <andreyknvl@google.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Catalin Marinas <catalin.mari...
      bc8fbc5f
    • Marco Elver's avatar
      kfence, Documentation: add KFENCE documentation · 10efe55f
      Marco Elver authored
      
      
      Add KFENCE documentation in dev-tools/kfence.rst, and add to index.
      
      [elver@google.com: add missing copyright header to documentation]
        Link: https://lkml.kernel.org/r/20210118092159.145934-4-elver@google.com
      
      Link: https://lkml.kernel.org/r/20201103175841.3495947-8-elver@google.com
      Signed-off-by: default avatarAlexander Potapenko <glider@google.com>
      Signed-off-by: default avatarMarco Elver <elver@google.com>
      Reviewed-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Co-developed-by: default avatarAlexander Potapenko <glider@google.com>
      Reviewed-by: default avatarJann Horn <jannh@google.com>
      Cc: Andrey Konovalov <andreyknvl@google.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christopher Lameter <cl@linux.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Hillf Danton <hdanton@sina.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Joern Engel <joern@purestorage.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Paul E. McKenney <paulmck@kernel.org>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: SeongJae Park <sjpark@amazon.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      10efe55f
    • Alexander Potapenko's avatar
      kfence, kasan: make KFENCE compatible with KASAN · 2b830526
      Alexander Potapenko authored
      
      
      Make KFENCE compatible with KASAN. Currently this helps test KFENCE
      itself, where KASAN can catch potential corruptions to KFENCE state, or
      other corruptions that may be a result of freepointer corruptions in the
      main allocators.
      
      [akpm@linux-foundation.org: merge fixup]
      [andreyknvl@google.com: untag addresses for KFENCE]
        Link: https://lkml.kernel.org/r/9dc196006921b191d25d10f6e611316db7da2efc.1611946152.git.andreyknvl@google.com
      
      Link: https://lkml.kernel.org/r/20201103175841.3495947-7-elver@google.com
      Signed-off-by: default avatarMarco Elver <elver@google.com>
      Signed-off-by: default avatarAlexander Potapenko <glider@google.com>
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Reviewed-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Reviewed-by: default avatarJann Horn <jannh@google.com>
      Co-developed-by: default avatarMarco Elver <elver@google.com>
      Cc: Andrey Konovalov <andreyknvl@google.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christopher Lameter <cl@linux.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Hillf Danton <hdanton@sina.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Joern Engel <joern@purestorage.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Paul E. McKenney <paulmck@kernel.org>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: SeongJae Park <sjpark@amazon.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2b830526
    • Alexander Potapenko's avatar
      mm, kfence: insert KFENCE hooks for SLUB · b89fb5ef
      Alexander Potapenko authored
      
      
      Inserts KFENCE hooks into the SLUB allocator.
      
      To pass the originally requested size to KFENCE, add an argument
      'orig_size' to slab_alloc*(). The additional argument is required to
      preserve the requested original size for kmalloc() allocations, which
      uses size classes (e.g. an allocation of 272 bytes will return an object
      of size 512). Therefore, kmem_cache::size does not represent the
      kmalloc-caller's requested size, and we must introduce the argument
      'orig_size' to propagate the originally requested size to KFENCE.
      
      Without the originally requested size, we would not be able to detect
      out-of-bounds accesses for objects placed at the end of a KFENCE object
      page if that object is not equal to the kmalloc-size class it was
      bucketed into.
      
      When KFENCE is disabled, there is no additional overhead, since
      slab_alloc*() functions are __always_inline.
      
      Link: https://lkml.kernel.org/r/20201103175841.3495947-6-elver@google.com
      Signed-off-by: default avatarMarco Elver <elver@google.com>
      Signed-off-by: default avatarAlexander Potapenko <glider@google.com>
      Reviewed-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Reviewed-by: default avatarJann Horn <jannh@google.com>
      Co-developed-by: default avatarMarco Elver <elver@google.com>
      Cc: Andrey Konovalov <andreyknvl@google.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christopher Lameter <cl@linux.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Hillf Danton <hdanton@sina.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Joern Engel <joern@purestorage.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Paul E. McKenney <paulmck@kernel.org>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: SeongJae Park <sjpark@amazon.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b89fb5ef
    • Alexander Potapenko's avatar
      mm, kfence: insert KFENCE hooks for SLAB · d3fb45f3
      Alexander Potapenko authored
      
      
      Inserts KFENCE hooks into the SLAB allocator.
      
      To pass the originally requested size to KFENCE, add an argument
      'orig_size' to slab_alloc*(). The additional argument is required to
      preserve the requested original size for kmalloc() allocations, which
      uses size classes (e.g. an allocation of 272 bytes will return an object
      of size 512). Therefore, kmem_cache::size does not represent the
      kmalloc-caller's requested size, and we must introduce the argument
      'orig_size' to propagate the originally requested size to KFENCE.
      
      Without the originally requested size, we would not be able to detect
      out-of-bounds accesses for objects placed at the end of a KFENCE object
      page if that object is not equal to the kmalloc-size class it was
      bucketed into.
      
      When KFENCE is disabled, there is no additional overhead, since
      slab_alloc*() functions are __always_inline.
      
      Link: https://lkml.kernel.org/r/20201103175841.3495947-5-elver@google.com
      Signed-off-by: default avatarMarco Elver <elver@google.com>
      Signed-off-by: default avatarAlexander Potapenko <glider@google.com>
      Reviewed-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Co-developed-by: default avatarMarco Elver <elver@google.com>
      
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Andrey Konovalov <andreyknvl@google.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Hillf Danton <hdanton@sina.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jann Horn <jannh@google.com>
      Cc: Joern Engel <joern@purestorage.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Paul E. McKenney <paulmck@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: SeongJae Park <sjpark@amazon.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d3fb45f3
    • Marco Elver's avatar
      kfence: use pt_regs to generate stack trace on faults · d438fabc
      Marco Elver authored
      
      
      Instead of removing the fault handling portion of the stack trace based on
      the fault handler's name, just use struct pt_regs directly.
      
      Change kfence_handle_page_fault() to take a struct pt_regs, and plumb it
      through to kfence_report_error() for out-of-bounds, use-after-free, or
      invalid access errors, where pt_regs is used to generate the stack trace.
      
      If the kernel is a DEBUG_KERNEL, also show registers for more information.
      
      Link: https://lkml.kernel.org/r/20201105092133.2075331-1-elver@google.com
      Signed-off-by: default avatarMarco Elver <elver@google.com>
      Suggested-by: default avatarMark Rutland <mark.rutland@arm.com>
      Acked-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Jann Horn <jannh@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d438fabc
    • Marco Elver's avatar
      arm64, kfence: enable KFENCE for ARM64 · 840b2398
      Marco Elver authored
      
      
      Add architecture specific implementation details for KFENCE and enable
      KFENCE for the arm64 architecture. In particular, this implements the
      required interface in <asm/kfence.h>.
      
      KFENCE requires that attributes for pages from its memory pool can
      individually be set. Therefore, force the entire linear map to be mapped
      at page granularity. Doing so may result in extra memory allocated for
      page tables in case rodata=full is not set; however, currently
      CONFIG_RODATA_FULL_DEFAULT_ENABLED=y is the default, and the common case
      is therefore not affected by this change.
      
      [elver@google.com: add missing copyright and description header]
        Link: https://lkml.kernel.org/r/20210118092159.145934-3-elver@google.com
      
      Link: https://lkml.kernel.org/r/20201103175841.3495947-4-elver@google.com
      Signed-off-by: default avatarAlexander Potapenko <glider@google.com>
      Signed-off-by: default avatarMarco Elver <elver@google.com>
      Reviewed-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Co-developed-by: default avatarAlexander Potapenko <glider@google.com>
      Reviewed-by: default avatarJann Horn <jannh@google.com>
      Reviewed-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Andrey Konovalov <andreyknvl@google.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christopher Lameter <cl@linux.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Hillf Danton <hdanton@sina.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Joern Engel <joern@purestorage.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Paul E. McKenney <paulmck@kernel.org>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: SeongJae Park <sjpark@amazon.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      840b2398
    • Alexander Potapenko's avatar
      x86, kfence: enable KFENCE for x86 · 1dc0da6e
      Alexander Potapenko authored
      
      
      Add architecture specific implementation details for KFENCE and enable
      KFENCE for the x86 architecture. In particular, this implements the
      required interface in <asm/kfence.h> for setting up the pool and
      providing helper functions for protecting and unprotecting pages.
      
      For x86, we need to ensure that the pool uses 4K pages, which is done
      using the set_memory_4k() helper function.
      
      [elver@google.com: add missing copyright and description header]
        Link: https://lkml.kernel.org/r/20210118092159.145934-2-elver@google.com
      
      Link: https://lkml.kernel.org/r/20201103175841.3495947-3-elver@google.com
      Signed-off-by: default avatarMarco Elver <elver@google.com>
      Signed-off-by: default avatarAlexander Potapenko <glider@google.com>
      Reviewed-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Co-developed-by: default avatarMarco Elver <elver@google.com>
      Reviewed-by: default avatarJann Horn <jannh@google.com>
      Cc: Andrey Konovalov <andreyknvl@google.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christopher Lameter <cl@linux.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Hillf Danton <hdanton@sina.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Joern Engel <joern@purestorage.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Paul E. McKenney <paulmck@kernel.org>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: SeongJae Park <sjpark@amazon.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1dc0da6e
    • Alexander Potapenko's avatar
      mm: add Kernel Electric-Fence infrastructure · 0ce20dd8
      Alexander Potapenko authored
      
      
      Patch series "KFENCE: A low-overhead sampling-based memory safety error detector", v7.
      
      This adds the Kernel Electric-Fence (KFENCE) infrastructure. KFENCE is a
      low-overhead sampling-based memory safety error detector of heap
      use-after-free, invalid-free, and out-of-bounds access errors.  This
      series enables KFENCE for the x86 and arm64 architectures, and adds
      KFENCE hooks to the SLAB and SLUB allocators.
      
      KFENCE is designed to be enabled in production kernels, and has near
      zero performance overhead. Compared to KASAN, KFENCE trades performance
      for precision. The main motivation behind KFENCE's design, is that with
      enough total uptime KFENCE will detect bugs in code paths not typically
      exercised by non-production test workloads. One way to quickly achieve a
      large enough total uptime is when the tool is deployed across a large
      fleet of machines.
      
      KFENCE objects each reside on a dedicated page, at either the left or
      right page boundaries. The pages to the left and right of the object
      page are "guard pages", whose attributes are changed to a protected
      state, and cause page faults on any attempted access to them. Such page
      faults are then intercepted by KFENCE, which handles the fault
      gracefully by reporting a memory access error.
      
      Guarded allocations are set up based on a sample interval (can be set
      via kfence.sample_interval). After expiration of the sample interval,
      the next allocation through the main allocator (SLAB or SLUB) returns a
      guarded allocation from the KFENCE object pool. At this point, the timer
      is reset, and the next allocation is set up after the expiration of the
      interval.
      
      To enable/disable a KFENCE allocation through the main allocator's
      fast-path without overhead, KFENCE relies on static branches via the
      static keys infrastructure. The static branch is toggled to redirect the
      allocation to KFENCE.
      
      The KFENCE memory pool is of fixed size, and if the pool is exhausted no
      further KFENCE allocations occur. The default config is conservative
      with only 255 objects, resulting in a pool size of 2 MiB (with 4 KiB
      pages).
      
      We have verified by running synthetic benchmarks (sysbench I/O,
      hackbench) and production server-workload benchmarks that a kernel with
      KFENCE (using sample intervals 100-500ms) is performance-neutral
      compared to a non-KFENCE baseline kernel.
      
      KFENCE is inspired by GWP-ASan [1], a userspace tool with similar
      properties. The name "KFENCE" is a homage to the Electric Fence Malloc
      Debugger [2].
      
      For more details, see Documentation/dev-tools/kfence.rst added in the
      series -- also viewable here:
      
      	https://raw.githubusercontent.com/google/kasan/kfence/Documentation/dev-tools/kfence.rst
      
      [1] http://llvm.org/docs/GwpAsan.html
      [2] https://linux.die.net/man/3/efence
      
      This patch (of 9):
      
      This adds the Kernel Electric-Fence (KFENCE) infrastructure. KFENCE is a
      low-overhead sampling-based memory safety error detector of heap
      use-after-free, invalid-free, and out-of-bounds access errors.
      
      KFENCE is designed to be enabled in production kernels, and has near
      zero performance overhead. Compared to KASAN, KFENCE trades performance
      for precision. The main motivation behind KFENCE's design, is that with
      enough total uptime KFENCE will detect bugs in code paths not typically
      exercised by non-production test workloads. One way to quickly achieve a
      large enough total uptime is when the tool is deployed across a large
      fleet of machines.
      
      KFENCE objects each reside on a dedicated page, at either the left or
      right page boundaries. The pages to the left and right of the object
      page are "guard pages", whose attributes are changed to a protected
      state, and cause page faults on any attempted access to them. Such page
      faults are then intercepted by KFENCE, which handles the fault
      gracefully by reporting a memory access error. To detect out-of-bounds
      writes to memory within the object's page itself, KFENCE also uses
      pattern-based redzones. The following figure illustrates the page
      layout:
      
        ---+-----------+-----------+-----------+-----------+-----------+---
           | xxxxxxxxx | O :       | xxxxxxxxx |       : O | xxxxxxxxx |
           | xxxxxxxxx | B :       | xxxxxxxxx |       : B | xxxxxxxxx |
           | x GUARD x | J : RED-  | x GUARD x | RED-  : J | x GUARD x |
           | xxxxxxxxx | E :  ZONE | xxxxxxxxx |  ZONE : E | xxxxxxxxx |
           | xxxxxxxxx | C :       | xxxxxxxxx |       : C | xxxxxxxxx |
           | xxxxxxxxx | T :       | xxxxxxxxx |       : T | xxxxxxxxx |
        ---+-----------+-----------+-----------+-----------+-----------+---
      
      Guarded allocations are set up based on a sample interval (can be set
      via kfence.sample_interval). After expiration of the sample interval, a
      guarded allocation from the KFENCE object pool is returned to the main
      allocator (SLAB or SLUB). At this point, the timer is reset, and the
      next allocation is set up after the expiration of the interval.
      
      To enable/disable a KFENCE allocation through the main allocator's
      fast-path without overhead, KFENCE relies on static branches via the
      static keys infrastructure. The static branch is toggled to redirect the
      allocation to KFENCE. To date, we have verified by running synthetic
      benchmarks (sysbench I/O, hackbench) that a kernel compiled with KFENCE
      is performance-neutral compared to the non-KFENCE baseline.
      
      For more details, see Documentation/dev-tools/kfence.rst (added later in
      the series).
      
      [elver@google.com: fix parameter description for kfence_object_start()]
        Link: https://lkml.kernel.org/r/20201106092149.GA2851373@elver.google.com
      [elver@google.com: avoid stalling work queue task without allocations]
        Link: https://lkml.kernel.org/r/CADYN=9J0DQhizAGB0-jz4HOBBh+05kMBXb4c0cXMS7Qi5NAJiw@mail.gmail.com
        Link: https://lkml.kernel.org/r/20201110135320.3309507-1-elver@google.com
      [elver@google.com: fix potential deadlock due to wake_up()]
        Link: https://lkml.kernel.org/r/000000000000c0645805b7f982e4@google.com
        Link: https://lkml.kernel.org/r/20210104130749.1768991-1-elver@google.com
      [elver@google.com: add option to use KFENCE without static keys]
        Link: https://lkml.kernel.org/r/20210111091544.3287013-1-elver@google.com
      [elver@google.com: add missing copyright and description headers]
        Link: https://lkml.kernel.org/r/20210118092159.145934-1-elver@google.com
      
      Link: https://lkml.kernel.org/r/20201103175841.3495947-2-elver@google.com
      Signed-off-by: default avatarMarco Elver <elver@google.com>
      Signed-off-by: default avatarAlexander Potapenko <glider@google.com>
      Reviewed-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Reviewed-by: default avatarSeongJae Park <sjpark@amazon.de>
      Co-developed-by: default avatarMarco Elver <elver@google.com>
      Reviewed-by: default avatarJann Horn <jannh@google.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Paul E. McKenney <paulmck@kernel.org>
      Cc: Andrey Konovalov <andreyknvl@google.com>
      Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christopher Lameter <cl@linux.com>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Hillf Danton <hdanton@sina.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Joern Engel <joern@purestorage.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0ce20dd8
    • Stephen Zhang's avatar
      mm/early_ioremap.c: use __func__ instead of function name · 87005394
      Stephen Zhang authored
      
      
      It is better to use __func__ instead of function name.
      
      Link: https://lkml.kernel.org/r/1611385587-4209-1-git-send-email-stephenzhangzsd@gmail.com
      Signed-off-by: default avatarStephen Zhang <stephenzhangzsd@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      87005394
    • Daniel Vetter's avatar
      mm/backing-dev.c: use might_alloc() · c1ca59a1
      Daniel Vetter authored
      
      
      Now that my little helper has landed, use it more.  On top of the existing
      check this also uses lockdep through the fs_reclaim annotations.
      
      [akpm@linux-foundation.org: include linux/sched/mm.h]
      
      Link: https://lkml.kernel.org/r/20210113135009.3606813-2-daniel.vetter@ffwll.ch
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c1ca59a1
    • Daniel Vetter's avatar
      mm/dmapool: use might_alloc() · 0f2f89b6
      Daniel Vetter authored
      
      
      Now that my little helper has landed, use it more.  On top of the existing
      check this also uses lockdep through the fs_reclaim annotations.
      
      Link: https://lkml.kernel.org/r/20210113135009.3606813-1-daniel.vetter@ffwll.ch
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0f2f89b6
    • Guo Ren's avatar
      mm: page-flags.h: Typo fix (It -> If) · 4be408ce
      Guo Ren authored
      
      
      The "If" was wrongly spelled as "It".
      
      Link: https://lkml.kernel.org/r/1608959036-91409-1-git-send-email-guoren@kernel.org
      Signed-off-by: default avatarGuo Ren <guoren@linux.alibaba.com>
      Cc: Oscar Salvador <osalvador@suse.de>
      Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Steven Price <steven.price@arm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      4be408ce
    • Miaohe Lin's avatar
      mm/zsmalloc.c: use page_private() to access page->private · a6c5e0f7
      Miaohe Lin authored
      
      
      It's recommended to use helper macro page_private() to access the private
      field of page.  Use such helper to eliminate direct access.
      
      Link: https://lkml.kernel.org/r/20210203091857.20017-1-linmiaohe@huawei.com
      Signed-off-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a6c5e0f7
    • Rokudo Yan's avatar
      zsmalloc: account the number of compacted pages correctly · 23959281
      Rokudo Yan authored
      There exists multiple path may do zram compaction concurrently.
      1. auto-compaction triggered during memory reclaim
      2. userspace utils write zram<id>/compaction node
      
      So, multiple threads may call zs_shrinker_scan/zs_compact concurrently.
      But pages_compacted is a per zsmalloc pool variable and modification
      of the variable is not serialized(through under class->lock).
      There are two issues here:
      1. the pages_compacted may not equal to total number of pages
      freed(due to concurrently add).
      2. zs_shrinker_scan may not return the correct number of pages
      freed(issued by current shrinker).
      
      The fix is simple:
      1. account the number of pages freed in zs_compact locally.
      2. use actomic variable pages_compacted to accumulate total number.
      
      Link: https://lkml.kernel.org/r/20210202122235.26885-1-wu-yan@tcl.com
      Fixes: 860c707d
      
       ("zsmalloc: account the number of compacted pages")
      Signed-off-by: default avatarRokudo Yan <wu-yan@tcl.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      23959281
    • Miaohe Lin's avatar
      mm/zsmalloc.c: convert to use kmem_cache_zalloc in cache_alloc_zspage() · f0231305
      Miaohe Lin authored
      
      
      We always memset the zspage allocated via cache_alloc_zspage.  So it's
      more convenient to use kmem_cache_zalloc in cache_alloc_zspage than caller
      do it manually.
      
      Link: https://lkml.kernel.org/r/20210114120032.25885-1-linmiaohe@huawei.com
      Signed-off-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Reviewed-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f0231305
    • Tian Tao's avatar
      mm: set the sleep_mapped to true for zbud and z3fold · e818e820
      Tian Tao authored
      
      
      zpool driver adds a flag to indicate whether the zpool driver can enter an
      atomic context after mapping.  This patch sets it true for z3fold and
      zbud.
      
      Link: https://lkml.kernel.org/r/1611035683-12732-3-git-send-email-tiantao6@hisilicon.com
      Signed-off-by: default avatarTian Tao <tiantao6@hisilicon.com>
      Reviewed-by: default avatarVitaly Wool <vitaly.wool@konsulko.com>
      Acked-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Reported-by: default avatarMike Galbraith <efault@gmx.de>
      Cc: Seth Jennings <sjenning@redhat.com>
      Cc: Dan Streetman <ddstreet@ieee.org>
      Cc: Barry Song <song.bao.hua@hisilicon.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e818e820
    • Tian Tao's avatar
      mm/zswap: add the flag can_sleep_mapped · fc6697a8
      Tian Tao authored
      
      
      Patch series "Fix the compatibility of zsmalloc and zswap".
      
      Patch #1 adds a flag to zpool, then zswap used to determine if zpool
      drivers such as zbud/z3fold/zsmalloc will enter an atomic context after
      mapping.
      
      The difference between zbud/z3fold and zsmalloc is that zsmalloc requires
      an atomic context that since its map function holds a preempt-disabled,
      but zbud/z3fold don't require an atomic context.  So patch #2 sets flag
      sleep_mapped to true indicating that zbud/z3fold can sleep after mapping.
      zsmalloc didn't support sleep after mapping, so don't set that flag to
      true.
      
      This patch (of 2):
      
      Add a flag to zpool, named is "can_sleep_mapped", and have it set true for
      zbud/z3fold, not set this flag for zsmalloc, so its default value is
      false.  Then zswap could go the current path if the flag is true; and if
      it's false, copy data from src to a temporary buffer, then unmap the
      handle, take the mutex, process the buffer instead of src to avoid
      sleeping function called from atomic context.
      
      [natechancellor@gmail.com: add return value in zswap_frontswap_load]
        Link: https://lkml.kernel.org/r/20210121214804.926843-1-natechancellor@gmail.com
      [tiantao6@hisilicon.com: fix potential memory leak]
        Link: https://lkml.kernel.org/r/1611538365-51811-1-git-send-email-tiantao6@hisilicon.com
      [colin.king@canonical.com: fix potential uninitialized pointer read on tmp]
        Link: https://lkml.kernel.org/r/20210128141728.639030-1-colin.king@canonical.com
      [tiantao6@hisilicon.com: fix variable 'entry' is uninitialized when used]
        Link: https://lkml.kernel.org/r/1611223030-58346-1-git-send-email-tiantao6@hisilicon.comLink: https://lkml.kernel.org/r/1611035683-12732-1-git-send-email-tiantao6@hisilicon.com
      
      Link: https://lkml.kernel.org/r/1611035683-12732-2-git-send-email-tiantao6@hisilicon.com
      Signed-off-by: default avatarTian Tao <tiantao6@hisilicon.com>
      Signed-off-by: default avatarNathan Chancellor <natechancellor@gmail.com>
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Reviewed-by: default avatarVitaly Wool <vitaly.wool@konsulko.com>
      Acked-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Reported-by: default avatarMike Galbraith <efault@gmx.de>
      Cc: Barry Song <song.bao.hua@hisilicon.com>
      Cc: Dan Streetman <ddstreet@ieee.org>
      Cc: Seth Jennings <sjenning@redhat.com>
      Cc: Dan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      fc6697a8
    • Randy Dunlap's avatar
      mm: zswap: clean up confusing comment · c0c641d7
      Randy Dunlap authored
      Correct wording and change one duplicated word (it) to "it is".
      
      Link: https://lkml.kernel.org/r/20201221042848.13980-1-rdunlap@infradead.org
      Fixes: 0ab0abcf
      
       ("mm/zswap: refactor the get/put routines")
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Cc: Weijie Yang <weijie.yang@samsung.com>
      Cc: Seth Jennings <sjennings@variantweb.net>
      Cc: Seth Jennings <sjenning@redhat.com>
      Cc: Dan Streetman <ddstreet@ieee.org>
      Cc: Vitaly Wool <vitaly.wool@konsulko.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c0c641d7
    • Miaohe Lin's avatar
      mm/rmap: fix potential pte_unmap on an not mapped pte · 5d5d19ed
      Miaohe Lin authored
      For PMD-mapped page (usually THP), pvmw->pte is NULL.  For PTE-mapped THP,
      pvmw->pte is mapped.  But for HugeTLB pages, pvmw->pte is not mapped and
      set to the relevant page table entry.  So in page_vma_mapped_walk_done(),
      we may do pte_unmap() for HugeTLB pte which is not mapped.  Fix this by
      checking pvmw->page against PageHuge before trying to do pte_unmap().
      
      Link: https://lkml.kernel.org/r/20210127093349.39081-1-linmiaohe@huawei.com
      Fixes: ace71a19
      
       ("mm: introduce page_vma_mapped_walk()")
      Signed-off-by: default avatarHongxiang Lou <louhongxiang@huawei.com>
      Signed-off-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Tested-by: default avatarSedat Dilek <sedat.dilek@gmail.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Nathan Chancellor <natechancellor@gmail.com>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Michel Lespinasse <walken@google.com>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Wei Yang <richard.weiyang@linux.alibaba.com>
      Cc: Dmitry Safonov <0x7f454c46@gmail.com>
      Cc: Brian Geffon <bgeffon@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5d5d19ed
    • Miaohe Lin's avatar
      mm/rmap: correct obsolete comment of page_get_anon_vma() · ad8a20cf
      Miaohe Lin authored
      Since commit 746b18d4
      
       ("mm: use refcounts for page_lock_anon_vma()"),
      page_lock_anon_vma() is renamed to page_get_anon_vma() and converted to
      return a refcount increased anon_vma.  But it forgot to change the
      relevant comment.
      
      Link: https://lkml.kernel.org/r/20210203093215.31990-1-linmiaohe@huawei.com
      Signed-off-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ad8a20cf
    • Miaohe Lin's avatar
      mm/rmap: use page_not_mapped in try_to_unmap() · b7e188ec
      Miaohe Lin authored
      
      
      page_mapcount_is_zero() calculates accurately how many mappings a hugepage
      has in order to check against 0 only.  This is a waste of cpu time.  We
      can do this via page_not_mapped() to save some possible atomic_read
      cycles.  Remove the function page_mapcount_is_zero() as it's not used
      anymore and move page_not_mapped() above try_to_unmap() to avoid
      identifier undeclared compilation error.
      
      Link: https://lkml.kernel.org/r/20210130084904.35307-1-linmiaohe@huawei.com
      Signed-off-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b7e188ec
    • Miaohe Lin's avatar
      mm/rmap: fix obsolete comment in __page_check_anon_rmap() · 90aaca85
      Miaohe Lin authored
      Commit 21333b2b
      
       ("ksm: no debug in page_dup_rmap()") has reverted
      page_dup_rmap() to an inline atomic_inc of mapcount.  So page_dup_rmap()
      does not call __page_check_anon_rmap() anymore.
      
      Link: https://lkml.kernel.org/r/20210128110209.50857-1-linmiaohe@huawei.com
      Signed-off-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      90aaca85
    • Miaohe Lin's avatar
      mm/rmap: remove unneeded semicolon in page_not_mapped() · e0af87ff
      Miaohe Lin authored
      
      
      Remove extra semicolon without any functional change intended.
      
      Link: https://lkml.kernel.org/r/20210127093425.39640-1-linmiaohe@huawei.com
      Signed-off-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e0af87ff
    • Miaohe Lin's avatar
      mm/rmap: correct some obsolete comments of anon_vma · aaf1f990
      Miaohe Lin authored
      commit 2b575eb6 ("mm: convert anon_vma->lock to a mutex") changed
      spinlock used to serialize access to vma list to mutex.  And further, the
      commit 5a505085
      
       ("mm/rmap: Convert the struct anon_vma::mutex to an
      rwsem") converted the mutex to an rwsem for solving scalability problem.
      So replace spinlock with rwsem to make comment uptodate.
      
      Link: https://lkml.kernel.org/r/20210123072459.25903-1-linmiaohe@huawei.com
      Signed-off-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Cc: Rik van Riel <riel@surriel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      aaf1f990
    • Miaohe Lin's avatar
      mm/mlock: stop counting mlocked pages when none vma is found · 48b03eea
      Miaohe Lin authored
      
      
      There will be no vma satisfies addr < vm_end when find_vma() returns NULL.
      Thus it's meaningless to traverse the vma list below because we can't
      find any vma to count mlocked pages.  Stop counting mlocked pages in this
      case to save some vma list traversal cycles.
      
      Link: https://lkml.kernel.org/r/20210204110705.17586-1-linmiaohe@huawei.com
      Signed-off-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      48b03eea