Skip to content
  1. Mar 25, 2022
    • David Hildenbrand's avatar
      mm: streamline COW logic in do_swap_page() · c145e0b4
      David Hildenbrand authored
      
      
      Currently we have a different COW logic when:
      * triggering a read-fault to swapin first and then trigger a write-fault
        -> do_swap_page() + do_wp_page()
      * triggering a write-fault to swapin
        -> do_swap_page() + do_wp_page() only if we fail reuse in do_swap_page()
      
      The COW logic in do_swap_page() is different than our reuse logic in
      do_wp_page().  The COW logic in do_wp_page() -- page_count() == 1 -- makes
      currently sure that we certainly don't have a remaining reference, e.g.,
      via GUP, on the target page we want to reuse: if there is any unexpected
      reference, we have to copy to avoid information leaks.
      
      As do_swap_page() behaves differently, in environments with swap enabled
      we can currently have an unintended information leak from the parent to
      the child, similar as known from CVE-2020-29374:
      
      	1. Parent writes to anonymous page
      	-> Page is mapped writable and modified
      	2. Page is swapped out
      	-> Page is unmapped and replaced by swap entry
      	3. fork()
      	-> Swap entries are copied to child
      	4. Child pins page R/O
      	-> Page is mapped R/O into child
      	5. Child unmaps page
      	-> Child still holds GUP reference
      	6. Parent writes to page
      	-> Page is reused in do_swap_page()
      	-> Child can observe changes
      
      Exchanging 2. and 3. should have the same effect.
      
      Let's apply the same COW logic as in do_wp_page(), conditionally trying to
      remove the page from the swapcache after freeing the swap entry, however,
      before actually mapping our page.  We can change the order now that we use
      try_to_free_swap(), which doesn't care about the mapcount, instead of
      reuse_swap_page().
      
      To handle references from the LRU pagevecs, conditionally drain the local
      LRU pagevecs when required, however, don't consider the page_count() when
      deciding whether to drain to keep it simple for now.
      
      Link: https://lkml.kernel.org/r/20220131162940.210846-5-david@redhat.com
      Signed-off-by: default avatarDavid Hildenbrand <david@redhat.com>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Don Dutile <ddutile@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jann Horn <jannh@google.com>
      Cc: Jason Gunthorpe <jgg@nvidia.com>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Liang Zhang <zhangliang5@huawei.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Cc: Nadav Amit <nadav.amit@gmail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Xu <peterx@redhat.com>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Roman Gushchin <roman.gushchin@linux.dev>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: Yang Shi <shy828301@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c145e0b4
    • David Hildenbrand's avatar
      mm: slightly clarify KSM logic in do_swap_page() · 84d60fdd
      David Hildenbrand authored
      
      
      Let's make it clearer that KSM might only have to copy a page in case we
      have a page in the swapcache, not if we allocated a fresh page and
      bypassed the swapcache.  While at it, add a comment why this is usually
      necessary and merge the two swapcache conditions.
      
      [akpm@linux-foundation.org: fix comment, per David]
      
      Link: https://lkml.kernel.org/r/20220131162940.210846-4-david@redhat.com
      Signed-off-by: default avatarDavid Hildenbrand <david@redhat.com>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Don Dutile <ddutile@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jann Horn <jannh@google.com>
      Cc: Jason Gunthorpe <jgg@nvidia.com>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Liang Zhang <zhangliang5@huawei.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Cc: Nadav Amit <nadav.amit@gmail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Xu <peterx@redhat.com>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Roman Gushchin <roman.gushchin@linux.dev>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: Yang Shi <shy828301@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      84d60fdd
    • David Hildenbrand's avatar
      mm: optimize do_wp_page() for fresh pages in local LRU pagevecs · d4c47097
      David Hildenbrand authored
      
      
      For example, if a page just got swapped in via a read fault, the LRU
      pagevecs might still hold a reference to the page.  If we trigger a write
      fault on such a page, the additional reference from the LRU pagevecs will
      prohibit reusing the page.
      
      Let's conditionally drain the local LRU pagevecs when we stumble over a
      !PageLRU() page.  We cannot easily drain remote LRU pagevecs and it might
      not be desirable performance-wise.  Consequently, this will only avoid
      copying in some cases.
      
      Add a simple "page_count(page) > 3" check first but keep the
      "page_count(page) > 1 + PageSwapCache(page)" check in place, as we want to
      minimize cases where we remove a page from the swapcache but won't be able
      to reuse it, for example, because another process has it mapped R/O, to
      not affect reclaim.
      
      We cannot easily handle the following cases and we will always have to
      copy:
      
      (1) The page is referenced in the LRU pagevecs of other CPUs. We really
          would have to drain the LRU pagevecs of all CPUs -- most probably
          copying is much cheaper.
      
      (2) The page is already PageLRU() but is getting moved between LRU
          lists, for example, for activation (e.g., mark_page_accessed()),
          deactivation (MADV_COLD), or lazyfree (MADV_FREE). We'd have to
          drain mostly unconditionally, which might be bad performance-wise.
          Most probably this won't happen too often in practice.
      
      Note that there are other reasons why an anon page might temporarily not
      be PageLRU(): for example, compaction and migration have to isolate LRU
      pages from the LRU lists first (isolate_lru_page()), moving them to
      temporary local lists and clearing PageLRU() and holding an additional
      reference on the page.  In that case, we'll always copy.
      
      This change seems to be fairly effective with the reproducer [1] shared by
      Nadav, as long as writeback is done synchronously, for example, using
      zram.  However, with asynchronous writeback, we'll usually fail to free
      the swapcache because the page is still under writeback: something we
      cannot easily optimize for, and maybe it's not really relevant in
      practice.
      
      [1] https://lkml.kernel.org/r/0480D692-D9B2-429A-9A88-9BBA1331AC3A@gmail.com
      
      Link: https://lkml.kernel.org/r/20220131162940.210846-3-david@redhat.com
      Signed-off-by: default avatarDavid Hildenbrand <david@redhat.com>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Don Dutile <ddutile@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Jan Kara <jack@suse.cz>
      Cc: Jann Horn <jannh@google.com>
      Cc: Jason Gunthorpe <jgg@nvidia.com>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Liang Zhang <zhangliang5@huawei.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Cc: Nadav Amit <nadav.amit@gmail.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Peter Xu <peterx@redhat.com>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Roman Gushchin <roman.gushchin@linux.dev>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: Yang Shi <shy828301@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d4c47097
    • David Hildenbrand's avatar
      mm: optimize do_wp_page() for exclusive pages in the swapcache · 53a05ad9
      David Hildenbrand authored
      Patch series "mm: COW fixes part 1: fix the COW security issue for THP and swap", v3.
      
      This series attempts to optimize and streamline the COW logic for ordinary
      anon pages and THP anon pages, fixing two remaining instances of
      CVE-2020-29374 in do_swap_page() and do_huge_pmd_wp_page(): information
      can leak from a parent process to a child process via anonymous pages
      shared during fork().
      
      This issue, including other related COW issues, has been summarized in [2]:
      
       "1. Observing Memory Modifications of Private Pages From A Child Process
      
        Long story short: process-private memory might not be as private as you
        think once you fork(): successive modifications of private memory
        regions in the parent process can still be observed by the child
        process, for example, by smart use of vmsplice()+munmap().
      
        The core problem is that pinning pages readable in a child process, such
        as done via the vmsplice system call, can result in a child process
        observing memory modifications done in the parent process the child is
        not supposed to observe. [1] contains an excellent summary and [2]
        contains further details. This issue was assigned CVE-2020-29374 [9].
      
        For this to trigger, it's required to use a fork() without subsequent
        exec(), for example, as used under Android zygote. Without further
        details about an application that forks less-privileged child processes,
        one cannot really say what's actually affected and what's not -- see the
        details section the end of this mail for a short sshd/openssh analysis.
      
        While commit 17839856 ("gup: document and work around "COW can break
        either way" issue") fixed this issue and resulted in other problems
        (e.g., ptrace on pmem), commit 09854ba9
      
       ("mm: do_wp_page()
        simplification") re-introduced part of the problem unfortunately.
      
        The original reproducer can be modified quite easily to use THP [3] and
        make the issue appear again on upstream kernels. I modified it to use
        hugetlb [4] and it triggers as well. The problem is certainly less
        severe with hugetlb than with THP; it merely highlights that we still
        have plenty of open holes we should be closing/fixing.
      
        Regarding vmsplice(), the only known workaround is to disallow the
        vmsplice() system call ... or disable THP and hugetlb. But who knows
        what else is affected (RDMA? O_DIRECT?) to achieve the same goal -- in
        the end, it's a more generic issue"
      
      This security issue was first reported by Jann Horn on 27 May 2020 and it
      currently affects anonymous pages during swapin, anonymous THP and hugetlb.
      This series tackles anonymous pages during swapin and anonymous THP:
      
       - do_swap_page() for handling COW on PTEs during swapin directly
      
       - do_huge_pmd_wp_page() for handling COW on PMD-mapped THP during write
         faults
      
      With this series, we'll apply the same COW logic we have in do_wp_page()
      to all swappable anon pages: don't reuse (map writable) the page in
      case there are additional references (page_count() != 1). All users of
      reuse_swap_page() are remove, and consequently reuse_swap_page() is
      removed.
      
      In general, we're struggling with the following COW-related issues:
      
      (1) "missed COW": we miss to copy on write and reuse the page (map it
          writable) although we must copy because there are pending references
          from another process to this page. The result is a security issue.
      
      (2) "wrong COW": we copy on write although we wouldn't have to and
          shouldn't: if there are valid GUP references, they will become out
          of sync with the pages mapped into the page table. We fail to detect
          that such a page can be reused safely, especially if never more than
          a single process mapped the page. The result is an intra process
          memory corruption.
      
      (3) "unnecessary COW": we copy on write although we wouldn't have to:
          performance degradation and temporary increases swap+memory
          consumption can be the result.
      
      While this series fixes (1) for swappable anon pages, it tries to reduce
      reported cases of (3) first as good and easy as possible to limit the
      impact when streamlining.  The individual patches try to describe in
      which cases we will run into (3).
      
      This series certainly makes (2) worse for THP, because a THP will now
      get PTE-mapped on write faults if there are additional references, even
      if there was only ever a single process involved: once PTE-mapped, we'll
      copy each and every subpage and won't reuse any subpage as long as the
      underlying compound page wasn't split.
      
      I'm working on an approach to fix (2) and improve (3): PageAnonExclusive
      to mark anon pages that are exclusive to a single process, allow GUP
      pins only on such exclusive pages, and allow turning exclusive pages
      shared (clearing PageAnonExclusive) only if there are no GUP pins.  Anon
      pages with PageAnonExclusive set never have to be copied during write
      faults, but eventually during fork() if they cannot be turned shared.
      The improved reuse logic in this series will essentially also be the
      logic to reset PageAnonExclusive.  This work will certainly take a
      while, but I'm planning on sharing details before having code fully
      ready.
      
      #1-#5 can be applied independently of the rest. #6-#9 are mostly only
      cleanups related to reuse_swap_page().
      
      Notes:
      * For now, I'll leave hugetlb code untouched: "unnecessary COW" might
        easily break existing setups because hugetlb pages are a scarce resource
        and we could just end up having to crash the application when we run out
        of hugetlb pages. We have to be very careful and the security aspect with
        hugetlb is most certainly less relevant than for unprivileged anon pages.
      * Instead of lru_add_drain() we might actually just drain the lru_add list
        or even just remove the single page of interest from the lru_add list.
        This would require a new helper function, and could be added if the
        conditional lru_add_drain() turn out to be a problem.
      * I extended the test case already included in [1] to also test for the
        newly found do_swap_page() case. I'll send that out separately once/if
        this part was merged.
      
      [1] https://lkml.kernel.org/r/20211217113049.23850-1-david@redhat.com
      [2] https://lore.kernel.org/r/3ae33b08-d9ef-f846-56fb-645e3b9b4c66@redhat.com
      
      This patch (of 9):
      
      Liang Zhang reported [1] that the current COW logic in do_wp_page() is
      sub-optimal when it comes to swap+read fault+write fault of anonymous
      pages that have a single user, visible via a performance degradation in
      the redis benchmark.  Something similar was previously reported [2] by
      Nadav with a simple reproducer.
      
      After we put an anon page into the swapcache and unmapped it from a single
      process, that process might read that page again and refault it read-only.
      If that process then writes to that page, the process is actually the
      exclusive user of the page, however, the COW logic in do_co_page() won't
      be able to reuse it due to the additional reference from the swapcache.
      
      Let's optimize for pages that have been added to the swapcache but only
      have an exclusive user.  Try removing the swapcache reference if there is
      hope that we're the exclusive user.
      
      We will fail removing the swapcache reference in two scenarios:
      (1) There are additional swap entries referencing the page: copying
          instead of reusing is the right thing to do.
      (2) The page is under writeback: theoretically we might be able to reuse
          in some cases, however, we cannot remove the additional reference
          and will have to copy.
      
      Note that we'll only try removing the page from the swapcache when it's
      highly likely that we'll be the exclusive owner after removing the page
      from the swapache.  As we're about to map that page writable and redirty
      it, that should not affect reclaim but is rather the right thing to do.
      
      Further, we might have additional references from the LRU pagevecs, which
      will force us to copy instead of being able to reuse.  We'll try handling
      such references for some scenarios next.  Concurrent writeback cannot be
      handled easily and we'll always have to copy.
      
      While at it, remove the superfluous page_mapcount() check: it's
      implicitly covered by the page_count() for ordinary anon pages.
      
      [1] https://lkml.kernel.org/r/20220113140318.11117-1-zhangliang5@huawei.com
      [2] https://lkml.kernel.org/r/0480D692-D9B2-429A-9A88-9BBA1331AC3A@gmail.com
      
      Link: https://lkml.kernel.org/r/20220131162940.210846-2-david@redhat.com
      Signed-off-by: default avatarDavid Hildenbrand <david@redhat.com>
      Reported-by: default avatarLiang Zhang <zhangliang5@huawei.com>
      Reported-by: default avatarNadav Amit <nadav.amit@gmail.com>
      Reviewed-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Cc: Jason Gunthorpe <jgg@nvidia.com>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Cc: Yang Shi <shy828301@gmail.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Jann Horn <jannh@google.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Rik van Riel <riel@surriel.com>
      Cc: Roman Gushchin <roman.gushchin@linux.dev>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Peter Xu <peterx@redhat.com>
      Cc: Don Dutile <ddutile@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Jan Kara <jack@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      53a05ad9
    • Miaohe Lin's avatar
      mm/huge_memory: make is_transparent_hugepage() static · 562beb72
      Miaohe Lin authored
      
      
      It's only used inside the huge_memory.c now. Don't export it and make
      it static. We can thus reduce the size of huge_memory.o a bit.
      
      Without this patch:
         text	   data	    bss	    dec	    hex	filename
        32319	   2965	      4	  35288	   89d8	mm/huge_memory.o
      
      With this patch:
         text	   data	    bss	    dec	    hex	filename
        32042	   2957	      4	  35003	   88bb	mm/huge_memory.o
      
      Link: https://lkml.kernel.org/r/20220302082145.12028-1-linmiaohe@huawei.com
      Signed-off-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Reviewed-by: default avatarMuchun Song <songmuchun@bytedance.com>
      Reviewed-by: default avatarYang Shi <shy828301@gmail.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: William Kucharski <william.kucharski@oracle.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Peter Xu <peterx@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      562beb72
    • Mike Kravetz's avatar
      userfaultfd/selftests: enable hugetlb remap and remove event testing · 9ae8f2b8
      Mike Kravetz authored
      
      
      With MADV_DONTNEED support added to hugetlb mappings, mremap testing can
      also be enabled for hugetlb.
      
      Modify the tests to use madvise MADV_DONTNEED and MADV_REMOVE instead of
      fallocate hole puch for releasing hugetlb pages.
      
      Link: https://lkml.kernel.org/r/20220215002348.128823-4-mike.kravetz@oracle.com
      Signed-off-by: default avatarMike Kravetz <mike.kravetz@oracle.com>
      Reviewed-by: default avatarAxel Rasmussen <axelrasmussen@google.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Mike Rapoport <rppt@kernel.org>
      Cc: Mina Almasry <almasrymina@google.com>
      Cc: Naoya Horiguchi <naoya.horiguchi@linux.dev>
      Cc: Peter Xu <peterx@redhat.com>
      Cc: Shuah Khan <skhan@linuxfoundation.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9ae8f2b8
    • Mike Kravetz's avatar
      selftests/vm: add hugetlb madvise MADV_DONTNEED MADV_REMOVE test · c4b6cb88
      Mike Kravetz authored
      
      
      Now that MADV_DONTNEED support for hugetlb is enabled, add corresponding
      tests.  MADV_REMOVE has been enabled for some time, but no tests exist so
      add them as well.
      
      Link: https://lkml.kernel.org/r/20220215002348.128823-3-mike.kravetz@oracle.com
      Signed-off-by: default avatarMike Kravetz <mike.kravetz@oracle.com>
      Reviewed-by: default avatarShuah Khan <skhan@linuxfoundation.org>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Axel Rasmussen <axelrasmussen@google.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Mike Rapoport <rppt@kernel.org>
      Cc: Mina Almasry <almasrymina@google.com>
      Cc: Naoya Horiguchi <naoya.horiguchi@linux.dev>
      Cc: Peter Xu <peterx@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c4b6cb88
    • Mike Kravetz's avatar
      mm: enable MADV_DONTNEED for hugetlb mappings · 90e7e7f5
      Mike Kravetz authored
      Patch series "Add hugetlb MADV_DONTNEED support", v3.
      
      Userfaultfd selftests for hugetlb does not perform UFFD_EVENT_REMAP
      testing.  However, mremap support was recently added in commit
      550a7d60
      
       ("mm, hugepages: add mremap() support for hugepage backed
      vma").  While attempting to enable mremap support in the test, it was
      discovered that the mremap test indirectly depends on MADV_DONTNEED.
      
      madvise does not allow MADV_DONTNEED for hugetlb mappings.  However, that
      is primarily due to the check in can_madv_lru_vma().  By simply removing
      the check and adding huge page alignment, MADV_DONTNEED can be made to
      work for hugetlb mappings.
      
      Do note that there is no compelling use case for adding this support.
      This was discussed in the RFC [1].  However, adding support makes sense as
      it is fairly trivial and brings hugetlb functionality more in line with
      'normal' memory.
      
      After enabling support, add selftest for MADV_DONTNEED as well as
      MADV_REMOVE.  Then update userfaultfd selftest.
      
      If new functionality is accepted, then madvise man page will be updated to
      indicate hugetlb is supported.  It will also be updated to clarify what
      happens to the passed length argument.
      
      This patch (of 3):
      
      MADV_DONTNEED is currently disabled for hugetlb mappings.  This certainly
      makes sense in shared file mappings as the pagecache maintains a reference
      to the page and it will never be freed.  However, it could be useful to
      unmap and free pages in private mappings.  In addition, userfaultfd minor
      fault users may be able to simplify code by using MADV_DONTNEED.
      
      The primary thing preventing MADV_DONTNEED from working on hugetlb
      mappings is a check in can_madv_lru_vma().  To allow support for hugetlb
      mappings create and use a new routine madvise_dontneed_free_valid_vma()
      that allows hugetlb mappings in this specific case.
      
      For normal mappings, madvise requires the start address be PAGE aligned
      and rounds up length to the next multiple of PAGE_SIZE.  Do similarly for
      hugetlb mappings: require start address be huge page size aligned and
      round up length to the next multiple of huge page size.  Use the new
      madvise_dontneed_free_valid_vma routine to check alignment and round up
      length/end.  zap_page_range requires this alignment for hugetlb vmas
      otherwise we will hit BUGs.
      
      Link: https://lkml.kernel.org/r/20220215002348.128823-1-mike.kravetz@oracle.com
      Link: https://lkml.kernel.org/r/20220215002348.128823-2-mike.kravetz@oracle.com
      Signed-off-by: default avatarMike Kravetz <mike.kravetz@oracle.com>
      Cc: Naoya Horiguchi <naoya.horiguchi@linux.dev>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Axel Rasmussen <axelrasmussen@google.com>
      Cc: Mina Almasry <almasrymina@google.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Peter Xu <peterx@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Shuah Khan <skhan@linuxfoundation.org>
      Cc: Mike Rapoport <rppt@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      90e7e7f5
    • Andrey Konovalov's avatar
      kasan: disable LOCKDEP when printing reports · c32caa26
      Andrey Konovalov authored
      
      
      If LOCKDEP detects a bug while KASAN is printing a report and if
      panic_on_warn is set, KASAN will not be able to finish.  Disable LOCKDEP
      while KASAN is printing a report.
      
      See https://bugzilla.kernel.org/show_bug.cgi?id=202115 for an example
      of the issue.
      
      Link: https://lkml.kernel.org/r/c48a2a3288200b07e1788b77365c2f02784cfeb4.1646237226.git.andreyknvl@google.com
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Marco Elver <elver@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c32caa26
    • Andrey Konovalov's avatar
      kasan: move and hide kasan_save_enable/restore_multi_shot · 80207910
      Andrey Konovalov authored
      
      
       - Move kasan_save_enable/restore_multi_shot() declarations to
         mm/kasan/kasan.h, as there is no need for them to be visible outside
         of KASAN implementation.
      
       - Only define and export these functions when KASAN tests are enabled.
      
       - Move their definitions closer to other test-related code in report.c.
      
      Link: https://lkml.kernel.org/r/6ba637333b78447f027d775f2d55ab1a40f63c99.1646237226.git.andreyknvl@google.com
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Marco Elver <elver@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      80207910
    • Andrey Konovalov's avatar
      kasan: reorder reporting functions · 865bfa28
      Andrey Konovalov authored
      
      
      Move print_error_description()'s, report_suppressed()'s, and
      report_enabled()'s definitions to improve the logical order of function
      definitions in report.c.
      
      No functional changes.
      
      Link: https://lkml.kernel.org/r/82aa926c411e00e76e97e645a551ede9ed0c5e79.1646237226.git.andreyknvl@google.com
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Marco Elver <elver@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      865bfa28
    • Andrey Konovalov's avatar
      kasan: respect KASAN_BIT_REPORTED in all reporting routines · c068664c
      Andrey Konovalov authored
      
      
      Currently, only kasan_report() checks the KASAN_BIT_REPORTED and
      KASAN_BIT_MULTI_SHOT flags.
      
      Make other reporting routines check these flags as well.
      
      Also add explanatory comments.
      
      Note that the current->kasan_depth check is split out into
      report_suppressed() and only called for kasan_report().
      
      Link: https://lkml.kernel.org/r/715e346b10b398e29ba1b425299dcd79e29d58ce.1646237226.git.andreyknvl@google.com
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Marco Elver <elver@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c068664c
    • Andrey Konovalov's avatar
      kasan: add comment about UACCESS regions to kasan_report · 795b760f
      Andrey Konovalov authored
      
      
      Add a comment explaining why kasan_report() is the only reporting function
      that uses user_access_save/restore().
      
      Link: https://lkml.kernel.org/r/1201ca3c2be42c7bd077c53d2e46f4a51dd1476a.1646237226.git.andreyknvl@google.com
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Marco Elver <elver@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      795b760f
    • Andrey Konovalov's avatar
      kasan: rename kasan_access_info to kasan_report_info · c965cdd6
      Andrey Konovalov authored
      
      
      Rename kasan_access_info to kasan_report_info, as the latter name better
      reflects the struct's purpose.
      
      Link: https://lkml.kernel.org/r/158a4219a5d356901d017352558c989533a0782c.1646237226.git.andreyknvl@google.com
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Marco Elver <elver@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c965cdd6
    • Andrey Konovalov's avatar
      kasan: move and simplify kasan_report_async · bb2f967c
      Andrey Konovalov authored
      
      
      Place kasan_report_async() next to the other main reporting routines.
      Also simplify printed information.
      
      Link: https://lkml.kernel.org/r/52d942ef3ffd29bdfa225bbe8e327bc5bda7ab09.1646237226.git.andreyknvl@google.com
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Marco Elver <elver@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      bb2f967c
    • Andrey Konovalov's avatar
      kasan: call print_report from kasan_report_invalid_free · 31c65110
      Andrey Konovalov authored
      
      
      Call print_report() in kasan_report_invalid_free() instead of calling
      printing functions directly.  Compared to the existing implementation of
      kasan_report_invalid_free(), print_report() makes sure that the buggy
      address has metadata before printing it.
      
      The change requires adding a report type field into kasan_access_info and
      using it accordingly.
      
      kasan_report_async() is left as is, as using print_report() will only
      complicate the code.
      
      Link: https://lkml.kernel.org/r/9ea6f0604c5d2e1fb28d93dc6c44232c1f8017fe.1646237226.git.andreyknvl@google.com
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Marco Elver <elver@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      31c65110
    • Andrey Konovalov's avatar
      kasan: merge __kasan_report into kasan_report · be8631a1
      Andrey Konovalov authored
      
      
      Merge __kasan_report() into kasan_report().  The code is simple enough to
      be readable without the __kasan_report() helper.
      
      Link: https://lkml.kernel.org/r/c8a125497ef82f7042b3795918dffb81a85a878e.1646237226.git.andreyknvl@google.com
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Marco Elver <elver@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      be8631a1
    • Andrey Konovalov's avatar
      kasan: restructure kasan_report · b3bb1d70
      Andrey Konovalov authored
      
      
      Restructure kasan_report() to make reviewing the subsequent patches
      easier.
      
      Link: https://lkml.kernel.org/r/ca28042889858b8cc4724d3d4378387f90d7a59d.1646237226.git.andreyknvl@google.com
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Marco Elver <elver@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b3bb1d70
    • Andrey Konovalov's avatar
      kasan: simplify kasan_find_first_bad_addr call sites · b9132800
      Andrey Konovalov authored
      
      
      Move the addr_has_metadata() check into kasan_find_first_bad_addr().
      
      Link: https://lkml.kernel.org/r/a49576f7a23283d786ba61579cb0c5057e8f0b9b.1646237226.git.andreyknvl@google.com
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Marco Elver <elver@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b9132800
    • Andrey Konovalov's avatar
      kasan: split out print_report from __kasan_report · 9d7b7dd9
      Andrey Konovalov authored
      
      
      Split out the part of __kasan_report() that prints things into
      print_report().  One of the subsequent patches makes another error handler
      use print_report() as well.
      
      Includes lower-level changes:
      
       - Allow addr_has_metadata() accepting a tagged address.
      
       - Drop the const qualifier from the fields of kasan_access_info to
         avoid excessive type casts.
      
       - Change the type of the address argument of __kasan_report() and
         end_report() to void * to reduce the number of type casts.
      
      Link: https://lkml.kernel.org/r/9be3ed99dd24b9c4e1c4a848b69a0c6ecefd845e.1646237226.git.andreyknvl@google.com
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Marco Elver <elver@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9d7b7dd9
    • Andrey Konovalov's avatar
      kasan: move disable_trace_on_warning to start_report · 0a6e8a07
      Andrey Konovalov authored
      
      
      Move the disable_trace_on_warning() call, which enables the
      /proc/sys/kernel/traceoff_on_warning interface for KASAN bugs, to
      start_report(), so that it functions for all types of KASAN reports.
      
      Link: https://lkml.kernel.org/r/7c066c5de26234ad2cebdd931adfe437f8a95d58.1646237226.git.andreyknvl@google.com
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Reviewed-by: default avatarAlexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Marco Elver <elver@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0a6e8a07
    • Andrey Konovalov's avatar
      kasan: move update_kunit_status to start_report · a260d281
      Andrey Konovalov authored
      
      
      Instead of duplicating calls to update_kunit_status() in every error
      report routine, call it once in start_report().  Pass the sync flag as an
      additional argument to start_report().
      
      Link: https://lkml.kernel.org/r/cae5c845a0b6f3c867014e53737cdac56b11edc7.1646237226.git.andreyknvl@google.com
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Marco Elver <elver@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a260d281
    • Andrey Konovalov's avatar
      kasan: check CONFIG_KASAN_KUNIT_TEST instead of CONFIG_KUNIT · 49d9977a
      Andrey Konovalov authored
      
      
      Check the more specific CONFIG_KASAN_KUNIT_TEST config option when
      defining things related to KUnit-compatible KASAN tests instead of
      CONFIG_KUNIT.
      
      Also put the kunit_kasan_status definition next to the definitons of other
      KASAN-related structs.
      
      Link: https://lkml.kernel.org/r/223592d38d2a601a160a3b2b3d5a9f9090350e62.1646237226.git.andreyknvl@google.com
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Reviewed-by: default avatarAlexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Marco Elver <elver@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      49d9977a
    • Andrey Konovalov's avatar
      kasan: simplify kasan_update_kunit_status() and call sites · 3784c299
      Andrey Konovalov authored
      
      
       - Rename kasan_update_kunit_status() to update_kunit_status() (the
         function is static).
      
       - Move the IS_ENABLED(CONFIG_KUNIT) to the function's definition
         instead of duplicating it at call sites.
      
       - Obtain and check current->kunit_test within the function.
      
      Link: https://lkml.kernel.org/r/dac26d811ae31856c3d7666de0b108a3735d962d.1646237226.git.andreyknvl@google.com
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Reviewed-by: default avatarAlexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Marco Elver <elver@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3784c299
    • Andrey Konovalov's avatar
      kasan: simplify async check in end_report() · 476b1dc2
      Andrey Konovalov authored
      
      
      Currently, end_report() does not call trace_error_report_end() for bugs
      detected in either async or asymm mode (when kasan_async_fault_possible()
      returns true), as the address of the bad access might be unknown.
      
      However, for asymm mode, the address is known for faults triggered by read
      operations.
      
      Instead of using kasan_async_fault_possible(), simply check that the addr
      is not NULL when calling trace_error_report_end().
      
      Link: https://lkml.kernel.org/r/1c8ce43f97300300e62c941181afa2eb738965c5.1646237226.git.andreyknvl@google.com
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Marco Elver <elver@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      476b1dc2
    • Andrey Konovalov's avatar
      kasan: print basic stack frame info for SW_TAGS · 1e0f611f
      Andrey Konovalov authored
      
      
      Software Tag-Based mode tags stack allocations when CONFIG_KASAN_STACK
      is enabled. Print task name and id in reports for stack-related bugs.
      
      [andreyknvl@google.com: include linux/sched/task_stack.h]
        Link: https://lkml.kernel.org/r/d7598f11a34ed96e508f7640fa038662ed2305ec.1647099922.git.andreyknvl@google.com
      
      Link: https://lkml.kernel.org/r/029aaa87ceadde0702f3312a34697c9139c9fb53.1646237226.git.andreyknvl@google.com
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Reviewed-by: default avatarAlexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Marco Elver <elver@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1e0f611f
    • Andrey Konovalov's avatar
      kasan: improve stack frame info in reports · 16347c31
      Andrey Konovalov authored
      
      
       - Print at least task name and id for reports affecting allocas
         (get_address_stack_frame_info() does not support them).
      
       - Capitalize first letter of each sentence.
      
      Link: https://lkml.kernel.org/r/aa613f097c12f7b75efb17f2618ae00480fb4bc3.1646237226.git.andreyknvl@google.com
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Reviewed-by: default avatarAlexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Marco Elver <elver@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      16347c31
    • Andrey Konovalov's avatar
      kasan: rearrange stack frame info in reports · 0f9b35f3
      Andrey Konovalov authored
      
      
       - Move printing stack frame info before printing page info.
      
       - Add object_is_on_stack() check to print_address_description() and add
         a corresponding WARNING to kasan_print_address_stack_frame(). This
         looks more in line with the rest of the checks in this function and
         also allows to avoid complicating code logic wrt line breaks.
      
       - Clean up comments related to get_address_stack_frame_info().
      
      Link: https://lkml.kernel.org/r/1ee113a4c111df97d168c820b527cda77a3cac40.1646237226.git.andreyknvl@google.com
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Reviewed-by: default avatarAlexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Marco Elver <elver@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0f9b35f3
    • Andrey Konovalov's avatar
      kasan: more line breaks in reports · 038fd2b4
      Andrey Konovalov authored
      
      
      Add a line break after each part that describes the buggy address.
      Improves readability of reports.
      
      Link: https://lkml.kernel.org/r/8682c4558e533cd0f99bdb964ce2fe741f2a9212.1646237226.git.andreyknvl@google.com
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Reviewed-by: default avatarAlexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Marco Elver <elver@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      038fd2b4
    • Andrey Konovalov's avatar
      kasan: drop addr check from describe_object_addr · 7131c883
      Andrey Konovalov authored
      
      
      Patch series "kasan: report clean-ups and improvements".
      
      A number of clean-up patches for KASAN reporting code.  Most are
      non-functional and only improve readability.
      
      This patch (of 22):
      
      describe_object_addr() used to be called with NULL addr in the early days
      of KASAN.  This no longer happens, so drop the check.
      
      Link: https://lkml.kernel.org/r/cover.1646237226.git.andreyknvl@google.com
      Link: https://lkml.kernel.org/r/761f8e5a6ee040d665934d916a90afe9f322f745.1646237226.git.andreyknvl@google.com
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Reviewed-by: default avatarAlexander Potapenko <glider@google.com>
      Cc: Marco Elver <elver@google.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7131c883
    • Andrey Konovalov's avatar
      kasan: print virtual mapping info in reports · c056a364
      Andrey Konovalov authored
      
      
      Print virtual mapping range and its creator in reports affecting virtual
      mappings.
      
      Also get physical page pointer for such mappings, so page information gets
      printed as well.
      
      Link: https://lkml.kernel.org/r/6ebb11210ae21253198e264d4bb0752c1fad67d7.1645548178.git.andreyknvl@google.com
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Marco Elver <elver@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Dmitriy Vyukov <dvyukov@google.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c056a364
    • Peter Collingbourne's avatar
      kasan: update function name in comments · 2dfd1bd9
      Peter Collingbourne authored
      
      
      The function kasan_global_oob was renamed to kasan_global_oob_right, but
      the comments referring to it were not updated.  Do so.
      
      Link: https://linux-review.googlesource.com/id/I20faa90126937bbee77d9d44709556c3dd4b40be
      Link: https://lkml.kernel.org/r/20220219012433.890941-1-pcc@google.com
      Signed-off-by: default avatarPeter Collingbourne <pcc@google.com>
      Reviewed-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Reviewed-by: default avatarMarco Elver <elver@google.com>
      Reviewed-by: default avatarAndrey Konovalov <andreyknvl@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2dfd1bd9
    • tangmeng's avatar
      mm/kasan: remove unnecessary CONFIG_KASAN option · 09eb911d
      tangmeng authored
      
      
      In mm/Makefile has:
      
        obj-$(CONFIG_KASAN)     += kasan/
      
      So that we don't need 'obj-$(CONFIG_KASAN) :=' in mm/kasan/Makefile,
      delete it from mm/kasan/Makefile.
      
      Link: https://lkml.kernel.org/r/20220221065421.20689-1-tangmeng@uniontech.com
      Signed-off-by: default avatartangmeng <tangmeng@uniontech.com>
      Reviewed-by: default avatarMarco Elver <elver@google.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Konovalov <andreyknvl@gmail.com>
      Cc: Dmitriy Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      09eb911d
    • Andrey Konovalov's avatar
      kasan: test: support async (again) and asymm modes for HW_TAGS · ed6d7444
      Andrey Konovalov authored
      Async mode support has already been implemented in commit e80a76aa
      ("kasan, arm64: tests supports for HW_TAGS async mode") but then got
      accidentally broken in commit 99734b53
      
       ("kasan: detect false-positives
      in tests").
      
      Restore the changes removed by the latter patch and adapt them for asymm
      mode: add a sync_fault flag to kunit_kasan_expectation that only get set
      if the MTE fault was synchronous, and reenable MTE on such faults in
      tests.
      
      Also rename kunit_kasan_expectation to kunit_kasan_status and move its
      definition to mm/kasan/kasan.h from include/linux/kasan.h, as this
      structure is only internally used by KASAN.  Also put the structure
      definition under IS_ENABLED(CONFIG_KUNIT).
      
      Link: https://lkml.kernel.org/r/133970562ccacc93ba19d754012c562351d4a8c8.1645033139.git.andreyknvl@google.com
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Cc: Marco Elver <elver@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ed6d7444
    • Andrey Konovalov's avatar
      kasan: improve vmalloc tests · 1a2473f0
      Andrey Konovalov authored
      
      
      Update the existing vmalloc_oob() test to account for the specifics of the
      tag-based modes.  Also add a few new checks and comments.
      
      Add new vmalloc-related tests:
      
       - vmalloc_helpers_tags() to check that exported vmalloc helpers can
         handle tagged pointers.
      
       - vmap_tags() to check that SW_TAGS mode properly tags vmap() mappings.
      
       - vm_map_ram_tags() to check that SW_TAGS mode properly tags
         vm_map_ram() mappings.
      
       - vmalloc_percpu() to check that SW_TAGS mode tags regions allocated
         for __alloc_percpu(). The tagging of per-cpu mappings is best-effort;
         proper tagging is tracked in [1].
      
      [1] https://bugzilla.kernel.org/show_bug.cgi?id=215019
      
      [sfr@canb.auug.org.au: similar to "kasan: test: fix compatibility with FORTIFY_SOURCE"]
        Link: https://lkml.kernel.org/r/20220128144801.73f5ced0@canb.auug.org.au
        Link: https://lkml.kernel.org/r/865c91ba49b90623ab50c7526b79ccb955f544f0.1644950160.git.andreyknvl@google.com
      [andreyknvl@google.com: set_memory_rw/ro() are not exported to modules]
        Link: https://lkml.kernel.org/r/019ac41602e0c4a7dfe96dc8158a95097c2b2ebd.1645554036.git.andreyknvl@google.com
      [akpm@linux-foundation.org: fix build]
      
      Cc: Andrey Konovalov <andreyknvl@gmail.com>
      [andreyknvl@google.com: vmap_tags() and vm_map_ram_tags() pass invalid page array size]
      Link: https://lkml.kernel.org/r/bbdc1c0501c5275e7f26fdb8e2a7b14a40a9f36b.1643047180.git.andreyknvl@google.com
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
      Acked-by: default avatarMarco Elver <elver@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Evgenii Stepanov <eugenis@google.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Collingbourne <pcc@google.com>
      Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1a2473f0
    • Andrey Konovalov's avatar
      kasan: documentation updates · 8479d7b5
      Andrey Konovalov authored
      Update KASAN documentation:
      
       - Bump Clang version requirement for HW_TAGS as ARM64_MTE depends on
         AS_HAS_LSE_ATOMICS as of commit 2decad92
      
       ("arm64: mte: Ensure
         TIF_MTE_ASYNC_FAULT is set atomically"), which requires Clang 12.
      
       - Add description of the new kasan.vmalloc command line flag.
      
       - Mention that SW_TAGS and HW_TAGS modes now support vmalloc tagging.
      
       - Explicitly say that the "Shadow memory" section is only applicable to
         software KASAN modes.
      
       - Mention that shadow-based KASAN_VMALLOC is supported on arm64.
      
      Link: https://lkml.kernel.org/r/a61189128fa3f9fbcfd9884ff653d401864b8e74.1643047180.git.andreyknvl@google.com
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Acked-by: default avatarMarco Elver <elver@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Evgenii Stepanov <eugenis@google.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Collingbourne <pcc@google.com>
      Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8479d7b5
    • Andrey Konovalov's avatar
      arm64: select KASAN_VMALLOC for SW/HW_TAGS modes · f6f37d93
      Andrey Konovalov authored
      Generic KASAN already selects KASAN_VMALLOC to allow VMAP_STACK to be
      selected unconditionally, see commit acc3042d
      
       ("arm64: Kconfig:
      select KASAN_VMALLOC if KANSAN_GENERIC is enabled").
      
      The same change is needed for SW_TAGS KASAN.
      
      HW_TAGS KASAN does not require enabling KASAN_VMALLOC for VMAP_STACK, they
      already work together as is.  Still, selecting KASAN_VMALLOC still makes
      sense to make vmalloc() always protected.  In case any bugs in KASAN's
      vmalloc() support are discovered, the command line kasan.vmalloc flag can
      be used to disable vmalloc() checking.
      
      Select KASAN_VMALLOC for all KASAN modes for arm64.
      
      Link: https://lkml.kernel.org/r/99d6b3ebf57fc1930ff71f9a4a71eea19881b270.1643047180.git.andreyknvl@google.com
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Acked-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Acked-by: default avatarMarco Elver <elver@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Evgenii Stepanov <eugenis@google.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Collingbourne <pcc@google.com>
      Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f6f37d93
    • Andrey Konovalov's avatar
      kasan: allow enabling KASAN_VMALLOC and SW/HW_TAGS · fbefb423
      Andrey Konovalov authored
      
      
      Allow enabling CONFIG_KASAN_VMALLOC with SW_TAGS and HW_TAGS KASAN modes.
      
      Also adjust CONFIG_KASAN_VMALLOC description:
      
       - Mention HW_TAGS support.
      
       - Remove unneeded internal details: they have no place in Kconfig
         description and are already explained in the documentation.
      
      Link: https://lkml.kernel.org/r/bfa0fdedfe25f65e5caa4e410f074ddbac7a0b59.1643047180.git.andreyknvl@google.com
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Acked-by: default avatarMarco Elver <elver@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Evgenii Stepanov <eugenis@google.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Collingbourne <pcc@google.com>
      Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      fbefb423
    • Andrey Konovalov's avatar
      kasan: add kasan.vmalloc command line flag · 551b2bcb
      Andrey Konovalov authored
      
      
      Allow disabling vmalloc() tagging for HW_TAGS KASAN via a kasan.vmalloc
      command line switch.
      
      This is a fail-safe switch intended for production systems that enable
      HW_TAGS KASAN.  In case vmalloc() tagging ends up having an issue not
      detected during testing but that manifests in production, kasan.vmalloc
      allows to turn vmalloc() tagging off while leaving page_alloc/slab
      tagging on.
      
      Link: https://lkml.kernel.org/r/904f6d4dfa94870cc5fc2660809e093fd0d27c3b.1643047180.git.andreyknvl@google.com
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Acked-by: default avatarMarco Elver <elver@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Evgenii Stepanov <eugenis@google.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Collingbourne <pcc@google.com>
      Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      551b2bcb
    • Andrey Konovalov's avatar
      kasan: clean up feature flags for HW_TAGS mode · 241944d1
      Andrey Konovalov authored
      
      
       - Untie kasan_init_hw_tags() code from the default values of
         kasan_arg_mode and kasan_arg_stacktrace.
      
       - Move static_branch_enable(&kasan_flag_enabled) to the end of
         kasan_init_hw_tags_cpu().
      
       - Remove excessive comments in kasan_arg_mode switch.
      
       - Add new comments.
      
      Link: https://lkml.kernel.org/r/76ebb340265be57a218564a497e1f52ff36a3879.1643047180.git.andreyknvl@google.com
      Signed-off-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Acked-by: default avatarMarco Elver <elver@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Evgenii Stepanov <eugenis@google.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Collingbourne <pcc@google.com>
      Cc: Vincenzo Frascino <vincenzo.frascino@arm.com>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      241944d1