Skip to content
  1. May 28, 2022
    • Minchan Kim's avatar
      mm: fix is_pinnable_page against a cma page · 1c563432
      Minchan Kim authored
      
      
      Pages in the CMA area could have MIGRATE_ISOLATE as well as MIGRATE_CMA so
      the current is_pinnable_page() could miss CMA pages which have
      MIGRATE_ISOLATE.  It ends up pinning CMA pages as longterm for the
      pin_user_pages() API so CMA allocations keep failing until the pin is
      released.
      
           CPU 0                                   CPU 1 - Task B
      
      cma_alloc
      alloc_contig_range
                                              pin_user_pages_fast(FOLL_LONGTERM)
      change pageblock as MIGRATE_ISOLATE
                                              internal_get_user_pages_fast
                                              lockless_pages_from_mm
                                              gup_pte_range
                                              try_grab_folio
                                              is_pinnable_page
                                                return true;
                                              So, pinned the page successfully.
      page migration failure with pinned page
                                              ..
                                              .. After 30 sec
                                              unpin_user_page(page)
      
      CMA allocation succeeded after 30 sec.
      
      The CMA allocation path protects the migration type change race using
      zone->lock but what GUP path need to know is just whether the page is on
      CMA area or not rather than exact migration type.  Thus, we don't need
      zone->lock but just checks migration type in either of (MIGRATE_ISOLATE
      and MIGRATE_CMA).
      
      Adding the MIGRATE_ISOLATE check in is_pinnable_page could cause rejecting
      of pinning pages on MIGRATE_ISOLATE pageblocks even though it's neither
      CMA nor movable zone if the page is temporarily unmovable.  However, such
      a migration failure by unexpected temporal refcount holding is general
      issue, not only come from MIGRATE_ISOLATE and the MIGRATE_ISOLATE is also
      transient state like other temporal elevated refcount problem.
      
      Link: https://lkml.kernel.org/r/20220524171525.976723-1-minchan@kernel.org
      Signed-off-by: default avatarMinchan Kim <minchan@kernel.org>
      Reviewed-by: default avatarJohn Hubbard <jhubbard@nvidia.com>
      Acked-by: default avatarPaul E. McKenney <paulmck@kernel.org>
      Cc: David Hildenbrand <david@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      1c563432
    • Miaohe Lin's avatar
      mm: filter out swapin error entry in shmem mapping · ba6851b4
      Miaohe Lin authored
      
      
      There might be swapin error entries in shmem mapping.  Filter them out to
      avoid "Bad swap file entry" complaint.
      
      Link: https://lkml.kernel.org/r/20220519125030.21486-6-linmiaohe@huawei.com
      Signed-off-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Reviewed-by: default avatarNaoya Horiguchi <naoya.horiguchi@nec.com>
      Cc: Alistair Popple <apopple@nvidia.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: NeilBrown <neilb@suse.de>
      Cc: Peter Xu <peterx@redhat.com>
      Cc: Ralph Campbell <rcampbell@nvidia.com>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      ba6851b4
    • Miaohe Lin's avatar
      mm/shmem: fix infinite loop when swap in shmem error at swapoff time · 6cec2b95
      Miaohe Lin authored
      
      
      When swap in shmem error at swapoff time, there would be a infinite loop
      in the while loop in shmem_unuse_inode().  It's because swapin error is
      deliberately ignored now and thus info->swapped will never reach 0.  So we
      can't escape the loop in shmem_unuse().
      
      In order to fix the issue, swapin_error entry is stored in the mapping
      when swapin error occurs.  So the swapcache page can be freed and the user
      won't end up with a permanently mounted swap because a sector is bad.  If
      the page is accessed later, the user process will be killed so that
      corrupted data is never consumed.  On the other hand, if the page is never
      accessed, the user won't even notice it.
      
      Link: https://lkml.kernel.org/r/20220519125030.21486-5-linmiaohe@huawei.com
      Signed-off-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Reported-by: default avatarNaoya Horiguchi <naoya.horiguchi@nec.com>
      Reviewed-by: default avatarNaoya Horiguchi <naoya.horiguchi@nec.com>
      Cc: Alistair Popple <apopple@nvidia.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: NeilBrown <neilb@suse.de>
      Cc: Peter Xu <peterx@redhat.com>
      Cc: Ralph Campbell <rcampbell@nvidia.com>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      6cec2b95
    • Miaohe Lin's avatar
      mm/madvise: free hwpoison and swapin error entry in madvise_free_pte_range · 7b49514f
      Miaohe Lin authored
      
      
      Once the MADV_FREE operation has succeeded, callers can expect they might
      get zero-fill pages if accessing the memory again.  Therefore it should be
      safe to delete the hwpoison entry and swapin error entry.  There is no
      reason to kill the process if it has called MADV_FREE on the range.
      
      Link: https://lkml.kernel.org/r/20220519125030.21486-4-linmiaohe@huawei.com
      Signed-off-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Suggested-by: default avatarAlistair Popple <apopple@nvidia.com>
      Acked-by: default avatarDavid Hildenbrand <david@redhat.com>
      Reviewed-by: default avatarNaoya Horiguchi <naoya.horiguchi@nec.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: NeilBrown <neilb@suse.de>
      Cc: Peter Xu <peterx@redhat.com>
      Cc: Ralph Campbell <rcampbell@nvidia.com>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      7b49514f
    • Miaohe Lin's avatar
      mm/swapfile: fix lost swap bits in unuse_pte() · 14a762dd
      Miaohe Lin authored
      
      
      This is observed by code review only but not any real report.
      
      When we turn off swapping we could have lost the bits stored in the swap
      ptes.  The new rmap-exclusive bit is fine since that turned into a page
      flag, but not for soft-dirty and uffd-wp.  Add them.
      
      Link: https://lkml.kernel.org/r/20220519125030.21486-3-linmiaohe@huawei.com
      Signed-off-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Suggested-by: default avatarPeter Xu <peterx@redhat.com>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Cc: Alistair Popple <apopple@nvidia.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
      Cc: NeilBrown <neilb@suse.de>
      Cc: Ralph Campbell <rcampbell@nvidia.com>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      14a762dd
    • Miaohe Lin's avatar
      mm/swapfile: unuse_pte can map random data if swap read fails · 9f186f9e
      Miaohe Lin authored
      
      
      Patch series "A few fixup patches for mm", v4.
      
      This series contains a few patches to avoid mapping random data if swap
      read fails and fix lost swap bits in unuse_pte.  Also we free hwpoison and
      swapin error entry in madvise_free_pte_range and so on.  More details can
      be found in the respective changelogs.  
      
      
      This patch (of 5):
      
      There is a bug in unuse_pte(): when swap page happens to be unreadable,
      page filled with random data is mapped into user address space.  In case
      of error, a special swap entry indicating swap read fails is set to the
      page table.  So the swapcache page can be freed and the user won't end up
      with a permanently mounted swap because a sector is bad.  And if the page
      is accessed later, the user process will be killed so that corrupted data
      is never consumed.  On the other hand, if the page is never accessed, the
      user won't even notice it.
      
      Link: https://lkml.kernel.org/r/20220519125030.21486-1-linmiaohe@huawei.com
      Link: https://lkml.kernel.org/r/20220519125030.21486-2-linmiaohe@huawei.com
      Signed-off-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Acked-by: default avatarDavid Hildenbrand <david@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: David Howells <dhowells@redhat.com>
      Cc: NeilBrown <neilb@suse.de>
      Cc: Alistair Popple <apopple@nvidia.com>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Peter Xu <peterx@redhat.com>
      Cc: Ralph Campbell <rcampbell@nvidia.com>
      Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      9f186f9e
    • Michal Koutný's avatar
      selftests: memcg: factor out common parts of memory.{low,min} tests · f079a020
      Michal Koutný authored
      
      
      The memory protection test setup and runtime is almost equal for
      memory.low and memory.min cases.
      
      It makes modification of the common parts prone to mistakes, since the
      protections are similar not only in setup but also in principle, factor
      the common part out.
      
      Past exceptions between the tests:
      - missing memory.min is fine (kept),
      - test_memcg_low protected orphaned pagecache (adapted like
        test_memcg_min and we keep the processes of protected memory running).
      
      The evaluation in two tests is different (OOM of allocator vs low events
      of protégés), this is kept different.
      
      Link: https://lkml.kernel.org/r/20220518161859.21565-6-mkoutny@suse.com
      Signed-off-by: default avatarMichal Koutný <mkoutny@suse.com>
      Acked-by: default avatarRoman Gushchin <roman.gushchin@linux.dev>
      CC: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: Richard Palethorpe <rpalethorpe@suse.de>
      Cc: David Vernet <void@manifault.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      f079a020
    • Michal Koutný's avatar
      selftests: memcg: remove protection from top level memcg · 6a359190
      Michal Koutný authored
      
      
      The reclaim is triggered by memory limit in a subtree, therefore the
      testcase does not need configured protection against external reclaim.
      
      Also, correct respective comments.
      
      Link: https://lkml.kernel.org/r/20220518161859.21565-5-mkoutny@suse.com
      Signed-off-by: default avatarMichal Koutný <mkoutny@suse.com>
      Acked-by: default avatarRoman Gushchin <roman.gushchin@linux.dev>
      Cc: David Vernet <void@manifault.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Richard Palethorpe <rpalethorpe@suse.de>
      Cc: Shakeel Butt <shakeelb@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      6a359190
    • Michal Koutný's avatar
      selftests: memcg: adjust expected reclaim values of protected cgroups · f10b6e9a
      Michal Koutný authored
      
      
      The numbers are not easy to derive in a closed form (certainly mere
      protections ratios do not apply), therefore use a simulation to obtain
      expected numbers.
      
      Link: https://lkml.kernel.org/r/20220518161859.21565-4-mkoutny@suse.com
      Signed-off-by: default avatarMichal Koutný <mkoutny@suse.com>
      Acked-by: default avatarRoman Gushchin <roman.gushchin@linux.dev>
      Cc: David Vernet <void@manifault.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Richard Palethorpe <rpalethorpe@suse.de>
      Cc: Shakeel Butt <shakeelb@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      f10b6e9a
    • Michal Koutný's avatar
      selftests: memcg: expect no low events in unprotected sibling · 1d09069f
      Michal Koutný authored
      This is effectively a revert of commit cdc69458
      
       ("cgroup: account for
      memory_recursiveprot in test_memcg_low()").  The case test_memcg_low will
      fail with memory_recursiveprot until resolved in reclaim code.
      
      However, this patch preserves the existing helpers and variables for later
      uses.
      
      Link: https://lkml.kernel.org/r/20220518161859.21565-3-mkoutny@suse.com
      Signed-off-by: default avatarMichal Koutný <mkoutny@suse.com>
      Reviewed-by: default avatarDavid Vernet <void@manifault.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Richard Palethorpe <rpalethorpe@suse.de>
      Cc: Roman Gushchin <roman.gushchin@linux.dev>
      Cc: Shakeel Butt <shakeelb@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      1d09069f
    • Michal Koutný's avatar
      selftests: memcg: fix compilation · ff3b72a5
      Michal Koutný authored
      Patch series "memcontrol selftests fixups", v2.
      
      Flushing the patches to make memcontrol selftests check the events
      behavior we had consensus about (test_memcg_low fails).
      
      (test_memcg_reclaim, test_memcg_swap_max fail for me now but it's present
      even before the refactoring.)
      
      The two bigger changes are:
      - adjustment of the protected values to make tests succeed with the given
        tolerance,
      - both test_memcg_low and test_memcg_min check protection of memory in
        populated cgroups (actually as per Documentation/admin-guide/cgroup-v2.rst
        memory.min should not apply to empty cgroups, which is not the case
        currently. Therefore I unified tests with the populated case in order to to
        bring more broken tests).
      
      
      This patch (of 5):
      
      This fixes mis-applied changes from commit 72b1e03a
      
       ("cgroup: account
      for memory_localevents in test_memcg_oom_group_leaf_events()").
      
      Link: https://lkml.kernel.org/r/20220518161859.21565-1-mkoutny@suse.com
      Link: https://lkml.kernel.org/r/20220518161859.21565-2-mkoutny@suse.com
      Signed-off-by: default avatarMichal Koutný <mkoutny@suse.com>
      Reviewed-by: default avatarDavid Vernet <void@manifault.com>
      Acked-by: default avatarRoman Gushchin <roman.gushchin@linux.dev>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Richard Palethorpe <rpalethorpe@suse.de>
      Cc: Shakeel Butt <shakeelb@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      ff3b72a5
    • Miaohe Lin's avatar
      mm/z3fold: fix z3fold_page_migrate races with z3fold_map · 943fb61d
      Miaohe Lin authored
      Think about the below scenario:
      
      CPU1				CPU2
       z3fold_page_migrate		z3fold_map
        z3fold_page_trylock
        ...
        z3fold_page_unlock
        /* slots still points to old zhdr*/
      				 get_z3fold_header
      				  get slots from handle
      				  get old zhdr from slots
      				  z3fold_page_trylock
      				  return *old* zhdr
        encode_handle(new_zhdr, FIRST|LAST|MIDDLE)
        put_page(page) /* zhdr is freed! */
      				 but zhdr is still used by caller!
      
      z3fold_map can map freed z3fold page and lead to use-after-free bug.  To
      fix it, we add PAGE_MIGRATED to indicate z3fold page is migrated and soon
      to be released.  So get_z3fold_header won't return such page.
      
      Link: https://lkml.kernel.org/r/20220429064051.61552-10-linmiaohe@huawei.com
      Fixes: 1f862989
      
       ("mm/z3fold.c: support page migration")
      Signed-off-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Reviewed-by: default avatarVitaly Wool <vitaly.wool@konsulko.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      943fb61d
    • Miaohe Lin's avatar
      mm/z3fold: fix z3fold_reclaim_page races with z3fold_free · 04094226
      Miaohe Lin authored
      Think about the below scenario:
      
      CPU1				CPU2
      z3fold_reclaim_page		z3fold_free
       spin_lock(&pool->lock)		 get_z3fold_header -- hold page_lock
       kref_get_unless_zero
      				 kref_put--zhdr->refcount can be 1 now
       !z3fold_page_trylock
        kref_put -- zhdr->refcount is 0 now
         release_z3fold_page
          WARN_ON(!list_empty(&zhdr->buddy)); -- we're on buddy now!
          spin_lock(&pool->lock); -- deadlock here!
      
      z3fold_reclaim_page might race with z3fold_free and will lead to pool lock
      deadlock and zhdr buddy non-empty warning.  To fix this, defer getting the
      refcount until page_lock is held just like what __z3fold_alloc does.  Note
      this has the side effect that we won't break the reclaim if we meet a soon
      to be released z3fold page now.
      
      Link: https://lkml.kernel.org/r/20220429064051.61552-9-linmiaohe@huawei.com
      Fixes: dcf5aedb
      
       ("z3fold: stricter locking and more careful reclaim")
      Signed-off-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Reviewed-by: default avatarVitaly Wool <vitaly.wool@konsulko.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      04094226
    • Miaohe Lin's avatar
      mm/z3fold: always clear PAGE_CLAIMED under z3fold page lock · 4a1c3839
      Miaohe Lin authored
      
      
      Think about the below race window:
      
      CPU1				CPU2
      z3fold_reclaim_page		z3fold_free
       test_and_set_bit PAGE_CLAIMED
       failed to reclaim page
       z3fold_page_lock(zhdr);
       add back to the lru list;
       z3fold_page_unlock(zhdr);
      				 get_z3fold_header
      				 page_claimed=test_and_set_bit PAGE_CLAIMED
      
       clear_bit(PAGE_CLAIMED, &page->private);
      
      				 if (!page_claimed) /* it's false true */
      				  free_handle is not called
      
      free_handle won't be called in this case. So z3fold_buddy_slots will leak.
      Fix it by always clear PAGE_CLAIMED under z3fold page lock.
      
      Link: https://lkml.kernel.org/r/20220429064051.61552-8-linmiaohe@huawei.com
      Signed-off-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Reviewed-by: default avatarVitaly Wool <vitaly.wool@konsulko.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      4a1c3839
    • Miaohe Lin's avatar
      mm/z3fold: put z3fold page back into unbuddied list when reclaim or migration fails · 6cf9a349
      Miaohe Lin authored
      
      
      When doing z3fold page reclaim or migration, the page is removed from
      unbuddied list.  If reclaim or migration succeeds, it's fine as page is
      released.  But in case it fails, the page is not put back into unbuddied
      list now.  The page will be leaked until next compaction work, reclaim or
      migration is done.
      
      Link: https://lkml.kernel.org/r/20220429064051.61552-7-linmiaohe@huawei.com
      Signed-off-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Reviewed-by: default avatarVitaly Wool <vitaly.wool@konsulko.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      6cf9a349
    • Miaohe Lin's avatar
      revert "mm/z3fold.c: allow __GFP_HIGHMEM in z3fold_alloc" · f4bad643
      Miaohe Lin authored
      Revert commit f1549cb5
      
       ("mm/z3fold.c: allow __GFP_HIGHMEM in
      z3fold_alloc").
      
      z3fold can't support GFP_HIGHMEM page now.  page_address is used directly
      at all places.  Moreover, z3fold_header is on per cpu unbuddied list which
      could be accessed anytime.  So we should remove the support of GFP_HIGHMEM
      allocation for z3fold.
      
      Link: https://lkml.kernel.org/r/20220429064051.61552-6-linmiaohe@huawei.com
      Signed-off-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Cc: Vitaly Wool <vitaly.wool@konsulko.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      f4bad643
    • Miaohe Lin's avatar
      mm/z3fold: throw warning on failure of trylock_page in z3fold_alloc · 2c0f3514
      Miaohe Lin authored
      
      
      If trylock_page fails, the page won't be non-lru movable page.  When this
      page is freed via free_z3fold_page, it will trigger bug on PageMovable
      check in __ClearPageMovable.  Throw warning on failure of trylock_page to
      guard against such rare case just as what zsmalloc does.
      
      Link: https://lkml.kernel.org/r/20220429064051.61552-5-linmiaohe@huawei.com
      Signed-off-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Cc: Vitaly Wool <vitaly.wool@konsulko.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      2c0f3514
    • Miaohe Lin's avatar
      mm/z3fold: remove buggy use of stale list for allocation · df6f0f1d
      Miaohe Lin authored
      
      
      Currently if z3fold couldn't find an unbuddied page it would first try to
      pull a page off the stale list.  But this approach is problematic.  If
      init z3fold page fails later, the page should be freed via
      free_z3fold_page to clean up the relevant resource instead of using
      __free_page directly.  And if page is successfully reused, it will BUG_ON
      later in __SetPageMovable because it's already non-lru movable page, i.e. 
      PAGE_MAPPING_MOVABLE is already set in page->mapping.  In order to fix all
      of these issues, we can simply remove the buggy use of stale list for
      allocation because can_sleep should always be false and we never really
      hit the reusing code path now.
      
      Link: https://lkml.kernel.org/r/20220429064051.61552-4-linmiaohe@huawei.com
      Signed-off-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Reviewed-by: default avatarVitaly Wool <vitaly.wool@konsulko.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      df6f0f1d
    • Miaohe Lin's avatar
      mm/z3fold: fix possible null pointer dereferencing · 7c61c35b
      Miaohe Lin authored
      alloc_slots could fail to allocate memory under heavy memory pressure.  So
      we should check zhdr->slots against NULL to avoid future null pointer
      dereferencing.
      
      Link: https://lkml.kernel.org/r/20220429064051.61552-3-linmiaohe@huawei.com
      Fixes: fc548865
      
       ("z3fold: simplify freeing slots")
      Signed-off-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Reviewed-by: default avatarVitaly Wool <vitaly.wool@konsulko.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      7c61c35b
    • Miaohe Lin's avatar
      mm/z3fold: fix sheduling while atomic · 4c6bdb36
      Miaohe Lin authored
      Patch series "A few fixup patches for z3fold".
      
      This series contains a few fixup patches to fix sheduling while atomic,
      fix possible null pointer dereferencing, fix various race conditions and
      so on. More details can be found in the respective changelogs.
      
      
      This patch (of 9):
      
      z3fold's page_lock is always held when calling alloc_slots.  So gfp should
      be GFP_ATOMIC to avoid "scheduling while atomic" bug.
      
      Link: https://lkml.kernel.org/r/20220429064051.61552-1-linmiaohe@huawei.com
      Link: https://lkml.kernel.org/r/20220429064051.61552-2-linmiaohe@huawei.com
      Fixes: fc548865
      
       ("z3fold: simplify freeing slots")
      Signed-off-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Reviewed-by: default avatarVitaly Wool <vitaly.wool@konsulko.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      4c6bdb36
    • Zi Yan's avatar
      mm: split free page with properly free memory accounting and without race · 86d28b07
      Zi Yan authored
      In isolate_single_pageblock(), free pages are checked without holding zone
      lock, but they can go away in split_free_page() when zone lock is held.
      Check the free page and its order again in split_free_page() when zone lock
      is held. Recheck the page if the free page is gone under zone lock.
      
      In addition, in split_free_page(), the free page was deleted from the page
      list without changing free page accounting. Add the missing free page
      accounting code.
      
      Fix the type of order parameter in split_free_page().
      
      Link: https://lore.kernel.org/lkml/20220525103621.987185e2ca0079f7b97b856d@linux-foundation.org/
      Link: https://lkml.kernel.org/r/20220526231531.2404977-2-zi.yan@sent.com
      Fixes: b2c9e2fb
      
       ("mm: make alloc_contig_range work at pageblock granularity")
      Signed-off-by: default avatarZi Yan <ziy@nvidia.com>
      Reported-by: default avatarDoug Berger <opendmb@gmail.com>
        Link: https://lore.kernel.org/linux-mm/c3932a6f-77fe-29f7-0c29-fe6b1c67ab7b@gmail.com/
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Qian Cai <quic_qiancai@quicinc.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Eric Ren <renzhengeek@gmail.com>
      Cc: Mike Rapoport <rppt@kernel.org>
      Cc: Oscar Salvador <osalvador@suse.de>
      Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Michael Walle <michael@walle.cc>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      86d28b07
    • Zi Yan's avatar
      mm: page-isolation: skip isolated pageblock in start_isolate_page_range() · 9b209e55
      Zi Yan authored
      start_isolate_page_range() first isolates the first and the last
      pageblocks in the range and ensure pages across range boundaries are split
      during isolation.  But it missed the case when the range is <= a pageblock
      and the first and the last pageblocks are the same one, so the second
      isolate_single_pageblock() will always fail.  To fix it, skip the
      pageblock isolation in second isolate_single_pageblock().
      
      Link: https://lkml.kernel.org/r/20220526231531.2404977-1-zi.yan@sent.com
      Fixes: 88ee1343
      
       ("mm: fix a potential infinite loop in start_isolate_page_range()")
      Signed-off-by: default avatarZi Yan <ziy@nvidia.com>
      Reported-by: default avatarMarek Szyprowski <m.szyprowski@samsung.com>
      Tested-by: default avatarMarek Szyprowski <m.szyprowski@samsung.com>
        Link: https://lore.kernel.org/linux-mm/ac65adc0-a7e4-cdfe-a0d8-757195b86293@samsung.com/
      Reported-by: default avatarMichael Walle <michael@walle.cc>
      Tested-by: default avatarMichael Walle <michael@walle.cc>
        Link: https://lore.kernel.org/linux-mm/8ca048ca8b547e0dd1c95387ee05c23d@walle.cc/
      Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Doug Berger <opendmb@gmail.com>
      Cc: Eric Ren <renzhengeek@gmail.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Mike Rapoport <rppt@kernel.org>
      Cc: Oscar Salvador <osalvador@suse.de>
      Cc: Qian Cai <quic_qiancai@quicinc.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      9b209e55
  2. May 26, 2022
  3. May 20, 2022