Skip to content
  1. Jul 30, 2022
    • Hui Zhu's avatar
      zsmalloc: zs_malloc: return ERR_PTR on failure · c7e6f17b
      Hui Zhu authored
      
      
      zs_malloc returns 0 if it fails.  zs_zpool_malloc will return -1 when
      zs_malloc return 0.  But -1 makes the return value unclear.
      
      For example, when zswap_frontswap_store calls zs_malloc through
      zs_zpool_malloc, it will return -1 to its caller.  The other return value
      is -EINVAL, -ENODEV or something else.
      
      This commit changes zs_malloc to return ERR_PTR on failure.  It didn't
      just let zs_zpool_malloc return -ENOMEM becaue zs_malloc has two types of
      failure:
      
      - size is not OK return -EINVAL
      - memory alloc fail return -ENOMEM.
      
      Link: https://lkml.kernel.org/r/20220714080757.12161-1-teawater@gmail.com
      Signed-off-by: default avatarHui Zhu <teawater@antgroup.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Nitin Gupta <ngupta@vflare.org>
      Cc: Sergey Senozhatsky <senozhatsky@chromium.org>
      Cc: Jens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      c7e6f17b
    • Xiu Jianfeng's avatar
      writeback: remove inode_to_wb_is_valid() · fef3e906
      Xiu Jianfeng authored
      inode_to_wb_is_valid() is no longer used since commit fe55d563
      
      
      ("remove inode_congested()"), remove it.
      
      Link: https://lkml.kernel.org/r/20220714084147.140324-1-xiujianfeng@huawei.com
      Signed-off-by: default avatarXiu Jianfeng <xiujianfeng@huawei.com>
      Reviewed-by: default avatarJohannes Thumshirn <johannes.thumshirn@wdc.com>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      fef3e906
    • Zhou Guanghui's avatar
      memblock,arm64: expand the static memblock memory table · 450d0e74
      Zhou Guanghui authored
      
      
      In a system(Huawei Ascend ARM64 SoC) using HBM, a multi-bit ECC error
      occurs, and the BIOS will mark the corresponding area (for example, 2 MB)
      as unusable.  When the system restarts next time, these areas are not
      reported or reported as EFI_UNUSABLE_MEMORY.  Both cases lead to an
      increase in the number of memblocks, whereas EFI_UNUSABLE_MEMORY leads to
      a larger number of memblocks.
      
      For example, if the EFI_UNUSABLE_MEMORY type is reported:
      ...
      memory[0x92]    [0x0000200834a00000-0x0000200835bfffff], 0x0000000001200000 bytes on node 7 flags: 0x0
      memory[0x93]    [0x0000200835c00000-0x0000200835dfffff], 0x0000000000200000 bytes on node 7 flags: 0x4
      memory[0x94]    [0x0000200835e00000-0x00002008367fffff], 0x0000000000a00000 bytes on node 7 flags: 0x0
      memory[0x95]    [0x0000200836800000-0x00002008369fffff], 0x0000000000200000 bytes on node 7 flags: 0x4
      memory[0x96]    [0x0000200836a00000-0x0000200837bfffff], 0x0000000001200000 bytes on node 7 flags: 0x0
      memory[0x97]    [0x0000200837c00000-0x0000200837dfffff], 0x0000000000200000 bytes on node 7 flags: 0x4
      memory[0x98]    [0x0000200837e00000-0x000020087fffffff], 0x0000000048200000 bytes on node 7 flags: 0x0
      memory[0x99]    [0x0000200880000000-0x0000200bcfffffff], 0x0000000350000000 bytes on node 6 flags: 0x0
      memory[0x9a]    [0x0000200bd0000000-0x0000200bd01fffff], 0x0000000000200000 bytes on node 6 flags: 0x4
      memory[0x9b]    [0x0000200bd0200000-0x0000200bd07fffff], 0x0000000000600000 bytes on node 6 flags: 0x0
      memory[0x9c]    [0x0000200bd0800000-0x0000200bd09fffff], 0x0000000000200000 bytes on node 6 flags: 0x4
      memory[0x9d]    [0x0000200bd0a00000-0x0000200fcfffffff], 0x00000003ff600000 bytes on node 6 flags: 0x0
      memory[0x9e]    [0x0000200fd0000000-0x0000200fd01fffff], 0x0000000000200000 bytes on node 6 flags: 0x4
      memory[0x9f]    [0x0000200fd0200000-0x0000200fffffffff], 0x000000002fe00000 bytes on node 6 flags: 0x0
      ...
      
      The EFI memory map is parsed to construct the memblock arrays before the
      memblock arrays can be resized.  As the result, memory regions beyond
      INIT_MEMBLOCK_REGIONS are lost.
      
      Add a new macro INIT_MEMBLOCK_MEMORY_REGIONS to replace
      INIT_MEMBLOCK_REGTIONS to define the size of the static memblock.memory
      array.
      
      Allow overriding memblock.memory array size with architecture defined
      INIT_MEMBLOCK_MEMORY_REGIONS and make arm64 to set
      INIT_MEMBLOCK_MEMORY_REGIONS to 1024 when CONFIG_EFI is enabled.
      
      Link: https://lkml.kernel.org/r/20220615102742.96450-1-zhouguanghui1@huawei.com
      Signed-off-by: default avatarZhou Guanghui <zhouguanghui1@huawei.com>
      Acked-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Tested-by: default avatarDarren Hart <darren@os.amperecomputing.com>
      Acked-by: Will Deacon <will@kernel.org>		[arm64]
      Reviewed-by: default avatarAnshuman Khandual <anshuman.khandual@arm.com>
      Cc: Xu Qiang <xuqiang36@huawei.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      450d0e74
    • Miaohe Lin's avatar
      mm: remove obsolete comment in do_fault_around() · 0f0b6931
      Miaohe Lin authored
      Since commit 7267ec00
      
       ("mm: postpone page table allocation until we
      have page to map"), do_fault_around is not called with page table lock
      held.  Cleanup the corresponding comments.
      
      Link: https://lkml.kernel.org/r/20220716080359.38791-1-linmiaohe@huawei.com
      Signed-off-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      0f0b6931
    • William Lam's avatar
      mm: compaction: include compound page count for scanning in pageblock isolation · b717d6b9
      William Lam authored
      
      
      The number of scanned pages can be lower than the number of isolated pages
      when isolating mirgratable or free pageblock.  The metric is being
      reported in trace event and also used in vmstat.
      
      some example output from trace where it shows nr_taken can be greater
      than nr_scanned:
      
      Produced by kernel v5.19-rc6
      kcompactd0-42      [001] .....  1210.268022: mm_compaction_isolate_migratepages: range=(0x107ae4 ~ 0x107c00) nr_scanned=265 nr_taken=255
      [...]
      kcompactd0-42      [001] .....  1210.268382: mm_compaction_isolate_freepages: range=(0x215800 ~ 0x215a00) nr_scanned=13 nr_taken=128
      kcompactd0-42      [001] .....  1210.268383: mm_compaction_isolate_freepages: range=(0x215600 ~ 0x215680) nr_scanned=1 nr_taken=128
      
      mm_compaction_isolate_migratepages does not seem to have this
      behaviour, but for the reason of consistency, nr_scanned should also be
      taken care of in that side.
      
      This behaviour is confusing since currently the count for isolated pages
      takes account of compound page but not for the case of scanned pages.  And
      given that the number of isolated pages(nr_taken) reported in
      mm_compaction_isolate_template trace event is on a single-page basis, the
      ambiguity when reporting the number of scanned pages can be removed by
      also including compound page count.
      
      Link: https://lkml.kernel.org/r/20220711202806.22296-1-william.lam@bytedance.com
      Signed-off-by: default avatarWilliam Lam <william.lam@bytedance.com>
      Reviewed-by: default avatarPunit Agrawal <punit.agrawal@bytedance.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      b717d6b9
    • Adam Sindelar's avatar
      selftests/vm: skip 128TBswitch on unsupported arch · ac3ced5f
      Adam Sindelar authored
      
      
      The test va_128TBswitch.c exercises a feature only supported on PPC and
      x86_64, but it's run on other 64-bit archs as well.  Before this patch,
      the test did nothing and returned 0 for KSFT_PASS.  This patch makes it
      return the KSFT codes from kselftest.h, including KSFT_SKIP when
      appropriate.
      
      Verified on arm64 and x86_64.
      
      Link: https://lkml.kernel.org/r/20220704123813.427625-1-adam@wowsignal.io
      Signed-off-by: default avatarAdam Sindelar <adam@wowsignal.io>
      Cc: David Vernet <void@manifault.com>
      Cc: Shuah Khan <shuah@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      ac3ced5f
    • Adam Sindelar's avatar
      selftests/vm: fix errno handling in mrelease_test · 3b8e7f5c
      Adam Sindelar authored
      mrelease_test should return KSFT_SKIP when process_mrelease is not
      defined, but due to a perror call consuming the errno, it returns
      KSFT_FAIL.
      
      This patch decides the exit code before calling perror.
      
      [adam@wowsignal.io: fix remaining instances of errno mishandling]
        Link: https://lkml.kernel.org/r/20220706141602.10159-1-adam@wowsignal.io
      Link: https://lkml.kernel.org/r/20220704173351.19595-1-adam@wowsignal.io
      Fixes: 33776141
      
       ("selftests: vm: add process_mrelease tests")
      Signed-off-by: default avatarAdam Sindelar <adam@wowsignal.io>
      Reviewed-by: default avatarDavid Vernet <void@manifault.com>
      Reviewed-by: default avatarSuren Baghdasaryan <surenb@google.com>
      Cc: Shuah Khan <shuah@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      3b8e7f5c
    • Roman Gushchin's avatar
      mm: memcontrol: do not miss MEMCG_MAX events for enforced allocations · d6e103a7
      Roman Gushchin authored
      
      
      Yafang Shao reported an issue related to the accounting of bpf memory:
      if a bpf map is charged indirectly for memory consumed from an
      interrupt context and allocations are enforced, MEMCG_MAX events are
      not raised.
      
      It's not/less of an issue in a generic case because consequent
      allocations from a process context will trigger the direct reclaim and
      MEMCG_MAX events will be raised.  However a bpf map can belong to a
      dying/abandoned memory cgroup, so there will be no allocations from a
      process context and no MEMCG_MAX events will be triggered.
      
      Link: https://lkml.kernel.org/r/20220702033521.64630-1-roman.gushchin@linux.dev
      Signed-off-by: default avatarRoman Gushchin <roman.gushchin@linux.dev>
      Reported-by: default avatarYafang Shao <laoar.shao@gmail.com>
      Acked-by: default avatarShakeel Butt <shakeelb@google.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Muchun Song <songmuchun@bytedance.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      d6e103a7
    • Miaohe Lin's avatar
      filemap: minor cleanup for filemap_write_and_wait_range · ccac11da
      Miaohe Lin authored
      
      
      Restructure the logic in filemap_write_and_wait_range to simplify the code
      and make it more consistent with file_write_and_wait_range. No functional
      change intended.
      
      Link: https://lkml.kernel.org/r/20220627132351.55680-1-linmiaohe@huawei.com
      Signed-off-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Reviewed-by: default avatarMuchun Song <songmuchun@bytedance.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      ccac11da
    • Miaohe Lin's avatar
      mm/mmap.c: fix missing call to vm_unacct_memory in mmap_region · 7f82f922
      Miaohe Lin authored
      Since the beginning, charged is set to 0 to avoid calling vm_unacct_memory
      twice because vm_unacct_memory will be called by above unmap_region.  But
      since commit 4f74d2c8 ("vm: remove 'nr_accounted' calculations from
      the unmap_vmas() interfaces"), unmap_region doesn't call vm_unacct_memory
      anymore.  So charged shouldn't be set to 0 now otherwise the calling to
      paired vm_unacct_memory will be missed and leads to imbalanced account.
      
      Link: https://lkml.kernel.org/r/20220618082027.43391-1-linmiaohe@huawei.com
      Fixes: 4f74d2c8
      
       ("vm: remove 'nr_accounted' calculations from the unmap_vmas() interfaces")
      Signed-off-by: default avatarMiaohe Lin <linmiaohe@huawei.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      7f82f922
    • Liam Howlett's avatar
      android: binder: fix lockdep check on clearing vma · b0cab80e
      Liam Howlett authored
      
      
      When munmapping a vma, the mmap_lock can be degraded to a write before
      calling close() on the file handle.  The binder close() function calls
      binder_alloc_set_vma() to clear the vma address, which now has a lock dep
      check for writing on the mmap_lock.  Change the lockdep check to ensure
      the reading lock is held while clearing and keep the write check while
      writing.
      
      Link: https://lkml.kernel.org/r/20220627151857.2316964-1-Liam.Howlett@oracle.com
      Fixes: 472a68df605b ("android: binder: stop saving a pointer to the VMA")
      Signed-off-by: default avatarLiam R. Howlett <Liam.Howlett@oracle.com>
      Reported-by: default avatar <syzbot+da54fa8d793ca89c741f@syzkaller.appspotmail.com>
      Acked-by: default avatarTodd Kjos <tkjos@google.com>
      Cc: "Arve Hjønnevåg" <arve@android.com>
      Cc: Christian Brauner (Microsoft) <brauner@kernel.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Hridya Valsaraju <hridya@google.com>
      Cc: Joel Fernandes <joel@joelfernandes.org>
      Cc: Martijn Coenen <maco@android.com>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      b0cab80e
    • Liam R. Howlett's avatar
      android: binder: stop saving a pointer to the VMA · a43cfc87
      Liam R. Howlett authored
      Do not record a pointer to a VMA outside of the mmap_lock for later use. 
      This is unsafe and there are a number of failure paths *after* the
      recorded VMA pointer may be freed during setup.  There is no callback to
      the driver to clear the saved pointer from generic mm code.  Furthermore,
      the VMA pointer may become stale if any number of VMA operations end up
      freeing the VMA so saving it was fragile to being with.
      
      Instead, change the binder_alloc struct to record the start address of the
      VMA and use vma_lookup() to get the vma when needed.  Add lockdep
      mmap_lock checks on updates to the vma pointer to ensure the lock is held
      and depend on that lock for synchronization of readers and writers - which
      was already the case anyways, so the smp_wmb()/smp_rmb() was not
      necessary.
      
      [akpm@linux-foundation.org: fix drivers/android/binder_alloc_selftest.c]
      Link: https://lkml.kernel.org/r/20220621140212.vpkio64idahetbyf@revolver
      Fixes: da1b9564
      
       ("android: binder: fix the race mmap and alloc_new_buf_locked")
      Reported-by: default avatar <syzbot+58b51ac2b04e388ab7b0@syzkaller.appspotmail.com>
      Signed-off-by: default avatarLiam R. Howlett <Liam.Howlett@oracle.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Christian Brauner (Microsoft) <brauner@kernel.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Hridya Valsaraju <hridya@google.com>
      Cc: Joel Fernandes <joel@joelfernandes.org>
      Cc: Martijn Coenen <maco@android.com>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Todd Kjos <tkjos@android.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      a43cfc87
    • Liam R. Howlett's avatar
      mips: rename mt_init to mips_mt_init · 15d2ce71
      Liam R. Howlett authored
      
      
      Move mt_init out of the way for the maple tree.  Use mips_mt prefix to
      match the rest of the functions in the file.
      
      Link: https://lkml.kernel.org/r/20220504002554.654642-2-Liam.Howlett@oracle.com
      Signed-off-by: default avatarLiam R. Howlett <Liam.Howlett@oracle.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: David Howells <dhowells@redhat.com>
      Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
      Cc: SeongJae Park <sj@kernel.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      15d2ce71
    • Tetsuo Handa's avatar
      mm: shrinkers: fix double kfree on shrinker name · 14773bfa
      Tetsuo Handa authored
      syzbot is reporting double kfree() at free_prealloced_shrinker() [1], for
      destroy_unused_super() calls free_prealloced_shrinker() even if
      prealloc_shrinker() returned an error.  Explicitly clear shrinker name
      when prealloc_shrinker() called kfree().
      
      [roman.gushchin@linux.dev: zero shrinker->name in all cases where shrinker->name is freed]
        Link: https://lkml.kernel.org/r/YtgteTnQTgyuKUSY@castle
      Link: https://syzkaller.appspot.com/bug?extid=8b481578352d4637f510 [1]
      Link: https://lkml.kernel.org/r/ffa62ece-6a42-2644-16cf-0d33ef32c676@I-love.SAKURA.ne.jp
      Fixes: e33c267a
      
       ("mm: shrinkers: provide shrinkers with names")
      Reported-by: default avatarsyzbot <syzbot+8b481578352d4637f510@syzkaller.appspotmail.com>
      Signed-off-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Acked-by: default avatarRoman Gushchin <roman.gushchin@linux.dev>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      14773bfa
  2. Jul 27, 2022
  3. Jul 18, 2022