Skip to content
  1. Jun 24, 2023
  2. Jun 20, 2023
    • Mark Brown's avatar
      selftests/mm: fix cross compilation with LLVM · 0518dbe9
      Mark Brown authored
      
      
      Currently the MM selftests attempt to work out the target architecture by
      using CROSS_COMPILE or otherwise querying the host machine, storing the
      target architecture in a variable called MACHINE rather than the usual
      ARCH though as far as I can tell (including for x86_64) the value is the
      same as we would use for architecture.
      
      When cross compiling with LLVM we don't need a CROSS_COMPILE as LLVM can
      support many target architectures in a single build so this logic does not
      work, CROSS_COMPILE is not set and we end up selecting tests for the host
      rather than target architecture.  Fix this by using the more standard ARCH
      to describe the architecture, taking it from the environment if specified.
      
      Link: https://lkml.kernel.org/r/20230614-kselftest-mm-llvm-v1-1-180523f277d3@kernel.org
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Albert Ou <aou@eecs.berkeley.edu>
      Cc: Nathan Chancellor <nathan@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Shuah Khan <shuah@kernel.org>
      Cc: Tom Rix <trix@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      0518dbe9
    • Ben Dooks's avatar
      mailmap: add entries for Ben Dooks · 823b37e8
      Ben Dooks authored
      
      
      I am going to be losing my sifive.com address soon and I also realised my
      old Simtec address (from >10 years ago) is also not been updates so update
      .mailmap for both.
      
      Link: https://lkml.kernel.org/r/20230615081820.79485-1-ben.dooks@codethink.co.uk
      Signed-off-by: default avatarBen Dooks <ben.dooks@sifive.com>
      Signed-off-by: default avatarBen Dooks <ben-linux@fluff.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      823b37e8
    • Ryusuke Konishi's avatar
      nilfs2: prevent general protection fault in nilfs_clear_dirty_page() · 782e53d0
      Ryusuke Konishi authored
      
      
      In a syzbot stress test that deliberately causes file system errors on
      nilfs2 with a corrupted disk image, it has been reported that
      nilfs_clear_dirty_page() called from nilfs_clear_dirty_pages() can cause a
      general protection fault.
      
      In nilfs_clear_dirty_pages(), when looking up dirty pages from the page
      cache and calling nilfs_clear_dirty_page() for each dirty page/folio
      retrieved, the back reference from the argument page to "mapping" may have
      been changed to NULL (and possibly others).  It is necessary to check this
      after locking the page/folio.
      
      So, fix this issue by not calling nilfs_clear_dirty_page() on a page/folio
      after locking it in nilfs_clear_dirty_pages() if the back reference
      "mapping" from the page/folio is different from the "mapping" that held
      the page/folio just before.
      
      Link: https://lkml.kernel.org/r/20230612021456.3682-1-konishi.ryusuke@gmail.com
      Signed-off-by: default avatarRyusuke Konishi <konishi.ryusuke@gmail.com>
      Reported-by: default avatar <syzbot+53369d11851d8f26735c@syzkaller.appspotmail.com>
      Closes: https://lkml.kernel.org/r/000000000000da4f6b05eb9bf593@google.com
      Tested-by: default avatarRyusuke Konishi <konishi.ryusuke@gmail.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      782e53d0
    • Qi Zheng's avatar
      Revert "mm: vmscan: make global slab shrink lockless" · 71c3ad65
      Qi Zheng authored
      This reverts commit f95bdb70.
      
      Kernel test robot reports -88.8% regression in stress-ng.ramfs.ops_per_sec
      test case [1], which is caused by commit f95bdb70
      
       ("mm: vmscan: make
      global slab shrink lockless").  The root cause is that SRCU has to be
      careful to not frequently check for SRCU read-side critical section exits.
      Therefore, even if no one is currently in the SRCU read-side critical
      section, synchronize_srcu() cannot return quickly.  That's why
      unregister_shrinker() has become slower.
      
      After discussion, we will try to use the refcount+RCU method [2] proposed
      by Dave Chinner to continue to re-implement the lockless slab shrink.  So
      revert the shrinker_srcu related changes first.
      
      [1]. https://lore.kernel.org/lkml/202305230837.db2c233f-yujie.liu@intel.com/
      [2]. https://lore.kernel.org/lkml/ZIJhou1d55d4H1s0@dread.disaster.area/
      
      Link: https://lkml.kernel.org/r/20230609081518.3039120-8-qi.zheng@linux.dev
      Reported-by: default avatarkernel test robot <yujie.liu@intel.com>
      Closes: https://lore.kernel.org/oe-lkp/202305230837.db2c233f-yujie.liu@intel.com
      Signed-off-by: default avatarQi Zheng <zhengqi.arch@bytedance.com>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Kirill Tkhai <tkhai@ya.ru>
      Cc: Muchun Song <muchun.song@linux.dev>
      Cc: Roman Gushchin <roman.gushchin@linux.dev>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      71c3ad65
    • Qi Zheng's avatar
      Revert "mm: vmscan: make memcg slab shrink lockless" · 7cee3603
      Qi Zheng authored
      This reverts commit caa05325.
      
      Kernel test robot reports -88.8% regression in stress-ng.ramfs.ops_per_sec
      test case [1], which is caused by commit f95bdb70
      
       ("mm: vmscan: make
      global slab shrink lockless").  The root cause is that SRCU has to be
      careful to not frequently check for SRCU read-side critical section exits.
      Therefore, even if no one is currently in the SRCU read-side critical
      section, synchronize_srcu() cannot return quickly.  That's why
      unregister_shrinker() has become slower.
      
      After discussion, we will try to use the refcount+RCU method [2] proposed
      by Dave Chinner to continue to re-implement the lockless slab shrink.  So
      revert the shrinker_srcu related changes first.
      
      [1]. https://lore.kernel.org/lkml/202305230837.db2c233f-yujie.liu@intel.com/
      [2]. https://lore.kernel.org/lkml/ZIJhou1d55d4H1s0@dread.disaster.area/
      
      Link: https://lkml.kernel.org/r/20230609081518.3039120-7-qi.zheng@linux.dev
      Reported-by: default avatarkernel test robot <yujie.liu@intel.com>
      Closes: https://lore.kernel.org/oe-lkp/202305230837.db2c233f-yujie.liu@intel.com
      Signed-off-by: default avatarQi Zheng <zhengqi.arch@bytedance.com>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Kirill Tkhai <tkhai@ya.ru>
      Cc: Muchun Song <muchun.song@linux.dev>
      Cc: Roman Gushchin <roman.gushchin@linux.dev>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      7cee3603
    • Qi Zheng's avatar
      Revert "mm: vmscan: add shrinker_srcu_generation" · d6ecbcd7
      Qi Zheng authored
      This reverts commit 475733dd.
      
      Kernel test robot reports -88.8% regression in stress-ng.ramfs.ops_per_sec
      test case [1], which is caused by commit f95bdb70
      
       ("mm: vmscan: make
      global slab shrink lockless").  The root cause is that SRCU has to be
      careful to not frequently check for SRCU read-side critical section exits.
      Therefore, even if no one is currently in the SRCU read-side critical
      section, synchronize_srcu() cannot return quickly.  That's why
      unregister_shrinker() has become slower.
      
      We will try to use the refcount+RCU method [2] proposed by Dave Chinner to
      continue to re-implement the lockless slab shrink.  So revert the
      shrinker_srcu related changes first.
      
      [1]. https://lore.kernel.org/lkml/202305230837.db2c233f-yujie.liu@intel.com/
      [2]. https://lore.kernel.org/lkml/ZIJhou1d55d4H1s0@dread.disaster.area/
      
      Link: https://lkml.kernel.org/r/20230609081518.3039120-6-qi.zheng@linux.dev
      Reported-by: default avatarkernel test robot <yujie.liu@intel.com>
      Closes: https://lore.kernel.org/oe-lkp/202305230837.db2c233f-yujie.liu@intel.com
      Signed-off-by: default avatarQi Zheng <zhengqi.arch@bytedance.com>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Kirill Tkhai <tkhai@ya.ru>
      Cc: Muchun Song <muchun.song@linux.dev>
      Cc: Roman Gushchin <roman.gushchin@linux.dev>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      d6ecbcd7
    • Qi Zheng's avatar
      Revert "mm: shrinkers: make count and scan in shrinker debugfs lockless" · 1a554ecc
      Qi Zheng authored
      This reverts commit 20cd1892.
      
      Kernel test robot reports -88.8% regression in stress-ng.ramfs.ops_per_sec
      test case [1], which is caused by commit f95bdb70
      
       ("mm: vmscan: make
      global slab shrink lockless").  The root cause is that SRCU has to be
      careful to not frequently check for SRCU read-side critical section exits.
      Therefore, even if no one is currently in the SRCU read-side critical
      section, synchronize_srcu() cannot return quickly.  That's why
      unregister_shrinker() has become slower.
      
      We will try to use the refcount+RCU method [2] proposed by Dave Chinner to
      continue to re-implement the lockless slab shrink.  So revert the
      shrinker_srcu related changes first.
      
      [1]. https://lore.kernel.org/lkml/202305230837.db2c233f-yujie.liu@intel.com/
      [2]. https://lore.kernel.org/lkml/ZIJhou1d55d4H1s0@dread.disaster.area/
      
      Link: https://lkml.kernel.org/r/20230609081518.3039120-5-qi.zheng@linux.dev
      Reported-by: default avatarkernel test robot <yujie.liu@intel.com>
      Closes: https://lore.kernel.org/oe-lkp/202305230837.db2c233f-yujie.liu@intel.com
      Signed-off-by: default avatarQi Zheng <zhengqi.arch@bytedance.com>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Kirill Tkhai <tkhai@ya.ru>
      Cc: Muchun Song <muchun.song@linux.dev>
      Cc: Roman Gushchin <roman.gushchin@linux.dev>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      1a554ecc
    • Qi Zheng's avatar
      Revert "mm: vmscan: hold write lock to reparent shrinker nr_deferred" · c534f7cc
      Qi Zheng authored
      This reverts commit b3cabea3.
      
      Kernel test robot reports -88.8% regression in stress-ng.ramfs.ops_per_sec
      test case [1], which is caused by commit f95bdb70
      
       ("mm: vmscan: make
      global slab shrink lockless"). The root cause is that SRCU has to be careful
      to not frequently check for SRCU read-side critical section exits. Therefore,
      even if no one is currently in the SRCU read-side critical section,
      synchronize_srcu() cannot return quickly. That's why unregister_shrinker()
      has become slower.
      
      We will try to use the refcount+RCU method [2] proposed by Dave Chinner
      to continue to re-implement the lockless slab shrink. Because there will
      be other readers after reverting the shrinker_srcu related changes, so
      it is better to restore to hold read lock to reparent shrinker nr_deferred.
      
      [1]. https://lore.kernel.org/lkml/202305230837.db2c233f-yujie.liu@intel.com/
      [2]. https://lore.kernel.org/lkml/ZIJhou1d55d4H1s0@dread.disaster.area/
      
      Link: https://lkml.kernel.org/r/20230609081518.3039120-4-qi.zheng@linux.dev
      Reported-by: default avatarkernel test robot <yujie.liu@intel.com>
      Closes: https://lore.kernel.org/oe-lkp/202305230837.db2c233f-yujie.liu@intel.com
      Signed-off-by: default avatarQi Zheng <zhengqi.arch@bytedance.com>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Kirill Tkhai <tkhai@ya.ru>
      Cc: Muchun Song <muchun.song@linux.dev>
      Cc: Roman Gushchin <roman.gushchin@linux.dev>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      c534f7cc
    • Qi Zheng's avatar
      Revert "mm: vmscan: remove shrinker_rwsem from synchronize_shrinkers()" · 07252b0f
      Qi Zheng authored
      This reverts commit 1643db98.
      
      Kernel test robot reports -88.8% regression in stress-ng.ramfs.ops_per_sec
      test case [1], which is caused by commit f95bdb70
      
       ("mm: vmscan: make
      global slab shrink lockless").  The root cause is that SRCU has to be
      careful to not frequently check for SRCU read-side critical section exits.
      Therefore, even if no one is currently in the SRCU read-side critical
      section, synchronize_srcu() cannot return quickly.  That's why
      unregister_shrinker() has become slower.
      
      We will try to use the refcount+RCU method [2] proposed by Dave Chinner to
      continue to re-implement the lockless slab shrink.  So we still need
      shrinker_rwsem in synchronize_shrinkers() after reverting the
      shrinker_srcu related changes.
      
      [1]. https://lore.kernel.org/lkml/202305230837.db2c233f-yujie.liu@intel.com/
      [2]. https://lore.kernel.org/lkml/ZIJhou1d55d4H1s0@dread.disaster.area/
      
      Link: https://lkml.kernel.org/r/20230609081518.3039120-3-qi.zheng@linux.dev
      Reported-by: default avatarkernel test robot <yujie.liu@intel.com>
      Closes: https://lore.kernel.org/oe-lkp/202305230837.db2c233f-yujie.liu@intel.com
      Signed-off-by: default avatarQi Zheng <zhengqi.arch@bytedance.com>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Kirill Tkhai <tkhai@ya.ru>
      Cc: Muchun Song <muchun.song@linux.dev>
      Cc: Roman Gushchin <roman.gushchin@linux.dev>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      07252b0f
    • Qi Zheng's avatar
      Revert "mm: shrinkers: convert shrinker_rwsem to mutex" · 47a7c01c
      Qi Zheng authored
      Patch series "revert shrinker_srcu related changes".
      
      
      This patch (of 7):
      
      This reverts commit cf2e309e.
      
      Kernel test robot reports -88.8% regression in stress-ng.ramfs.ops_per_sec
      test case [1], which is caused by commit f95bdb70
      
       ("mm: vmscan: make
      global slab shrink lockless").  The root cause is that SRCU has to be
      careful to not frequently check for SRCU read-side critical section exits.
      Therefore, even if no one is currently in the SRCU read-side critical
      section, synchronize_srcu() cannot return quickly.  That's why
      unregister_shrinker() has become slower.
      
      After discussion, we will try to use the refcount+RCU method [2] proposed
      by Dave Chinner to continue to re-implement the lockless slab shrink.  So
      revert the shrinker_mutex back to shrinker_rwsem first.
      
      [1]. https://lore.kernel.org/lkml/202305230837.db2c233f-yujie.liu@intel.com/
      [2]. https://lore.kernel.org/lkml/ZIJhou1d55d4H1s0@dread.disaster.area/
      
      Link: https://lkml.kernel.org/r/20230609081518.3039120-1-qi.zheng@linux.dev
      Link: https://lkml.kernel.org/r/20230609081518.3039120-2-qi.zheng@linux.dev
      Reported-by: default avatarkernel test robot <yujie.liu@intel.com>
      Closes: https://lore.kernel.org/oe-lkp/202305230837.db2c233f-yujie.liu@intel.com
      Signed-off-by: default avatarQi Zheng <zhengqi.arch@bytedance.com>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Kirill Tkhai <tkhai@ya.ru>
      Cc: Muchun Song <muchun.song@linux.dev>
      Cc: Roman Gushchin <roman.gushchin@linux.dev>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Yujie Liu <yujie.liu@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      47a7c01c
    • Ryusuke Konishi's avatar
      nilfs2: fix buffer corruption due to concurrent device reads · 679bd7eb
      Ryusuke Konishi authored
      
      
      As a result of analysis of a syzbot report, it turned out that in three
      cases where nilfs2 allocates block device buffers directly via sb_getblk,
      concurrent reads to the device can corrupt the allocated buffers.
      
      Nilfs2 uses sb_getblk for segment summary blocks, that make up a log
      header, and the super root block, that is the trailer, and when moving and
      writing the second super block after fs resize.
      
      In any of these, since the uptodate flag is not set when storing metadata
      to be written in the allocated buffers, the stored metadata will be
      overwritten if a device read of the same block occurs concurrently before
      the write.  This causes metadata corruption and misbehavior in the log
      write itself, causing warnings in nilfs_btree_assign() as reported.
      
      Fix these issues by setting an uptodate flag on the buffer head on the
      first or before modifying each buffer obtained with sb_getblk, and
      clearing the flag on failure.
      
      When setting the uptodate flag, the lock_buffer/unlock_buffer pair is used
      to perform necessary exclusive control, and the buffer is filled to ensure
      that uninitialized bytes are not mixed into the data read from others.  As
      for buffers for segment summary blocks, they are filled incrementally, so
      if the uptodate flag was unset on their allocation, set the flag and zero
      fill the buffer once at that point.
      
      Also, regarding the superblock move routine, the starting point of the
      memset call to zerofill the block is incorrectly specified, which can
      cause a buffer overflow on file systems with block sizes greater than
      4KiB.  In addition, if the superblock is moved within a large block, it is
      necessary to assume the possibility that the data in the superblock will
      be destroyed by zero-filling before copying.  So fix these potential
      issues as well.
      
      Link: https://lkml.kernel.org/r/20230609035732.20426-1-konishi.ryusuke@gmail.com
      Signed-off-by: default avatarRyusuke Konishi <konishi.ryusuke@gmail.com>
      Reported-by: default avatar <syzbot+31837fe952932efc8fb9@syzkaller.appspotmail.com>
      Closes: https://lkml.kernel.org/r/00000000000030000a05e981f475@google.com
      Tested-by: default avatarRyusuke Konishi <konishi.ryusuke@gmail.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      679bd7eb
    • Florian Fainelli's avatar
      scripts/gdb: fix SB_* constants parsing · 6a59cb51
      Florian Fainelli authored
      --0000000000009a0c9905fd9173ad
      Content-Transfer-Encoding: 8bit
      
      After f15afbd3 ("fs: fix undefined behavior in bit shift for
      SB_NOUSER") the constants were changed from plain integers which
      LX_VALUE() can parse to constants using the BIT() macro which causes the
      following:
      
      Reading symbols from build/linux-custom/vmlinux...done.
      Traceback (most recent call last):
        File "/home/fainelli/work/buildroot/output/arm64/build/linux-custom/vmlinux-gdb.py", line 25, in <module>
          import linux.constants
        File "/home/fainelli/work/buildroot/output/arm64/build/linux-custom/scripts/gdb/linux/constants.py", line 5
          LX_SB_RDONLY = ((((1UL))) << (0))
      
      Use LX_GDBPARSED() which does not suffer from that issue.
      
      f15afbd3
      
       ("fs: fix undefined behavior in bit shift for SB_NOUSER")
      Link: https://lkml.kernel.org/r/20230607221337.2781730-1-florian.fainelli@broadcom.com
      Signed-off-by: default avatarFlorian Fainelli <florian.fainelli@broadcom.com>
      Acked-by: default avatarChristian Brauner <brauner@kernel.org>
      Cc: Hao Ge <gehao@kylinos.cn>
      Cc: Jan Kiszka <jan.kiszka@siemens.com>
      Cc: Kieran Bingham <kbingham@kernel.org>
      Cc: Luis Chamberlain <mcgrof@kernel.org>
      Cc: Pankaj Raghav <p.raghav@samsung.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      6a59cb51
    • Prathu Baronia's avatar
      scripts: fix the gfp flags header path in gfp-translate · 2049a7d0
      Prathu Baronia authored
      Since gfp flags have been shifted to gfp_types.h so update the path in
      the gfp-translate script.
      
      Link: https://lkml.kernel.org/r/20230608154450.21758-1-prathubaronia2011@gmail.com
      Fixes: cb5a065b
      
       ("headers/deps: mm: Split <linux/gfp_types.h> out of <linux/gfp.h>")
      Signed-off-by: default avatarPrathu Baronia <prathubaronia2011@gmail.com>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Cc: Masahiro Yamada <masahiroy@kernel.org>
      Cc: Nathan Chancellor <nathan@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Nicolas Schier <nicolas@fjasle.eu>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Yury Norov <yury.norov@gmail.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      2049a7d0
    • Mike Kravetz's avatar
      udmabuf: revert 'Add support for mapping hugepages (v4)' · b7cb3821
      Mike Kravetz authored
      This effectively reverts commit 16c243e9 ("udmabuf: Add support for
      mapping hugepages (v4)").  Recently, Junxiao Chang found a BUG with page
      map counting as described here [1].  This issue pointed out that the
      udmabuf driver was making direct use of subpages of hugetlb pages.  This
      is not a good idea, and no other mm code attempts such use.  In addition
      to the mapcount issue, this also causes issues with hugetlb vmemmap
      optimization and page poisoning.
      
      For now, remove hugetlb support.
      
      If udmabuf wants to be used on hugetlb mappings, it should be changed to
      only use complete hugetlb pages.  This will require different alignment
      and size requirements on the UDMABUF_CREATE API.
      
      [1] https://lore.kernel.org/linux-mm/20230512072036.1027784-1-junxiao.chang@intel.com/
      
      Link: https://lkml.kernel.org/r/20230608204927.88711-1-mike.kravetz@oracle.com
      Fixes: 16c243e9
      
       ("udmabuf: Add support for mapping hugepages (v4)")
      Signed-off-by: default avatarMike Kravetz <mike.kravetz@oracle.com>
      Acked-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Acked-by: default avatarVivek Kasireddy <vivek.kasireddy@intel.com>
      Acked-by: default avatarGerd Hoffmann <kraxel@redhat.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Dongwon Kim <dongwon.kim@intel.com>
      Cc: James Houghton <jthoughton@google.com>
      Cc: Jerome Marchand <jmarchan@redhat.com>
      Cc: Junxiao Chang <junxiao.chang@intel.com>
      Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Muchun Song <muchun.song@linux.dev>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      b7cb3821
    • David Stevens's avatar
      mm/khugepaged: fix iteration in collapse_file · c8a8f3b4
      David Stevens authored
      Remove an unnecessary call to xas_set(index) when iterating over the
      target range in collapse_file.  The extra call to xas_set reset the xas
      cursor to the top of the tree, causing the xas_next call on the next
      iteration to walk the tree to index instead of advancing to index+1.  This
      returned the same page again, which would cause collapse_file to fail
      because the page is already locked.
      
      This bug was hidden when CONFIG_DEBUG_VM was set.  When that config was
      used, the xas_load in a subsequent VM_BUG_ON assert would walk xas from
      the top of the tree to index, causing the xas_next call on the next loop
      iteration to advance the cursor as expected.
      
      Link: https://lkml.kernel.org/r/20230607053135.2087354-1-stevensd@google.com
      Fixes: a2e17cc2
      
       ("mm/khugepaged: maintain page cache uptodate flag")
      Signed-off-by: default avatarDavid Stevens <stevensd@chromium.org>
      Reviewed-by: default avatarPeter Xu <peterx@redhat.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Jiaqi Yan <jiaqiyan@google.com>
      Cc: Kirill A . Shutemov <kirill@shutemov.name>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Yang Shi <shy828301@gmail.com>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      c8a8f3b4
    • Roberto Sassu's avatar
      memfd: check for non-NULL file_seals in memfd_create() syscall · 935d44ac
      Roberto Sassu authored
      Ensure that file_seals is non-NULL before using it in the memfd_create()
      syscall.  One situation in which memfd_file_seals_ptr() could return a
      NULL pointer when CONFIG_SHMEM=n, oopsing the kernel.
      
      Link: https://lkml.kernel.org/r/20230607132427.2867435-1-roberto.sassu@huaweicloud.com
      Fixes: 47b9012e
      
       ("shmem: add sealing support to hugetlb-backed memfd")
      Signed-off-by: default avatarRoberto Sassu <roberto.sassu@huawei.com>
      Cc: Marc-Andr Lureau <marcandre.lureau@redhat.com>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      935d44ac
    • Lorenzo Stoakes's avatar
      mm/vmalloc: do not output a spurious warning when huge vmalloc() fails · 95a301ee
      Lorenzo Stoakes authored
      In __vmalloc_area_node() we always warn_alloc() when an allocation
      performed by vm_area_alloc_pages() fails unless it was due to a pending
      fatal signal.
      
      However, huge page allocations instigated either by vmalloc_huge() or
      __vmalloc_node_range() (or a caller that invokes this like kvmalloc() or
      kvmalloc_node()) always falls back to order-0 allocations if the huge page
      allocation fails.
      
      This renders the warning useless and noisy, especially as all callers
      appear to be aware that this may fallback.  This has already resulted in
      at least one bug report from a user who was confused by this (see link).
      
      Therefore, simply update the code to only output this warning for order-0
      pages when no fatal signal is pending.
      
      Link: https://bugzilla.suse.com/show_bug.cgi?id=1211410
      Link: https://lkml.kernel.org/r/20230605201107.83298-1-lstoakes@gmail.com
      Fixes: 80b1d8fd
      
       ("mm: vmalloc: correct use of __GFP_NOWARN mask in __vmalloc_area_node()")
      Signed-off-by: default avatarLorenzo Stoakes <lstoakes@gmail.com>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Reviewed-by: default avatarBaoquan He <bhe@redhat.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Reviewed-by: default avatarUladzislau Rezki (Sony) <urezki@gmail.com>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      95a301ee
    • Liam R. Howlett's avatar
      mm/mprotect: fix do_mprotect_pkey() limit check · 77795f90
      Liam R. Howlett authored
      The return of do_mprotect_pkey() can still be incorrectly returned as
      success if there is a gap that spans to or beyond the end address passed
      in.  Update the check to ensure that the end address has indeed been seen.
      
      Link: https://lore.kernel.org/all/CABi2SkXjN+5iFoBhxk71t3cmunTk-s=rB4T7qo0UQRh17s49PQ@mail.gmail.com/
      Link: https://lkml.kernel.org/r/20230606182912.586576-1-Liam.Howlett@oracle.com
      Fixes: 82f95134
      
       ("mm/mprotect: fix do_mprotect_pkey() return on error")
      Signed-off-by: default avatarLiam R. Howlett <Liam.Howlett@oracle.com>
      Reported-by: default avatarJeff Xu <jeffxu@chromium.org>
      Reviewed-by: default avatarLorenzo Stoakes <lstoakes@gmail.com>
      Acked-by: default avatarDavid Hildenbrand <david@redhat.com>
      Acked-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      77795f90
    • Rafael Aquini's avatar
      writeback: fix dereferencing NULL mapping->host on writeback_page_template · 54abe19e
      Rafael Aquini authored
      When commit 19343b5b ("mm/page-writeback: introduce tracepoint for
      wait_on_page_writeback()") repurposed the writeback_dirty_page trace event
      as a template to create its new wait_on_page_writeback trace event, it
      ended up opening a window to NULL pointer dereference crashes due to the
      (infrequent) occurrence of a race where an access to a page in the
      swap-cache happens concurrently with the moment this page is being written
      to disk and the tracepoint is enabled:
      
          BUG: kernel NULL pointer dereference, address: 0000000000000040
          #PF: supervisor read access in kernel mode
          #PF: error_code(0x0000) - not-present page
          PGD 800000010ec0a067 P4D 800000010ec0a067 PUD 102353067 PMD 0
          Oops: 0000 [#1] PREEMPT SMP PTI
          CPU: 1 PID: 1320 Comm: shmem-worker Kdump: loaded Not tainted 6.4.0-rc5+ #13
          Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS edk2-20230301gitf80f052277c8-1.fc37 03/01/2023
          RIP: 0010:trace_event_raw_event_writeback_folio_template+0x76/0xf0
          Code: 4d 85 e4 74 5c 49 8b 3c 24 e8 06 98 ee ff 48 89 c7 e8 9e 8b ee ff ba 20 00 00 00 48 89 ef 48 89 c6 e8 fe d4 1a 00 49 8b 04 24 <48> 8b 40 40 48 89 43 28 49 8b 45 20 48 89 e7 48 89 43 30 e8 a2 4d
          RSP: 0000:ffffaad580b6fb60 EFLAGS: 00010246
          RAX: 0000000000000000 RBX: ffff90e38035c01c RCX: 0000000000000000
          RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff90e38035c044
          RBP: ffff90e38035c024 R08: 0000000000000002 R09: 0000000000000006
          R10: ffff90e38035c02e R11: 0000000000000020 R12: ffff90e380bac000
          R13: ffffe3a7456d9200 R14: 0000000000001b81 R15: ffffe3a7456d9200
          FS:  00007f2e4e8a15c0(0000) GS:ffff90e3fbc80000(0000) knlGS:0000000000000000
          CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
          CR2: 0000000000000040 CR3: 00000001150c6003 CR4: 0000000000170ee0
          Call Trace:
           <TASK>
           ? __die+0x20/0x70
           ? page_fault_oops+0x76/0x170
           ? kernelmode_fixup_or_oops+0x84/0x110
           ? exc_page_fault+0x65/0x150
           ? asm_exc_page_fault+0x22/0x30
           ? trace_event_raw_event_writeback_folio_template+0x76/0xf0
           folio_wait_writeback+0x6b/0x80
           shmem_swapin_folio+0x24a/0x500
           ? filemap_get_entry+0xe3/0x140
           shmem_get_folio_gfp+0x36e/0x7c0
           ? find_busiest_group+0x43/0x1a0
           shmem_fault+0x76/0x2a0
           ? __update_load_avg_cfs_rq+0x281/0x2f0
           __do_fault+0x33/0x130
           do_read_fault+0x118/0x160
           do_pte_missing+0x1ed/0x2a0
           __handle_mm_fault+0x566/0x630
           handle_mm_fault+0x91/0x210
           do_user_addr_fault+0x22c/0x740
           exc_page_fault+0x65/0x150
           asm_exc_page_fault+0x22/0x30
      
      This problem arises from the fact that the repurposed writeback_dirty_page
      trace event code was written assuming that every pointer to mapping
      (struct address_space) would come from a file-mapped page-cache object,
      thus mapping->host would always be populated, and that was a valid case
      before commit 19343b5b.  The swap-cache address space
      (swapper_spaces), however, doesn't populate its ->host (struct inode)
      pointer, thus leading to the crashes in the corner-case aforementioned.
      
      commit 19343b5b ended up breaking the assignment of __entry->name and
      __entry->ino for the wait_on_page_writeback tracepoint -- both dependent
      on mapping->host carrying a pointer to a valid inode.  The assignment of
      __entry->name was fixed by commit 68f23b89 ("memcg: fix a crash in
      wb_workfn when a device disappears"), and this commit fixes the remaining
      case, for __entry->ino.
      
      Link: https://lkml.kernel.org/r/20230606233613.1290819-1-aquini@redhat.com
      Fixes: 19343b5b
      
       ("mm/page-writeback: introduce tracepoint for wait_on_page_writeback()")
      Signed-off-by: default avatarRafael Aquini <aquini@redhat.com>
      Reviewed-by: default avatarYafang Shao <laoar.shao@gmail.com>
      Cc: Aristeu Rozanski <aris@redhat.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      54abe19e
  3. Jun 13, 2023
    • Nhat Pham's avatar
      zswap: do not shrink if cgroup may not zswap · 0bdf0efa
      Nhat Pham authored
      Before storing a page, zswap first checks if the number of stored pages
      exceeds the limit specified by memory.zswap.max, for each cgroup in the
      hierarchy.  If this limit is reached or exceeded, then zswap shrinking is
      triggered and short-circuits the store attempt.
      
      However, since the zswap's LRU is not memcg-aware, this can create the
      following pathological behavior: the cgroup whose zswap limit is 0 will
      evict pages from other cgroups continually, without lowering its own zswap
      usage.  This means the shrinking will continue until the need for swap
      ceases or the pool becomes empty.
      
      As a result of this, we observe a disproportionate amount of zswap
      writeback and a perpetually small zswap pool in our experiments, even
      though the pool limit is never hit.
      
      More generally, a cgroup might unnecessarily evict pages from other
      cgroups before we drive the memcg back below its limit.
      
      This patch fixes the issue by rejecting zswap store attempt without
      shrinking the pool when obj_cgroup_may_zswap() returns false.
      
      [akpm@linux-foundation.org: fix return of unintialized value]
      [akpm@linux-foundation.org: s/ENOSPC/ENOMEM/]
      Link: https://lkml.kernel.org/r/20230530222440.2777700-1-nphamcs@gmail.com
      Link: https://lkml.kernel.org/r/20230530232435.3097106-1-nphamcs@gmail.com
      Fixes: f4840ccf
      
       ("zswap: memcg accounting")
      Signed-off-by: default avatarNhat Pham <nphamcs@gmail.com>
      Cc: Dan Streetman <ddstreet@ieee.org>
      Cc: Domenico Cerasuolo <cerasuolodomenico@gmail.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Seth Jennings <sjenning@redhat.com>
      Cc: Vitaly Wool <vitaly.wool@konsulko.com>
      Cc: Yosry Ahmed <yosryahmed@google.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      0bdf0efa
    • Mike Kravetz's avatar
      page cache: fix page_cache_next/prev_miss off by one · 9425c591
      Mike Kravetz authored
      Ackerley Tng reported an issue with hugetlbfs fallocate here[1].  The
      issue showed up after the conversion of hugetlb page cache lookup code to
      use page_cache_next_miss.  Code in hugetlb fallocate, userfaultfd and GUP
      is now using page_cache_next_miss to determine if a page is present the
      page cache.  The following statement is used.
      
      	present = page_cache_next_miss(mapping, index, 1) != index;
      
      There are two issues with page_cache_next_miss when used in this way.
      1) If the passed value for index is equal to the 'wrap-around' value,
         the same index will always be returned.  This wrap-around value is 0,
         so 0 will be returned even if page is present at index 0.
      2) If there is no gap in the range passed, the last index in the range
         will be returned.  When passed a range of 1 as above, the passed
         index value will be returned even if the page is present.
      The end result is the statement above will NEVER indicate a page is
      present in the cache, even if it is.
      
      As noted by Ackerley in [1], users can see this by hugetlb fallocate
      incorrectly returning EEXIST if pages are already present in the file.  In
      addition, hugetlb pages will not be included in core dumps if they need to
      be brought in via GUP.  userfaultfd UFFDIO_COPY also uses this code and
      will not notice pages already present in the cache.  It may try to
      allocate a new page and potentially return ENOMEM as opposed to EEXIST.
      
      Both page_cache_next_miss and page_cache_prev_miss have similar issues.
      Fix by:
      - Check for index equal to 'wrap-around' value and do not exit early.
      - If no gap is found in range, return index outside range.
      - Update function description to say 'wrap-around' value could be
        returned if passed as index.
      
      [1] https://lore.kernel.org/linux-mm/cover.1683069252.git.ackerleytng@google.com/
      
      Link: https://lkml.kernel.org/r/20230602225747.103865-2-mike.kravetz@oracle.com
      Fixes: d0ce0e47
      
       ("mm/hugetlb: convert hugetlb fault paths to use alloc_hugetlb_folio()")
      Signed-off-by: default avatarMike Kravetz <mike.kravetz@oracle.com>
      Reported-by: default avatarAckerley Tng <ackerleytng@google.com>
      Reviewed-by: default avatarAckerley Tng <ackerleytng@google.com>
      Tested-by: default avatarAckerley Tng <ackerleytng@google.com>
      Cc: Erdem Aktas <erdemaktas@google.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Muchun Song <songmuchun@bytedance.com>
      Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
      Cc: Vishal Annapurve <vannapurve@google.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      9425c591
    • Luís Henriques's avatar
      ocfs2: check new file size on fallocate call · 26a6ffff
      Luís Henriques authored
      
      
      When changing a file size with fallocate() the new size isn't being
      checked.  In particular, the FSIZE ulimit isn't being checked, which makes
      fstest generic/228 fail.  Simply adding a call to inode_newsize_ok() fixes
      this issue.
      
      Link: https://lkml.kernel.org/r/20230529152645.32680-1-lhenriques@suse.de
      Signed-off-by: default avatarLuís Henriques <lhenriques@suse.de>
      Reviewed-by: default avatarMark Fasheh <mark@fasheh.com>
      Reviewed-by: default avatarJoseph Qi <joseph.qi@linux.alibaba.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Cc: Changwei Ge <gechangwei@live.cn>
      Cc: Gang He <ghe@suse.com>
      Cc: Jun Piao <piaojun@huawei.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      26a6ffff
    • John Keeping's avatar
      mailmap: add entry for John Keeping · 0e4d4ef9
      John Keeping authored
      
      
      Map my corporate address to my personal one, as I am leaving the
      company.
      
      Link: https://lkml.kernel.org/r/20230531144839.1157112-1-john@keeping.me.uk
      Signed-off-by: default avatarJohn Keeping <john@keeping.me.uk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      0e4d4ef9
    • Kefeng Wang's avatar
      mm/damon/core: fix divide error in damon_nr_accesses_to_accesses_bp() · 5ff6e2ff
      Kefeng Wang authored
      If 'aggr_interval' is smaller than 'sample_interval', max_nr_accesses in
      damon_nr_accesses_to_accesses_bp() becomes zero which leads to divide
      error, let's validate the values of them in damon_set_attrs() to fix it,
      which similar to others attrs check.
      
      Link: https://lkml.kernel.org/r/20230527032101.167788-1-wangkefeng.wang@huawei.com
      Fixes: 2f5bef5a
      
       ("mm/damon/core: update monitoring results for new monitoring attributes")
      Reported-by: default avatar <syzbot+841a46899768ec7bec67@syzkaller.appspotmail.com>
      Closes: https://syzkaller.appspot.com/bug?extid=841a46899768ec7bec67
      Link: https://lore.kernel.org/damon/00000000000055fc4e05fc975bc2@google.com/
      Reviewed-by: default avatarSeongJae Park <sj@kernel.org>
      Signed-off-by: default avatarKefeng Wang <wangkefeng.wang@huawei.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      5ff6e2ff
    • Benjamin Segall's avatar
      epoll: ep_autoremove_wake_function should use list_del_init_careful · 2192bba0
      Benjamin Segall authored
      autoremove_wake_function uses list_del_init_careful, so should epoll's
      more aggressive variant.  It only doesn't because it was copied from an
      older wait.c rather than the most recent.
      
      [bsegall@google.com: add comment]
        Link: https://lkml.kernel.org/r/xm26bki0ulsr.fsf_-_@google.com
      Link: https://lkml.kernel.org/r/xm26pm6hvfer.fsf@google.com
      Fixes: a16ceb13
      
       ("epoll: autoremove wakers even more aggressively")
      Signed-off-by: default avatarBen Segall <bsegall@google.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Christian Brauner <brauner@kernel.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      2192bba0
    • Haibo Li's avatar
      mm/gup_test: fix ioctl fail for compat task · 4f572f00
      Haibo Li authored
      
      
      When tools/testing/selftests/mm/gup_test.c is compiled as 32bit, then run
      on arm64 kernel, it reports "ioctl: Inappropriate ioctl for device".
      
      Fix it by filling compat_ioctl in gup_test_fops
      
      Link: https://lkml.kernel.org/r/20230526022125.175728-1-haibo.li@mediatek.com
      Signed-off-by: default avatarHaibo Li <haibo.li@mediatek.com>
      Acked-by: default avatarDavid Hildenbrand <david@redhat.com>
      Cc: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
      Cc: Matthias Brugger <matthias.bgg@gmail.com>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      4f572f00
    • Ryusuke Konishi's avatar
      nilfs2: reject devices with insufficient block count · 92c5d1b8
      Ryusuke Konishi authored
      
      
      The current sanity check for nilfs2 geometry information lacks checks for
      the number of segments stored in superblocks, so even for device images
      that have been destructively truncated or have an unusually high number of
      segments, the mount operation may succeed.
      
      This causes out-of-bounds block I/O on file system block reads or log
      writes to the segments, the latter in particular causing
      "a_ops->writepages" to repeatedly fail, resulting in sync_inodes_sb() to
      hang.
      
      Fix this issue by checking the number of segments stored in the superblock
      and avoiding mounting devices that can cause out-of-bounds accesses.  To
      eliminate the possibility of overflow when calculating the number of
      blocks required for the device from the number of segments, this also adds
      a helper function to calculate the upper bound on the number of segments
      and inserts a check using it.
      
      Link: https://lkml.kernel.org/r/20230526021332.3431-1-konishi.ryusuke@gmail.com
      Signed-off-by: default avatarRyusuke Konishi <konishi.ryusuke@gmail.com>
      Reported-by: default avatar <syzbot+7d50f1e54a12ba3aeae2@syzkaller.appspotmail.com>
        Link: https://syzkaller.appspot.com/bug?extid=7d50f1e54a12ba3aeae2
      Tested-by: default avatarRyusuke Konishi <konishi.ryusuke@gmail.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      92c5d1b8
    • Luís Henriques's avatar
      ocfs2: fix use-after-free when unmounting read-only filesystem · 50d92788
      Luís Henriques authored
      
      
      It's trivial to trigger a use-after-free bug in the ocfs2 quotas code using
      fstest generic/452.  After a read-only remount, quotas are suspended and
      ocfs2_mem_dqinfo is freed through ->ocfs2_local_free_info().  When unmounting
      the filesystem, an UAF access to the oinfo will eventually cause a crash.
       
      BUG: KASAN: slab-use-after-free in timer_delete+0x54/0xc0
      Read of size 8 at addr ffff8880389a8208 by task umount/669
      ...
      Call Trace:
       <TASK>
       ...
       timer_delete+0x54/0xc0
       try_to_grab_pending+0x31/0x230
       __cancel_work_timer+0x6c/0x270
       ocfs2_disable_quotas.isra.0+0x3e/0xf0 [ocfs2]
       ocfs2_dismount_volume+0xdd/0x450 [ocfs2]
       generic_shutdown_super+0xaa/0x280
       kill_block_super+0x46/0x70
       deactivate_locked_super+0x4d/0xb0
       cleanup_mnt+0x135/0x1f0
       ...
       </TASK>
      
      Allocated by task 632:
       kasan_save_stack+0x1c/0x40
       kasan_set_track+0x21/0x30
       __kasan_kmalloc+0x8b/0x90
       ocfs2_local_read_info+0xe3/0x9a0 [ocfs2]
       dquot_load_quota_sb+0x34b/0x680
       dquot_load_quota_inode+0xfe/0x1a0
       ocfs2_enable_quotas+0x190/0x2f0 [ocfs2]
       ocfs2_fill_super+0x14ef/0x2120 [ocfs2]
       mount_bdev+0x1be/0x200
       legacy_get_tree+0x6c/0xb0
       vfs_get_tree+0x3e/0x110
       path_mount+0xa90/0xe10
       __x64_sys_mount+0x16f/0x1a0
       do_syscall_64+0x43/0x90
       entry_SYSCALL_64_after_hwframe+0x72/0xdc
      
      Freed by task 650:
       kasan_save_stack+0x1c/0x40
       kasan_set_track+0x21/0x30
       kasan_save_free_info+0x2a/0x50
       __kasan_slab_free+0xf9/0x150
       __kmem_cache_free+0x89/0x180
       ocfs2_local_free_info+0x2ba/0x3f0 [ocfs2]
       dquot_disable+0x35f/0xa70
       ocfs2_susp_quotas.isra.0+0x159/0x1a0 [ocfs2]
       ocfs2_remount+0x150/0x580 [ocfs2]
       reconfigure_super+0x1a5/0x3a0
       path_mount+0xc8a/0xe10
       __x64_sys_mount+0x16f/0x1a0
       do_syscall_64+0x43/0x90
       entry_SYSCALL_64_after_hwframe+0x72/0xdc
      
      Link: https://lkml.kernel.org/r/20230522102112.9031-1-lhenriques@suse.de
      Signed-off-by: default avatarLuís Henriques <lhenriques@suse.de>
      Reviewed-by: default avatarJoseph Qi <joseph.qi@linux.alibaba.com>
      Tested-by: default avatarJoseph Qi <joseph.qi@linux.alibaba.com>
      Cc: Mark Fasheh <mark@fasheh.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Cc: Changwei Ge <gechangwei@live.cn>
      Cc: Gang He <ghe@suse.com>
      Cc: Jun Piao <piaojun@huawei.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      50d92788
    • Lorenzo Stoakes's avatar
      lib/test_vmalloc.c: avoid garbage in page array · 9f6c6ad1
      Lorenzo Stoakes authored
      It turns out that alloc_pages_bulk_array() does not treat the page_array
      parameter as an output parameter, but rather reads the array and skips any
      entries that have already been allocated.
      
      This is somewhat unexpected and breaks this test, as we allocate the pages
      array uninitialised on the assumption it will be overwritten.
      
      As a result, the test was referencing uninitialised data and causing the
      PFN to not be valid and thus a WARN_ON() followed by a null pointer deref
      and panic.
      
      In addition, this is an array of pointers not of struct page objects, so we
      need only allocate an array with elements of pointer size.
      
      We solve both problems by simply using kcalloc() and referencing
      sizeof(struct page *) rather than sizeof(struct page).
      
      Link: https://lkml.kernel.org/r/20230524082424.10022-1-lstoakes@gmail.com
      Fixes: 869cb29a
      
       ("lib/test_vmalloc.c: add vm_map_ram()/vm_unmap_ram() test case")
      Signed-off-by: default avatarLorenzo Stoakes <lstoakes@gmail.com>
      Reviewed-by: default avatarUladzislau Rezki (Sony) <urezki@gmail.com>
      Reviewed-by: default avatarBaoquan He <bhe@redhat.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      9f6c6ad1
    • Ryusuke Konishi's avatar
      nilfs2: fix possible out-of-bounds segment allocation in resize ioctl · fee5eaec
      Ryusuke Konishi authored
      Syzbot reports that in its stress test for resize ioctl, the log writing
      function nilfs_segctor_do_construct hits a WARN_ON in
      nilfs_segctor_truncate_segments().
      
      It turned out that there is a problem with the current implementation of
      the resize ioctl, which changes the writable range on the device (the
      range of allocatable segments) at the end of the resize process.
      
      This order is necessary for file system expansion to avoid corrupting the
      superblock at trailing edge.  However, in the case of a file system
      shrink, if log writes occur after truncating out-of-bounds trailing
      segments and before the resize is complete, segments may be allocated from
      the truncated space.
      
      The userspace resize tool was fine as it limits the range of allocatable
      segments before performing the resize, but it can run into this issue if
      the resize ioctl is called alone.
      
      Fix this issue by changing nilfs_sufile_resize() to update the range of
      allocatable segments immediately after successful truncation of segment
      space in case of file system shrink.
      
      Link: https://lkml.kernel.org/r/20230524094348.3784-1-konishi.ryusuke@gmail.com
      Fixes: 4e33f9ea
      
       ("nilfs2: implement resize ioctl")
      Signed-off-by: default avatarRyusuke Konishi <konishi.ryusuke@gmail.com>
      Reported-by: default avatar <syzbot+33494cd0df2ec2931851@syzkaller.appspotmail.com>
      Closes: https://lkml.kernel.org/r/0000000000005434c405fbbafdc5@google.com
      Tested-by: default avatarRyusuke Konishi <konishi.ryusuke@gmail.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      fee5eaec
    • Ricardo Ribalda's avatar
      riscv/purgatory: remove PGO flags · 88ac3bbc
      Ricardo Ribalda authored
      If profile-guided optimization is enabled, the purgatory ends up with
      multiple .text sections.  This is not supported by kexec and crashes the
      system.
      
      Link: https://lkml.kernel.org/r/20230321-kexec_clang16-v7-4-b05c520b7296@chromium.org
      Fixes: 93045705
      
       ("kernel/kexec_file.c: split up __kexec_load_puragory")
      Signed-off-by: default avatarRicardo Ribalda <ribalda@chromium.org>
      Acked-by: default avatarPalmer Dabbelt <palmer@rivosinc.com>
      Cc: <stable@vger.kernel.org>
      Cc: Albert Ou <aou@eecs.berkeley.edu>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Borislav Petkov (AMD) <bp@alien8.de>
      Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Nathan Chancellor <nathan@kernel.org>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Philipp Rudo <prudo@redhat.com>
      Cc: Ross Zwisler <zwisler@google.com>
      Cc: Simon Horman <horms@kernel.org>
      Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tom Rix <trix@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      88ac3bbc
    • Ricardo Ribalda's avatar
      powerpc/purgatory: remove PGO flags · 20188bac
      Ricardo Ribalda authored
      If profile-guided optimization is enabled, the purgatory ends up with
      multiple .text sections.  This is not supported by kexec and crashes the
      system.
      
      Link: https://lkml.kernel.org/r/20230321-kexec_clang16-v7-3-b05c520b7296@chromium.org
      Fixes: 93045705
      
       ("kernel/kexec_file.c: split up __kexec_load_puragory")
      Signed-off-by: default avatarRicardo Ribalda <ribalda@chromium.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
      Cc: <stable@vger.kernel.org>
      Cc: Albert Ou <aou@eecs.berkeley.edu>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Borislav Petkov (AMD) <bp@alien8.de>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Nathan Chancellor <nathan@kernel.org>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Palmer Dabbelt <palmer@rivosinc.com>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Philipp Rudo <prudo@redhat.com>
      Cc: Ross Zwisler <zwisler@google.com>
      Cc: Simon Horman <horms@kernel.org>
      Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tom Rix <trix@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      20188bac
    • Ricardo Ribalda's avatar
      x86/purgatory: remove PGO flags · 97b6b9cb
      Ricardo Ribalda authored
      If profile-guided optimization is enabled, the purgatory ends up with
      multiple .text sections.  This is not supported by kexec and crashes the
      system.
      
      Link: https://lkml.kernel.org/r/20230321-kexec_clang16-v7-2-b05c520b7296@chromium.org
      Fixes: 93045705
      
       ("kernel/kexec_file.c: split up __kexec_load_puragory")
      Signed-off-by: default avatarRicardo Ribalda <ribalda@chromium.org>
      Cc: <stable@vger.kernel.org>
      Cc: Albert Ou <aou@eecs.berkeley.edu>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Borislav Petkov (AMD) <bp@alien8.de>
      Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Nathan Chancellor <nathan@kernel.org>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Palmer Dabbelt <palmer@rivosinc.com>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Philipp Rudo <prudo@redhat.com>
      Cc: Ross Zwisler <zwisler@google.com>
      Cc: Simon Horman <horms@kernel.org>
      Cc: Steven Rostedt (Google) <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tom Rix <trix@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      97b6b9cb
    • Ricardo Ribalda's avatar
      kexec: support purgatories with .text.hot sections · 8652d44f
      Ricardo Ribalda authored
      Patch series "kexec: Fix kexec_file_load for llvm16 with PGO", v7.
      
      When upreving llvm I realised that kexec stopped working on my test
      platform.
      
      The reason seems to be that due to PGO there are multiple .text sections
      on the purgatory, and kexec does not supports that.
      
      
      This patch (of 4):
      
      Clang16 links the purgatory text in two sections when PGO is in use:
      
        [ 1] .text             PROGBITS         0000000000000000  00000040
             00000000000011a1  0000000000000000  AX       0     0     16
        [ 2] .rela.text        RELA             0000000000000000  00003498
             0000000000000648  0000000000000018   I      24     1     8
        ...
        [17] .text.hot.        PROGBITS         0000000000000000  00003220
             000000000000020b  0000000000000000  AX       0     0     1
        [18] .rela.text.hot.   RELA             0000000000000000  00004428
             0000000000000078  0000000000000018   I      24    17     8
      
      And both of them have their range [sh_addr ... sh_addr+sh_size] on the
      area pointed by `e_entry`.
      
      This causes that image->start is calculated twice, once for .text and
      another time for .text.hot. The second calculation leaves image->start
      in a random location.
      
      Because of this, the system crashes immediately after:
      
      kexec_core: Starting new kernel
      
      Link: https://lkml.kernel.org/r/20230321-kexec_clang16-v7-0-b05c520b7296@chromium.org
      Link: https://lkml.kernel.org/r/20230321-kexec_clang16-v7-1-b05c520b7296@chromium.org
      Fixes: 93045705
      
       ("kernel/kexec_file.c: split up __kexec_load_puragory")
      Signed-off-by: default avatarRicardo Ribalda <ribalda@chromium.org>
      Reviewed-by: default avatarRoss Zwisler <zwisler@google.com>
      Reviewed-by: default avatarSteven Rostedt (Google) <rostedt@goodmis.org>
      Reviewed-by: default avatarPhilipp Rudo <prudo@redhat.com>
      Cc: Albert Ou <aou@eecs.berkeley.edu>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Borislav Petkov (AMD) <bp@alien8.de>
      Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Nathan Chancellor <nathan@kernel.org>
      Cc: Nicholas Piggin <npiggin@gmail.com>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Palmer Dabbelt <palmer@rivosinc.com>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Simon Horman <horms@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tom Rix <trix@redhat.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      8652d44f
    • Peter Xu's avatar
      mm/uffd: allow vma to merge as much as possible · 5543d3c4
      Peter Xu authored
      We used to not pass in the pgoff correctly when register/unregister uffd
      regions, it caused incorrect behavior on vma merging and can cause
      mergeable vmas being separate after ioctls return.
      
      For example, when we have:
      
        vma1(range 0-9, with uffd), vma2(range 10-19, no uffd)
      
      Then someone unregisters uffd on range (5-9), it should logically become:
      
        vma1(range 0-4, with uffd), vma2(range 5-19, no uffd)
      
      But with current code we'll have:
      
        vma1(range 0-4, with uffd), vma3(range 5-9, no uffd), vma2(range 10-19, no uffd)
      
      This patch allows such merge to happen correctly before ioctl returns.
      
      This behavior seems to have existed since the 1st day of uffd.  Since
      pgoff for vma_merge() is only used to identify the possibility of vma
      merging, meanwhile here what we did was always passing in a pgoff smaller
      than what we should, so there should have no other side effect besides not
      merging it.  Let's still tentatively copy stable for this, even though I
      don't see anything will go wrong besides vma being split (which is mostly
      not user visible).
      
      Link: https://lkml.kernel.org/r/20230517190916.3429499-3-peterx@redhat.com
      Fixes: 86039bd3
      
       ("userfaultfd: add new syscall to provide memory externalization")
      Signed-off-by: default avatarPeter Xu <peterx@redhat.com>
      Reported-by: default avatarLorenzo Stoakes <lstoakes@gmail.com>
      Acked-by: default avatarLorenzo Stoakes <lstoakes@gmail.com>
      Reviewed-by: default avatarLiam R. Howlett <Liam.Howlett@oracle.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Mike Rapoport (IBM) <rppt@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      5543d3c4
    • Peter Xu's avatar
      mm/uffd: fix vma operation where start addr cuts part of vma · 270aa010
      Peter Xu authored
      Patch series "mm/uffd: Fix vma merge/split", v2.
      
      This series contains two patches that fix vma merge/split for userfaultfd
      on two separate issues.
      
      Patch 1 fixes a regression since 6.1+ due to something we overlooked when
      converting to maple tree apis.  The plan is we use patch 1 to replace the
      commit "2f628010799e (mm: userfaultfd: avoid passing an invalid range to
      vma_merge())" in mm-hostfixes-unstable tree if possible, so as to bring
      uffd vma operations back aligned with the rest code again.
      
      Patch 2 fixes a long standing issue that vma can be left unmerged even if
      we can for either uffd register or unregister.
      
      Many thanks to Lorenzo on either noticing this issue from the assert
      movement patch, looking at this problem, and also provided a reproducer on
      the unmerged vma issue [1].
      
      [1] https://gist.github.com/lorenzo-stoakes/a11a10f5f479e7a977fc456331266e0e
      
      
      This patch (of 2):
      
      It seems vma merging with uffd paths is broken with either
      register/unregister, where right now we can feed wrong parameters to
      vma_merge() and it's found by recent patch which moved asserts upwards in
      vma_merge() by Lorenzo Stoakes:
      
      https://lore.kernel.org/all/ZFunF7DmMdK05MoF@FVFF77S0Q05N.cambridge.arm.com/
      
      It's possible that "start" is contained within vma but not clamped to its
      start.  We need to convert this into either "cannot merge" case or "can
      merge" case 4 which permits subdivision of prev by assigning vma to prev. 
      As we loop, each subsequent VMA will be clamped to the start.
      
      This patch will eliminate the report and make sure vma_merge() calls will
      become legal again.
      
      One thing to mention is that the "Fixes: 29417d29" below is there only
      to help explain where the warning can start to trigger, the real commit to
      fix should be 69dbe6da.  Commit 29417d29 helps us to identify the
      issue, but unfortunately we may want to keep it in Fixes too just to ease
      kernel backporters for easier tracking.
      
      Link: https://lkml.kernel.org/r/20230517190916.3429499-1-peterx@redhat.com
      Link: https://lkml.kernel.org/r/20230517190916.3429499-2-peterx@redhat.com
      Fixes: 69dbe6da
      
       ("userfaultfd: use maple tree iterator to iterate VMAs")
      Signed-off-by: default avatarPeter Xu <peterx@redhat.com>
      Reported-by: default avatarMark Rutland <mark.rutland@arm.com>
      Reviewed-by: default avatarLorenzo Stoakes <lstoakes@gmail.com>
      Reviewed-by: default avatarLiam R. Howlett <Liam.Howlett@oracle.com>
      Closes: https://lore.kernel.org/all/ZFunF7DmMdK05MoF@FVFF77S0Q05N.cambridge.arm.com/
      Cc: Lorenzo Stoakes <lstoakes@gmail.com>
      Cc: Mike Rapoport (IBM) <rppt@kernel.org>
      Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      270aa010
    • Arnd Bergmann's avatar
      radix-tree: move declarations to header · bde1597d
      Arnd Bergmann authored
      
      
      The xarray.c file contains the only call to radix_tree_node_rcu_free(),
      and it comes with its own extern declaration for it.  This means the
      function definition causes a missing-prototype warning:
      
      lib/radix-tree.c:288:6: error: no previous prototype for 'radix_tree_node_rcu_free' [-Werror=missing-prototypes]
      
      Instead, move the declaration for this function to a new header that can
      be included by both, and do the same for the radix_tree_node_cachep
      variable that has the same underlying problem but does not cause a warning
      with gcc.
      
      [zhangpeng.00@bytedance.com: fix building radix tree test suite]
        Link: https://lkml.kernel.org/r/20230521095450.21332-1-zhangpeng.00@bytedance.com
      Link: https://lkml.kernel.org/r/20230516194212.548910-1-arnd@kernel.org
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarPeng Zhang <zhangpeng.00@bytedance.com>
      Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      bde1597d
    • Ryusuke Konishi's avatar
      nilfs2: fix incomplete buffer cleanup in nilfs_btnode_abort_change_key() · 2f012f2b
      Ryusuke Konishi authored
      
      
      A syzbot fault injection test reported that nilfs_btnode_create_block, a
      helper function that allocates a new node block for b-trees, causes a
      kernel BUG for disk images where the file system block size is smaller
      than the page size.
      
      This was due to unexpected flags on the newly allocated buffer head, and
      it turned out to be because the buffer flags were not cleared by
      nilfs_btnode_abort_change_key() after an error occurred during a b-tree
      update operation and the buffer was later reused in that state.
      
      Fix this issue by using nilfs_btnode_delete() to abandon the unused
      preallocated buffer in nilfs_btnode_abort_change_key().
      
      Link: https://lkml.kernel.org/r/20230513102428.10223-1-konishi.ryusuke@gmail.com
      Signed-off-by: default avatarRyusuke Konishi <konishi.ryusuke@gmail.com>
      Reported-by: default avatar <syzbot+b0a35a5c1f7e846d3b09@syzkaller.appspotmail.com>
      Closes: https://lkml.kernel.org/r/000000000000d1d6c205ebc4d512@google.com
      Tested-by: default avatarRyusuke Konishi <konishi.ryusuke@gmail.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      2f012f2b
  4. May 29, 2023
    • Linus Torvalds's avatar
      Merge tag 'trace-v6.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace · 8b817fde
      Linus Torvalds authored
      Pull tracing fixes from Steven Rostedt:
       "User events:
      
         - Use long instead of int for storing the enable set/clear bit, as it
           was found that big endian machines could end up using the wrong
           bits.
      
         - Split allocating mm and attaching it. This keeps the allocation
           separate from the registration and avoids various races.
      
         - Remove RCU locking around pin_user_pages_remote() as that can
           schedule. The RCU protection is no longer needed with the above
           split of mm allocation and attaching.
      
         - Rename the "link" fields of the various structs to something more
           meaningful.
      
         - Add comments around user_event_mm struct usage and locking
           requirements.
      
        Timerlat tracer:
      
         - Fix missed wakeup of timerlat thread caused by the timerlat
           interrupt triggering when tracing is off. The timer interrupt
           handler needs to always wake up the timerlat thread regardless if
           tracing is enabled or not, otherwise, it will never wake up.
      
        Histograms:
      
         - Fix regression of breaking the "stacktrace" modifier for variables.
           That modifier cannot be used for values, but can be used for
           variables that are passed from one histogram to the next. This was
           broken when adding the restriction to values as the variable logic
           used the same code.
      
         - Rename the special field "stacktrace" to "common_stacktrace".
      
           Special fields (that are not actually part of the event, but can
           act just like event fields, like 'comm' and 'timestamp') should be
           prefixed with 'common_' for consistency. To keep backward
           compatibility, 'stacktrace' can still be used (as with the special
           field 'cpu'), but can be overridden if the event has a field called
           'stacktrace'.
      
         - Update the synthetic event selftests to use the new name (synthetic
           events are created by histograms)
      
        Tracing bootup selftests:
      
         - Reorganize the code to keep artifacts of the selftests not compiled
           in when selftests are not configured.
      
         - Add various cond_resched() around the selftest code, as the
           softlock watchdog was triggering much more often. It appears that
           the kernel runs slower now with full debugging enabled.
      
         - While debugging ftrace with ftrace (using an instance ring buffer
           instead of the top level one), I found that the selftests were
           disabling prints to the debug instance.
      
           This should not happen, as the selftests only disable printing to
           the main buffer as the selftests examine the main buffer to see if
           it has what it expects, and prints can make the tests fail.
      
           Make the selftests only disable printing to the toplevel buffer,
           and leave the instance buffers alone"
      
      * tag 'trace-v6.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
        tracing: Have function_graph selftest call cond_resched()
        tracing: Only make selftest conditionals affect the global_trace
        tracing: Make tracing_selftest_running/delete nops when not used
        tracing: Have tracer selftests call cond_resched() before running
        tracing: Move setting of tracing_selftest_running out of register_tracer()
        tracing/selftests: Update synthetic event selftest to use common_stacktrace
        tracing: Rename stacktrace field to common_stacktrace
        tracing/histograms: Allow variables to have some modifiers
        tracing/user_events: Document user_event_mm one-shot list usage
        tracing/user_events: Rename link fields for clarity
        tracing/user_events: Remove RCU lock while pinning pages
        tracing/user_events: Split up mm alloc and attach
        tracing/timerlat: Always wakeup the timerlat thread
        tracing/user_events: Use long vs int for atomic bit ops
      8b817fde